EverAnalytics: August 2015

Friday, August 21, 2015

Clean up Log directory in IBM Cognos TM1

As part of development, over a period of time we often see plenty of files created in the logs directory. Typically these files consists of :

Errors encountered in execution of Turbo Integrator (TI) code
Saved files of tm1s.log (after SaveDataAll has been run or service is recycled)
AsciiOutput / TextOutput files processed as part of TI code for debugging
Files generated as required for upstream/downstream processing

Some sort of a method is required to clean these files periodically. There is also a possibility of lot of disk space wasted to store these files on the server.

To configure the logs directory, you’d need to set LoggingDirectory parameter in the tm1s.cfg file. While this parameter is optional, it is recommended to set it to a folder of your choice. Typically this folder is a sibling to the data directory and my recommendation is that, path mentioned to this logs folder in the tm1s.cfg be relative, instead of fully qualified!

If the parameter is not specified then:

By default it will write to the data directory, there by cluttering the directory
If there is more than one DataBaseDirectory mentioned in tm1s.cfg, then it is written to the 1st one

So let’s see how a sample log directory looks after few weeks/months. In our case, there has been not much development activity, since the minor errors are far less in number :)

We built a TI process that is scheduled to run every morning/week before we walk-in that clears up the log file. Let’s look at the code. There are 2 parameters for the process:

ps_SearchStr – You’d need to specify the beginning part of the file name, which needs to be deleted. Code does NOT use wildcards

pi_Days – Specify number of days. Any file older than this, and with the above search pattern will be deleted. Files lesser than this number will be retained. If you want to delete all the files then enter pi_Days = -1, when running the process

Code here looks assumes that date of the file is embedded in the file name. It is mainly written to cater deletion of time stamped tm1s logs as well as TM1ProcessError files. Both of these have date as part of the file name.

Prolog

In this tab, we will delete those files that match the pattern of the 1st parameter (SearchStr) and whose date is greater than 2nd parameter (Days).

vi_StartIdx = Long (ps_SearchStr) + 1;
vs_FileName = ”;
vs_FilePath = GetProcessErrorFileDirectory;
# Search for presence of given file
vs_FileName = WildcardFileSearch (vs_FilePath | ps_SearchStr | ‘*.log’, ”);
WHILE (vs_FileName @<> ”);
# Get its date and subtract from today’s date. If more than nn days – specified in the parameter, then delete it
vLogFileDate = DAYNO ( SUBST (vs_FileName, vi_StartIdx, 4) | ‘-‘ | SUBST (vs_FileName, vi_StartIdx + 4, 2) | ‘-‘ | SUBST (vs_FileName, vi_StartIdx + 4 + 2, 2)) ;
vTodayDate = DAYNO(TODAY(1));
vDifference = vTodayDate – vLogFileDate;
IF (vDifference > pi_Days);
AsciiDelete (vs_FilePath | vs_FileName);
vs_FileName = ”;
ENDIF;
# Search for the next file
vs_FileName = WildcardFileSearch (vs_FilePath | ps_SearchStr | ‘*.log’, vs_FileName);
END;

Epilog

Code written is mainly to delete other types of that get created. As an example, all the .txt files created by AsciiOuput / TextOutput that developers put for debugging; perhaps few .csv files, .log files etc. You’d need to alter the code to suit your needs

# Delete other .log files here, which users/developers would have created that is not required
vs_LogFile1 = ‘TM1ProcessError_’;
vs_LogFile2 = ‘tm1s’;
vs_FileName = ”;
vs_FilePath = GetProcessErrorFileDirectory;
vs_FileName = WildcardFileSearch (vs_FilePath | ‘*.log’, ”);
WHILE (vs_FileName @<> ”);
# Ensure it is not one of the TI errors or the timestamped tm1s.log file
IF (Scan (vs_LogFile1, vs_FileName) = 0 & Scan (vs_LogFile2, vs_FileName) = 0);
AsciiDelete (vs_FilePath | vs_FileName);
vs_FileName = ”;
ENDIF;
vs_FileName = WildcardFileSearch (vs_FilePath | ‘*.log’, vs_FileName);
END;
# Delete miscelaneous files like .txt, .csv, .cma files from log directory
# Below is an example for .txt alone
vs_FileName = WildcardFileSearch (vs_FilePath | ‘*.txt’, ”);
WHILE (vs_FileName @<> ”);
AsciiDelete (vs_FilePath | vs_FileName);
vs_FileName = WildcardFileSearch (vs_FilePath | ‘*.txt’, ”);
END;

Accomplish Attibute Update in Cognos TM1 Smartly!

Every application involves some kind of metadata build. Quite often we build the dimension structures and then assign attributes to the elements of dimension. When we add an attribute to a dimension, TM1 server creates a control cube for it – }ElementAttributes_YOUR_DIMNAME. It is a 2-dimension cube, one with the dimension in question and another control dimension }ElementAttributes_YOUR_DIMNAME.

Depending on the attribute type (numeric or string), we use AttrPutN or AttrPutS, to populate the attributes. I will demonstrate the effect of using AttrPutN / AttrPutS. Below is the screen shot of the logs directory before we begin any work.

Let us build a very large dimension and populate attributes for it. You can refer to my erstwhile post on building a large dimension. The entire code of the process was in Prolog.

Option 1: Use AttrPutS / AttrPutN

Let’s tweak the process and add below code to Epilog

vs_AttrName1 = ‘Description’;
vs_AttrName2 = ‘Caption’;
vs_AttrName3 = ‘Index’;

AttrDelete (vs_DimName, vs_AttrName1);
AttrInsert (vs_Dimname, ”, vs_AttrName1, ‘S’);
AttrDelete (vs_DimName, vs_AttrName2);
AttrInsert (vs_Dimname, ”, vs_AttrName2, ‘S’);
AttrDelete (vs_DimName, vs_AttrName3);
AttrInsert (vs_Dimname, ”, vs_AttrName3, ‘N’);

vi_DimSiz = DimSiz (vs_DimName);
WHILE (vi_DimSiz > 0);
   vs_Elem = DimNm (vs_DimName, vi_DimSiz);
   AttrPutS (‘Desc_’ | vs_Elem, vs_DimName, vs_Elem, vs_AttrName1);
   AttrPutS (‘Capt_’ | vs_Elem, vs_DimName, vs_Elem, vs_AttrName2);
   AttrPutN (vi_DimSiz, vs_DimName, vs_Elem, vs_AttrName3);
   vi_DimSiz = vi_DimSiz – 1;
END;

I am adding 3 attributes (2 of string type and 1 numeric) for testing. Let’s run the code with following values in parameter:

It took 40.16 seconds (as mentioned in the log file) to complete the process. However tm1s.log size is now increased

Let’s delete the dimension and run SaveDataAll before we begin the next step. Open the tm1s_TIMESTAMP_.log file. You will see tons of entries where in the system is recording the old attribute value for the 3 attributes we created.

Option 2: Use AttrPutS / AttrPutN and disable cube logging

We will now modify the above code, and disable logging on the attribute cube. Therefore when an AttrPut call is made, the entry is not logged into the tm1s.log file

vs_AttrName1 = ‘Description’;
vs_AttrName2 = ‘Caption’;
vs_AttrName3 = ‘Index’;

AttrDelete (vs_DimName, vs_AttrName1);
AttrInsert (vs_Dimname, ”, vs_AttrName1, ‘S’);
AttrDelete (vs_DimName, vs_AttrName2);
AttrInsert (vs_Dimname, ”, vs_AttrName2, ‘S’);
AttrDelete (vs_DimName, vs_AttrName3);
AttrInsert (vs_Dimname, ”, vs_AttrName3, ‘N’);

# Disable Cube logging on the attr cube
vs_AttrCubeName = ‘}ElementAttributes_’ | vs_DimName;
vi_OldValue = CubeGetLogChanges (vs_AttrCubeName);
CubeSetLogChanges (vs_AttrCubeName, 0);

vi_DimSiz = DimSiz (vs_DimName);
WHILE (vi_DimSiz > 0);
    vs_Elem = DimNm (vs_DimName, vi_DimSiz);
    AttrPutS (‘Desc_’ | vs_Elem, vs_DimName, vs_Elem, vs_AttrName1);
    AttrPutS (‘Capt_’ | vs_Elem, vs_DimName, vs_Elem, vs_AttrName2);
    AttrPutN (vi_DimSiz, vs_DimName, vs_Elem, vs_AttrName3);
    vi_DimSiz = vi_DimSiz – 1;
END;

# Revert the logging value on attr cube
CubeSetLogChanges (vs_AttrCubeName , vi_OldValue);

The process took 34.78 seconds to finish (saving about 5 seconds) and the tm1s.log file is no longer bloated.

Summary:

AttrPutS / AttrPutN are used to populate the attribute value(s) of elements in dimension. Each call of the AttrPutN / AttrPutS will create an entry in the tm1s.log file. Therefore a very large dimension with lot of attribute updates, you will likely end up executing for a longer time
In addition, the tm1s.log file will be bloated up to almost 1 GB depending the volume of change
It is therefore advisable to disable the attribute cube, before performing attribute updates on a dimension
The time differences and tm1s.log bloating may not be of an issue, if you are working with dimensions that are small

AttrPutS (‘Happy Smart Attribute updating to you!’, ‘Blog’, ‘Readers’, ‘Message’)

Friday, August 7, 2015

Clearing the air around IBM Cognos TM1’s VMM setting

In case you are rolling your eyes and thinking, why we need yet another blog talking about VMM, you may want to read for some new information. Let’s first start with what VMM is and how it is useful.

VMM stands for View Maximum Memory
It is a property that is available to be set on each cube
It controls the amount of RAM, which needs to be reserved on the server for purpose of storing view caches (a.k.a Stargate views)
If VMM threshold is reached for a given cube, then the oldest views is purged out to make room for newer ones (kind of FIFO – First In First Out)

VMM and its partner in crime VMT (View Minimum Time) together control the Stargate caching. You can check out the Operations Guide to know more about VMT. I felt compelled to write about VMM, given the confusion in its documentation.

What’s the default value for VMM?

That is the intent to write this post. The default value for VMM is 128KB and it is specified in KB. For those of you raising eyebrows and to some who can swear that it is 64KB, take a look at the chronology of events that is causing this confusion:

9.5 & 10.1 documentation (operations guide in particular) says the default value is 65,536 and it is specified in bytes!
Come 2012 Apr, an IBM KB (Ref # 7024354) is released that mentions the correct value

(Fig 1: KB Article showing the correct values)

10.2 and 10.2.2 documentation (operations guide in particular) says the default value is 65,536 and it is specified in bytes!
A new KB article (Ref # 1960495) on tips and technique to work with MTQ in TM1, released in 2015 Jul, states that default value is 65KB (not 64KB)!

(Fig 2: An incorrect KB article)

So what is really going on over here? What is the default value of VMM and how is it expressed? The correct answer is that:

Default value of VMM is 128
It is specified in Kilo Bytes. If you enter VMM for a cube as 1024, then system will reserve a memory in RAM for 1024KB = 1MB, for storing the view caches (Stragate views)

This information is correctly mentioned in the KB Ref # 7024354. Unfortunately this information has neither trickled down into the operations guide nor subsequent KB articles that refer to the VMM. Consequently, IBM KB articles, blogs on VMM continue to refer that default value is 64KB or 65KB and is expressed in bytes – which is misleading.

What is the Max Value of VMM?

Ok, if default value is indeed 128KB, then what is the max value. Per the 10.2.2 operations guide the maximum value that can be assigned to VMM is ~40GB (almost 40 GB).

(Fig 3: Operations Guide showing max value for VMM)

Again, this is not entirely correct. Valid range for VMM is 0 – 2147483647 KB (equates to 2 TB). This is theoretical limit. The upper limit is usually constrained by the amount of RAM available on your server; with the assumption that there is sufficient memory available to load other cubes in memory. IBM KB (Ref # 7024354) article has mentioned in this correctly – refer the 2nd line in Fig 1 above.

An APAR (PI45248) has been raised to correct the default value as well as the upper limit of the VMM. Hopefully in the future release of the documentation this gets corrected. You can check the status of the APAR here.

Add dummy data to a cube or muddle data to make it unintelligible

In continuation from my last post – Creating a large dimension, this post will describe how we obfuscated data in an existing cube. The premises on which these 2 processes were built, was to mimic the most voluminous cube for support; then reproduce the myriad of issues that accompanied this cube (be it in TM1, Cognos BI or Cognos Insight).

Goal is to obfuscate the numbers, and modeled on Dilbert’s logic :)

To illustrate, I have built a sample cube with few dimensions. Time dimension has months from Jan 2014 through Dec 2014. There is Chart of Accounts as well as dimension with Regions (picked from Planning Sample database). Also there is a dimension called “aLargeDim” – this is the one with close to half a million elements in it. For the purpose of demonstration I have created it with 1,000 elements in it.

Turbo Integrator (TI) process we have consists of code in Prolog, Data and Epilog tabs. The original process I had, is slightly modified to add dummy data to the cube. Since I cannot attach files in WordPress, I will explain it with snippets of code. Please note that I have only shown snippets of code. It will serve you as a template to build on it.

Parameters

You’d need to specify the cube name and option 1 (to add dummy data) or 2 (muddle data to make it unintelligible). The last 2 parameters are required, if you are adding new data. It controls the volume of data being added (defined in percent) and on what dimension you’d like to control.

Prolog

Some housekeeping steps like disabling the logging on cube is not shown. Bulk of the code in Prolog is made up of the while loop.

If we are adding data, then we are calculating the number of intersections in the cube (highlighted in maroon color)
If we are adding data, then pick only percentage of members from the dimension mentioned in the parameter name (highlighted in green color). Say there are 1,000 elements in it and pi_PercentPopulate is 20, then the code will pick every 5th element (100/20 = 5) … there by totaling the number of that dimension elements to 200 (20% of 1000 = 200)
If we are adding new data, then we need to pick cells in a view that are zero (skip zeroes set = 0)
If we are making existing data unintelligible, then we need to pick only the non-zero value (skip zeroes set = 1)

vi_CellCount = 1;
vi_AddCount = 0;
i = 1;
vs_DimName = TabDim (vs_CubeName, i);
WHILE (vs_DimName @<> ”);
   # Calculate the total cell count. This is required when we need to populate a % of cells
   vi_DimSiz = DimSiz (vs_DimName);
   vi_ElemCnt = 0;
   IF (pi_AddData = 1);
       WHILE (vi_DimSiz > 0);
           IF (ElLev (vs_DimName, DimNm (vs_DimName, vi_DimSiz)) = 0 );
               vi_ElemCnt = vi_ElemCnt + 1;
           ENDIF;
           vi_DimSiz = vi_DimSiz – 1;
       END;
       vi_CellCount = vi_CellCount * vi_ElemCnt;
   ENDIF;
IF (SubsetExists (vs_DimName, vs_SubName) <> 0 );
SubsetDeleteAllElements (vs_DimName, vs_SubName);
ELSE;
SubsetCreate (vs_DimName, vs_SubName);
ENDIF;
   IF (Upper (ps_DimName) @= Upper (vs_DimName) & pi_AddData = 1);
       # If the dimension name matches to the prompt value, then pick only % of its elements
       # Ex: if pi_PerCentPopulate = 20, then pick 20% of elements from this dimension
       vi_DimSiz = DimSiz (vs_DimName);
       vi_ModCount = Round (100 \ pi_PercentPopulate);
       vi_SubInsPt = 1;
       WHILE (vi_DimSiz > 0);
           IF (ElLev (vs_DimName, DimNm (vs_DimName, vi_DimSiz)) = 0 & Mod (vi_DimSiz, vi_ModCount) = 0);
               SubsetElementInsert (vs_DimName, vs_SubName, DimNm (vs_DimName, vi_DimSiz), vi_SubInsPt);
               vi_SubInsPt = vi_SubInsPt + 1;
           ENDIF;
           vi_DimSiz = vi_DimSiz – 1;
       END;
   ELSE;
       # Otherise use all the elements in the dimension
       SubsetIsAllSet (vs_DimName, vs_SubName, 1);
   ENDIF;
   ViewSubsetAssign(vs_CubeName, vs_ViewName, vs_DimName, vs_SubName);
   i = i + 1;
   vs_DimName = TabDim (vs_CubeName, i);
END;
vi_NumOfDims = i – 1;
# If we are adding data and ps_DimName is not a valid one, then quit
# Otherwise we will end up processing 100% of intersections
IF (pi_AddData = 1 & vi_SubInsPt <= 1);
ProcessQuit;
ENDIF;
IF (pi_AddData = 1);
   # If we are adding data, then we need to conisder the blank cells
   ViewExtractSkipZeroesSet     (vs_CubeName, vs_ViewName, 0);
ELSEIF (pi_AddData = 2);
   # If we are modifying data, then we ned to modify only the non-zero cells
   ViewExtractSkipZeroesSet     (vs_CubeName, vs_ViewName, 1);
ENDIF;

Data Tab

In the data tab, we call a random value and then multiply it with the existing data (if required). The underlined piece feel free to change it whatever you want.

vi_RandVal = Rand ();
IF (pi_AddData = 1);
# Skip every so often
vi_Populate = IF (vi_RandVal < 0.33333, 0, 1);
IF (vi_Populate = 0 );
ItemSkip;
ENDIF;

IF (vi_AddCount > (pi_PercentPopulate * vi_CellCount \ 100));
ProcessBreak;
ENDIF;
ENDIF;

vi_Sec = StringToNumber (TimSt (Now (), ‘\s’) );
vi_NewVal = (NValue + (100 * vi_Sec) ) * 0.75 * vi_RandVal ;
vi_AddCount = vi_AddCount + 1;

IF (vi_NumOfDims = 2);
CellIncrementN (vi_NewVal, ps_CubeName, Dim_001, Dim_002);
ELSEIF (vi_NumOfDims = 3);
CellIncrementN (vi_NewVal, ps_CubeName, Dim_001, Dim_002, Dim_003);
ELSEIF (vi_NumOfDims = 4);
CellIncrementN (vi_NewVal, ps_CubeName, Dim_001, Dim_002, Dim_003, Dim_004);
ELSEIF (vi_NumOfDims = 5);
CellIncrementN (vi_NewVal, ps_CubeName, Dim_001, Dim_002, Dim_003, Dim_004, Dim_005);
… R E P E A T T H E B L O C K F O R A S M A N Y D I M E N S I O N S Y O U H A V E

Epilog tab has not much code in it, except to turn the logging on the cube, back to initial value.

Demonstration

Ok, not literally … but for those, who insist on seeing accurate numbers (to the decimal) it may lead to confusion :)

I have entered some predefined values in ‘Sales’ account for Jan 2014. This is what the data looks like. Nice, round numbers are present. We will run the code with option 2 for pi_AddData parameter.

This is the output after running the process. Process runs fairly fast, as we have to work with <100 rows. As you can see the numbers are distorted.

Let’s clear the data in the cube and re-run the process with pi_AddData = 1 and populate about 5% of cells. The output will look something like this – shown for couple of accounts for Jan 2014. Data however is spread across all the months and accounts.

References:

Dilbert cartoon is from Scott Adams’ site – http://dilbert.com/strip/2008-05-08
Refer to my blog post for creating a 127 Dim cube, which is used as data source for the TI process

Happy Data Manipulation to you!