Wednesday, September 28, 2011

What a big "If" you have



Have you ever been asked to automate a manual process? If you have then you have probably been ambushed like Little Red Riding Hood. Things start off normally enough Grandma is wearing her nightcap and her glasses or the process you are automating has rules and lists but everything is pretty normal just like Grandma. Then you notice that Grandma has very big eyes, and very big ears and a monster flow chart that defines all the exceptions to the lists and rules...


THE BETTER TO EAT YOU WITH!!!





This is my big bad wolf, I mean "if"

Ok I know it really isn't that bad but the person describing the process to me says
"well every time I do this it changes because we find new things that we want to reclassify"
So now I am faced with writing some complicated code and having to rewrite it every quarter.

YAY Job security!
I am a consultant, having regular projects is very welcome but, one of the things I like about consulting is that I get to leave when the project is done. So this blog is intended to show you my exit strategy for this project. The code is Qlikview load script but the procedure could be implemented in a variety of languages.

So first of all I spent a significant amount figuring out the process for categorizing this data the result of which is found in the flow chart shown above.

Next I Build up the contributing data, fields, tables and mapping lists and I do as much mapping as I can with lists, joins, lookups etc. Now I start to dissect the remaining flow chart and I am going to follow it building a spreadsheet with the folowing columns
Step - The name of the steps, every diamond on the flow chart will have a step.
ConditionStatement - This is a boolean expression which should be listed inside the diamond
TrueType - Tells the progam how to handle the argument as a Literal, Step, or Field each of these types uses different syntax to use within the expression.
TruePart - The value for a literal, another Step name or a field name (be sure to use square braces)
FalseType - Same as the TrueType but used in the false arguments of the expression
FalsePart - Same as TruePart but used as the False Argument.

Here is a really simple example of the flow chart data


Step
ConditionStatement
TrueType
TruePart
FalseType
FalsePart
SA01
[Color] = 'Orange'
Literal
Warm Color
Step
SA02
SA02
[Color] = 'Red'
Literal
Hot Color
Field
[Fruit]

This example would be coded something like this, in no particular language

   If [Color] = 'Orange' Then
      'Warm Color'
   Else if [Color] = 'Red' Then
      'Hot Color'
   Else
      [Fruit]
   End If

However, we are after something that works as part of a QlikView load statement like this...

   If([Color] = 'Orange','Warm Color',If([Color] = 'Red', 'Hot Color', [Fruit]))

Now that we have the table we need to process it in QlikView to achieve the desired expression.
We start by loading the table FlowChartData

   
//For the sake of having an example that will actually run I'm using Inline data

//Some of the syntax does not work as inline data so you need to build the spreadsheet
//although I did not test it I would expect some text file formats would work too.

Fruits:
Load * Inline [
Fruit, Color, Shape
Apple, Red, Round
Banana, Yellow, Long
Orange, Orange, Round
];






FlowChartData:
LOAD Step, 
     ConditionStatement,
     TrueType, 
     TruePart,
     FalseType, 
     FalsePart 
FROM
FlowChartIfData.xlsx
(ooxml, embedded labels, table is Sheet1);





//The procedure used to build the Big Bad If
Sub BuildIf(sTargetVariable, sStep, lCondition, lTruePart, lTrueType, lFalsePart, lFalseType)
   //Required Parameters
   //sTargetVariable - a variable where the "If" will be held
   //sStep - the name of the first step
   //Parameters beginning with "l" should not be passed to the
   //procedure, they are listed so they
   //will be initialized as NULLS and treated as local variables.



   Let lCondition = Lookup('ConditionStatement','Step','$(sStep)','FlowChartData');
   Let lTrueType = Lookup('TrueType','Step','$(sStep)','FlowChartData');
   Let lTruePart = Lookup('TruePart','Step','$(sStep)','FlowChartData');
   Let lFalseType = Lookup('FalseType','Step','$(sStep)','FlowChartData');
   Let lFalsePart = Lookup('FalsePart','Step','$(sStep)','FlowChartData');

   //begin building the expression
   set sTargetVariable = $(sTargetVariable) If($(lCondition),;

   //Build the true Part
   //In order to build the code correctly you have to handle
   //fields reference, literals and Step References differently
   SWITCH lTrueType
      Case 'Literal'
         set sTargetVariable = $(sTargetVariable) '$(lTruePart)';
      Case 'Field'
         set sTargetVariable = $(sTargetVariable) $(lTruePart);
      Case 'Step'
         Call BuildIf(sTargetVariable,'$(lTruePart)')
   END SWITCH

   // add the parameter that comes after the true part
   Set sTargetVariable = $(sTargetVariable),;

   //Build the false part
   SWITCH lFalseType
      Case 'Literal'
         set sTargetVariable = $(sTargetVariable) '$(lFalsePart)';
      Case 'Field'
         set sTargetVariable = $(sTargetVariable) $(lFalsePart);
      Case 'Step'
         Call BuildIf(sTargetVariable,'$(lFalsePart)')
   END SWITCH

   //Close the parenthesis
   Set sTargetVariable = $(sTargetVariable));
END SUB
   



//Variable must be initialized or the value from the previous run will
//persist and the new expression will be concatenated
Let vDynamicIf = '';



Call BuildIf(vDynamicIf, 'SA01')


//use the variable as an expression in a load statement
Salad:
Load [Fruit], $(vDynamicIf) as [Color Feel]
Resident Fruits;



Wild Thing I'll Map Your Pattern Strings

While working on a QlikView project I was presented with an troublesome problem where I was trying to map some codes to a set of descriptions but the mapping was based on a pattern string. So I have a list that looks something like this

Department Pattern

Description

ORT*

Orthopedics

VAS*

Heart & Vascular

CATH*

Heart & Vascular

*FAM*

Family Services

This is a highly abbreviated list but I think you’ll get the idea. I went looking for a mapping function that would help me match the pattern strings and I found two mapping functions ApplyMap() and MapSubstring(). I know ApplyMap won’t help me because it does an exact match and MapSubstring does not really do the job either because it will match a partial string but not a pattern.

The function that I decided to explore was WildMatch(). WildMatch is an interesting function, it requires two arguments, the first is a string usually a field that you would like to examine and then a list of pattern strings to search.

Load
WildMatch([DeptCode], ‘ORT*,’VAS*,’CATH*, *FAM*) as [Department Pattern List Number]
Resident
Departments;

The WildMatch function returns a number that matches the number of the argument where the match was found and zero when there is no match. I thought if I used this function with a carefully prepared mapping table then I could achieve the “WildMap” functionality.

Step 1. Load the map as a regular table and add a row number field to the table.

DeptPatterns:
Load
RowNo() as RowNumber,
DeptPattern,
DeptDesc;
Load * Inline [
DeptPattern, DeptDesc
ORT*, Orthopedics
VAS*, Heart & Vascular
CATH*, Heart & Vascular
*FAM*, Family Services
]
;

Step 2. Concatenate the patterns into a string that will serve as the arguments for the wildmatch function. Notice that I am using the sort weight parameter in the concat function to ensure the list items are in the proper order. I am creating a simple table assigning the field value to a variable and dropping the table

PatternStrings:
Load
Concat( chr(39) & DeptPattern & Chr(39),',',RowNumber) as PatternList
Resident
DeptPatterns;

Let vDeptPatternList = FieldValue('PatternList', 1);

Drop Table PatternStrings;

Step 3. Create the mapping table.

DeptMap:
Mapping
Load
RowNumber as Arg,
DeptDesc as Desc
Resident
DeptPatterns

Order By RowNumber;

Step 4. Use the result of the WildMatch function to determine what row in the mapping table has the desired description.


Load
ApplyMap('DeptMap',WildMatch([DeptCode], $(vDeptPatternList)),'Not Found') as [Department Group],


Resident
Departments;


In conclusion, I found this technique helpful to meet this requirement I hope you will find it useful too.