Saturday, 24 May 2014

Create and fill Age dimension

Case
Is there an easy way to create and populate an age dimension with age groups?

Solution
Creating an age dimension is usually done once and probably not in SSIS, but with a TSQL script.
For each new assignment I use a script similar to this and adjust it to the requirements for that particular assignment.

-- Drop dimension table if exists
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[dim_age]') AND TYPE IN (N'U'))
BEGIN
 DROP TABLE [dbo].[dim_age]
END

-- Create table dim_age
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[dim_age](
  [dim_age_id] [int] IDENTITY(-1,1) NOT NULL,
  [Age] [smallint] NULL,
  [AgeGroup1] [nvarchar](50) NULL,
  [AgeGroup1Sort] [int] NULL,
  [AgeGroup2] [nvarchar](50) NULL,
  [AgeGroup2Sort] [int] NULL,
 CONSTRAINT [PK_dim_age] PRIMARY KEY CLUSTERED 
(
 [dim_age_id] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

-- Enter unknown dimension value (in case a person's date of birth is unknown)
INSERT INTO [dbo].[dim_age]
           ([Age]
           ,[AgeGroup1]
     ,[AgeGroup1Sort]
           ,[AgeGroup2]
     ,[AgeGroup2Sort])
     VALUES
           (-1
           ,'Unknown'
     ,0
           ,'Unknown'
     ,0)
GO

-- Enter all ages
declare @age smallint;
set @age = 0;

-- Loop through ages 0 to 130
WHILE @age < 131
BEGIN
 INSERT INTO [dbo].[dim_age]
  ([Age]
  ,[AgeGroup1]
  ,[AgeGroup1Sort]
  ,[AgeGroup2]
  ,[AgeGroup2Sort])
 VALUES
  (@age
  -- Use the common age groups/categories of your region/branch/industry
  -- This is just an example
  , CASE
    WHEN @age < 15 THEN '0 till 15 year'
    WHEN @age < 25 THEN '15 till 25 year'
    WHEN @age < 35 THEN '25 till 35 year'
    WHEN @age < 45 THEN '35 till 45 year'
    WHEN @age < 55 THEN '45 till 55 year'
    WHEN @age < 65 THEN '55 till 65 year'
    ELSE '65 year and older'
   END
  -- Add value to sort on in SSAS
  , CASE
    WHEN @age < 15 THEN 1
    WHEN @age < 25 THEN 2
    WHEN @age < 35 THEN 3
    WHEN @age < 45 THEN 4
    WHEN @age < 55 THEN 5
    WHEN @age < 65 THEN 6
    ELSE 7
   END
  , CASE
    WHEN @age < 19 THEN 'Juvenile'
    ELSE 'Mature'
   END
  -- Add value to sort on in SSAS
  , CASE
    WHEN @age < 19 THEN 1
    ELSE 2
   END
  )

 -- Goto next age
 set @age = @age + 1
END

The result: filled age dimension
























How could you use this dimension?
A while ago I also posted an example to create and populate a date dimension. So now you can combine those in a datamart. I have an employee table and an absence table with a start- and enddate.
Employee table

Absence table



















I will use the date dimension to split the absence time periods in separate days and then calculate the employee's age of each day of absence. This will go in to a fact table and then I can use the age dimension to see absence per age group.
-- Split absence time periode in separate days, but go back 2 years max and 1 year forward if end date is unknown
SELECT  Absence.AbsenceId
,   Absence.EmployeeNumber
--   Date of absence
,   dim_date.Date as AbsenceDate
,   Absence.ReasonCode
--   Calculation of age at time of absence
,   DATEDIFF(YEAR, Employee.DateOfBirth, dim_date.Date)
   -
   (CASE
    WHEN DATEADD(YY, DATEDIFF(YEAR, Employee.DateOfBirth, dim_date.Date), Employee.DateOfBirth)
     >  dim_date.Date THEN 1
    ELSE 0
   END) as Age
FROM  EmployeeApplication.dbo.Absence
INNER JOIN EmployeeApplication.dbo.Employee
   on Absence.EmployeeNumber = Employee.EmployeeNumber
INNER JOIN  DM_Staff.dbo.dim_date
   on dim_date.Date
   -- change start date to lower bound if it's below it
            BETWEEN CASE WHEN YEAR(Absence.AbsenceStartDate) >= YEAR(GETDATE()) - 2 THEN Absence.AbsenceStartDate
            ELSE DATEADD(yy, DATEDIFF(yy, 0, getdate()) - 2, 0) END
   -- change end date to upper bound if it's null
            AND ISNULL(Absence.AbsenceEndDate, DATEADD(yy, DATEDIFF(yy, 0, getdate()) + 2, -1))
--   Filter absence record with an enddate below the lower bound (perhaps a bit superfluous with the inner join)
WHERE  YEAR(ISNULL(Absence.AbsenceEndDate, GETDATE())) >= YEAR(GETDATE()) - 2


Result of query that can be used in a fact package





















fact absence


























Note: this is a simplified situation to keep things easy to explain.

Tuesday, 13 May 2014

Continue Loop after error - Part II

Case
I used the propagate variable trick to continue a loop on error. That works within the package, but the parent package calling the package with the loop still fails.
Child succeeds, parent fails

















Solution
This is by design according Microsoft. The workaround they suggest is setting the DisableEventHandlers property of the Execute Package task to True. This should ignore all errors in the child package. Below here is an alternative solution to only ignore expected errors. Don't hesitate to post your own solution in the comments.

A parent package variable will be filled by the child package in case of an unexpected error. In the parent package it will be used to throw an error when it's filled.

1) Variable - Parent Package
Add a string variable in the parent package called childError.
Add string variable


























2) OnError - Child Package
Go to the child package where you used the propagate variable 'trick' and add an OnError event handler on package level. Go to Event Handlers tab. Make sure the package is selected as Executable and OnError as Event handler. Then click on the link in the middle of the page: Click here to create an 'OnError' event handler for the executable 'Child1'.
OnError event handler for package


















3a) Script Task - Child Package
Add a Script Task to the event handler and give it a suitable name. Then edit it and add the System variables ErrorCode and ErrorDescription as read only variables and type the name of the parent package in the read write box. You can't select it because it's only known at runtime.
Add the variables




















3b) The Script
Hit the edit button and copy the code of my main method to your main method. The example code is in C#. For a VB.Net version you can use this conversion tool.
// C# Code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_ace2311e5a4c4bbb98101cd54888c7c9
{
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        public void Main()
        {
            // Fill parent package variables with error message from this child package.
            Dts.Variables["User::ChildError"].Value = Dts.Variables["System::ErrorCode"].Value.ToString() + " - " + Dts.Variables["System::ErrorDescription"].Value.ToString();
            Dts.TaskResult = (int)ScriptResults.Success;
        }

        #region ScriptResults declaration
        /// 
        /// This enum provides a convenient shorthand within the scope of this class for setting the
        /// result of the script.
        /// 
        /// This code was generated automatically.
        /// 
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

 }
}

4) Propagate - Parent Package
Go back to the parent package and add an empty event handler on the Execute Package Task so that it won't fail if the child package has an error. Go to the event handlers. Select the Execute Package Task as executable and select OnError as eventhandler and then create the event handler by clicking on the link in the middle of the screen: "Click here to create an 'OnError' event handler for executable 'EPT - Child1'.

Next go to the variables pane and hit the Variable Grid Option button to also show system variables (SSIS 2008 has a different button). Then find the system variable Propagate and set it to false.

Last step is to add an annotation in the empty event handler to explain why it's empty.
OnError Event Handler

















Show system variables for SSIS 2008








5a) Script Task - Parent Package
Go back to the Control Flow of your parent package and add a Script Task to fire an error. Connect it to the Execute Package Task with an expression only on the precedence constraint:
@[User::ChildError] != ""
Fire Child Error























5b) The Script
Edit the Script Task and add the string variable ChildError as read-only variable. After that hit the edit button and copy the content of my main method to your main method. The example code is in C#. For a VB.Net version you can use this conversion tool.
Read-only variable























// C# Code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_7b522ca79c9f4428a233a100bfc66e6e
{
    [Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        public void Main()
        {
            // Fire error with error message from child package and fail script task
            Dts.Events.FireError(0, "Child package", "Child package error: " + Dts.Variables["User::ChildError"].Value.ToString(), string.Empty, 0);
            Dts.TaskResult = (int)ScriptResults.Failure;
        }

        #region ScriptResults declaration
        /// 
        /// This enum provides a convenient shorthand within the scope of this class for setting the
        /// result of the script.
        /// 
        /// This code was generated automatically.
        /// 
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

 }
}


6) Precedence Contraint - Parent Package
If there is no unexpected error in the child package then we continue with the next task(s). Each task that is connected to the Execute Package Task with a Success Precedence Constraint should be changed to an expression: @[User::ChildError] == ""
Continue if there are no unexpected errors in the child package


























7) The result
I added a Script Task in the child package that always fails to simulate an unexpected error. On the second run there where only expected errors in the loop, so the child package was successful. In the parent package there is still a red cross, but it continues without (unexpected) errors.
The result

Continue Loop after error - Part I

Case
I want to continue my loop when in one of the iterations a task fails. It should continue with the next iteration/file. I tried changing the task properties FailParentOnFailure and MaximumErrorCount, but nothing seems to work.
Task fails, so loop fails

















Solution
There are a couple of solutions. You could set the MaximumErrorCount to a higher number on the parent container (not on the task that fails). With FailParentOnFailure property on the failing task you can override that setting and fail the parent on the first error.

The easiest/stable solution is to use an empty OnError event handler on the failing task with the system variable Propagate set to false.

1) Empty event handler
Go to the event handlers tab. Select (1) the failing task as Executable and (2) OnError as the event handler. Next (3) click on the link in the middle of the page to create the event handler: Click here to create an 'OnError' event handler for executable 'DFT - Stage files'.
Create empty event handler
















2) System variable propagate
Open the variable pane in the event handler. Click on (1) the Variable Grid Options button. Check (2) the radio button "Show system variables". Search for the Propagate variable and set (3) it to false.
System variable Propagate

























Show system variables for SSIS 2008








3) Annotation
An empty event handler could confuse other developers. Adding a simple annotation could solve that.
Add an annotation























4) The result
Now an error in the Data Flow Task won't fail the Foreach Loop Container. This solution works with all containers (Sequence, For Loop, Foreach Loop and package).
























Note: this solution wont work with parent child packages. Propagate can't be disabled from a child package to a parent package. This is by design according Microsoft. Here is a workaround for that.

Friday, 2 May 2014

BIML doesn't recognize system variable ServerExecutionID

Case
I want to use the SSIS System Variable ServerExecutionID as a parameter for an Execute SQL Task in a BIML Script, but it doesn't recognize it and gives an error:

Could not resolve reference to 'System.ServerExecutionID' of type 'VariableBase'. 'VariableName="System.ServerExecutionID"' is invalid.
























Solution
The current version of BIDS/BIML doesn't recognize all system variables (for example
LocaleId and ServerExecutionID). Other system variables like VersionMajor or VersionBuild will work. You can overcome this by manually adding these variables in your BIML Script.


<Variable Name="ServerExecutionID" DataType="Int64" Namespace="System">0</Variable>


























And if you now run the package (in the catalog) the table gets filled with the System variable ServerExecutionID:

Number added, it works!