Friday, 17 December 2010

Split record with start- and enddate into seperate records per date

Case
If you have records with a start- and enddate and you want to split them into separate records per date, the Script Component might be a solution for you.

Starting point
IdStartdateEnddate
101-21-201001-28-2010
209-26-201010-02-2010

Desired situation
IdFactdate
101-21-2010
101-22-2010
101-23-2010
101-24-2010
101-25-2010
101-26-2010
101-27-2010
101-28-2010
209-26-2010
209-27-2010
209-28-2010
209-29-2010
209-30-2010
210-01-2010
210-02-2010

Solution
1) Add a script component to your dataflow.
Script component (transformation)














2) Select all columns you need and select ReadOnly (default).
ReadOnly Input columns





















3) Now make sure that your task is Asynchronous (number of records in isn't equal to the number of records out) by selecting "None" at SynchronousInputId.
Asynchronous





















4) Copy all input columns to the output columns, but the StartDate and EndDate will be replaced by one date column named FactDate. Make sure that the datatypes match.
Copy columns





















5) Now the script (I used VB.net in this example). SSIS will generate two methods named:  Input0_ProcessInputRow and CreateNewOutputRows. You can remove the last one because we only need Input0_ProcessInputRow. First get the start- and enddate, then loop through the dates and add a row for each date:
' Split a record with a start- and enddate into separate records per date
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

Public Class ScriptMain
    Inherits UserComponent

    ' Method that will be started for each record in you dataflow
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        ' Declare variables to use them in the loop
        Dim StartDate, EndDate as Date

        ' Get StartDate from dataflow
        StartDate = Row.StartDate

        ' Get EndDate from dataflow. You could add some default value if the EndDate is unknown.
        If Row.EndDate_IsNull Then
            EndDate = DateTime.Now.Date
        Else
            EndDate = Row.EndDate
        End If

        ' Loop through dates
        While StartDate <= EndDate
            Me.Output0Buffer.AddRow                ' Add a new record
            Me.Output0Buffer.FactDate = StartDate  ' Fill the new created FactDate column
            Me.Output0Buffer.Id = Row.Id           ' Fill all other columns copied from the input columns
            StartDate = StartDate.AddDays(1)       ' Add one day to go to the next date
        End While
    End Sub

End Class





6) Now add a target and run the package to see the result:
2 rows => 15 rows




















Note: There are other ways to fix this case, but this is one is quite easy and quick.

No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.