Monday, 29 October 2012

Split multi value column into multiple columns

 I have a source where one column has a divider that splits the sales values for each month. How do I split that value over multiple columns?
The last column contains the sales per month and is divided by a comma instead of a semicolon.

There are various options to split that value.
A) Substring/Findstring
B) Script Component
C) Token

Solution A: Substring/Findstring
You can use an expression in the Derived Column. The first and last deviate from the rest:
SUBSTRING(Sales,1,FINDSTRING(Sales,",",1) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",1) + 1,FINDSTRING(Sales,",",2) - FINDSTRING(Sales,",",1) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",2) + 1,FINDSTRING(Sales,",",3) - FINDSTRING(Sales,",",2) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",3) + 1,FINDSTRING(Sales,",",4) - FINDSTRING(Sales,",",3) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",4) + 1,FINDSTRING(Sales,",",5) - FINDSTRING(Sales,",",4) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",5) + 1,FINDSTRING(Sales,",",6) - FINDSTRING(Sales,",",5) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",6) + 1,FINDSTRING(Sales,",",7) - FINDSTRING(Sales,",",6) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",7) + 1,FINDSTRING(Sales,",",8) - FINDSTRING(Sales,",",7) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",8) + 1,FINDSTRING(Sales,",",9) - FINDSTRING(Sales,",",8) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",9) + 1,FINDSTRING(Sales,",",10) - FINDSTRING(Sales,",",9) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",10) + 1,FINDSTRING(Sales,",",11) - FINDSTRING(Sales,",",10) - 1)
SUBSTRING(Sales,FINDSTRING(Sales,",",11) + 1,LEN(Sales) - FINDSTRING(Sales,",",11))
Substring/Findstring solution

Solution B: Script Component
Add a Script Component (type transformation) and select the Sales column as ReadOnly input column in the Input Columns pane. Then go to the Inputs and Outputs pane and create a column foreach month. After that choose your language and hit the Edit Script button and copy the following method.
// C# Code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

public class ScriptMain : UserComponent
    public override void Input0_ProcessInputRow(Input0Buffer Row)
        // Split input column on comma fill output columns
        // Added the Convert.ToInt32(XXXX) to convert it to int
        string[] sales = Row.Sales.ToString().Split(new char[]{','}, StringSplitOptions.None);
        Row.Jan = Convert.ToInt32(sales[0]);
        Row.Feb = Convert.ToInt32(sales[1]);
        Row.Mar = Convert.ToInt32(sales[2]);
        Row.Apr = Convert.ToInt32(sales[3]);
        Row.May = Convert.ToInt32(sales[4]);
        Row.Jun = Convert.ToInt32(sales[5]);
        Row.Jul = Convert.ToInt32(sales[6]);
        Row.Aug = Convert.ToInt32(sales[7]);
        Row.Sep = Convert.ToInt32(sales[8]);
        Row.Oct = Convert.ToInt32(sales[9]);
        Row.Nov = Convert.ToInt32(sales[10]);
        Row.Dec = Convert.ToInt32(sales[11]);

or VB.Net

'VB.Net code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()g _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        ' Split input column on comma fill output columns
        ' Added the Convert.ToInt32(XXXX) to convert it to int
        Dim sales As String() = Row.Sales.ToString().Split(New Char() {","c}, StringSplitOptions.None)
        Row.Jan = Convert.ToInt32(sales(0))
        Row.Feb = Convert.ToInt32(sales(1))
        Row.Mar = Convert.ToInt32(sales(2))
        Row.Apr = Convert.ToInt32(sales(3))
        Row.May = Convert.ToInt32(sales(4))
        Row.Jun = Convert.ToInt32(sales(5))
        Row.Jul = Convert.ToInt32(sales(6))
        Row.Aug = Convert.ToInt32(sales(7))
        Row.Sep = Convert.ToInt32(sales(8))
        Row.Oct = Convert.ToInt32(sales(9))
        Row.Nov = Convert.ToInt32(sales(10))
        Row.Dec = Convert.ToInt32(sales(11))
    End Sub
End Class
This is more readable than the Substring/Findstring expressions!
Script Component solution

Solution C: Token
SSIS 2012 has a new expression called Token which makes life a lot easier than the Substring/Findstring solution:

Token solution

1 comment:

  1. I know this is an older article, but it saved my cookies! Thanks guys!


Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.

Related Posts Plugin for WordPress, Blogger...