Thursday 3 November 2011

How to pivot in SSIS

The Pivot Transformation in SSIS has a big drawback: It has no interface which makes it hard to work with, even for seasoned users.

I have a CSV source that looks like this:
Unpivoted data 

And I want to transfer it to this:
Pivoted data

How does the Pivot Transformation work?

First vote for this feedback on Microsoft Connect because even in Denali (ctp3) there is still no good gui and that's a bit of a shame!
* UPDATE 17 November: Pivot GUI for SQL Server 2012 RC0 *

1) Source
For this example I use a CSV file. Add a Flat File Source for the file.
CSV as Source

2) Sort
The data needs to be sorted on the key column. In this case that is the day column (it will be the unique key after the pivot). If your source is a database table, then you should add an order by clause in the source query. If your source can't be sorted, then you have to use a Sort Component. (see Performance Best Practices step 5)
Sort on key column

If you forget to sort you will get more rows and a lot of null values.
Forgot to sort

3) Pivot input columns
Select all the columns you need as (readonly) input columns. In this case select them all.
Input columns

4) Pivot Usage
Now the hard part. You need to determine the so called pivot usage for each column. There are 4 values (0 to 3):
means that the column will pass through unaffected
1 means that the column becomes the key of the pivot (aka the Set Key)
2 means that the column values become the column names of the pivot (aka the Pivot Column)
3 means that the column values are pivoted in the pivot

In our case:
Day gets pivot usage 1 because it will be the new key
Period gets pivot usage 0 because it will be passed through unaffected
Product gets pivot usage 2, because its values will be the new column names
Volume gets pivot usage 3, because its values will be used for the new columns
Pivot Usage (notice the LineageID for the next step)

5) Output columns
Now we have to create new output columns. For the first column we will keep the same name: Day.
Create a new column named Day and in the SourceColumn property you need to fill in the LineageID from the source column Day. See picture of previous step. In my case it's 100, but yours could be different! Name and SourceColumn are the only things you need to fill in.
New column

Do the same for the Period column: Give it the name Period and lookup the LineageID for the SourceColumn.

For the next three columns there are more things to fill in. First create three new columns and give them suitable names. I used the same names as the values in the excel sheet, but you can change that. Now the tricky part. The PivotKeyValue has to be the exact value of the various row values in the product column (Mountainbike, Tricycle, Recumbentbike). The SourceColumn refers to the column where the new value comes from. So in our case it will be the LineageID from the Volume column.
New columns

6) The result
Now you can add a destination and test the result. I used data viewers to show you the result.
The result

Download example package (2008 R2)
Download example file (CSV)

No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.

Related Posts Plugin for WordPress, Blogger...