Friday 16 September 2016

Using SAS as a source in SSIS

I want to extract data from a SAS database file (*.sas7bdat). How do I do that in SSIS?

This is possible but not out of the box. You need to install an extra provider to accomplish  this.

1) Download SAS Provider
First you need to download and install the SAS Providers for OLE DB. There are multiple versions make sure to download the correct version (otherwise you get error messages like "This application does not support your platform"). You only need the select SAS Providers for OLE DB.
Install SAS Providers for OLE DB

2) Setup OLE DB Connection Manager
After installation the new provider will be available in OLE DB Connection Manager editor. Make sure to choose "SAS Local Data Provider X.X". This is the provider that can read SAS database files (*.sas7bdat).
SAS Local Data Provider 9.3

Second import step in the setup is to select the folder where the sas7bdat files are located. Don't select a file! All files will appear as tables in the OLE DB Source component. In my case I could leave the User name and Password fields empty because I already had access to the folder (but I'm not an SAS expert).
Fill in folderpath in Server or file name field

3) Setup OLE DB Source Component
Now you can use a regular OLE DB Source Component to extract data from SAS. However there are two concerns. When you select a table and close the editor you will get a warning that there is something wrong with the code page.
Cannot retrieve the column code page info from the OLE DB provider.
  If the component supports the "DefaultCodePage" property, the code page
from that property will be used.  Change the value of the property if the
current string code page values are incorrect.  If the component does not
support the property, the code page from the component's locale ID will
be used.

After clicking OK there will be a warning icon in the OLE DB Source Component which you can remove by setting the "AlwaysUseDefaultCodePage" property on true.
Before and after changing AlwaysUseDefaultCodePage

The second concern is more annoying: all datatypes will be DT_SRT (ansi string) or DT_R8 (float). You cannot change this and you need to add a data conversion.
Date(times) are also numbers: dates will be a number of days after January 1 1960 and datetimes will be the number of seconds after January 1 1960 and any decimals are used for milliseconds. A Derived Column expression for date could look something like:
DATEADD("DD", (DT_I4)[mydatecolumn], (DT_DATE)"1960-01-01")
All string or float

Tip: you can also use BIML to create SSIS packages with a SAS7BDAT source.

No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.

Related Posts Plugin for WordPress, Blogger...