My column in Excel contains mixed datatypes (strings and numbers). The string values are shown as NULL. How can I get all values?
Solution
The Excel driver reads the first 8 rows to determine the datatype. If the majority of the rows are numbers it will choose a numeric datatype instead of a string data type and visa versa.
So if your Excelsheet looks like this. The first and second column get DT_R8 and the third column gets DT_WSTR.
The standard result in SSIS |
TypeGuessRows
You can change the default of checking 8 rows (1 to16) by changing the windows registry, but that doesn't change the majority rule! If you want to change that number... Start the Windows Registry-editor and search for "TypeGuessRows". The place of that key depends on the version of Windows and Office. Examples:
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0\Engines\Excel
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\3.0\Engines\Excel
For the ACE provider (Excel 2007, 2010 & 2013):
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\TypeGuessRows
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel\TypeGuessRows
HKEY_LOCAL_MACHINE\Software\Microsoft\Office\15.0\Access Connectivity Engine\Engines\Excel\TypeGuessRows
Note: Setting the TypeGuessRows value to zero (0) will result in checking the first 16384 rows in excel.
Change at your own risk! |
Import Mode
If you want to get the data of all rows, regardless the datatypes in Excel, you should add the string ;IMEX=1 to the Excel Connection String. Changing to Import Mode (0 is Export mode, 1 is Import mode and 2 is Linked mode) will result in getting everything (including numbers) as a string. Goto the properties of the Excel Connection manager and change the ConnectionString property:
Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\MyExcelFile1.xls;Extended Properties="EXCEL 8.0;HDR=YES;IMEX=1";
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\MyExcelFile.xlsx;Extended Properties="Excel 12.0 XML;HDR=YES;IMEX=1";
The preview of the same Excelsheet looks very different now.
The result in SSIS with IMEX=1 |
There is one big but.... if the first 8 rows contain a number the datatype will be DT_R8 again (despite of IMEX=1). This could be a problem if the ninth row contains a string.
Still null values! |