Friday, November 13, 2009

Unexplained NULLs from Excel Sources in SSIS

Today when using SSIS to transpose some data I run into a problem which I'd seen before but had never had time to properly investigate and understand. Today the alternative to solving the problem was a lot of manual data manipulation so it was time to dig in and work out what was going on.


So, here's the problem, you may well have struck it yourself.


My Excel spreadsheet has a text column and three columns of integers.

But when I added an SSIS Excel Source and previewed the data I saw the following:

Rather than seeing the values of "Battery Point", 2, 1 and 1 for the first data row I had "Battery Point", 2, 1, NULL. Where had the value for the last column gone?

Some digging around MSDN led me to learn that the Excel driver reads 8 rows to determine the data type for each column in the source file. When a column has mixed data types the majority type wins, with numeric types winning over text types in the event of a tie. Cells in that column with the losing type(s) are given a NULL value. In my case I had one numeric cell and seven text cells in the first eight rows of the column. So my integer value of 1, despite the fact that it was in the first row, lost out under the collective weight of the following seven rows and was made into a NULL.

It is possible to avoid this by telling the driver to work in Import Mode. To do this bring up the Properties Page of the Connection object and add the value IMEX=1 to the extended properties of the connection string. You'll need to manually type the value into the connection string rather than using the ellipse or you won't be able to make the change.

So, in my example I start with this


And change it to this




A quick check of the preview from the Excel source reveals my value of 1 is back - the NULL is gone!


By the way, it's also possible to change the number of rows that Excel samples to guestimate the data type for each column to something other than the default value of 8. This is done by altering the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel and changing the value of TypeGuessRows to your new value.

No comments:

Post a Comment