Thursday, 29 December 2011

ProperCase in SSIS

Case
I have a source where all data is in uppercase and I want to use ProperCase (similar synonyms: UpperCamelCase, PascalCase, TitleCase, StartCase) instead. How do I do that in SSIS?
All uppercase








Solution
A) If your source is a SQL Server table then you could use a User-Defined Function with some TSQL code (example 1, example 2) to proper case your data.

B) You could come up with some clever and complex, but probably unreadable expression.

C) You could use a Script Component with one line of code for the actual proper case.

Let’s elaborate that Script Component solution. VB.net has a standard method for it: StrConv, but it’s also possible in C#.

1) Add Script Component
Go to the Data Flow and add a Script Component (type transformation) after your source. Connect it to your source and give it a suitable name.
Script Component (type transformation)
























2) Input Columns
Edit the Script Component and go to the Input Columns tab. Select the columns that you want to change as ReadWrite.
Input Columns: ReadWrite























3) The Script
Select the programming language (Visual Basic), edit the Script and copy the following VB.net code.
' VB.Net code
' The standard VB.net string function to convert
' a string to ProperCase (also called TitleCase)
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent

    ' Just one line of code foreach column
    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        Row.FirstName = StrConv(Row.FirstName, VbStrConv.ProperCase)
        Row.LastName = StrConv(Row.LastName, VbStrConv.ProperCase)
    End Sub

End Class

or use the official C# variant:
// C# code
// The C# variant of Proper Case (also title case)
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.Globalization;     // Added
using System.Threading;         // Added

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // C# alternative for VB.Net StrConv uses culterinfo and threading
        // See: http://support.microsoft.com/kb/312890/
        CultureInfo cultureInfo = Thread.CurrentThread.CurrentCulture;
        TextInfo textInfo = cultureInfo.TextInfo;

        // The actual Proper Casing
        Row.FirstName = textInfo.ToTitleCase(Row.FirstName);
        Row.LastName = textInfo.ToTitleCase(Row.LastName);

        // Update: Or try this!
        // Row.FirstName = new System.Globalization.CultureInfo("en").TextInfo.ToTitleCase(Row.FirstName);
        // Row.LastName = new System.Globalization.CultureInfo("en").TextInfo.ToTitleCase(Row.LastName);
    }
}

NOTE: for some reason it doesn't work if all chars are uppercase, but it's also possible to use the VB.net VbStrConv.ProperCase in your C# code by adding a reference to Visual Basic.Net

// C# code
// Using the Visual Basic VbStrConv.ProperCase
// by adding a reference to Visual Basic.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using Microsoft.VisualBasic;         // Added
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        Row.FirstName = Strings.StrConv(Row.FirstName, VbStrConv.ProperCase, 0);
        Row.LastName = Strings.StrConv(Row.LastName, VbStrConv.ProperCase, 0);
    }
}

Here is how you add the Visual Basic Reference.
Adding Visual Basic reference


















4) The result
I added a destination and two Data Viewers for testing purposes.
The Result

8 comments:

  1. Thanks for this. I found that the C# code didn't work, I had to go with the VB.NET code which solved the problem. It ran through when I used the C# code, but it didn't change my string to TitleCase. Great post though, Thanks.

    ReplyDelete
    Replies
    1. Which C# code did you use? The 'official' or the one calling a VB.net reference?

      Delete
    2. Maybe the problem is because ToTitleCase does not convert a string with all letters upper case. Convert first to lower case and after to title case.

      Delete
  2. Great post! It does exactly what I was looking for. Thanks!!

    ReplyDelete
  3. Ugh. Wish I had seen the comments before troubleshooting this problem. ToTitleCase does not work on All Caps.

    http://stackoverflow.com/questions/2697203/textinfo-totitlecase-does-not-work-as-expected-for-all-caps-strings

    ReplyDelete
    Replies
    1. Thx for the URL. I have added a note about this bug. You could add a reference to VB and use that method. See NOTE for more details.

      Delete
  4. It works. You saved my time. Thank you so much!!!

    ReplyDelete

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.