Tuesday, 17 April 2012

Custom SSIS Component: Foreach File Enumerator with regex

In some cases the standard wildcard filter in the Foreach File Enumerator isn't flexible enough. Try to get all files ending with a number or all files starting with an 'a' or 'b'. A filter based on a regular expression gives more freedom. See/vote this request at Microsoft Connect.

My .Net colleague helped me to create a File Enumerator where I replaced the standard wildcard filter by a regular expression filter. And we also added a sorting possibility.
Regular Expression Support























This Regex File Enumerator also alows you to select multiple extensions like *.xls + *.csv:
^.*\.(xls|csv)$

Disclaimer
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Downloads:
You can download a SSIS 2008 version on the download page. All my custom 2012 enumerators have an optional regex filter included.

Installation
The installer registers the DLL in the GAC and copies it to the ForEachEnumerators folder of SSIS (example: C:\Program Files\Microsoft SQL Server\100\DTS\ForEachEnumerators\). After that you have to close Visual Studio/BIDS because it caches the GAC on startup.

SSIS
The enumerator is automatically added to the Foreach Loop when you open a SSIS project.
The new enumerator




























Bugs or feature suggestions
Please contact me or leave a comment if you have any suggestions or found a bug in this Custom Enumerator.

Sunday, 15 April 2012

Regex filter for Foreach Loop

Case
The standard Foreach Loop File Enumerator only has a wildcard filter, but in some cases a regular expression filter would be more useful. See/vote this request at Microsoft Connect.

Solution
At the moment I'm working on a custom File Enumerator with regular expression support, but for those who don't want to wait or don't want to use custom components... You could achieve the same result with a Script Task.

*update: Now also available as Custom Foreach Enumerator. *

1) Foreach Loop
For this case I will use a standard Foreach File Enumerator that fills a variable FilePath. The filter is *.* so all files will be returned.
Standard foreach loop


















2) Variables
I will use two extra variables in this example: RegexFilter (string) for storing the regular expression and PassesRegexFilter (boolean) for indicating whether the filename passes the regular expression filter.
Variables













3) Script Task
Add a Script Task in front of the Data Flow Task and give it a suitable name.
Script Task






















4) Add variables
Edit the Script Task and add the FilePath and RegexFilter as ReadOnlyVariables and the PassesRegexFilter as ReadWriteVariable.
Variables






















5) The Script
Copy the following script to the Script Task.
// C# Code for filtering filenames with Regex
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Text.RegularExpressions;   // Added
using System.IO;                        // Added

namespace ST_02b6595da2274d7182409fb43af929ae.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {
        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            // Get the filename from the complete filepath
            String FileName = Path.GetFileName(Dts.Variables["User::FilePath"].Value.ToString());

            // Create a regex object with the pattern from the SSIS variable
            Regex FileFilterRegex = new Regex(Dts.Variables["User::RegexFilter"].Value.ToString());
            
            // Check if it is match and return that value (boolean) to the SSIS variable
            Dts.Variables["User::PassesRegexFilter"].Value = FileFilterRegex.IsMatch(FileName); 
            
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}
or with VB.Net
' VB.Net Code for filtering filenames with Regex
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Text.RegularExpressions   ' Added
Imports System.IO                        ' Added

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum
 
    Public Sub Main()
        ' Get the filename from the complete filepath
        Dim FileName As String = Path.GetFileName(Dts.Variables("User::FilePath").Value.ToString())

        ' Create a regex object with the pattern from the SSIS variable
        Dim FileFilterRegex As Regex = New Regex(Dts.Variables("User::RegexFilter").Value.ToString())

        ' Check if it is match and return that value (boolean) to the SSIS variable
        Dts.Variables("User::PassesRegexFilter").Value = FileFilterRegex.IsMatch(FileName)

        Dts.TaskResult = ScriptResults.Success
    End Sub

End Class


6) Precedence Constraints Expression
Connect the Script Task to the Data Flow Task and add an expression that does the real filtering:
@[User::PassesRegexFilter] == true
Precedence Constraint with expression





















7) The result
For testing the result, I added an other Script Task below the Data Flow Task that logs all files. My regular expression in this test case: .*[0-9]\.csv$    (all csv files that end with a number in the filename)
The result























Note 1: to gain some performance I should have changed the Foreach Loop File Enumerator wildcard from *.* to *.csv

Note 2: this method could be a bit inefficient if your file collection contains thousands of files and you only need two of them.

Thursday, 12 April 2012

Custom SSIS Component: Checksum Transformation

We have created a new, free of charge, custom component: The Checksum Transformation.
Checksum Editor













This component calculates hash values for one or more columns using a variety of methods like MD5, RIPEMD160, SHA1, SHA256, SHA384 and SHA512. You can either drag and drop or double click the columns to use them as input columns. The salt can be provided as tekst or by a SSIS variable. You can also add this component via BIML.
Values are now hashed




















Disclaimer
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Downloads:
You can download a SSIS 2008 and 2012 version on the download page.

Version 1.1: Added support for GUID columns
Version 1.2: Added support for NTEXT & TEXT colums.
Version 1.3beta*: Added column separator option to prevent column mix ups: "aa" and "aa" was the same as "aaa" and "a".
* release note: auto upgrade is not working yet. If you have a package with an existing checksum transformation with version 1.2 or lower then you first have to remove it and then add it again. Will be solved in final version.

Installation
The installer registers the DLL in the GAC and copies it to the component folder of SSIS (example: C:\Program Files\Microsoft SQL Server\100\DTS\PipelineComponents\). After that you have to close Visual Studio/BIDS because it caches the GAC on startup. (Restart not required for SSDT)

How add the transformation to the toolbox (2008 only)
When you open a SSIS project in Visual Studio/Bids, right click the toolbox and choose "Choose Items...". After this Visual Studio will be busy for a couple of seconds!
Right click toolbox






















When the Choose Toolbox Items window appears, go to the SSIS Data Flow Items and search for the newly installed  Checksum component and select it. Click ok to finish.
Choose Toolbox Items




















Now the new component will appear in the toolbox. Ready to use! Have fun.

New component added



























Bugs or feature suggestions
Please contact me or leave a comment if you have any suggestions or found a bug in this Custom component.

Saturday, 7 April 2012

SSIS 2008 with Team Foundation Server 2010

Case
A lot of my clients are still working with SSIS 2008 (R2), but already have Team Foundation Server 2010 for source control. How can you connect to TFS 2010 with BIDS/Visual Studio 2008?

Solution
Although this blog is about SSIS, this solution works for all kinds of Visual Studio 2008 projects.

1) Install Team Explorer
First you need to install "Visual Studio Team System 2008 Team Explorer" which is a simplified Visual Studio Team System 2008 environment used solely to access Team Foundation Server services. You can download it here.
Next, Next, Finish























2) Install VS 2008 SP1
Secondly install "Microsoft Visual Studio 2008 Service Pack 1" (even if it was already installed before step 1). You can download it here.
Next, Next, Finish
























3) Install Team System 2008 SP1
Thirdly install "Visual Studio Team System 2008 Service Pack 1 Forward Compatibility Update for Team Foundation Server 2010", which you can download here. This Service Pack let's you connect to TFS 2010.
Next, Next, Finish
























4) Connect to Team Foundation Server in BIDS.
a) In the Tools menu click "Connect to Team Foundation Server"
Connect to Team Foundation Server



















b) In the next screen click on "Servers..."
Connect to Team Foundation Server, Servers...



















c) And in the next screen click on "Add..."
Connect to Team Foundation Server, Servers..., Add...

















d) If you connect over http then your url should look something like this (don't use port number and protocol):
Connect to Team Foundation Server over http















And if you connect over https then you url should look something like this (don't use port number and protocol):
Connect to Team Foundation Server over https














Note: If you get this error "TF30335: The server name cannot contain the characters '/' or ':', or start with 'http://' or 'https://'. If the server name is an IPv6 address, it can contain the character ':' only if the full name is enclosed by square brackets.", while trying to connect to Team Foundation Server 2010 then you probably forgot the third step of this post.
Team Foundation Server 2010
















e) Click "OK" to add the server. Depending on the environment/tfs settings you now have to enter your username and password. After this step your tfs server will be added to Visual Studio/BIDS. Select it and click "Close"
Connect to Team Foundation Server, Servers..., Close


















f) Now you can select one or more projects.
Available TFS projects




















g) And they are also available via the Team Explorer / Source Control Explorer so you can open an existing SSIS project.
Team Explorer / Source Control Explorer















Or you can add a new SSIS project to source control.















Note: Disable multiple check-out!