Saturday, 21 May 2011

Referencing a custom assembly inside a Script Task

Case
I created an assembly with some often used methods. How can I reference to this dll inside a SSIS Script Task so that I can use these methods?

Solution
This is possible, but requires some additional steps before you can use the methods. To start with, you will have to add the dll to the GAC. Only assemblies from the GAC can be referenced. For this example I will use the isCorrectEmail method from the regular expression article from some weeks ago. It's made with SSIS 2008 on Windows 7. Some of the steps could differ a little bit on other systems.

1) Class Library
Start Visual Studio and create a new Class Library project (C# in this example).
Class Library project



















2) The example method
Add the following script to the ScriptMain.cs file. I created a static class EmailMethods with one static method IsCorrectEmail. This method will validate the format of an email address and returns True or False.
// C# code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;       // Added

namespace myMethodsForSSIS
{
    // A static class with email methods
    public static class EmailMethods
    {
        // A boolean method that validates an email address     
        // with a regex pattern.     
        public static bool IsCorrectEmail(String emailAddress)
        {
            // The pattern for email         
            string emailAddressPattern = @"^(([^<>()[\]\\.,;:\s@\""]+"
                + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@"
                + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
                + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+"
                + @"[a-zA-Z]{2,}))$";
            // Create a regex object with the pattern
            Regex emailAddressRegex = new Regex(emailAddressPattern);
            // Check if it is match and return that value (boolean)
            return emailAddressRegex.IsMatch(emailAddress);
        }
    }
}
Now build (release) the project to check for errors.

3) Strong name
We have to strong name the assembly before we can use it. More about that in this Codeguru article: Giving a .NET Assembly a Strong Name. Open the Visual Studio 2008 Command Prompt (in Windows start menu). Browse to your project folder and execute the following command to create a key file: sn.exe -k PublicPrivateKeyFile.snk
Microsoft (R) .NET Framework Strong Name Utility
















4) Add key file to project
Back to Visual Studio. Right click the project file and add the newly created key file to the project. See image for more details.
Add existing item














5) AssemblyInfo
Now we have added the key file to the project, we also have to add it to the AssemblyInfo.cs file. Which can be found in your Solution Explorer in the folder Properties. Add the following rows and build the project:
[assembly: AssemblyDelaySign(false)]
[assembly: AssemblyKeyFile("PublicPrivateKeyFile.snk")]











6) Add the assembly to GAC
Before we can use our assembly, we have to add it to the global assembly cache (GAC). Open the Visual Studio 2008 Command Prompt again (for Vista/Windows7/etc. open it as Administrator). And execute the following command.
gacutil /i d:\myMethodsForSSIS\myMethodsForSSIS\bin\Release\myMethodsForSSIS.dll
Microsoft (R) .NET Global Assembly Cache Utility














7) The SSIS project
Start Visual Studio and create a SSIS project. If Visual Studio is still open, close it first because it caches the GAC on start.


8) Variable
Open the default package and create a String variable named email and fill it with a correct email address.
String variable named email











9) Script Task
Add a Script Task to the Control Flow and edit it. Add the variable from the previous step as a ReadOnly variable.
ReadOnlyVariables




















10) Add reference
Edit the script and add a reference (menu project). Browse to the release folder of your assembly and select the dll file. See picture for more details. (Don't forgot to save afterwards. Click Save All to save adding the reference!)
Add reference to assembly
















11) The script
Add the Using and the Messagebox to the script.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using myMethodsForSSIS;                 // Added

namespace ST_28f906d94d9745d6a375b42d428da87d.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            // Messagebox True or False
            // indicating that the email
            // format is correct or not
            MessageBox.Show(EmailMethods.IsCorrectEmail(Dts.Variables["email"].Value.ToString()).ToString());

            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}


12) The result
Now execute the Script Task and see what happens. Add for example an extra @ to the email address and see what happens if you execute it again.

Note 1: this can also be used for Script Components in exacty the same way.

Note 2: Instead of using a reference to the project folder, some people prefer to copy the dll to C:\WINDOWS\Microsoft.NET\Framework\<version>  and reference to that file. However for SSIS 2005 the dll should be in this folder: c:\Program Files\Microsoft SQL Server\90\SDK\Assemblies\

Monday, 2 May 2011

Wait until file is unlocked

Case
I often get errors that my source file is still locked by an other process. Is there a solution to wait until the file is released?

Solution
You could make a private copy of the source file with the File System Task but you can also wait for the unlock. Here are a couple of solutions:
1) Script Task
Add a Script Task to the Control Flow and connect it to your Data Flow Task.
Script Task

















2) The script
Copy the following script to the Script Task. If you use 2005 then the script result is a little different. Replace the my script result line with the original line from your main method.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.IO;                        // Added to check file existance
using System.Threading;                 // Added for delay

namespace ST_89ab5f10e1de490aa762819b8221ee0a.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            // Variable to store the file path in.
            string myConnectionString = "";

            // For flat files like csv and txt you can use the whole connectionstring
            // ======================================================================
            string connectionString = Dts.Connections["myCsvFile"].ConnectionString;


            // For Excel connection you only need a part of the connectionstring:
            // ======================================================================
            // Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\MyExcelFile.xls;Extended Properties="Excel 8.0;HDR=YES";
            // Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\MyExcelFile.xlsx;Extended Properties="Excel 12.0 XML;HDR=YES";
            // ======================================================================
            // You only want the part after 'Source=' until the next semicolon (;)
            // myConnectionString = Dts.Connections["myExcelFile"].ConnectionString.Substring(Dts.Connections["myExcelFile"].ConnectionString.IndexOf("Source=") + 6);
            // myConnectionString = myConnectionString.Substring(1, myConnectionString.IndexOf(";") - 1);

            // Check if the file exists before checking if it can be opened
            if (File.Exists(myConnectionString))
            {
                Boolean fireAgain = false;
                Dts.Events.FireInformation(0, "File Lock Check", "File exists, now checking if it can be opened", string.Empty, 0, ref fireAgain);

                // Boolean variable to prevent endless lock warnings
                Boolean ShowLockWarning = true;

                // Boolean variable needed for the while loop
                Boolean FileLocked = true;
                while (FileLocked)
                {
                    try
                    {
                        // Check if the file isn't locked by an other process by opening 
                        // the file. If it succeeds, set variable to false and close stream                     
                        FileStream fs = new FileStream(myConnectionString, FileMode.Open);

                        // No error so it is not locked
                        Dts.Events.FireInformation(0, "File Lock Check", "File not locked", string.Empty, 0, ref fireAgain);
                        FileLocked = false;

                        // Close the file and exit the Script Task
                        fs.Close();
                        Dts.TaskResult = (int)ScriptResults.Success;
                    }
                    catch (IOException ex)
                    {
                        // If opening fails, it's probably locked by an other process. This is the exact message:
                        // System.IO.IOException: The process cannot access the file 'D:\example.csv' because it is being used by another process. 

                        // Log locked status (once)
                        if (ShowLockWarning)
                        {
                            Dts.Events.FireWarning(0, "File Lock Check", "File locked: " + ex.Message, string.Empty, 0);
                        }
                        ShowLockWarning = false;

                        // Wait two seconds before rechecking
                        Thread.Sleep(2000);
                    }
                    catch (Exception ex)
                    {
                        // Catch other unexpected errors and break the while loop
                        Dts.Events.FireError(0, "File Lock Check", "Unexpected error: " + ex.Message, string.Empty, 0);
                        Dts.TaskResult = (int)ScriptResults.Failure;
                        break;
                    }
                }
            }
            else
            {
                // File doesn't exist, so no checking possible.
                Dts.Events.FireError(0, "File Lock Check", "File does not exist: " + myConnectionString, string.Empty, 0);
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}


or VB.Net code

'VB.Net code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.IO           ' Added to check file existance
Imports System.Threading    ' Added for delay

<Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute()> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
    Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

    Public Sub Main()
        ' Variable to store the file path in.
        Dim myConnectionString As String = ""

        ' For flat files like csv and txt you can use the whole connectionstring
        ' ======================================================================
        myConnectionString = Dts.Connections("myCsvFile").ConnectionString

        ' For Excel connection you only need a part of the connectionstring:
        ' ======================================================================
        ' Provider=Microsoft.Jet.OLEDB.4.0;Data Source=D:\MyExcelFile.xls;Extended Properties="Excel 8.0;HDR=YES";
        ' Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\MyExcelFile.xlsx;Extended Properties="Excel 12.0 XML;HDR=YES";
        ' ======================================================================
        ' You only want the part after 'Source=' until the next semicolon (;)
        ' myConnectionString = Dts.Connections("myExcelFile").ConnectionString.Substring(Dts.Connections("myExcelFile").ConnectionString.IndexOf("Source=") + 6)
        ' myConnectionString = myConnectionString.Substring(1, myConnectionString.IndexOf(";") - 1)

        ' Check if the file exists before checking if it can be opened
        If File.Exists(myConnectionString) Then
            Dim fireAgain As [Boolean] = False
            Dts.Events.FireInformation(0, "File Lock Check", "File exists, now checking if it can be opened", String.Empty, 0, fireAgain)

            ' Boolean variable to prevent endless lock warnings
            Dim ShowLockWarning As [Boolean] = True

            ' Boolean variable needed for the while loop
            Dim FileLocked As [Boolean] = True
            While FileLocked
                Try
                    ' Check if the file isn't locked by an other process by opening 
                    ' the file. If it succeeds, set variable to false and close stream                     
                    Dim fs As New FileStream(myConnectionString, FileMode.Open)

                    ' No error so it is not locked
                    Dts.Events.FireInformation(0, "File Lock Check", "File not locked", String.Empty, 0, fireAgain)
                    FileLocked = False

                    ' Close the file and exit the Script Task
                    fs.Close()
                    Dts.TaskResult = ScriptResults.Success
                Catch ex As IOException
                    ' If opening fails, it's probably locked by an other process. This is the exact message:
                    ' System.IO.IOException: The process cannot access the file 'D:\example.csv' because it is being used by another process. 

                    ' Log locked status (once)
                    If ShowLockWarning Then
                        Dts.Events.FireWarning(0, "File Lock Check", "File locked: " & ex.Message, String.Empty, 0)
                    End If
                    ShowLockWarning = False

                    ' Wait two seconds before rechecking
                    Thread.Sleep(2000)
                Catch ex As Exception
                    ' Catch other unexpected errors and break the while loop
                    Dts.Events.FireError(0, "File Lock Check", "Unexpected error: " & ex.Message, String.Empty, 0)
                    Dts.TaskResult = ScriptResults.Failure
                    Exit Try
                End Try
            End While
        Else
            ' File doesn't exist, so no checking possible.
            Dts.Events.FireError(0, "File Lock Check", "File does not exist: " & myConnectionString, String.Empty, 0)
            Dts.TaskResult = ScriptResults.Failure
        End If
    End Sub

    Enum ScriptResults
        Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
        Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
    End Enum

End Class
Note: you can check both Excel files and flat files like Csv and fixed. Comment out the Flat file or Excel part of the Script.

3) The result
Now you can test your package by opening the source file in for example Excel and see the result:
The test result














Note: you could add some counter or time compare mechanism to accomplish a max number of checks or a max check time.This will prevent endless waiting.

Sunday, 1 May 2011

Download source file from website with SSIS

Case
I have to download my source file from a website. Can SSIS do that for me?

Solution
There are a few options:
1) Third party tools such as CozyRoc or BlueSSIS.
2) Script Task with WebClient
3) Script Task with HTTP Connection Manager

I will show you the second and third option. First the WebClient solution.


The WebClient solution:
1) Start situation
I have a Data Flow that reads a CSV file (Products.csv), does some adjustments and inserts the records into a database table. The CSV file is on my harddrive, but I want a fresh download each time I run the package.
Start situation

















2) Variables
Create a SSIS string variable named DownloadURL and fill it with a correct value.
Variable with the download URL









3) Script Task
Add a Script Task to your Control Flow and give it a suitable name. Connect it to your Data Flow Task.
The Script Task


















4) ReadOnly Variable
Add the DownloadURL variable as ReadOnly to the Script Task.
ReadOnly Variable



















5) The Script
Add the following C#/VB.net Script to your Script Task. It downloads the file and uses the Flat File Connectionstring to save the file.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Net;           // Added

namespace ST_929b5207bd37455a882c35e500ab2950.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            try
            {
                // Logging start of download
                bool fireAgain = true;
                Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Variables["DownloadURL"].Value.ToString(), string.Empty, 0, ref fireAgain);

                // Create a webclient to download a file
                WebClient mySSISWebClient = new WebClient();

                // Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
                // to save the file (and replace the existing file)
                mySSISWebClient.DownloadFile(Dts.Variables["DownloadURL"].Value.ToString(), Dts.Connections["myProductFile"].ConnectionString);

                // Logging end of download
                Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections["myProductFile"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Quit Script Task succesful
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                // Logging why download failed
                Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, string.Empty, 0);

                // Quit Script Task unsuccesful
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}

' VB.net Code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Net      ' Added

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum
 
    Public Sub Main()
        Try
            ' Logging start of download
            Dim fireAgain As Boolean = True
            Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Variables("DownloadURL").Value.ToString(), String.Empty, 0, fireAgain)

            ' Create a webclient to download a file 
            Dim mySSISWebClient As WebClient = New WebClient()

            ' Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)                 
            ' to save the file (and replace the existing file)
            mySSISWebClient.DownloadFile(Dts.Variables("DownloadURL").Value.ToString(), Dts.Connections("myProductFile").ConnectionString)

            ' Logging end of download                 
            Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections("myProductFile").ConnectionString, String.Empty, 0, fireAgain)

            ' Quit Script Task succesful
            Dts.TaskResult = ScriptResults.Success
        Catch ex As Exception
            ' Logging why download failed
            Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, String.Empty, 0)

            ' Quit Script Task unsuccesful
            Dts.TaskResult = ScriptResults.Failure
        End Try
    End Sub
End Class
Note: This is a script with some basic eventlogging and error handling. See this article for more details.

6) Testing
Now run your package to test the result. You can use this CSV file to test it:
https://sites.google.com/site/ssisblogspot/products.csv

Note: See MSDN WebClient page for other properties of the WebClient Class such as Credentials and Proxy. If you don't want to add Credentials and Proxy settings in your Script Task the next solution might be better.





The HTTP Connection Manager solution:
This solution uses the HTTP Connection Manager that is standard available in SSIS. You can configure all Credentials and Proxy settings in this connection manager itself which makes is a little more easier to configure. You still have to do the download part with a Script Task.

1) Start situation
I have a Data Flow that reads a CSV file (Products.csv), does some adjustments and inserts the records into a database table. The CSV file is on my harddrive, but I want a fresh download each time I run the package.
Start situation

















2) HTTP Connection manager
You don't need the URL variable for this solution. Instead add a new connection of the type HTTP.
Right Click in the Connection Manager window and select..


















3) Configure HTTP Connection manager
Add the download URL in the Server URL box and optional set all Credentials and Proxy settings if you need them for your internet connection. I'm directly connected to the internet so I don't have to change any of the settings here.
HTTP Connection Manager

















4) Script Task
Add a Script Task to your Control Flow and give it a suitable name. Connect it to your Data Flow Task.
The Script Task


















5) The Script
Add the following C#/VN.Net Script to your Script Task. It downloads the file and uses the Flat File Connectionstring to save the file.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_6799d08685cb4ad78633d035fab12178.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            try
            {
                // Logging start of download
                bool fireAgain = true;
                Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Connections["HTTP Connection Manager"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Get your newly added HTTP Connection Manager
                Object mySSISConnection = Dts.Connections["HTTP Connection Manager"].AcquireConnection(null);

                // Create a new connection
                HttpClientConnection myConnection = new HttpClientConnection(mySSISConnection);

                // Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
                // to save the file (and replace the existing file)
                myConnection.DownloadFile(Dts.Connections["myProductFile"].ConnectionString, true);

                // Logging end of download
                Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections["myProductFile"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Quit Script Task succesful
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                // Logging why download failed
                Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, string.Empty, 0);

                // Quit Script Task unsuccesful
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}

' VB.Net code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum

    Public Sub Main()
        Try
            ' Logging start of download 
            Dim fireAgain As Boolean = True
            Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Connections("HTTP Connection Manager").ConnectionString, String.Empty, 0, fireAgain)

            ' Get your newly added HTTP Connection Manager
            Dim mySSISConnection As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing)
            ' Create a new connection                 
            Dim myConnection As HttpClientConnection = New HttpClientConnection(mySSISConnection)

            ' Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
            ' to save the file (and replace the existing file)

            myConnection.DownloadFile(Dts.Connections("myProductFile").ConnectionString, True)
            ' Logging end of download                 
            Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections("myProductFile").ConnectionString, String.Empty, 0, fireAgain)

            ' Quit Script Task succesful
            Dts.TaskResult = ScriptResults.Success
        Catch ex As Exception
            ' Logging why download failed                 
            Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, String.Empty, 0)

            ' Quit Script Task unsuccesful
            Dts.TaskResult = ScriptResults.Failure
        End Try
    End Sub
End Class
Note: This is a script with some basic eventlogging and error handling. See this article for more details.

6) Testing
Now run your package to test the result. You can use this CSV file to test it:
https://sites.google.com/site/ssisblogspot/products.csv

Conclusion
Both methods (WebClient and HTTP Connection Manager) will have the same result, but the HTTP Connection Manager is probably a little easier to configure.


NOTE: This script example cannot do website logins or other manual actions!