Showing posts with label DOWNLOAD. Show all posts
Showing posts with label DOWNLOAD. Show all posts

Sunday, 1 May 2011

Download source file from website with SSIS

Case
I have to download my source file from a website. Can SSIS do that for me?

Solution
There are a few options:
1) Third party tools such as CozyRoc or BlueSSIS.
2) Script Task with WebClient
3) Script Task with HTTP Connection Manager

I will show you the second and third option. First the WebClient solution.


The WebClient solution:
1) Start situation
I have a Data Flow that reads a CSV file (Products.csv), does some adjustments and inserts the records into a database table. The CSV file is on my harddrive, but I want a fresh download each time I run the package.
Start situation

















2) Variables
Create a SSIS string variable named DownloadURL and fill it with a correct value.
Variable with the download URL









3) Script Task
Add a Script Task to your Control Flow and give it a suitable name. Connect it to your Data Flow Task.
The Script Task


















4) ReadOnly Variable
Add the DownloadURL variable as ReadOnly to the Script Task.
ReadOnly Variable



















5) The Script
Add the following C#/VB.net Script to your Script Task. It downloads the file and uses the Flat File Connectionstring to save the file.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
using System.Net;           // Added

namespace ST_929b5207bd37455a882c35e500ab2950.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            try
            {
                // Logging start of download
                bool fireAgain = true;
                Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Variables["DownloadURL"].Value.ToString(), string.Empty, 0, ref fireAgain);

                // Create a webclient to download a file
                WebClient mySSISWebClient = new WebClient();

                // Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
                // to save the file (and replace the existing file)
                mySSISWebClient.DownloadFile(Dts.Variables["DownloadURL"].Value.ToString(), Dts.Connections["myProductFile"].ConnectionString);

                // Logging end of download
                Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections["myProductFile"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Quit Script Task succesful
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                // Logging why download failed
                Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, string.Empty, 0);

                // Quit Script Task unsuccesful
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}

' VB.net Code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime
Imports System.Net      ' Added

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum
 
    Public Sub Main()
        Try
            ' Logging start of download
            Dim fireAgain As Boolean = True
            Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Variables("DownloadURL").Value.ToString(), String.Empty, 0, fireAgain)

            ' Create a webclient to download a file 
            Dim mySSISWebClient As WebClient = New WebClient()

            ' Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)                 
            ' to save the file (and replace the existing file)
            mySSISWebClient.DownloadFile(Dts.Variables("DownloadURL").Value.ToString(), Dts.Connections("myProductFile").ConnectionString)

            ' Logging end of download                 
            Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections("myProductFile").ConnectionString, String.Empty, 0, fireAgain)

            ' Quit Script Task succesful
            Dts.TaskResult = ScriptResults.Success
        Catch ex As Exception
            ' Logging why download failed
            Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, String.Empty, 0)

            ' Quit Script Task unsuccesful
            Dts.TaskResult = ScriptResults.Failure
        End Try
    End Sub
End Class
Note: This is a script with some basic eventlogging and error handling. See this article for more details.

6) Testing
Now run your package to test the result. You can use this CSV file to test it:
https://sites.google.com/site/ssisblogspot/products.csv

Note: See MSDN WebClient page for other properties of the WebClient Class such as Credentials and Proxy. If you don't want to add Credentials and Proxy settings in your Script Task the next solution might be better.





The HTTP Connection Manager solution:
This solution uses the HTTP Connection Manager that is standard available in SSIS. You can configure all Credentials and Proxy settings in this connection manager itself which makes is a little more easier to configure. You still have to do the download part with a Script Task.

1) Start situation
I have a Data Flow that reads a CSV file (Products.csv), does some adjustments and inserts the records into a database table. The CSV file is on my harddrive, but I want a fresh download each time I run the package.
Start situation

















2) HTTP Connection manager
You don't need the URL variable for this solution. Instead add a new connection of the type HTTP.
Right Click in the Connection Manager window and select..


















3) Configure HTTP Connection manager
Add the download URL in the Server URL box and optional set all Credentials and Proxy settings if you need them for your internet connection. I'm directly connected to the internet so I don't have to change any of the settings here.
HTTP Connection Manager

















4) Script Task
Add a Script Task to your Control Flow and give it a suitable name. Connect it to your Data Flow Task.
The Script Task


















5) The Script
Add the following C#/VN.Net Script to your Script Task. It downloads the file and uses the Flat File Connectionstring to save the file.
// C# code
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_6799d08685cb4ad78633d035fab12178.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            try
            {
                // Logging start of download
                bool fireAgain = true;
                Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Connections["HTTP Connection Manager"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Get your newly added HTTP Connection Manager
                Object mySSISConnection = Dts.Connections["HTTP Connection Manager"].AcquireConnection(null);

                // Create a new connection
                HttpClientConnection myConnection = new HttpClientConnection(mySSISConnection);

                // Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
                // to save the file (and replace the existing file)
                myConnection.DownloadFile(Dts.Connections["myProductFile"].ConnectionString, true);

                // Logging end of download
                Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections["myProductFile"].ConnectionString, string.Empty, 0, ref fireAgain);

                // Quit Script Task succesful
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                // Logging why download failed
                Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, string.Empty, 0);

                // Quit Script Task unsuccesful
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}

' VB.Net code
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum

    Public Sub Main()
        Try
            ' Logging start of download 
            Dim fireAgain As Boolean = True
            Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Connections("HTTP Connection Manager").ConnectionString, String.Empty, 0, fireAgain)

            ' Get your newly added HTTP Connection Manager
            Dim mySSISConnection As Object = Dts.Connections("HTTP Connection Manager").AcquireConnection(Nothing)
            ' Create a new connection                 
            Dim myConnection As HttpClientConnection = New HttpClientConnection(mySSISConnection)

            ' Download file and use the Flat File Connectionstring (D:\SourceFiles\Products.csv)
            ' to save the file (and replace the existing file)

            myConnection.DownloadFile(Dts.Connections("myProductFile").ConnectionString, True)
            ' Logging end of download                 
            Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections("myProductFile").ConnectionString, String.Empty, 0, fireAgain)

            ' Quit Script Task succesful
            Dts.TaskResult = ScriptResults.Success
        Catch ex As Exception
            ' Logging why download failed                 
            Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, String.Empty, 0)

            ' Quit Script Task unsuccesful
            Dts.TaskResult = ScriptResults.Failure
        End Try
    End Sub
End Class
Note: This is a script with some basic eventlogging and error handling. See this article for more details.

6) Testing
Now run your package to test the result. You can use this CSV file to test it:
https://sites.google.com/site/ssisblogspot/products.csv

Conclusion
Both methods (WebClient and HTTP Connection Manager) will have the same result, but the HTTP Connection Manager is probably a little easier to configure.


NOTE: This script example cannot do website logins or other manual actions!
Related Posts Plugin for WordPress, Blogger...