I want to loop through all files in a FTP folder and process them. I need a Foreach ftp file enumerator, but the Foreach Loop component only loops through local files.
Solution
There are a couple of options. You could use a Script Task, a wildcard in the RemotePath Property of the FTP Task or a Third Party component to download all files to a certain folder and loop through those files with a Foreach Loop file enumerator. This is the most common approach.
This example uses a Script Task to get a list of ftp files and uses a Foreach From Variable Enumerator to loop through those files.
Update: There is now a custom Foreach FTP File Enumerator available.
1) Variables
We need a couple of variables to keep things flexible:
- FtpWorkingDirectory: String variable with the remote folder path. Example: /root/data/
- DownloadDirectory: String variable that contains the path of the download folder: Example: d:\SourceFiles\
- FtpFileURL: String variable used in the foreach loop to store the remote filename in. Example: 01-01-2011.csv
- FtpFileList: Object variable to store the remote file list in.
Variables to keep things flexible |
2) Ftp Connection Manager
Create a FTP Connection Manager by right clicking in the Connection Manager pane and select New Connection... In the new window select FTP and click Add. After that fill in the FTP Connection Manager Editor. Test the connection manager and rename it to myFtpServer (this name is used in the Script Task).
New FTP connection |
After closing you can rename it. |
3) Script Task
Add a Script Task to the Control Flow and give it a suitable name. Edit the Script Task and add variable FtpWorkingDirectory as ReadOnly and FtpFileList as ReadWrite.
Script Task to get list of files |
4) The Script
Edit the script and copy the following code.
// #C Code using System; using System.Collections; // Added using System.Data; using System.Net; using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms; namespace ST_f20cc76ba3bc47849f70c6cbfd4701dc.csproj { [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase { #region VSTA generated code enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion public void Main() { try { // Get the ftp connection from the Connection Managers collection ConnectionManager ftpServer = Dts.Connections["myFtpServer"]; // Create a FTP connection object and us the credentials of connection manager FtpClientConnection myFtpConnection = new FtpClientConnection(ftpServer.AcquireConnection(null)); // Open the connection myFtpConnection.Connect(); // Set work folder with the value of the variable myFtpConnection.SetWorkingDirectory(Dts.Variables["FtpWorkingDirectory"].Value.ToString()); // Create StringArrays for filenames and folders // The folderNames aren't used, but is mandatory // for the next method. String[] fileNames; String[] folderNames; // Get a directory listing and fill the StringArray variables myFtpConnection.GetListing(out folderNames, out fileNames); // Copy StringArray to ArrayList to fit in Object variable ArrayList fileNamesArray = new ArrayList(fileNames); // Optional sorter fileNamesArray.Sort(); // Fill ssis object variable Dts.Variables["FtpFileList"].Value = fileNamesArray; // Close connection myFtpConnection.Close(); // Close Script Task, set result to success Dts.TaskResult = (int)ScriptResults.Success; } catch (Exception ex) { // Fire error and set result to failure Dts.Events.FireError(0, "FTP Script Task", "Error: " + ex.Message, string.Empty, 0); Dts.TaskResult = (int)ScriptResults.Failure; } } } }
Or VB.Net
' VB.Net code Imports System Imports System.Collections Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime <System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _ <System.CLSCompliantAttribute(False)> _ Partial Public Class ScriptMain Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase Enum ScriptResults Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure End Enum Public Sub Main() Try ' Get the ftp connection from the Connection Managers collection Dim ftpServer As ConnectionManager = Dts.Connections("myFtpServer") ' Create a FTP connection object and us the credentials of connection manager Dim myFtpConnection As FtpClientConnection = New FtpClientConnection(ftpServer.AcquireConnection(Nothing)) ' Open the connection myFtpConnection.Connect() ' Set work folder with the value of the variable myFtpConnection.SetWorkingDirectory(Dts.Variables("FtpWorkingDirectory").Value.ToString()) ' Create StringArrays for filenames and folders ' The folderNames aren't used, but is mandatory ' for the next method. Dim fileNames() As String Dim folderNames() As String ' Get a directory listing and fill the StringArray variables myFtpConnection.GetListing(folderNames, fileNames) ' Copy StringArray to ArrayList to fit in Object variable Dim fileNamesArray As ArrayList = New ArrayList(fileNames) ' Optional sorter fileNamesArray.Sort() ' Fill ssis object variable Dts.Variables("FtpFileList").Value = fileNamesArray ' Close connection myFtpConnection.Close() ' Close Script Task, set result to success Dts.TaskResult = ScriptResults.Success Catch ex As Exception ' Fire error and set result to failure Dts.Events.FireError(0, "FTP Script Task", "Error: " + ex.Message, String.Empty, 0) Dts.TaskResult = ScriptResults.Failure End Try End Sub End Class
5) Foreach Loop
Add a Foreach Loop after the Script Task. Edit the Foreach Loop and on the Collection tab select Foreach From Variable Enumerator as enumerator and select the object variable FtpFileList as the variable.
Foreach From Variable Enumerator |
6) Foreach Loop Variable Mappings
On the Variable Mappings tab of the Foreach Loop select the variable FtpFileURL.
Variable Mapping |
7) FTP Task General
Add a FTP task within the Foreach Loop. Edit the FTP Task and on the General tab you can select the FTP Connection which we created in step 2. And you can give it a suitable name.
General: select the ftp connection |
8) FTP Task File Transfer
On the File Transfer tab you can select the DownloadDirectory variable for the local folder. In the Operation category you should select Receive files as the operation. For RemotePath you can enter a dummy value because we are overriding that value with an expression in the next step.
File Transfer |
9) FTP Task Expressions
At the Expressions tab you must add an expression for the RemotePath: @[User::FtpWorkingDirectory] + @[User::FtpFileURL] This expression combines the working folder (/root/data/) and the filename (01-01-2011.csv) into: /root/data/01-01-2011.csv.
Click at 2 to open the Expression Editor |
10) Process file in Data Flow
Add a Data Flow Task within the Foreach Loop to process the downloaded files.
Process files |
10) Process file in Connection Manager
Create a Data Flow like the example below. I named the connection for the csv file myDownloadedFtpFile. Go to the properties of that connection and add an expression for the ConnectionString so we can use it in the Foreach Loop: @[User::DownloadDirectory] + @[User::FtpFileURL]
A simple data flow |
Expression to override the ConnectionString property |
11) The result
Run the package to see the result.
The result |
Download example package
Note: In the Script Task you can use the .Net variable folderNames to create some recursive function to loop through all subfolders as well.
Note: If you just want to download all files with a Script Task and use a Foreach Loop file enumerator to process all files, you can replace the C# lines 48 - 55 or the VB.Net lines 41 - 48 with the following code:
// C# Code // Download all files at once. Don't forget to add the SSIS variable DownloadDirectory to the ReadOnlyVariables myFtpConnection.ReceiveFiles(fileNames, Dts.Variables["DownloadDirectory"].Value.ToString(), true, false);
' VB.Net code ' Download all files at once. Don't forget to add the SSIS variable DownloadDirectory to the ReadOnlyVariables myFtpConnection.ReceiveFiles(fileNames, Dts.Variables("DownloadDirectory").Value.ToString(), true, false)
Hi,
ReplyDeleteCan you please setup this package for download? Thanks for replying to my question in the forums.
@PeaceOut: I have added the example package.
ReplyDeleteGrate post. For some reason I had to set UsePassiveMode property to true on the FTP connection manager.
ReplyDelete@AikoAiko: An FTP connection manager can send and receive files using either active mode or passive mode. In active mode, the server initiates the data connection, and in passive mode, the client initiates the data connection.
ReplyDeleteHi,Great post! Quick question, I have recreated the package but nothing is being passed into the data flow tasks to be written to the database table. Any tips on what i have messed up? Hoping to be able to files the files downloaded from the FTP site into the database table.
ReplyDelete@RDH: First check if the Script is working with a debug or by adding a MessageBox.Show in the script to check wether the variable is getting filled.
ReplyDeleteSomething like MessageBox.Show(fileNamesArray.Length().ToString()) or MessageBox.Show(fileNames.ToString())
If it's filled then something is wrong with you Foreach Loop Container (Variable mapping or Expression). If it's not filled then something is probably wrong with your FTP Connection manager (or the script).
There is also an example package that you could try. Let me know your observations...
Great post, very helpful. I have one suggestion to improve it. The package will fail on the Get File List of FTP Site Script Task if there are no files to FTP. To avoid this I've modified the script to pass an empty list if there are no files present. With this modification it will still make it to the Foreach Loop Container and obviously won't do anything since the list is empty. The difference being it finishes with Success.
ReplyDeleteThe changes to the script is as follows:
...
// Get a directory listing and fill the StringArray variables
myFtpConnection.GetListing(out folderNames, out fileNames);
ArrayList fileNamesArray = new ArrayList();
if (fileNames != null)
{
// Copy StringArray to ArrayList to fit in Object variable
fileNamesArray = new ArrayList(fileNames);
// Optional sorter
fileNamesArray.Sort();
}
// Fill ssis object variable
Dts.Variables["FtpFileList"].Value = fileNamesArray;
...
@Anonymous: great addition
ReplyDeleteJoost, could you please tell me where should i insert the "anonymous" code in above script...
ReplyDeletei am using VB code from above and please compile for errors...as i am not expert in .Net
Thanks in Advance!!!!!!
@Anonymous: are you talking about anonymous access? The script uses the ftp connection manager to contact the ftp server. So access should be handled in the connection manager. See this msdn page for more details about the ftpconnection in the script. You can 'overrule' the connection manager part and use your own credentials.
Deletei am talking about the following script...
ReplyDeleteplease post it in VB.net and tell me in where should i insert below code in main script...please compile for errors...i am not .net Pro
// Get a directory listing and fill the StringArray variables
myFtpConnection.GetListing(out folderNames, out fileNames);
ArrayList fileNamesArray = new ArrayList();
if (fileNames != null)
{
// Copy StringArray to ArrayList to fit in Object variable
fileNamesArray = new ArrayList(fileNames);
// Optional sorter
fileNamesArray.Sort();
}
// Fill ssis object variable
Dts.Variables["FtpFileList"].Value = fileNamesArray;
try something like this:
Delete' Get a directory listing and fill the StringArray variables
myFtpConnection.GetListing(folderNames, fileNames)
Dim fileNamesArray As ArrayList
If Not (fileNames Is Nothing) Then
' Copy StringArray to ArrayList to fit in Object variable
fileNamesArray = New ArrayList(fileNames)
' Optional sorter
fileNamesArray.Sort()
End If
' Fill ssis object variable
Dts.Variables("FtpFileList").Value = fileNamesArray
code fails again when there are no files, can u tell me where shud i insert the above code in Main code..
ReplyDeletethat helps....
replace the part between (including these two lines):
Delete' Get a directory listing and fill the StringArray variables
and
Dts.Variables("FtpFileList").Value = fileNamesArray
with the code above... tested it myself and it works. If it doesn't work then contact me via the "contact me page" and include the code and errormessage.
Joost...this works great for downloading the files, but what if I needed to create the parent directory on my side and then download the files into that directory? So if the files on the FTP server reside in FolderA, I need to create FolderA on my side and the download all of the files in FolderA.
ReplyDeleteThe code is for populating the foreach loop only, but you can use the variable from the foreach loop (that contains the complete ftp file path) to create a folder. You could for example create a second string variable with an expression on it that converts the string from the first variable into a local path. Now you can use that variable to create a folder with an Execute File Task and you can also use it in the FTP Task as a download (destination) folder.
Deleteis it possible to modify this code to not only download the files but also it's parent folder?
ReplyDeleteNot sure what you mean... this code is for populating the foreach loop only. It only returns filepaths. You can make a recursive function to also loop through subfolders (or if you don't know .Net you can use my custom ftp file enumerator.
DeleteCreating local folders should be done as described above.
An other great resources is the SSIS MSDN Forum: http://social.msdn.microsoft.com/forums/en-US/sqlintegrationservices/threads/
Excellent Example. Solved all my problems. Thanks a lot!!!!
ReplyDeleteThank you for the code. Very good job.
ReplyDeleteOne small problem I have. I want to download only certain files.
All files that start with "test" as an example.
I've already tried with String.StartsWith but that does not work.
Can you help me?
One option could be to loop trough the string array or the arraylist and add an if-statement inside. Something like (haven't tested it):
Delete// Loop through arraylist of ftp files
ArrayList fileNamesArrayFiltered = new ArrayList();
foreach (string value in fileNamesArray)
{
// Check if it starts with "test"
if (value.StartsWith("test"))
{
fileNamesArrayFiltered.Add(value);
}
}
// Optional sorter
fileNamesArrayFiltered.Sort();
// Fill ssis object variable
Dts.Variables["FtpFileList"].Value = fileNamesArrayFiltered;
An other option could be to use my custom ftp enumerator. It has a filter in it.
Joost,
ReplyDeleteThanks for this, and for all your work responding to users. Without using too much of your time, could you point me in the best direction for using your ftp enumerator for only grabbing the most recent file in a dir? I've searched for ways to alter the script or to use the sort desc and then only run the loop one time, but I haven't hit on the answer.
Thanks again,
scott
Hi Scott,
DeleteThe standard FTP Connection in SSIS doesn't support filedates. So the FTP FILE Enumerator wont help you unless the date is stored in de filename.
I think you have two options:
1) .Net coding in a Script Task (any experience?)
2) Use a FTP client that does support filedates and call it via the Execue Process Task.
Regards,
Joost
how would i get all the files in the subfolders?
ReplyDeleteYou have t make a recursive method that gets all the files and sub folders in a certain folder. And for each folder you call this method again...
DeleteI created a custom foreach ftp file enumerator that does it for you if you don't want to program.
i made th same example but using excel as source data , but i can't excute the package i have an error :
ReplyDeleteErreur sur Tâche de flux de données [Source Excel [1]] : Code d'erreur SSIS DTS_E_CANNOTACQUIRECONNECTIONFROMCONNECTIONMANAGER. Échec de l'appel de la méthode AcquireConnection vers le gestionnaire de connexions « Gestionnaire de connexions Excel ». Code d'erreur : 0xC0202009. Des messages d'erreur peuvent être envoyés au préalable avec des informations indiquant la raison de l'échec de la méthode AcquireConnection.
Erreur sur Tâche de flux de données [SSIS.Pipeline] : Échec de la validation de composant « Source Excel » (1). Code d'erreur : 0xC020801C.
Erreur sur Tâche de flux de données [SSIS.Pipeline] : Échec de la validation d'un ou de plusieurs composants.
Erreur sur Tâche de flux de données : Des erreurs se sont produites au cours de la validation de la tâche.
Sorry don't speak/read French. But I think you have to set delay validation property to true because the excel doesn't yet exist when starting the package. Try setting it for the Excel Connection Manager and/or for the Data Flow Task.
DeleteGreat post!
ReplyDeleteThanks, this article was really helpful. I do wish the SSIS FTP Task was more robust though.
ReplyDeletethanks for sharing .. it helped :)
ReplyDeleteHi there, this is a superb script. However what ammendments would you sugget for automatically moving the file on the FTP to a "Processed" folder; so that the file doesn't get downloaded again when the script re-runs. I have added a "Remove" file ammendment, however I would like to store it in a separate folder.
ReplyDeleteIt's not possible within this example code. You could try the WebRequestMethods.Ftp.Rename
DeleteHi Joost, I need to download the file from local to ftp one by one from the folder with foreach loop container and make sure that if any file loading fail then it try again 2 time and if its still fails then its sent us the meessage in a catch blog.
ReplyDeleteHello Unknown,
DeleteIf you want to upload multiple files to the FTP server then you should use the regular FILE enumerator (not this custom FTP FILE enumerator). Within the loop you can add a FTP task to upload the file. With a retry trick http://microsoft-ssis.blogspot.com/2014/06/retry-task-on-failure.html you can accomplish your other request.
i did not understand the data flow part, can you please add a video.
ReplyDeleteI am downloading images files.