I want my files to be process in filedate order, but the foreach loop doesn't provide a sort mechanism. Is there a way to order by filedate or filename?
Solution
The standard foreach loop container can't be sorted. See/vote this Feedback suggestion at Microsoft. It looks like it's standard ordered by filename. You could use this custom Sorted File Enumerator or you can accomplish it with a Script Task. This Script Task solution let's you order (asc or desc) by filename, creation date, last modified date or any other file property.
1) Control Flow
Drag a Script Task and a Foreach Loop Container to the Control Flow like the image below.
Sorted foreach loop container |
2) Variables
Create three variables:
- startlocation (string) that indicates the folder wherein we gonna search for files. Fill it with a path like "d:\mySourceFiles\".
- dataset (object) which we gonna use for communicating between the Script Task and the Foreach Loop Container.
- filepath (string) which is used for the variable mapping in the Foreach Loop Container.
The variables for this solution |
3) Script Task
Select the startlocation as a readonly variable (we only read the path inside it) and select dataset as a readwritevariable (an ADO object will be stored in it).
Variables in Script Task |
4) The Script
Copy the following script to your Script Task. Line 52 does the filtering and sorting. It is almost the same as a SQL query. See more examples on msdn.
// C# code using System; using System.Data; using System.IO; // Added using Microsoft.SqlServer.Dts.Runtime; using System.Windows.Forms; namespace ST_2e776e26793b45939128add3d850f70d.csproj { [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")] public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase { #region VSTA generated code enum ScriptResults { Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success, Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure }; #endregion public void Main() { // Create a dataset. I named it unsorted, because it's not yet sorted DataSet dsUnsorted = new DataSet(); // Create a new table in the dataset DataTable filelistTable = dsUnsorted.Tables.Add(); filelistTable.Columns.Add("FilePath", typeof(string)); // Filepath needed for connectionstring. filelistTable.Columns.Add("FileName", typeof(string)); // Filename used for sorting [optional]. filelistTable.Columns.Add("FileDate", typeof(DateTime));// Filedate used for sorting [optional]. // Get all files within the folder string[] allFiles = Directory.GetFiles(Dts.Variables["User::startlocation"].Value.ToString()); // Variable for storing file properties FileInfo fileInfo; // Loop through the files in the folder foreach (string currentFile in allFiles) { // Fill fileInfo variable with file information fileInfo = new FileInfo(currentFile); // Choose which the file properties you will use // Columns: FilePath FileName FileDate filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime); } // Filtering on *.txt extension. Note: like uses * instead of % // Sorting the files on filename (or filedate: FileName DESC) DataRow[] rows = dsUnsorted.Tables[0].Select("FileName like '*.txt'", "FileDate ASC"); // Create a new sorted dataset that the SSIS foreach loop uses. DataSet dsSorted = new DataSet(); DataTable filelistTableSorted = dsSorted.Tables.Add(); // Only interested in the filepath which is needed for the connectionstring filelistTableSorted.Columns.Add("FilePath", typeof(string)); // Fill the new dataset with the sorted rows. foreach (DataRow row in rows) { filelistTableSorted.Rows.Add(row["FilePath"].ToString()); } // Store the dataset in the SSIS variable Dts.Variables["dataset"].Value = dsSorted; Dts.TaskResult = (int)ScriptResults.Success; } } }
or with VB.Net
' VB.Net Code Imports System Imports System.Data Imports System.IO ' Added Imports System.Math Imports Microsoft.SqlServer.Dts.Runtime <System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _ <System.CLSCompliantAttribute(False)> _ Partial Public Class ScriptMain Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase Enum ScriptResults Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure End Enum Public Sub Main() ' Create a dataset. I named it unsorted, because it's not yet sorted Dim dsUnsorted As New DataSet() ' Create a new table in the dataset Dim filelistTable As DataTable = dsUnsorted.Tables.Add() filelistTable.Columns.Add("FilePath", GetType(String)) ' Filepath needed for connectionstring. filelistTable.Columns.Add("FileName", GetType(String)) ' Filename used for sorting [optional]. filelistTable.Columns.Add("FileDate", GetType(DateTime)) ' Filedate used for sorting [optional]. ' Get all files within the folder Dim allFiles As String() = Directory.GetFiles(Dts.Variables("User::startlocation").Value.ToString()) ' Variable for storing file properties Dim fileInfo As FileInfo ' Loop through the files in the folder For Each currentFile As String In allFiles ' Fill fileInfo variable with file information fileInfo = New FileInfo(currentFile) ' Choose which the file properties you will use ' Columns: FilePath FileName FileDate filelistTable.Rows.Add(fileInfo.FullName, fileInfo.Name, fileInfo.CreationTime) Next ' Filtering on *.txt extension. Note: like uses * instead of % ' Sorting the files on filename (or filedate: FileName DESC) Dim rows As DataRow() = dsUnsorted.Tables(0).[Select]("FileName like '*.txt'", "FileDate ASC") ' Create a new sorted dataset that the SSIS foreach loop uses. Dim dsSorted As New DataSet() Dim filelistTableSorted As DataTable = dsSorted.Tables.Add() ' Only interested in the filepath which is needed for the connectionstring filelistTableSorted.Columns.Add("FilePath", GetType(String)) ' Fill the new dataset with the sorted rows. For Each row As DataRow In rows filelistTableSorted.Rows.Add(row("FilePath").ToString()) Next ' Store the dataset in the SSIS variable Dts.Variables("dataset").Value = dsSorted Dts.TaskResult = ScriptResults.Success End Sub End Class
5) Foreach Loop Container
Edit the Foreach Loop Container and change the enumerator on the Collection tab to Foreach ADO Enumerator. This means that it will loop through an ADO object. Select the dataset variable as the ADO object source variable. This is the variable that contains the sorted dataset from the Script Task.
Foreach ADO Enumerator |
6) Variable Mappings
Map the variable filepath to Index 0. This will store the path in this variable.
Variable mapping |
Now you can add your own tasks to the Foreach Loop container and use the variable filepath in the expression for Connection String. I added a simple Script Task for testing that shows the filepath in a MessageBox.
The Result: a sorted Foreach Loop Container |
Note: there are other ways to accomplish this, like a first Foreach Loop Container that adds all filenames to a database table and then a second Foreach Loop Container to loop through that database table. Let me know what your solution is.
Example package 2008
Update: I have created a Custom Foreach Sorted File Enumerator if you don't like scripting.