Sunday, 13 March 2011

Get file properties with SSIS

Case
I recently did a post on how to get filedates in SSIS, but I got questions about how to get other file properties like size, owner and a check whether the file is locked by an other process.

Solution
You could use the open source File Properties Task, but you can also do it yourself with a Script Task. Let's elaborate that Script Task solution.

1) Create variables
I created a whole bunch of variables, one for each file property. Notice the various datatypes. Fill the variable FilePath with a value.
Right click in your Control Flow to activate Variable window














2) Script Task
Add a Script Task to your Control Flow. I didn't add the variables to the readonly and readwrite lists of the Script Task, but locked them in the script. See this article for the difference between those methods.
Script Task














2a) Optional
This step is optional. It depends on whether you choose script 3a or 3b. This step is required for script 3a.
Select all the new variables as ReadWriteVariables. Only the FilePath can be a ReadOnlyVariable.
ReadOnly- and ReadWriteVariables




















3a) The script (if you used 2a)
Copy the following code (usings and method) to your Script Task and remove the properties you don't need.
// C# code
// Fill SSIS variables with file properties
using System;
using System.Data;
using System.IO;                        // Added to get file properties
using System.Security.Principal;        // Added to get file owner
using System.Security.AccessControl;    // Added to get file owner
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_9ef66c631df646e08e4184e34887da16.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            // Variable for file information
            FileInfo fileInfo;

            // Fill fileInfo variable with file information
            fileInfo = new FileInfo(Dts.Variables["User::FilePath"].Value.ToString());

            // Check if file exists
            Dts.Variables["User::FileExists"].Value = fileInfo.Exists;

            // Get the rest of the file properties if the file exists
            if (fileInfo.Exists)
            {
                // Get file creation date
                Dts.Variables["User::FileCreationDate"].Value = fileInfo.CreationTime;
                
                // Get last modified date
                Dts.Variables["User::FileLastModifiedDate"].Value = fileInfo.LastWriteTime;

                // Get last accessed date
                Dts.Variables["User::FileLastAccessedDate"].Value = fileInfo.LastAccessTime;

                // Get size of the file in bytes
                Dts.Variables["User::FileSize"].Value = fileInfo.Length;

                // Get file attributes
                Dts.Variables["User::FileAttributes"].Value = fileInfo.Attributes.ToString();
                Dts.Variables["User::FileIsReadOnly"].Value = fileInfo.IsReadOnly;
                
                //////////////////////////////////////////////////////
                // Check if the file isn't locked by an other process
                try
                {
                    // Try to open the file. If it succeeds, set variable to false and close stream
                    FileStream fs = new FileStream(Dts.Variables["User::FilePath"].Value.ToString(), FileMode.Open);
                    Dts.Variables["User::FileInUse"].Value = false;
                    fs.Close();
                }
                catch (Exception ex)
                {
                    // If opening fails, it's probably locked by an other process
                    Dts.Variables["User::FileInUse"].Value = true;

                    // Log actual error to SSIS to be sure 
                    Dts.Events.FireWarning(0, "Get File Properties", ex.Message, string.Empty, 0);
                }

                //////////////////////////////////////////////////////
                // Get the Windows domain user name of the file owner
                FileSecurity fileSecurity = fileInfo.GetAccessControl();
                IdentityReference identityReference = fileSecurity.GetOwner(typeof(NTAccount));
                Dts.Variables["User::FileOwner"].Value = identityReference.Value;
            }

            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}

or the VB.Net code
' VB.net Code
' Fill SSIS variables with file properties 
Imports System
Imports System.Data
Imports System.IO                        ' Added to get file properties
Imports System.Security.Principal        ' Added to get file owner
Imports System.Security.AccessControl    ' Added to get file owner 
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum

 Public Sub Main()
        ' Variable for file information 
        Dim fileInfo As FileInfo

        ' Fill fileInfo variable with file information 
        fileInfo = New FileInfo(Dts.Variables("User::FilePath").Value.ToString())

        ' Check if file exists 
        Dts.Variables("User::FileExists").Value = fileInfo.Exists

        ' Get the rest of the file properties if the file exists 
        If (fileInfo.Exists) Then
            ' Get file creation date 
            Dts.Variables("User::FileCreationDate").Value = fileInfo.CreationTime
            ' Get last modified date 
            Dts.Variables("User::FileLastModifiedDate").Value = fileInfo.LastWriteTime
            ' Get last accessed date 
            Dts.Variables("User::FileLastAccessedDate").Value = fileInfo.LastAccessTime
            ' Get size of the file in bytes 
            Dts.Variables("User::FileSize").Value = fileInfo.Length
            ' Get file attributes 
            Dts.Variables("User::FileAttributes").Value = fileInfo.Attributes.ToString()
            Dts.Variables("User::FileIsReadOnly").Value = fileInfo.IsReadOnly
            ''''''''''''''''''''''''''''''''''''''''''''''''''' 
            ' Check if the file isn't locked by an other process 
            Try
                ' Try to open the file. If it succeeds, set variable to false and close stream 
                Dim fs As FileStream = New FileStream(Dts.Variables("User::FilePath").Value.ToString(), FileMode.Open)
                Dts.Variables("User::FileInUse").Value = False
                fs.Close()
            Catch ex As Exception
                ' If opening fails, it's probably locked by an other process 
                Dts.Variables("User::FileInUse").Value = True

                ' Log actual error to SSIS to be sure 
                Dts.Events.FireWarning(0, "Get File Properties", ex.Message, String.Empty, 0)
            End Try

            ''''''''''''''''''''''''''''''''''''''''''''''''''' 
            ' Get the Windows domain user name of the file owner 
            Dim fileSecurity As FileSecurity = fileInfo.GetAccessControl()
            Dim identityReference As IdentityReference = fileSecurity.GetOwner(GetType(NTAccount))
            Dts.Variables("User::FileOwner").Value = identityReference.Value
        End If

        Dts.TaskResult = ScriptResults.Success
 End Sub

End Class




3b) The script (without 2a)
Copy the following code (usings and method) to your Script Task and remove the properties you don't need.
// C# code
// Fill SSIS variables with file properties
using System;
using System.Data;
using System.IO;                        // Added to get file properties
using System.Security.Principal;        // Added to get file owner
using System.Security.AccessControl;    // Added to get file owner
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_cb8dd466d98149fcb2e3852ead6b6a09.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            // Lock SSIS variables
            Dts.VariableDispenser.LockForRead("User::FilePath");

            Dts.VariableDispenser.LockForWrite("User::FileAttributes");
            Dts.VariableDispenser.LockForWrite("User::FileCreationDate");
            Dts.VariableDispenser.LockForWrite("User::FileExists");
            Dts.VariableDispenser.LockForWrite("User::FileInUse");
            Dts.VariableDispenser.LockForWrite("User::FileIsReadOnly");
            Dts.VariableDispenser.LockForWrite("User::FileLastAccessedDate");
            Dts.VariableDispenser.LockForWrite("User::FileLastModifiedDate");
            Dts.VariableDispenser.LockForWrite("User::FileOwner");
            Dts.VariableDispenser.LockForWrite("User::FileSize");

            // Create a variables 'container' to store variables
            Variables vars = null;

            // Add variables from the VariableDispenser to the variables 'container'
            Dts.VariableDispenser.GetVariables(ref vars);

            // Variable for file information
            FileInfo fileInfo;

            // Fill fileInfo variable with file information
            fileInfo = new FileInfo(vars["User::FilePath"].Value.ToString());

            // Check if file exists
            vars["User::FileExists"].Value = fileInfo.Exists;

            // Get the rest of the file properties if the file exists
            if (fileInfo.Exists)
            {
                // Get file creation date
                vars["User::FileCreationDate"].Value = fileInfo.CreationTime;
                
                // Get last modified date
                vars["User::FileLastModifiedDate"].Value = fileInfo.LastWriteTime;

                // Get last accessed date
                vars["User::FileLastAccessedDate"].Value = fileInfo.LastAccessTime;

                // Get size of the file in bytes
                vars["User::FileSize"].Value = fileInfo.Length;

                // Get file attributes
                vars["User::FileAttributes"].Value = fileInfo.Attributes.ToString();
                vars["User::FileIsReadOnly"].Value = fileInfo.IsReadOnly;
                
                //////////////////////////////////////////////////////
                // Check if the file isn't locked by an other process
                try
                {
                    // Try to open the file. If it succeeds, set variable to false and close stream
                    FileStream fs = new FileStream(vars["User::FilePath"].Value.ToString(), FileMode.Open);
                    vars["User::FileInUse"].Value = false;
                    fs.Close();
                }
                catch (Exception ex)
                {
                    // If opening fails, it's probably locked by an other process
                    vars["User::FileInUse"].Value = true;

                    // Log actual error to SSIS to be sure 
                    Dts.Events.FireWarning(0, "Get File Properties", ex.Message, string.Empty, 0);
                }

                //////////////////////////////////////////////////////
                // Get the Windows domain user name of the file owner
                FileSecurity fileSecurity = fileInfo.GetAccessControl();
                IdentityReference identityReference = fileSecurity.GetOwner(typeof(NTAccount));
                vars["User::FileOwner"].Value = identityReference.Value;
            }

            // Release the locks
            vars.Unlock();
            
            Dts.TaskResult = (int)ScriptResults.Success;
        }
    }
}

or the VB.Net code
' VB.net Code
' Fill SSIS variables with file properties 
Imports System
Imports System.Data
Imports System.IO                        ' Added to get file properties
Imports System.Security.Principal        ' Added to get file owner
Imports System.Security.AccessControl    ' Added to get file owner 
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum

    Public Sub Main()
        ' Lock SSIS variables 
        Dts.VariableDispenser.LockForRead("User::FilePath")

        Dts.VariableDispenser.LockForWrite("User::FileAttributes")
        Dts.VariableDispenser.LockForWrite("User::FileCreationDate")
        Dts.VariableDispenser.LockForWrite("User::FileExists")
        Dts.VariableDispenser.LockForWrite("User::FileInUse")
        Dts.VariableDispenser.LockForWrite("User::FileIsReadOnly")
        Dts.VariableDispenser.LockForWrite("User::FileLastAccessedDate")
        Dts.VariableDispenser.LockForWrite("User::FileLastModifiedDate")
        Dts.VariableDispenser.LockForWrite("User::FileOwner")
        Dts.VariableDispenser.LockForWrite("User::FileSize")

        ' Create a variables 'container' to store variables 
        Dim vars As Variables = Nothing

        ' Add variables from the VariableDispenser to the variables 'container' 
        Dts.VariableDispenser.GetVariables(vars)

        ' Variable for file information 
        Dim fileInfo As FileInfo

        ' Fill fileInfo variable with file information 
        fileInfo = New FileInfo(vars("User::FilePath").Value.ToString())

        ' Check if file exists 
        vars("User::FileExists").Value = fileInfo.Exists

        ' Get the rest of the file properties if the file exists 
        If (fileInfo.Exists) Then
            ' Get file creation date 
            vars("User::FileCreationDate").Value = fileInfo.CreationTime
            ' Get last modified date 
            vars("User::FileLastModifiedDate").Value = fileInfo.LastWriteTime
            ' Get last accessed date 
            vars("User::FileLastAccessedDate").Value = fileInfo.LastAccessTime
            ' Get size of the file in bytes 
            vars("User::FileSize").Value = fileInfo.Length
            ' Get file attributes 
            vars("User::FileAttributes").Value = fileInfo.Attributes.ToString()
            vars("User::FileIsReadOnly").Value = fileInfo.IsReadOnly
            ''''''''''''''''''''''''''''''''''''''''''''''''''' 
            ' Check if the file isn't locked by an other process 
            Try
                ' Try to open the file. If it succeeds, set variable to false and close stream 
                Dim fs As FileStream = New FileStream(vars("User::FilePath").Value.ToString(), FileMode.Open)
                vars("User::FileInUse").Value = False
                fs.Close()
            Catch ex As Exception
                ' If opening fails, it's probably locked by an other process 
                vars("User::FileInUse").Value = True

                ' Log actual error to SSIS to be sure 
                Dts.Events.FireWarning(0, "Get File Properties", ex.Message, String.Empty, 0)
            End Try

            ''''''''''''''''''''''''''''''''''''''''''''''''''' 
            ' Get the Windows domain user name of the file owner 
            Dim fileSecurity As FileSecurity = fileInfo.GetAccessControl()
            Dim identityReference As IdentityReference = fileSecurity.GetOwner(GetType(NTAccount))
            vars("User::FileOwner").Value = identityReference.Value
        End If

        ' Release the locks 
        vars.Unlock()

        Dts.TaskResult = ScriptResults.Success
    End Sub
End Class


4) The result
For testing purposes I added an other Script Task with a messagebox that reads all the variables.
The result






















5) Usage
You can check for example whether a file locked before you start the Data Flow. You do this with an expression on the Precedence Constraint.

Expression

















Note: This script task example can be refined with error handling and logging.

23 comments:

  1. Hello,
    Thank you for the great article, but i'm getting error "Error 1 The type or namespace name 'AddIn' does not exist in the namespace 'System' (are you missing an assembly reference?)"

    i've added all the above references, what's missing?

    Thanks

    ReplyDelete
  2. @Jason Yousef: System.Addin is a standard reference (in my SSIS installation). If it's not present in yours, you should add this reference in the Project Explorer of your Script Task: C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\v3.5\System.AddIn.dll

    ReplyDelete
    Replies
    1. Hi Jason, Great post.
      Can you elaborate the above statement .
      "If it's not present in yours, you should add this reference in the Project Explorer of your Script Task: C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\v3.5\System.AddIn.dll"

      Can you provide me the steps How to add this .dll to my project......
      Thanks In advance......

      Delete
    2. @ranjith: the VB.Net example is made in SSIS 2008. If you still use 2005, just copy the imports rows and copy the content of my main metod to your 2005 main method. Only keep the row where the task result is set to true. You need that to replace the taskresult row from 2008.

      Delete
  3. Hi,
    this is a great post ! Can you please post the code for the second script task for the box message ? it will be really helpful.

    Thank you,

    ReplyDelete
    Replies
    1. It's just a messagebox. Something like:
      Messagebox.Show(dts.variables("User::FilePath").value.ToString());

      Here is how you use variables in a Script Task:
      http://microsoft-ssis.blogspot.com/2010/12/how-to-use-variables-in-script-task.html

      Delete
    2. First of all thanks for the article, it really helped, but I still have issues with the other script task, I connected it and edited the script putting Messagebox.Show(dts.variables("User::FilePath").value.ToString()); but it says message box does not exist in the cuurent content same as dts, I don't know what I am doing wrong, I am new to ssis tho, Also is there a way all this information can be sent to a table in a database?

      Thanks

      Delete
    3. Try adding the name space in the includes:
      Include system.windows.forms;

      Delete
    4. Hello Joost thanks for the update, the message now works, is there a way I can load this information in a Table in a sql database?

      Thanks.

      Delete
    5. Either read the variables in a Derived Column in the next Data Flow Task or use an Execute SQL Task with these variables as parameters for an insert query.

      Delete
  4. Thanks for posting this and it helped alot, going to post soon what I have created :)

    ReplyDelete
  5. Thanks a lot..It helps me a lot to load File properties of all files in a folder to a table in database.

    ReplyDelete
  6. I am trying to read all the files from the directory and get file attributes. Can you help me setup container and read file attributes.

    Thanks

    JP

    ReplyDelete
    Replies
    1. You could add some loop in the Script Task and combine it with an insert query, but if you don't have programming skills then you could also add a Foreach Loop (File Enumerator) around the script task.

      Delete
  7. Thank you very much for your article. I found it very useful!

    ReplyDelete
  8. Is it possible to retrieve 'Last Saved By' from the file using script task in SSIS?

    ReplyDelete
    Replies
    1. I guess you are talking about the property Last Saved By from Offices documents. It is possible, but not sure it is possible without interop (office) dll's. Try stackoverflow or one of the offices forums on MSDN.

      Delete
  9. Hello can I have the package to test it

    ReplyDelete
    Replies
    1. Just create your own package and copy everything between "public void Main() {" and "Dts.TaskResult = (int)ScriptResults.Success;" to your main method. If you can't get it to work fill in the contact form in the menu and let me know where you failed.

      Delete
  10. hi, please how can i apture properties like Author

    ReplyDelete
    Replies
    1. Are you refering to the document properties of a Word/Excel file? There are a couple of options:
      https://msdn.microsoft.com/en-us/library/4e0tda25.aspx
      http://stackoverflow.com/questions/29578944/author-of-a-file-using-c-sharp

      Delete
  11. hi all my attributes dates are showing up as 1601-01-01, can you help? thanks

    ReplyDelete
  12. Fantastic! Thank you very much Joost! Your code worked the first time with no modification in SSIS 2019!

    ReplyDelete

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.