Thursday, 1 March 2012

Adding a header row to a Flat File

Case
I want to add a header row with totals (total number of records, total count) on the first row:
ROWCOUNT: 5  TOTAL: 57
MOUNTAINBIKE BLUE   10
MOUNTAINBIKE RED     8
MOUNTAINBIKE BLACK  21
MOUNTAINBIKE SILVER 13
MOUNTAINBIKE WHITE   5

This information is only available after the Data Flow Task has finished, so you can't dynamicly alter the header property of a Flat File Destination.

Solution
A) You can add a Multicast after your source component and use one flow for the regular data and one flow for calculating the totals and add some header row. Read this post from Keith Hyer for more details for that solution. The flow should look something like this:
Example for adding header.



























B) If your header doesn't contain data that's calculated in the data flow it self, you could try something like this with an expression on the header:
Header with no calculated data

















C) In all other cases you could use a Script Task to add a header to a Flat File. Let's elaborate that solution.

1) Variables
I created two integer variables that will be filled by the Data Flow Task in step 2.
The variables











2) The Data Flow
The Data Flow I will be using for this example looks something like the flow below. You can adjust it for your own needs. There is a Row Count Transformation that counts the number of records and there is a Script Component (Transformation) that calculates the total of my sales column.
My data flow



























And the corresponding Script for calculating the sales total:
// C# Code
// Basic script for calculating
// row totals without aggregate.
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
    // Declare variable for the total sales
    int ColumnTotal = 0;

    public override void PostExecute()
    {
        base.PostExecute();
        // Copy value of script variable to SSIS variable
        this.Variables.TotalSales = ColumnTotal;
    }

    public override void Input0_ProcessInputRow(Input0Buffer Row)
    {
        // Add value of sales column to script variable
        ColumnTotal += Row.Sales;
    }
}
or VB.Net

' VB.Net Code
' Basic script for calculating
' row totals without aggregate.
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper

<Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute> _
<CLSCompliant(False)> _
Public Class ScriptMain
    Inherits UserComponent

    Dim ColumnTotal As Integer = 0

    Public Overrides Sub PostExecute()
        MyBase.PostExecute()
        ' Copy value of script variable to SSIS variable
        Me.Variables.TotalSales2 = ColumnTotal
    End Sub

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
        ColumnTotal = ColumnTotal + Row.Sales
    End Sub
End Class

3) The Script Task
Add a Script Task after your Data Flow Task and give it a suitable name. Note the name of the Flat File Connection Manager. It will be used in step 5.
Script Task after Data Flow Task





















4) ReadOnly variables
Add the two variables from step 1 as ReadOnly variables.
ReadOnly variables
























5) The Script
Add the following code to the main method in your Script Task.
// C# Code
// Script Task for appending header row on top of textfile
using System;
using System.Data;
using System.IO;      // Added
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;

namespace ST_cc10f7cb927344e7a8239df8ebf9bca3.csproj
{
    [System.AddIn.AddIn("ScriptMain", Version = "1.0", Publisher = "", Description = "")]
    public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
    {

        #region VSTA generated code
        enum ScriptResults
        {
            Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
            Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
        };
        #endregion

        public void Main()
        {
            try
            {
                // Determine Filenames using the Flat File Connection manager
                string FileName = Dts.Connections["MyFlatFile"].ConnectionString;
                string tempFileName = FileName + ".temp";

                // Create temporary copy of source file
                File.Move(FileName, tempFileName);

                using (StreamReader input = new StreamReader(tempFileName))
                {
                    using (StreamWriter output = new StreamWriter(FileName, false))
                    {
                        // Create header in empty file
                        output.WriteLine("ROWCOUNT: " + Dts.Variables["RowCount"].Value.ToString() +
                            " TOTAL: " + Dts.Variables["TotalSales"].Value.ToString());

                        // Create a buffer. This is needed for large files that won't fit in the servers RAM
                        var buf = new char[4096];

                        // Read temporary copy of source file in blocks
                        // and write in blocks to empty file with header
                        int read = 0;
                        do
                        {
                            read = input.ReadBlock(buf, 0, buf.Length);
                            output.Write(buf, 0, read);
                        } while (read > 0);

                        // Clear and close
                        output.Flush();
                        output.Close();
                        input.Close();
                    }
                }

                // temporary copy of source file
                File.Delete(tempFileName);

                // Close Script Task with Succes
                Dts.TaskResult = (int)ScriptResults.Success;
            }
            catch (Exception ex)
            {
                // Log error
                Dts.Events.FireError(0, "Write header", ex.Message, string.Empty, 0);

                // Close Script Task with Failure
                Dts.TaskResult = (int)ScriptResults.Failure;
            }
        }
    }
}
or VB.Net

' VB.Net Code
' Script Task for appending header row on top of textfile
Imports System
Imports System.Data
Imports System.IO       ' Added
Imports System.Math
Imports Microsoft.SqlServer.Dts.Runtime

<System.AddIn.AddIn("ScriptMain", Version:="1.0", Publisher:="", Description:="")> _
<System.CLSCompliantAttribute(False)> _
Partial Public Class ScriptMain
 Inherits Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase

 Enum ScriptResults
  Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success
  Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
 End Enum
 
    Public Sub Main()
        Try
            ' Determine Filenames using the Flat File Connection manager
            Dim FileName As String = Dts.Connections("MyFlatFile").ConnectionString
            Dim tempFileName As String = FileName + ".temp"

            ' Create temporary copy of source file
            File.Move(FileName, tempFileName)

            Using input As StreamReader = New StreamReader(tempFileName)
                Using output As StreamWriter = New StreamWriter(FileName, False)
                    ' Create header in empty file
                    output.WriteLine("ROWCOUNT: " + Dts.Variables("RowCount").Value.ToString() + _
                                     " TOTAL: " + Dts.Variables("TotalSales").Value.ToString())

                    ' Create a buffer. This is needed for large files that won't fit in the servers RAM
                    Dim buf = New Char(4095) {}

                    ' Read temporary copy of source file in blocks
                    ' and write in blocks to empty file with header
                    Dim read As Integer = 0
                    Do
                        read = input.ReadBlock(buf, 0, buf.Length)
                        output.Write(buf, 0, read)
                    Loop While read > 0

                    ' Clear and close
                    output.Flush()
                    output.Close()
                    input.Close()

                End Using
            End Using

            ' temporary copy of source file
            File.Delete(tempFileName)

            ' Close Script Task with Succes
            Dts.TaskResult = ScriptResults.Success

        Catch ex As Exception
            ' Log error
            Dts.Events.FireError(0, "Write header", ex.Message, String.Empty, 0)

            ' Close Script Task with Failure
            Dts.TaskResult = ScriptResults.Failure
        End Try
    End Sub
End Class


6) The Result
The result is an extra header line on top of the existing file.
The Result

No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.