Thursday, 8 March 2018

Show SSIS error messages in the ADF monitor

Case
I want to see SSIS error messages in the ADF monitor or in the ADF email notification so that I can have a quick look to determine the severity of the problem without logging in with SSMS to the catalog in Azure. Is that possible?
Need more details in case of an error













Solution
It is possible but it requires to adjust the T-SQL statement that executes the package. Please follow all steps in this blog post, but replace the T-SQL code from step 4 with the code below.

The change is in the last part only. Previously it only showed that the execution failed, but now it also retrieves error messages from the catalog. Because the space is a bit limited we only show 7 errors. Errors with 'validation' in the text are less useful for the quick look we want. So those are filtered out as well. All messages are separated with a linefeed for a better overview.

-- Variables for execution and error message
declare @err_msg as varchar(8000) 
declare @err_msg_part as varchar(1000) 
declare @execution_id as bigint

-- Create execution and fill @execution_id variable
EXEC [SSISDB].[catalog].[create_execution] @package_name=N'Package.dtsx', @execution_id=@execution_id OUTPUT, @folder_name=N'SSISJoost', @project_name=N'MyAzureProject', @use32bitruntime=False, @reference_id=2, @useanyworker=True, @runinscaleout=True
 
-- Set logging level: 0=None, 1=Basic, 2=Performance, 3=Verbose 
EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'LOGGING_LEVEL', @parameter_value=1 
 
-- Set synchonized option 0=A-SYNCHRONIZED, 1=SYNCHRONIZED 
-- A-SYNCHRONIZED: don't wait for the result
EXEC [SSISDB].[catalog].[set_execution_parameter_value] @execution_id,  @object_type=50, @parameter_name=N'SYNCHRONIZED',  @parameter_value=1 
 
-- Execute the package with parameters from above
EXEC [SSISDB].[catalog].[start_execution] @execution_id, @retry_count=0

-- Check if the package executed successfully (only for SYNCHRONIZED execution)
IF (SELECT [status] FROM [SSISDB].[catalog].[executions] WHERE execution_id=@execution_id) <> 7
BEGIN
    SET @err_msg = 'Your package execution did not succeed for execution ID: ' + CAST(@execution_id AS NVARCHAR(20)) + CHAR(13) + CHAR(10)
    DECLARE err_cursor CURSOR FOR 
    SELECT  top(7) CAST([message] as varchar(1000)) as message -- Max 7 errors
    FROM  [catalog].[event_messages]
    WHERE  [event_name] = 'OnError' -- Only show errors
    AND   [operation_id] = @execution_id
    AND   [message]  not like '%validation%'
    -- Exclude less usefull validation messages like:    
    -- Error: One or more component failed validation.
    -- Error: There were errors during task validation.
    -- Error: Error 0xC0012050 while executing package from project reference package "xxx". Package failed validation from the ExecutePackage task. The package cannot run.
    -- Error: xxx failed validation and returned error code 0xC020801C.
    -- Error: "xxx" failed validation and returned validation status "VS_ISBROKEN".

    OPEN err_cursor  
    FETCH NEXT FROM err_cursor INTO @err_msg_part  
    WHILE @@FETCH_STATUS = 0  
    BEGIN
        SET @err_msg = @err_msg + @err_msg_part + CHAR(13) + CHAR(10) 
        FETCH NEXT FROM err_cursor INTO @err_msg_part 
    END 
    CLOSE err_cursor  
    DEALLOCATE err_cursor 

    RAISERROR(@err_msg,15,1)
END

Now it shows more details in the ADF monitor and if you are also using the ADF email notifications then the same messages will appear in the email. Feel free to suggest improvements in the comments.
More error details












Note that it is not a complete overview of all messages, but in most cases it should be enough for the seasoned developer to quickly identify the problem and take actions to solve it.


*update 13-04-2018: There is a new Execute SSIS Package activity, but without error options.*

Thursday, 1 March 2018

Azure Blob Source ≠ Flat File Source

Case
I'm running my SSIS packages in Azure and my source is a flat file in an Azure Blob Storage container. Therefor I use the Azure Blob Source as a source in my Data Flow Task. However this source has just a few formatting options compared to the Flat File Source (and its connection manager). I want to specify things like qualifiers and data types. How do I do that?
Azure Blob Source has too few options


























Solution
With the current version of the Azure Blob Source you can only specify the column separator, but there is a workaround available. Your Integration Runtime (IR) that is hosted in ADF is actually a virtual machine with Integration Services on it. A simple Script Task running on that IR reveals the drives and their available space. It shows that we have several drives available on that Virtual Machine.
Log with drive details











// C# Code to replace your Main() method
public void Main()
{
    // Create array with drive information
    System.IO.DriveInfo[] allDrives = System.IO.DriveInfo.GetDrives();

    // Create string to store a message
    String DriveDetails = "";

    // Loop through all drives to get info about it
    foreach (System.IO.DriveInfo d in allDrives)
    {
        // Get drive letter (C:) and type (NTFS)
        DriveDetails = d.Name + "(" + d.DriveType + ")" + Environment.NewLine;

        // If drive is ready you can get more details
        if (d.IsReady == true)
        {
            DriveDetails += " - Volume label: " + d.VolumeLabel + Environment.NewLine;
            DriveDetails += " - File system: " + d.DriveFormat + Environment.NewLine;
            DriveDetails += " - Available space to current user: " + d.AvailableFreeSpace + Environment.NewLine;
            DriveDetails += " - Total available space: " + d.TotalFreeSpace + Environment.NewLine;
            DriveDetails += " - Total size of drive: " + d.TotalSize;
        }

        // Fire the message as warning to stand out between other messages
        Dts.Events.FireWarning(0, "Details", DriveDetails, "", 0);
    }

    // End Script Task
    Dts.TaskResult = (int)ScriptResults.Success;
}

So the solution is to first use the Azure Blob Download Task to download the file from the Blob Storage Container to the Virtual Machine. After that you can use a regular Flat File Source in the Data Flow Task.
Azure Blob Download Task



























I'm not sure what the purpose is of these disks and if one of them is for non-persistent data (disk that are automatically cleaned), but I recommend using the E drive to temporarily store the downloaded files and clean up afterwards.

Windows Temp folder
An alternative to pick a temporarily folder on your IR machine is to use a very simple Script Task with only one line of code that retrieves the path of the Windows temp folder. The path looks something like D:\Users\WATASK_1\AppData\Local\Temp\. If you store this path in an SSIS string variable, then you can use that for expressions on your tasks and Flat File connection manager. After the next reboot Windows removes all old files in this folder.
// C# code (see line 5)
public void Main()
{
    // TODO: Add your code here
    Dts.Variables["User::tempPath"].Value = System.IO.Path.GetTempPath();

    Dts.TaskResult = (int)ScriptResults.Success;
}

Script Task that fills a variable