Friday, 6 December 2019

Azure DevOps - New Microsoft SSIS Deploy task

Case
Until now we needed Third Party tasks or PowerShell to deploy an SSIS project in DevOps. Now Microsoft finally released its own SSIS DevOps tasks. How does it work?
Replace PowerShell code with Microsoft SSIS Deploy task














Solution
Microsoft just released the SSIS Deploy task (public preview) which makes it much easier to deploy an SSIS project. Below you will find the codeless steps to deploy artifacts created by the SSIS Build task.

1) New Release Pipeline - Empty Job
The first step is to create a new Release Pipeline. For this example we will use an empty job to start with and later on add tasks manually.
  • Go to the Pipelines menu on the left
  • Go to Releases
  • Click on + New to create a New Release pipeline
  • Next choose 'Empty job'

Next step is to give the first Stage its name. A Stage is an environment like acceptance or production where you want to deploy your SSIS packages. For this example we will have three stages / environments: Dev/Tst, Acc and Prd. The next two stages will be created by using cloning, but first we will add and configure the new deploy task so that you don't have to repeat yourself.
  • After choosing 'Empty job' a first stage is created. Rename it 'DEV/TST'
  • Close the Stage pane by clicking the cross in the upper right corner
Create Relesae Pipeline

















2) Select Artifact
In the Artifact pane you need to select the Artifact (the ispac file) that you created in the Build Pipeline. So look up the name before you start.
  • In the Artifacts pane click on + Add to add an artifact
  • Choose the project if you have multiple DevOps projects
  • Select the Artifact of the CI pipeline
  • Select Latest version
  • Optionally rename the source alias
  • Click on the Add button

We also need to determine whether we want to use the Continuous deployment trigger or perhaps use an other trigger or scheduling. For this example we will use the CD trigger.
  • Click on the lightning icon
  • Enable Build trigger (to release after each new successful build)
  • Close the CD trigger pane by clicking the cross in the upper right corner
Add Artifact and enable Continuous deployment

















3) Add variables
For this example we need three variables: SsisServer, SsisDBUsername and SsisDBPassword
  • Go to the Variables tab
  • Click on the + Add to add the first variable
  • Enter the name of the variable
  • Enter the value of the variable
  • Hit the padlock icon when it's a sensitive value like a password
  • Select the first stage 'Dev/Tst' as scope

Repeat this for all three variables and later on, after the next step, repeat this for each Stage. For each stage it will add the same three variable, but with a different scope (Acc or Prd), but you do have to change the values of the variables.
Add variables to DevOps

















4) Add task to first stage
Now we need to add a task to do the actual deployment of the SSIS ispac file. This is where the new Microsoft SSIS Deploy DevOps task is used. Note: Click on the Info-icon behind each field title to get more information about the desired value for that field.
  • Within the first Stage pane, click on the '1 job, 0 task' link to go to the job
  • Optionally change the settings of the Agent job (for example its name)
  • Click on the + icon in the Agent Job to add a new task
  • Search for 'SSIS Deploy'
  • Add the task called SSIS Deploy (This is the official Deploy Task for SSIS DevOps from Microsoft). Note: There are various third party tasks with the same name.
  • Click on the new task and rename it to Deploy SSIS [projectname]
  • Use the ellipsis button to browse to the ISPAC file
  • Use SSISDB as Destination Type
  • Use $(DBServer) as Destination Server (this retrieves the value of the variable)
  • Use "/SSISDB/[projectname]" as Destination Path (replace it with the foldername you need)
  • Use SQL Server Authentication as Authentication type
  • Use $(DBUsername) as Username (this retrieves the value of the variable)
  • Use $(DBPassword) as Password (this retrieves the value of the variable)
  • Leave Overwrite on (you want to do this more often)
Add SSIS Deploy task to Release pipeline

















5) Add stages
Now we need to clone the Dev/Tst stage to Acc and to Prd.
  • Click on the Pipeline tab if you don't see your first Stage
  • Hover above the first stage and wait for the clone button to appear and click it
  • Next rename it to Acc
  • Repeat this for Prd

Notice the red exclamation mark before the variables tab after adding a stage: you need to add the same variables for each stage
Adding stages for Acc and Prd

















For each Stage you can determine how that Stage starts. These are the 'Pre-deployment conditions'. For example for Dev/Tst you could do an automatic deployment, for Acc you first need an approval from one person and for Prd you need two approvals.

  • Click on the lightning/person icon on the left side of each Stage to set it
  • Dev/Tst will use the 'After release' trigger and the next stages will use the 'After stage trigger'
  • For Acc and Prd you could add Pre-deployment approvals by selecting one of your teammembers.
  • Close the Pre-deployment conditions pane by clicking the cross in the upper right corner
  • Repeat this for all stages

And don't forget to add the variables when you have finished the Pre-deployment settings.
Adding Pre-deployment conditions for each stage

















6) Test
Now create a new release by activating the build task. After this a new artifact will be created and this will trigger the release pipeline. You can also hit the Create Release button in the upper right corner to use the last created artifact. The artifact will automatically be released to the first stage, but the next step waits for approval (click on the blue ACC button to approve).
Dev/Tst succesful, Acc waiting for approval











Note: You can also check the result of the release by clicking on stage DEV/TST and then click on the SSIS Deployment step. If it failed you can see the error messages per step.


Conclusion
In this post you saw how easy it is to deploy an SSIS project to the Integration Services Catalog with this new task. It is easy to switch between various Authentication types. One small disadvantage is that you don't have any options for the SSIS Catalog Environments. You either have to do that manually or you have to write some PowerShell or TSQL code to do that after the deployment.


Azure DevOps - New Microsoft SSIS Build task

Solution
Until now we needed Third Party tasks or PowerShell (with NuGet) to build an SSIS project in DevOps. Now Microsoft finally released its own SSIS DevOps tasks. How does it work?
Four tasks that will be replaced by one new task























Case
Microsoft just released SSIS Build (public preview) which makes it much easier to build an SSIS project. Below you will find the codeless steps to build and publish the artifacts so that they can be deployed.

1) Create new build pipeline
In this step we will create an empty build pipeline which is connected to Azure Repos Git where our SSIS project is located.
  • Go to Pipelines menu in DevOps and then to Builds. In the list of pipelines you will find the + New sign on top. Click on it and choose New build pipeline.
  • Now you can choose where your code is located. However on the bottom you will find the option "Use the classic editor to create a pipeline without YAML.". After clicking on it you must select the source you want to use for building the code (Azure Repos Git in our example).
  • Next step is to create an Empty job/pipeline and give it a useful name.
Create new empy build pipeline

















2) Add SSIS Build task
In this step we will add the new SSIS Build task from Microsoft. Make sure to pick the right one because there are several tasks with the same name. You want to look for "by Microsoft Corporation" or "by BeisiZhou".
  • Click on the plus icon behind the "Agent job 1" (you probably want to give that one a better name) and search for "SSIS Build" choose the SSIS Build from Microsoft (see animated gif)
  • Edit the new SSIS Build task. The only thing you need to change is the Project Path property. You can use the ellipsis button to browse to your SSIS project file.
Add SSIS Build task by Microsoft

















3) Publish Artifact
The previous task will create an IsPac file that we need to publish so that we can use it in a release pipeline later on.
  • Click on the plus icon behind the "Agent job 1" and search for "Publish artifact" choose the Publish artifact from Microsoft (see animated gif)
  • The only thing you need to change is the Artifact name. Don't use the default 'drop', but use you SSIS project name instead.
Publish artifact (IsPac)

















4) Add Trigger
To trigger this build pipeline automatically when changes are committed to the Master branch we need to add a trigger. Go to the triggers menu and enable continuous integration. Then add a path filter so that this build will only trigger when the specific project changes. Then save the Build pipeline and trigger it by committing a change or by hitting the trigger options for a manual trigger.
Add trigger

















Conclusion
This new task is much easier to use than the PowerShell code and also easier than most of the third party tasks. With a little practice you can now easily create a build task under two minutes which is probably faster than the build itself.

If your build fails with the following error message then you are probably using a custom task or component (like Blob Storage Download Task). These tasks are not installed on the build agents hosted by Microsoft. You can either install them via PowerShell or use a self hosted agent where you can install all custom components.

System.ArgumentException: Value does not fall within the expected range.
   at Microsoft.SqlServer.Dts.Runtime.Interop.ProjectInterop.ReferencePackage(Package package, String packageLocation)
   at Microsoft.SqlServer.Dts.Runtime.PackageItem.Load(IDTSEvents events)
   at Microsoft.SqlServer.Dts.Runtime.PackageItem.get_Package()
   at Microsoft.DataTransformationServices.Project.DataTransformationsProjectBuilder.IncrementalBuildThroughObj(IOutputWindow outputWindow)
   at Microsoft.DataTransformationServices.Project.DataTransformationsProjectBuilder.BuildIncremental(IOutputWindow outputWindow)
ERR:The process 'C:\Program Files (x86)\Microsoft Visual Studio\2017\Enterprise\Common7\IDE\devenv.exe' failed with exit code 1
Build D:\a\1\s\SSIS\STG\STG_MySource\STG_MySource.dtproj failed
ERR:Build failed
##[error]Build failed

Friday, 8 November 2019

Azure DevOps - Deploy SSIS project (CD)

Case
Recently you showed how to build an SSIS project in DevOps, but how do you deploy an SSIS project within DevOps.
Release SSIS in DevOps












Solution
There are no out of the box tasks yet, but there are a couple of Third Party DevOps tasks for SSIS which you could try-out. For this example we will use little PowerShell code which we have done before.
Update Dec 6: New Microsoft SSIS Deploy task

1) New Release Pipeline - Empty Job
The first step is to create a new Release Pipeline. For this example we will use an empty job to start with and later on add tasks manually.
  • Go to the Pipelines menu on the left
  • Go to Releases
  • Click on + New to create a New Release pipeline
  • Next choose 'Empty job'
Next step is to give the first Stage its name. A Stage is an environment like acceptance or production where you want to deploy to. For this example we will have three stages / environments: Dev/Tst, Acc and Prd. The next two stages will be created by using cloning, but first we will add tasks so that you don't have to repeat yourself.
  • After choosing 'Empty job' a first stage is created. Rename it 'DEV/TST'
  • Close the Stage pane by clicking the cross in the upper right corner
Create Relesae Pipeline

















2) Select Artifact
In the Artifact pane you need to select the Artifact (the ispac file) you created in the Build Pipeline. So look the name before you start.
  • In the Artifacts pane click on + Add to add an artifact
  • Choose the project if you have multiple DevOps projects
  • Select the Artifact of the CI pipeline
  • Select Latest version
  • Optionally rename the source alias
  • Click on the Add button
We also need to determine whether we want to use the Continuous deployment trigger or perhaps use an other trigger or scheduling. For this example we will use the CD trigger.
  • Click on the lightning icon
  • Enable Build trigger (to release after each new successful build)
  • Close the CD trigger pane by clicking the cross in the upper right corner
Add Artifact and enable Continuous deployment

















3) Add task to first stage
Now we need to add a task to do the actual deployment of the SSIS ispac file. There are a couple of third party SSIS deployment tasks, but the instructions and examples of those are not always that clear. Hopefully Microsoft will create there own SSIS devops tasks soon. For now we will use a little Powershell scripting.
  • Within the first Stage pane, click on the '1 job, 0 task' link to go to the job
  • Optionally change the settings of the Agent job (for example its name)
  • Click on the + icon in the Agent Job to add a new task
  • Search for 'Powershell'
  • Add the task called PowerShell (Run a PowerShell script on Linux, macOS, or Windows)
  • Click on the new task and rename it to Deploy SSIS project
  • Select Inline as Type
  • Paste the script below in the Script textbox
Add PowerShell code to deploy SSIS

















Notice the Params piece in the PowerShell code. You could hardcode each value, but for this example we will use Variables from the Release pipeline to set the value of these parameters in the next step. The string "$(SsisServer)" in the code will be replaced by the value of the DevOps variable and result in: "bitools.database.windows.net" (see next step).

# PowerShell code
# Params
$SsisServer ="$(SsisServer)"
$SSISDBUsername = "$(SsisDBUsername)"
$SSISDBPassword = "$(SsisDBPassword)"
$FolderName = "RM"

# Mask the password to show something on
# screen, but not the actual password
# This is for testing purposes only.
$SSISDBPasswordMask = $SSISDBPassword -replace '.', '*'
 

Write-Host "========================================================================================================================================================"
Write-Host "==                                                         Used parameters                                                                            =="
Write-Host "========================================================================================================================================================"
Write-Host "SSIS Server            : " $SsisServer
Write-Host "SQL Username           : " $SSISDBUsername
Write-Host "SQL Password           : " $SSISDBPasswordMask
Write-Host "========================================================================================================================================================"

############## ASSEMBLY ###############
# Add SSIS assembly so you can do SSIS stuff in PowerShell
Write-Host "Referencing SSIS assembly"
$SsisNamespace = "Microsoft.SqlServer.Management.IntegrationServices"
# Load the IntegrationServices Assembly
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.Management.IntegrationServices") | Out-Null;


############## SQL SERVER ###############
Write-Host "Connecting to Azure SQL DB server $($SsisServer)"

# Create a connectionstring for the Azure DB Server
# Make sure you use SSISDB as the Initial Catalog!
$SqlConnectionstring = "Data Source=$($SsisServer);User ID=$($SSISDBUsername);Password=$($SSISDBPassword);Initial Catalog=SSISDB;"
 
# Create a connection object
$SqlConnection = New-Object System.Data.SqlClient.SqlConnection $SqlConnectionstring
 
# Check if the connection works
Try
{
    $SqlConnection.Open();
    Write-Host "Connected to Azure SQL DB server $($SsisServer)"
}
Catch [System.Data.SqlClient.SqlException]
{
    Throw  [System.Exception] "Failed to connect to Azure SQL DB server $($SsisServer), exception: $($_)"
}


############## SSISDB ############### 
# Create the Integration Services object
$IntegrationServices = New-Object $SsisNamespace".IntegrationServices" $SqlConnection
  
# Check if SSISDB connection succeeded
if (-not $IntegrationServices)
{
    Throw  [System.Exception] "Failed to connect to SSISDB on $($SsisServer)"
}
else
{
    Write-Host "Connected to SSISDB on $($SsisServer)"
}

# Create object for SSISDB Catalog
$Catalog = $IntegrationServices.Catalogs["SSISDB"]
  
# Check if the SSISDB Catalog exists
if (-not $Catalog)
{
    # Catalog doesn't exists. The user should create it manually.
    # It is possible to create it, but that shouldn't be part of
    # deployment of packages.
    # Also make sure the catalog is SSISDB and not master or any
    # other database.
    Throw  [System.Exception] "SSISDB catalog doesn't exist. Create it manually!"
}
else
{
    Write-Host "Catalog SSISDB found"
}

############## CATALOG FOLDER ###############
# Create object to the (new) folder
$Folder = $Catalog.Folders[$FolderName]
  
# Check if folder already exists
if (-not $Folder)
{
    # Folder doesn't exists, so create the new folder.
    Write-Host "Creating new folder" $FolderName
    $Folder = New-Object $SsisNamespace".CatalogFolder" ($Catalog, $FolderName, $FolderName)
    $Folder.Create()
}
else
{
    Write-Host "Folder" $FolderName "found"
}
 
############## LOOP ISPACS ###############
Get-ChildItem -Filter "StandardReportGenerator.ispac" -Recurse | Where-Object { -Not ($_.FullName -match "obj") } | ForEach-Object {

    ############## PROJECT ###############
    $IspacFilePath = $_.FullName

    # Check if ispac file exists
    if (-Not (Test-Path $IspacFilePath))
    {
        Throw  [System.IO.FileNotFoundException] "Ispac file $IspacFilePath doesn't exists!"
    }
    else
    {
        $IspacFileName = split-path $IspacFilePath -leaf
        Write-Host "Ispac file" $IspacFileName "found"
    }

    # Get project name from ispac file
    $ProjectName = [system.io.path]::GetFileNameWithoutExtension($IspacFilePath)

    # Reading ispac file as binary
    [byte[]] $IspacFile = [System.IO.File]::ReadAllBytes($IspacFilePath) 
    $Folder.DeployProject($ProjectName, $IspacFile) | Out-Null
    $Project = $Folder.Projects[$ProjectName]
    if (-not $Project)
    {
        # Something went wrong with the deployment
        # Don't continue with the rest of the script
        return ""
    }

    Return "Ready deploying $IspacFileName " 
}

Note: this script uses a DB user to connect to the SSIS DB. For other authorisations you need to change the code block starting on line 30.

4) Add variables
For this example we need three variables: SsisServer, SsisDBUsername and SsisDBPassword
  • Go to the Variables tab
  • Click on the + Add to add the first variable
  • Enter the name of the variable
  • Enter the value of the variable
  • Hit the padlock icon when it's a sensitive value like a password
  • Select the first stage 'Dev/Tst' as scope
Repeat this for all three variables and later on, after the next step, repeat this for each Stage. Just add the same variables but change the scope to Acc or Prd.
Add variables to DevOps

















5) Add stages
Now we need to clone the Dev/Tst stage to Acc and to Prd.
  • Click on the Pipeline tab if you don't see your first Stage
  • Hover above the first stage and wait for the clone button to appear and click it
  • Next rename it to Acc
  • Repeat this for Prd
Notice the red exclamation mark before the variables tab after adding a stage: you need to add the same variables for each stage.
Adding stages for Acc and Prd

















For each Stage you can determine how that Stage starts. These are the 'Pre-deployment conditions'. For example for Dev/Tst you could do an automatic deployment, for Acc you first need an approval from one person and for Prd you need two approvals.
  • Click on the lightning/person icon on the left side of each Stage to set it
  • Dev/Tst will use the 'After release' trigger and the next stages will use the 'After stage trigger'
  • For Acc and Prd you could add Pre-deployment approvals by selecting one of your teammembers.
  • Close the Pre-deployment conditions pane by clicking the cross in the upper right corner
  • Repeat this for all stages
And don't forget to add the variables when you have finished the Pre-deployment settings.
Adding Pre-deployment conditions for each stage

















6) Test
Now create a new release by activating the build task. After this a new artifact will be created and this will trigger the release pipeline. You can also hit the Create Release button in the upper right corner to use the last created artifact. The artifact will automatically be released to the first stage, but the next step waits for approval (click on the blue ACC button to approve).
Dev/Tst succesful, Acc waiting for approval











You can also check the result of the release by clicking on Dev/Tst and then click on the PowerShell step. If it failed you can see the error messages per step.
See result of deployment













Conclusions
In this step you saw how to create a Release pipeline with several Stages and learned how these deployment stages are activated with the Pre-deployment conditions. Unfortunately there are no out-of-the-box Microsoft SSIS devops tasks which force you to either use Third Party tasks or PowerShell. Also take a look around and see all the other options in the build pipeline.




Friday, 20 September 2019

GIT Snack: missing buttons in Team Explorer

Case
I can commit and sync, but where are the GIT buttons Pull Request, Work Items and Builds in my Visual Studio's Team Explorer?
Missing 3 Git buttons: Pull Request, Work Items and Builds

















Solution
You are not connected to the project in GIT. Let's connect to your project:
  1. In Visual Studio/SSDT go to Team in the menu and click on Manage Connections...
  2. In the Team Explorer an option "Manage Connections" appears. Click on it and choose Connect to a Project...
  3. A new GIT window "Connect to a Project" appears, find and select your project and click on the Connect button.
Connect to GIT project
















After connecting the 3 buttons Pull Request, Work Items and Builds will appear in the Team Explorer pane and then you can start a Pull Request from within Visual Studio/SSDT.
All buttons are now available


































Wednesday, 12 June 2019

Azure DevOps - Build SSIS project (CI)

Case
I have my SSIS project in Git (Azure Repos Git) and I want to build my project in Azure DevOps. How do I add Continuous Integration (CI) for my SSIS project?
DevOps - Build pipeline




















Solution
For this example we start with an SSIS project already in Azure Repos Git and build the project automatically when committing changes. Deployment (CD) of SSIS will be handled in a separate post.
Update Dec 6: New Microsoft SSIS Build task

1) New Build Pipeline
Go to Pipelines and then to Builds. In the list of pipelines you will find the + New sign. Click on it and choose New build pipeline. Now you can choose where your code is located. However on the bottom you will find the option "Use the classic editor to create a pipeline without YAML." For this example we will use this option since SSIS is a bit old school. As last step choose the Empty job and give your CI pipeline a useful name.
Creating new CI pipeline in DevOps for SSIS
















2) NuGet Tool Installer
Since we will be needing an SSIS building tool that is available as NuGet, we first need to install NuGet itself. Click on the plus icon behind the "Agent job 1" (you need to give that a better name) and search for "NuGet" choose the NuGet Tool Installer and determine which version of NuGet you need. Fot this example I used 5.1.0 but click on the information icon behind version to get a list of of versions.
DevOps - NuGet Tool Installer
















3) NuGet SSISBUILD
Now that we have installed the tool NuGet itself we can install a so called NuGet package. The one we will be needing is called SSISBuild. Add a Task to the agent and search for NuGet again. Choose NuGet and change the command to Custom. Then enter the follow command to install SSISBuild to the Build Server: install SSISBuild -Version 2.3.0
DevOps - NuGet install SSISBuild
















4) PowerShell to build SSIS project
Now the most important step: the building of the SSIS project. We will be using PowerShell for this Task. Add a PowerShell Task to run PowerShell on Windows(, macOS, or Linux). Then change the script type to InLine and copy & paste the code below. On the first line of code (excl comments) you have to specify the path of your SSIS project starting from the solution folder untill the extension .dtproj. In the code a Predefined variable called Build.SourcesDirectory will be concatenated to this path to specify the project path on the server.
DevOps - PowerShel task to build the SSIS project
















# PowerShell code
# The path of the project starting from the solution folder
$ProjectToBuild = "\SSIS\RM\StandardReportGenerator\StandardReportGenerator.dtproj"

# Join the path of the build server with the path of the project
$ProjectFilePath = Join-Path -Path $env:BUILD_SOURCESDIRECTORY -ChildPath $ProjectToBuild

# Check if the file exists on the build server
if (!(Test-Path $ProjectFilePath)) {
    # Not found, throw error
    throw "Project $($ProjectFilePath) not found!"
}
else
{
    # Call ssisbuild.exe with parameters to build this project
    &"$($env:BUILD_SOURCESDIRECTORY)\SSISBuild.2.3.0\tools\ssisbuild.exe" $ProjectFilePath -Configuration Development -ProtectionLevel DontSaveSensitive

    # Check whether the build was successful
    if ($LASTEXITCODE -ne 0) {
        # Build failed, throw error
        throw "Build of $($ProjectFilePath) failed.";
    }
}

You can also test this locally on your own computer. Then you have to download and unzip the SSISBuild nuget to your Visual Studio solution folder and add one line of code at the top to fill Predefined variable with the solution path on your local computer.
# PowerShell code
$env:BUILD_SOURCESDIRECTORY = "D:\sources\xxxxxxbi\"

Test locally with PowerShell ISE
















The PowerShell is very basic at the moment (KISS), but there are some options to extend the possibilities for this PowerShell. These will be handled in a separate post.

5) Copy Files
Now that we have successfully build the project it is time to copy the result, the .ispac file, to the Artifact Staging Directory. Add a new task called Copy Files. Select *.ispac as content and use the Predefined variable build.ArtifactStagingDirectory as Target Folder.
DevOps - Copy ispac file to stage folder
















6) Publish Build Artifacts
The last task is to publish the artifact, the . ispac file, so you can use it in a Release pipeline later on. Add a task called Publish Build Artifacts. By default the same Predefined variable build.ArtifactStagingDirectory is used as path to publish. Rename the artifact so that you can easily find it when building the Release pipeline.
DevOps - Publish Build Artifacts
















7) Add Triggers
To trigger this build pipeline automatically when changes are committed to the Master branch we need to add a trigger. Go to the triggers menu and enable continuous integration. Then add a path filter so that this build will only trigger when the specific project changes. Then save the Build pipeline and trigger it by committing a change.
DevOps - Enable trigger to build SSIS project
















Conclusion
In this post I showed you how to build an SSIS project in Azure DevOps with the SSISBuild NuGet. It took quite a few steps, but most steps are relative simple and easy to replicate in your own DevOps project.

One downside of this method is that it is a bit more strict when building your SSIS project. It could even happen that you can build your project in Visual Studio successfully and even deploy it to the catalog, but that it will fail building when using the SSISBuild Nuget and throwing you this error: ERROR: An item with the same key has already been added.
ERROR: An item with the same key has already been added













The possible cause of this error is duplicate Connection Manager Guids. This happens when you copy a package (with a package connection manager) in Visual Studio. I solved it when recreated the connection manager manually.
If you have duplicate package connection managers and you convert one of them to a project connection manager then it will hide the duplicate connection manager in the copied package, making it very hard to find.

Next step: Deploy your SSIS project








Tuesday, 2 April 2019

Stopping an Integration runtime that is already stopping

Case
My integration runtime is stopping, but it takes already more than an hour and I see errors in the ADF monitor:

Error 1: Last operation 'Stop' get the status 'Failed'. Error Code: PoolDeleteFailedDueToLockOrReference Error Message: Activity ID: a67615ac-94ff-4547-9244-728d3ae5b25e

Error 2: Error Code: VNetResourceGroupLockedDuringStop Error Message: We are unable to clean up unneeded resources when stopping your IR. Please unlock the resource group containing your VNet and/or remove any reference to the resources created for your IR, and then retry to stop your IR.

Error 3: Error Code: IntegrationRuntimeStoppingLongerThanUsual Error Message: Your integration runtime is taking longer than usual to stop, so please retry to stop it.

It suggests to retry to stop the SSISIR, but the start and stop buttons are greyed out. How do I retry the stop?

Stop and start are greyed out
















Solution
The Azure portal interface does not support restart at the moment, however you can restart an SSISIR that is already stopping with PowerShell!

#PowerShell code
# Login with your Azure account (popup will open)
Login-AzureRmAccount 

# Your subscription name (needed if you have more than one subscription)
$SubscriptionName = "xxxxxxx"

# Change the current and default Azure subscription
Select-AzureRmSubscription -SubscriptionName $SubscriptionName
Get-AzureRmSubscription -SubscriptionName $SubscriptionName

# Force stopping SSISIR
Stop-AzureRmDataFactoryV2IntegrationRuntime -DataFactoryName "MyDataFactoryNae" -Name "SSISxxxxxxxIR" -ResourceGroupName "MyResourceGroup" -Force


Other steps you should take if the PowerShell does not stop the Integration Runtime:
  1. Check if there are lockes on the Resource Group or individual items (like ADF, VNET or SSISDB).
  2. If you are using a VNET, check whether it works. (Create a new SSISIR and then press validate on the VNET step)
  3. If there are no locks and the VNET is valid, but the Integration Runtime is still stopping then contact the Azure help + support desk either via Twitter of via an official support request in the portal.
1) Check for locks














2) Validate VNET

























3) Add support request (in worst case)

















Update: there is a new "retry stop" option available. Click on the Author (pencil) button on the left and then on connection in the left bottom corner. Now open the Integration Runtimes tab and click on the new "retry stop" button. (I hope they will add this option in the monitor as well)
Retry stopping the SSIS IR

















Monday, 18 February 2019

Power Query Source (Preview)

Case
Yes new SSIS functionality! In a previous post I had almost written it off, but Microsoft just introduced the SSIS source that we where all waiting for: The PowerQuery source. How does it work?
SSIS - Power Query Source















Solution
First you need to download and install the latest SSDT (Version 15.9.0) and since it is still in preview, you can only use it within SSDT or on a Azure Integration Runtime in Azure Data Factory.

If you drag the new Power Query Source to your Data Flow canvas and edit it, you can paste your Power Query script from Power BI (or Excel) in the query textbox. So there is not yet(?) an editor, but this should simplify a lot of tasks that could were previously could only be solved with .Net scripting.
Paste your Power Query script in the editor




After pasting the Power Query script go to the Connection Managers pane and create a Power Query Connection Manager by first clicking on Detect Data Source and then by adding the new Connection Manager via the drop down list.
Power Query Connection Manager

























Preview notes:
  • Only working in SSDT or on a Azure-SSIS IR in ADF
  • Web source does not work on an Azure-SSIS IR with custom setup
  • Oracle source only works via Oracle ODBC driver on an Azure-SSIS IR


For more details read Power Query Source (Preview)

Conclusion
This new source was announced (and canceled) a long time ago, but it is finally available. This preview version is still very basic and still has some limitations. I'm not sure whether they will integrate an editor like in Excel and Power BI, but  lets hope they will.


Sunday, 17 February 2019

Updated ADF Pipeline Activity for SSIS packages

Case
In April 2018 Microsoft added the SSIS pipeline activity to ADF. A couple of days ago the released a new version. What are the changes?
ADF - Execute SSIS Package activity




















Solution
The big difference is that you can now select your packages and environment instead of typing the full paths manually, but there is also an new "Manual entries" option which allows you to set parameters (and connection managers) directly within the pipeline.
Selecting package and environment














See below the old version of the SSIS activity where you had to type the full paths manually.
The old manual entry

















Conclusion
Again much easier then before, but in case of package errors still not very helpful. It still forces you to search for errors in the SSIS catalog. If you are using the ADF monitor to the check for errors I would probably still prefer the Stores Procedure activity.

Wednesday, 30 January 2019

Introducing Data Flows in Azure Data Factory

Case
Since last year we can run the good old SSIS packages in the Azure cloud. Not on a self created and self maintained virtual machine, but with the Integration Runtime service in Azure Data Factory.

However Microsoft also introduced Azure Databricks and called in ETL 2.0. So I'm not sure how much effort Microsoft will put in any future release of SSIS. But Azure Databricks is not a visual tool and not everybody is into coding. Is there an alternative?
Visual ETL in Azure















Solution
You can probably keep on using SSIS for an other decade, but if indeed the development of SSIS stagnates and the rest of the data platform isn't, then it is good to learn a couple of new tricks. Within the Microsoft Data Platform we a couple of alternatives for SSIS:


  1. Azure Databricks
    As mentioned above this requires learning some new coding skills since this isn't a visual development tool. I will post an introduction in a later blog post.
  2. Azure Data Factory with Pipelines and T-SQL
    You could use the Copy Data activity in combination with the Stored Procedure activity and build all transformations in T-SQL. If you are a diehard SSIS developer then this is probably not your cup of thee.
  3. Power BI Dataflows
    The Power BI team just introduced self service ETL within Power BI. We just added a new post to explain this new option which will make some people very exciting.
  4. Azure Data Factory Dataflows
    This is a new preview feature in Azure Data Factory to visually create ETL flows. Below I will show you the steps to create you own first simple Data Flow.


1) Request preview access
If you are reading this during the preview periode (early 2019), then first request access via this form. With your Azure subscription Guid that you need to provide, Microsoft will turn this new feature on for you that specific subscription.
Check 'Version' for the preview version


























2) Create Databricks Service
Yes you are reading this correctly. Under the hood Data Factory is using Databricks to execute the Data flows, but don't worry you don't have to write code.
Create a Databricks Service and choose the right region. This should be the same as your storage region to prevent high data movement costs. As Pricing Tier you can use Standard for this introduction. Creating the service it self doesn't cost anything.
Create Databricks Service



















3) Create Databricks cluster
Now go to the newly created service and click on the Launch Workspace button under the big Databricks logo in the center.
Launch Workspace













A new tab will open (just like with data factory) and in this workspace we need to create a cluster. A cluster is a collection of virtual machines that will do the ETL work for you. This is also where you start paying! Click on Clusters in the left menu and then on Create Cluster (or click on New Cluster under Common Tasks in the center).
Create new Databricks cluster























For the Cluster Mode use the default option: Standard. For the Runtime Version choose: Runtime: 5.0 (Scale 2.11, Spark 2.4.0). Pyhthon version. The other options to change are the number of minutes of inactivity before terminating the machine. For this example 60 minutes should be enough to create and execute your new data flow. And most important for you wallet is the Worker Type and the numbers of workers. For this example one of the cheapest (Standard_DS3_v2) should be more than enough.
Create cluster manually




















Note: In a future post I will show you how to create a cluster automatically. For now we will create one manually, but terminate it automatically when you are ready (= not doing anything with the cluster).
Creating a cluster takes a few minutes












4) Create Data Factory V2 with data flow (preview)
Create a new Data Factory and choose 'V2 with data flow (preview)' as the version. At the moment of writing this new feature is only available in 4 regions, but more regions will be added on its way from preview to general availability. So watch out with moving lots of data between regions. Also notice the option to add a couple of examples which will help you to master the Data Flows.
Only available in 4 regions at the moment


























5) Add new Data Flow
Now go to the newly created Data Factory and click on Author & Monitor to go to the Data Factory portal.
Azure Data Factory, open portal














In the portal go to the Author page (pencil icon in the left menu) and then click on the three dots behind Data Flows and choose Add Dataflow. A new empty Dataflow will be created and we can start adding a source.
Add Dataflow















6) Add source
For this example I will use an existing file that is located in an Azure Blob Storage Container. Go to the new dataflow and click on the source to specify the file from the Blob Storage Container. First give the source a suitable name. In the second field you can find all existing datasets, but if you click on the +New button behind it, you can create a new dataset. Now you first need to give the dataset a name and then you can specify the connection to the Azure Blob Storage Container.
Adding source part 1















When creating a new Blob Storage Container dataset it only creates a link to the storage account itself. You still need to specify the container, filename and properties like delimiter and column names. To do that hit the Edit button and go to the Connections tab. Here you can specify all properties like location and file format. Next go to the Schema tab to specify the columns. You can do that manually or by clicking on the Import Schema button that will read the column names from the file. After this you can specify the datatypes for each column. Now the source dataset is ready, but we still have to map this to the source in the dataflow.
Adding source part 2















Go back to the new dataflow and go to the Define Schema tab. Use the Import from dataset button to import all columns from the dataset. The datatype is not copied from the dataset. I'm not sure whether this is deliberately or that it is a bug. Now the source is ready and we can start adding transformations.
Adding source part 3















7) Add transformation
The number of different transformations is not yet as rich as those of the SSIS Data Flow Task, but the most important transformations are available and there are also some handy new ones like SurrogateKey and Window (for aggregation on time windows) which are not available in SSIS.

On the other hand the expression language in the Derived Column is much more comprehensive with for example functions for hashing and a lot of string functions which are not available in SSIS.

For this example I will add a Derived column to upper the location column. Click on the little + behind the source to add a new transformation. Click on Derived Column and edit an existing column. When editing the expression a new window will appear. The upper method can be found under String and the column name can be found under Input. If you know SSIS then this should be very familiar.
Adding a Derived Column transformation















8) Add destination
For this example I will use an existing table in an Azure SQL Database as destination. Click on the little + behind the Derived Column to add a new item to the dataflow. Scroll down to the bottom and click on Sink. Now you can add a new destination dataset (sink dataset) for this output. First give the output stream a name and then click on the + New button to add a dataset. Choose Azure SQL Server and give the dataset a name in the next step. After that we need to create a new connection to the database server. See the animated gifs for the details.
Destination - Add sink dataset and connection















When creating a new sink dataset and its connection, it only creates a link to the SQL Server database. You still need to specify the table name and columns. To do that hit the Edit button on the dataflow destination and go to the Connections tab. Here you can specify which table you want to use. Next go to the Schema tab to specify the columns. You can do that by clicking on the Import Schema button that will read the columns from the database table. Now the sink dataset is ready, but you still need to create a column mapping.
Destination - Choose table and columns















Go back to the dataflow, click on the destination and then go to the Mapping tab. If the column names in your dataflow are the same then you can use Auto Mapping. Otherwise turn of the auto mapping and do it manually.
Destination - Mapping
















9) Debug
You probably already clicked on the Data Preview tab of the source, derived column or destination and saw that you first have to turn on debug mode.
Please turn on the debug mode and wait until cluster is ready to preview data




















To do this we first need to get a new token from Azure Databricks to connect from Data Factory. Go to the Databricks portal and click in the person icon in the top right. Then choose User Settings and then hit the Generate New Token button. Copy it and keep it save in for example KeePass because you wont be able to retrieve it again.
Generate New Token in Azure Databricks















Now go back to your dataflow and switch on the debug mode in the top left of the dataflow. A new form appears on the right side where you can create a new linked service to the Databricks service. This is where you need the token from the previous step. For details see animated gif. After that select the cluster that you created and hit the start button. Now it will connect to the cluster and you will be able to see the contents of the Data Preview tabs.
Debugging dataflow (not the actual speed)















10) Create pipeline
The dataflow is ready and we can now add it to a data factory pipeline with the new dataflow preview activity. You need to provide the name of the dataflow you want to execute, but also the link to the Azure Databricks Service. You can reuse the linked service from the previous step.
Adding dataflow to the pipeline (not the actual speed)















And the result viewed with SSMS


























Summary
Microsoft created a new Visual ETL tool with lots of promising features that could become a great successor for SSIS. However it does not yet have the Visual Studio feeling like with SSIS. So that will take some time to get used to. Debugging is an other thing. It works, but takes me way to much waiting time at the moment. It will probably be faster with a more expensive cluster, but not sure it will match the SSDT experience. Besides these two notes, I can't wait to get some actual working experience with it.