Case
How do I execute a package with the new Scale Out function? I don't see any options when executing a package.
Find the Scale Out options
Solution
The new Scale Out execution is not (yet) integrated in the standard package execution window. And the Package Execution Task has not changed. Therefore it will always execute the package on the same worker as the parent package. Both will probably change within a couple CTP releases.
Catalog
If you right click on SSISDB within the catalog then you will see the new context menu item "Execute in Scale Out..."
Execute in Scale Out...
Next you can choose which packages to execute and on which worker servers.
Execute in Scale Out
After hitting the OK button no reports are shown like in the regular execution, but you can find the reports in the context menu of the Catalog.
No open report option
And in the report you can see which 'random' worker executed the particular package.
The Machine property show which worker was used
Conclusions
Nice first version of the Scale Out. Hopefully the next CTP contains a new version of the Execute Package Task and an integration of the regular execution and scale out execution. Please try it out and let me (or Microsoft) know what you think about it.
Some considerations, because the worker services uses a local system account you might want to consider changing that to a domain account or use other options like a proxy or a database user. Other concerns are the firewall if you're using a local database on the master and local paths (d:\myfiles\) on the master won't work either.
PublicKeyToken=89845dcd8080cc91' or one of its dependencies.
The system cannot find the file specified. (mscorlib)
Solution
SSDT 17.0 RC1 still got some bugs, for real projects you should use SSDT 16.5 But if you want to discover for example the Scale Out function or Support for Microsoft Dynamics Online Resources of SQL VNEXT then you should/could use this version.
Three solutions (in order of recommendation):
Since this bug only occurs in SSDT, you could deploy outside SSDT with PowerShell of by just double clicking the ISPAC file. Then this error won't occur.
It's a known issue. Just wait for the next release of SSDT (early 2017).
Or add the missing reference to <drive>:\Program Files (x86)\Microsoft Visual Studio 14.0\Common7\IDE\devenv.exe.config and restart SSDT (at your own risk of course)
Case
SQL VNext has a new Scale Out function. How does it work and how do you install and configure that?
Solution
The new Scale Out option in SSIS VNEXT gives you the ability to execute multiple packages distributed to multiple worker machines. You can select multiple packages on the master that will be executed in parallel by one or more worker machines. Machine setup
To make sense of a Scale Out you of course need multiple machines. We need a master machine and one or more worker machines. Because the master distributes the executions and doesn't execute packages it self, you may want to consider installing a worker on the same machine as the master to make use of its resources. The worker machines only have a worker installed. A SQL Server engine installation is not necessary on a worker.
Option 1: Master only distributes executions
Option 2: Master also executes packages itself
For this example I will use option 1 with a master and two separate workers on three HyperV machines. All machines are identical with Windows Server 2012 R2 with all updates installed.
HyperV Machines
Download
Before installing your first you need to download SQL Server VNEXT and the associated SSMS and SSDT.
Installation steps:
Install SQL Server VNEXT on master
Configure firewall on master
Copy master certificate to workers
Install SQL Server VNEXT on workers
Copy worker certificates to master
Install SSMS VNEXT to add Catalog on master
Install worker certificates on master
Enable scale out workers on master
Installation step 1: install SQL Server VNEXT on master
Install SQL Server VNEXT on the 'Master' machine called "SQLVNEXT_M". Below the most important steps of the installation. At the bottom all screens are shown in a movie.
We need the Database Engine to store the SSISDB
We need SSIS and the Scale Out Master
SQL Server Authentication mode is required on the SSISDB
Choose port 8391 and create a new SSL certificate
All steps
Installation step 2: configure firewall on master
Open a firewall port on the Scale Out Master. We need at least an inbound rule for port 8391 supplied as EndPoint in the previous step, but a complete list of all SQL Server ports can be found here.
Installation step 3: copy master certificate to workers We created a new certificate during installation of the Scale Out Master. You can find it in <drive>:\Program Files\Microsoft SQL Server\140\DTS\Binn. We need that certificate during installation of the Scale Out Workers. So copy it to the worker machines.
Copy SSISScaleOutMaster.cer to worker
Copy SSISScaleOutMaster.cer to worker
Installation step 4: install SQL Server VNEXT on workers Install SQL Server VNEXT on the 'Worker' machines called "SQLVNEXT_W1" and "SQLVNEXT_W2". Below the most important steps of the installation. At the bottom all screens are shown in a movie.
Only select SSIS and Scale Out Worker (no engine needed)
Add an EndPoint like https://SQLVNEXT_M:8391. This is the name of the
Scale Out Master machine and the port chosen during the installation of the
Scale Out Master.
The certificate is the one you copied from the Scale Out Master
in one of the previous steps.
All steps
Installation step 5: copy worker certificates to master During installation of the Scale Out Worker machines, certificates where created which we need register on the machine with the Scale Out Master. With these certificates the Scale Out Master can authenticate the Scale Out Workers. You can find SSISScaleOutWorker.cer in <drive>:\Program Files\Microsoft SQL Server\140\DTS\Binn
(repeat this for all workers).
Copy SSISScaleOutWorker.cer to master
Copy SSISScaleOutWorker.cer to master
Installation step 6: install SSMS VNEXT to add Catalog on master
Now we need to add a catalog to the master. To do this you need to install SSMS VNEXT first. For this demo situation I installed SSMS VNEXT on the master machine.
Install SSMS VNEXT
Add catalog as you normally do, but notice the extra option:
Enable this server as SSIS scale out master
Add catalog
Installation step 7: install worker certificates on master Now we need to install all Scale Out Worker Certificates on the Scale Out Master machine. They should be stored in the Trusted Root Certification Authorities. Repeat the steps below for all Worker Certificates.
Store certificates in Trusted Root Certification Authorities
All Steps
Installation step 8: enable scale out workers on master
Make sure the services SSISScaleOutWorker140 on the Worker machines are started and SSISScaleOutMaster140 on the master. Then start SSMS and connect to the SQL Server instance on the master and execute the following query:
-- Get Worker info
SELECT * FROM [SSISDB].[catalog].[worker_agents]
It could take a few minutes before the worker machines are registered. Once that happens the query should return records. Use the values from the WorkerIdAgentId column in the next Stored Procedure Call to enable the Scale Out Workers
Now you're ready to deploy your first project and execute packages with the new Scale Out function. Also see the Microsoft walkthrough for the scale out setup. Conclusion
The Master Worker setup is a great way to distribute package executions over multiple servers. When you will choose for upgrading your existing SSIS server with more memory and more cores above the new master-worker setup, probably depends on the licensing model. But when you already maxed out the hardware of your current SSIS server, then this new master-worker setup is an easy way to upgrade.
And what about a future Scale Out to Azure? If a weekly, monthly or quarterly run is taking to much time or one of your worker servers is down (for maintenance?).
In 2006 Jamie Thomson came up with naming conventions for SSIS tasks and data flow components. These naming conventions make your packages and logs more readable. Five SQL Server versions and a decade later a couple of tasks and components were deprecated, but there were also a lot of new tasks and components introduced by Microsoft.
Case
I used PowerShell to deploy my SSIS project to the Catalog. Can I also automatically create a SQL Server Agent job with an SSIS jobstep?
SQL Agent Job for SSIS package
Solution
Yes, almost every Microsoft product supports PowerShell and SQL Server Agent is no exception. Only the SSIS specific part of the jobstep seems to be a little more difficult to handle. So for this example I first created a SQL Server Agent job(step) for an SSIS package in SSMS manually and then scripted it to see the jobstep command. This command is a long string with all SSIS specific information like the packagepath, enviroment and parameters. Below you see parts of the generated TSQL Script. We are interested in the part behind @command= in row 12:
This command string is used in the PowerShell script below, but hardcoded parts are replaced with values from the PowerShell parameters (see row 66). The rest of the script is more straightforward and easily to adjust or extend. If you're not sure about how to adjust the script then first take a look at the T-SQL script which has similar steps and with the same properties to set.
#PowerShell SSIS JobStep
################################
########## PARAMETERS ##########
################################
# Destination
$SsisServer = "MyServer\MSSQLSERVER2016"
$FolderName = "Finance"
$ProjectName = "PowerShell and SSIS"
# Job
$JobName = "Load DWH"
$MasterPackage = "Master.dtsx"
$JobStartTime = New-TimeSpan -hours 6 -minutes 30
clear
Write-Host "========================================================================================="
Write-Host "== Used parameters =="
Write-Host "========================================================================================="
Write-Host "SSIS Server : " $SsisServer
Write-Host "FolderName : " $FolderName
Write-Host "ProjectName : " $ProjectName
Write-Host "Job name : " $JobName
Write-Host "MasterPackage : " $MasterPackage
Write-Host "ScheduleTime : " $JobStartTime
Write-Host "========================================================================================="
Write-Host ""
# Reference SMO assembly and connect to the SQL Sever Instance
# Check the number in the path which is different for each version
Add-Type -Path 'C:\Program Files\Microsoft SQL Server\130\SDK\Assemblies\Microsoft.SqlServer.Smo.dll'
$SQLSvr = New-Object -TypeName Microsoft.SQLServer.Management.Smo.Server($SsisServer)
# Check if job already exists. Then fail, rename or drop
$SQLJob = $SQLSvr.JobServer.Jobs[$JobName]
if ($SQLJob)
{
# Use one of these 3 options to handle existing jobs
# Fail:
#Throw [System.Exception] "Job with name '$JobName' already exists."
# Rename:
Write-Host "Job with name '$JobName' found, renaming and disabling it"
$SQLJob.Rename($SQLJob.Name +"_OLD_" + (Get-Date -f MM-dd-yyyy_HH_mm_ss))
$SQLJob.IsEnabled = $false
$SQLJob.Alter()
# Drop:
#Write-Host "Job with name $JobName found, removing it"
#$SQLJob.Drop()
}
#Create new (empty) job
$SQLJob = New-Object -TypeName Microsoft.SqlServer.Management.SMO.Agent.Job -argumentlist $SQLSvr.JobServer, $JobName
$SQLJob.OwnerLoginName = "SA"
$SQLJob.Create()
Write-Host "Job '$JobName' created"
# Command of jobstep
# This string is copied from T-SQL, by scripting a job(step) in SSMS
# Then replace the hardcode strings with [NAME] to replace them with variables
$Command = @'
/ISSERVER "\"\SSISDB\[FOLDER]\[PROJECT]\[PACKAGE]\"" /SERVER "\"[INSTANCE]\"" /Par "\"$ServerOption::LOGGING_LEVEL(Int16)\"";1 /Par "\"$ServerOption::SYNCHRONIZED(Boolean)\"";True /CALLERINFO SQLAGENT /REPORTING E
'@
$Command = $Command.Replace("[FOLDER]", $FolderName)
$Command = $Command.Replace("[PROJECT]", $ProjectName)
$Command = $Command.Replace("[PACKAGE]", $MasterPackage)
$Command = $Command.Replace("[INSTANCE]", $SsisServer)
# Create new SSIS job step with command from previous block
$SQLJobStep = New-Object -TypeName Microsoft.SqlServer.Management.SMO.Agent.JobStep -argumentlist $SQLJob, "$ProjectName - $MasterPackage"
$SQLJobStep.OnSuccessAction = [Microsoft.SqlServer.Management.Smo.Agent.StepCompletionAction]::QuitWithSuccess
$SQLJobStep.OnFailAction = [Microsoft.SqlServer.Management.Smo.Agent.StepCompletionAction]::QuitWithFailure
$SQLJobStep.SubSystem = "SSIS"
$SQLJobStep.DatabaseName = $SsisServer
$SQLJobStep.Command = $Command
$SQLJobStep.Create()
Write-Host "Jobstep $SQLJobStep created"
# Create a daily schedule
$SQLJobSchedule = New-Object -TypeName Microsoft.SqlServer.Management.Smo.Agent.JobSchedule -ArgumentList $SQLJob, "Daily $JobStartTime"
$SQLJobSchedule.IsEnabled = $true
$SQLJobSchedule.FrequencyTypes = [Microsoft.SqlServer.Management.SMO.Agent.FrequencyTypes]::Daily
$SQLJobSchedule.FrequencyInterval = 1 # Recurs Every Day
$SQLJobSchedule.ActiveStartDate = Get-Date
$SQLJobSchedule.ActiveStartTimeofDay = $JobStartTime
$SQLJobSchedule.Create()
Write-Host "Jobschedule $SQLJobSchedule created"
# Apply to target server which can only be done after the job is created
$SQLJob.ApplyToTargetServer("(local)")
$SQLJob.Alter()
Write-Host "Job '$JobName' saved"
You could combine this with the deploy script to handle the complete SSIS deployment in one script.
Recently
I presented an SSIS & PowerShell session at the SQL Serverdays in Schelle, Belgium and SQL Saturday in Utrecht, The Netherlands. You can download the PowerPoints:
SSIS Appetizer
I'm not sure I have a purpose for this, but did you know that you can use the cache file of the Cache Transformation (introduced in SSIS 2008) as a source file in the Raw File Source. Demo
For this demo I use two Data Flow Tasks. The first creates the cache file and the second one uses it as a source.
Two Data Flow Task
1) Create Cache
The first Data Flow has a random source (a flat file in this case) and a Cache Transformation named "CTR - Create Cache" as a destination. When you create the Cache Connection Manager, make sure to check "Use file cache" to provide a file path for the cache file. Copy the path for the next step.
The Cache Transformation and Connection Manager
2) Read Cache
The second Data Flow Task uses a Raw File Source. In the editor you can specify the location of the Raw File. Paste the path from the Cache Connection Manager (a .caw file). For demonstration purposes I added a dummy Derived Column behind it with a Data Viewer on the path between them. Now run the package a see the result. You will get some hash columns 'for free'.
Raw File Source
Please let me know in the comments if you found a good purpose for this.
Note: you can't use a raw file as a cache file unless you're able to add the extra hash columns as well.
Case
I want to extract data from a SAS database file (*.sas7bdat). How do I do that in SSIS?
Solution
This is possible but not out of the box. You need to install an extra provider to accomplish this.
1) Download SAS Provider
First you need to download and install the SAS Providers for OLE DB. There are multiple versions make sure to download the correct version (otherwise you get error messages like "This application does not support your platform"). You only need the select SAS Providers for OLE DB.
Install SAS Providers for OLE DB
2) Setup OLE DB Connection Manager
After installation the new provider will be available in OLE DB Connection Manager editor. Make sure to choose "SAS Local Data Provider X.X". This is the provider that can read SAS database files (*.sas7bdat).
SAS Local Data Provider 9.3
Second import step in the setup is to select the folder where the sas7bdat files are located. Don't select a file! All files will appear as tables in the OLE DB Source component. In my case I could leave the User name and Password fields empty because I already had access to the folder (but I'm not an SAS expert).
Fill in folderpath in Server or file name field
3) Setup OLE DB Source Component
Now you can use a regular OLE DB Source Component to extract data from SAS. However there are two concerns. When you select a table and close the editor you will get a warning that there is something wrong with the code page.
Cannot retrieve the column code page info from the OLE DB provider.
If the component supports the "DefaultCodePage" property, the code page
from that property will be used. Change the value of the property if the
current string code page values are incorrect. If the component does not
support the property, the code page from the component's locale ID will
be used.
After clicking OK there will be a warning icon in the OLE DB Source Component which you can remove by setting the "AlwaysUseDefaultCodePage" property on true.
Before and after changing AlwaysUseDefaultCodePage
The second concern is more annoying: all datatypes will be DT_SRT (ansi string) or DT_R8 (float). You cannot change this and you need to add a data conversion.
Date(times) are also numbers: dates will be a number of days after January 1 1960 and datetimes will be the number of seconds after January 1 1960 and any decimals are used for milliseconds. A Derived Column expression for date could look something like:
DATEADD("DD", (DT_I4)[mydatecolumn], (DT_DATE)"1960-01-01")
All string or float
Tip: you can also use BIML to create SSIS packages with a SAS7BDAT source.
Case
I recently created packages with a SAS source, but now I want to use the same SAS source in my BIML Script. But I'm getting an error that the Local Provider doesn't support SQL. How can I solve this?
Error 0 : Node OLE_SRC - DIM_TIJD:
Could not execute Query on Connection PROFIT1:
SELECT * FROM DIM_TIJD
The Local Provider does not currently support SQL processing.
Solution
There is NO easy solution for this. The provider doesn't support SQL Queries and that's what the BIML engine does first to get the metadata from the source table. Luckily there is a search-and-replace workaround. A lot of extra work, but still much easier then creating all packages by hand!
1) mirror database in SQL server
I used the metadata from SAS to get all tables and columns which I then used to create (empty/dummy) SQL Server tables with the same metadata as SAS (The datatype is either varchar of float). The tool to get the SAS metadata is SAS Enterprise Guide. It lets you export the metadata to for example Excel and then you can use that to create the dummy tables.
A little script created by a SAS developer to get metadata
Metadata export example in Excel
2) BIML
Instead of the SAS OleDB connection manager I used a temporary SQL Server OleDB connection manager, but I also kept the SAS OleDB connection manager in my BIML code and gave both the same name with a different number at the end (easier to replace later on).
BIML Connection Managers
Because the SAS OleDB connection manager isn't used in the BIML code it won't be created by the BIML engine. To enforce that, I used a second connections tag between </Tasks> and </Package>. It also lets me give them nearly the same GUID (easier to replace later on).
BIML Force create connection managers
The end result of the BIML script:
A whole bunch of packages that use the SQL Server database as a source (instead of SAS DB)
Two connection managers with nearly the same name and GUID (SAS OleDB and SQL OleDB)
3) Search and Replace
Now you must open all generated packages by using View Code (instead of View Designer). When all packages are opened you can use Search and Replace to change the name and GUID in all packages. Make sure you don't replace too much that could damage your generated packages. Then save all changes and close all packages. Next open your packages in the designer to view the result.
Tip: you can use also the same metadata (and a big if-then-else construction) to create a derived column in BIML that casts all float-columns to the correct datatypes (int, date, decimal, etc.).