Sunday, 28 February 2016

Searching in packages with PowerShell

Case
My client asked me to make an overview of packages that use stored procedures from a certain database. Of course I can open each package and edit each task/transformation that can use stored procedures. But they have dozens of SSIS solutions with each multiple projects and a lot of packages.

The overview will probably be outdated before I can finish reviewing all packages. Is there a smarter, faster and less tedious solution?

Solution
PowerShell to the rescue! I made a small script that loops through all packages in a certain folder and then searches for certain strings in the source code of the packages. Those search strings are in an array that can be filled manually or automatically with a script that gets all stored procedures from a database.

#PowerShell: Example 1 with hardcoded list of stored procedures
# Where are all my packages
$PackagePath = "H:\My Documents\Visual Studio 2013\projects\TFS"
 
# Where should we save the report
$ReportPath = "H:\My Documents\packagestoredprocedures.csv"
 
# Which stored procedures do we need to find
$StoredProcedures = @("pr_Forced_append","pr_DV_HUB","pr_DV_LINK")

#################################################################################################
# No need to edit below this line
#################################################################################################

# Declare array to store result in
$PackageSPArray = @("Solution;Project;Package;StoredProcedure")
 
# Get all SSIS packages in the configured folder and in all its subfolders
Get-ChildItem $PackagePath -Filter “*.dtsx” -Recurse | Where-Object { $_.Attributes -ne “Directory”} | 
ForEach-Object {
    # Ignore packages in the object folder
    If (!$_.FullName.Contains("\obj\"))
    {
        # Read entire package content like it's a textfile
        $PackageXml = Get-Content $_.FullName
 
        # Loop through Stored Procedure array
        ForEach ($StoredProcedure in $StoredProcedures)
        {
            # Optionally write search to screen
            # Write-Host ("Looking for " + $StoredProcedure + " in package " + $_.Name)
            
            # Check if it contains the stored procedure
            If ($PackageXml | Select-String -Pattern $StoredProcedure)
            {
                # Optionally write find to screen
                # Write-Host ("Found " + $StoredProcedure.ToString() + " in package " + $_.Name.ToString())
 
                # Filling array: Solution;Project;Package;StoredProcedure
                $PackageSPArray  += $_.Directory.Parent.Name + ";"+ $_.Directory.Name + ";"  + $_.Name + ";"  + $StoredProcedure
            }
        }
    }
}
# Optionally write result to screen
# $PackageSPArray | ForEach-Object {$_}
 
# Write result to file
$PackageSPArray | % {$_} | Out-File  $ReportPath 


#PowerShell: Example 2 with query to get stored procedures
# Where are all my packages
$PackagePath = "H:\My Documents\Visual Studio 2013\projects\TFS"

# Which server do we connect to to find stored procedures
$SqlServer = "MyServer\SQL2014"

# Which database do we connect to to find stored procedures
$Database = "DWH" 

# Where should we save the report
$ReportPath = ("H:\My Documents\packagestoredprocedures" + $Database.ToString() + ".csv")

#################################################################################################
# No need to edit below this line
#################################################################################################
 
# Query database to find stored procedures in information_schema
$StoredProceduresQry = @(Invoke-SQLCmd -query ("SELECT ROUTINE_NAME as Name from " + $Database.ToString() + ".INFORMATION_SCHEMA.ROUTINES WHERE ROUTINE_TYPE = 'PROCEDURE' and ROUTINE_NAME NOT LIKE 'sp_%' order by 1") -Server $SqlServer)

# Store query result in array
$StoredProcedures = @($StoredProceduresQry | select-object -ExpandProperty Name)

# Declare array to store result in
$PackageSPArray = @("Solution;Project;Package;StoredProcedure")
 
# Get all SSIS packages in the configured folder and in all its subfolders
Get-ChildItem $PackagePath -Filter “*.dtsx” -Recurse | Where-Object { $_.Attributes -ne “Directory”} | 
ForEach-Object {
    # Ignore packages in the object folder
    If (!$_.FullName.Contains("\obj\"))
    {
        # Read entire package content like it's a textfile
        $PackageXml = Get-Content $_.FullName
 
        # Loop through Stored Procedure array
        ForEach ($StoredProcedure in $StoredProcedures)
        {
            # Optionally write search to screen
            # Write-Host ("Looking for " + $StoredProcedure + " in package " + $_.Name)
            
            # Check if it contains the stored procedure
            If ($PackageXml | Select-String -Pattern $StoredProcedure)
            {
                # Optionally write find to screen
                # Write-Host ("Found " + $StoredProcedure.ToString() + " in package " + $_.Name.ToString())
 
                # Filling array: Solution;Project;Package;StoredProcedure
                $PackageSPArray  += $_.Directory.Parent.Name + ";"+ $_.Directory.Name + ";"  + $_.Name + ";"  + $StoredProcedure
            }
        }
    }
}
# Optionally write result to screen
# $PackageSPArray | ForEach-Object {$_}
 
# Write result to file
$PackageSPArray | % {$_} | Out-File  $ReportPath 


There are a few considerations when using this script:
  • It's looking for pieces characters! If your stored procedure has a very general name it might find that text where it isn't used as a stored procedure.
  • It isn't honoring connection managers. If the same stored procedure is used in another database it will find it.
  • It isn't honoring schemas. If you have a stored procedure multiple times in your database, but with different schemas it can't tell the difference.
  • If your have a lot of stored procedures in your database and you have a lot of packages it could take a while, but it will always be faster then checking each package manually
Conclusion: the script isn't very intelligent, but very useful and fast for this case. All problems above can be solved, but then the script will be much bigger and take a lot of time to develop.




No comments:

Post a Comment

Please use the SSIS MSDN forum for general SSIS questions that are not about this post. I'm a regular reader of that forum and will gladly answer those questions over there.

All comments are moderated manually to prevent spam.