I have a list of teacher and student combinations with one combination per record and I want to created a comma delimited list of students per teacher.

Solution
If your source is a database then the easiest solution is a TSQL query like this:
--TSQL Query WITH UniqueTeachers AS ( SELECT DISTINCT Teacher FROM TeacherStudentTable ) SELECT Teacher , Students = STUFF(( SELECT ',' + Student FROM TeacherStudentTable WHERE Teacher = UniqueTeachers.Teacher ORDER BY Student FOR XML PATH(''), TYPE).value('.','varchar(100)'), 1, 1, '') FROM UniqueTeachers ORDER BY Teacher
![]() |
The query in SSIS as source |
If your source is for example a flat file or a database that doesn't support a query like this, then there are also options within SSIS. For this solution I use a asynchronous Script Component.
1) Sorted source
We need a sorted source because we are comparing records with each other. In this case make sure the source is sorted on teacher first (and optional secondly on student).
![]() |
Add Sort transformation if source isn't sorted |
2) Script Component
Add a Script Component (type transformation) and select the Teacher and Student columns as ReadOnly input columns.
![]() |
Input columns: Teacher and Student |
3) Asynchronous
We need to make the Script Component asynchronous because it throws out a different number of rows than there are incomming. Go to the Inputs and Outputs pane, click on Output 0 and change the SynchronousInputID to None.
![]() |
Asynchonous |
4) Output
We now need to create an output for the Script Component. Expand the Output 0 and add two columns:
Teacher (same data type and size as the input column teacher)
Students (same data type as the input column student, but larger to fit multiple student names)
![]() |
Output columns |
5) The Script
Copy the three variables and the two methods to your Script Component (and remove any other existing methods).
// C# Code using System; using System.Data; using Microsoft.SqlServer.Dts.Pipeline.Wrapper; using Microsoft.SqlServer.Dts.Runtime.Wrapper; [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute] public class ScriptMain : UserComponent { bool initialRow = true; // Indicater for the first row string teacher = ""; // Name of the teacher to track teacherchanges between rows string students = ""; // The comma delimited list of students public override void Input0_ProcessInput(Input0Buffer Buffer) { // Loop through buffer while (Buffer.NextRow()) { // Process an input row Input0_ProcessInputRow(Buffer); // Change the indicator after the first row has been processed initialRow = false; } // Check if this is the last row if (Buffer.EndOfRowset()) { // Fill the columns of the existing output row with values // from the variable before closing this Script Component Output0Buffer.Teacher = teacher; Output0Buffer.Students = students; } } public override void Input0_ProcessInputRow(Input0Buffer Row) { if (initialRow) { // This is for the first input row only // Create a new output row Output0Buffer.AddRow(); // Now fill the variables with the values from the input row teacher = Row.Teacher; students = Row.Student; } else if ((!initialRow) & (teacher != Row.Teacher)) { // This isn't the first row, but the teacher did change // Fill the columns of the existing output row with values // from the variable before creating a new output row Output0Buffer.Teacher = teacher; Output0Buffer.Students = students; // Create a new output row Output0Buffer.AddRow(); // Now fill the variables with the values from the input row teacher = Row.Teacher; students = Row.Student; } else if ((!initialRow) & (teacher == Row.Teacher)) { // This isn't the first row, and the teacher did not change // Concatenate the studentsname to the variable students += "," + Row.Student; } } // Little explanation: // Rows are created in memory with .AddRow() // and will be submitted to the output when a // new / subsequent row is created or when // the last buffer has been finished. }
or in VB.Net
'VB.Net code Imports System Imports System.Data Imports System.Math Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper Imports Microsoft.SqlServer.Dts.Runtime.Wrapper <Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute()> _ <CLSCompliant(False)> _ Public Class ScriptMain Inherits UserComponent Private initialRow As Boolean = True ' Indicater for the first row Private teacher As String = "" ' Name of the teacher to track teacherchanges between rows Private students As String = "" ' The comma delimited list of students Public Overrides Sub Input0_ProcessInput(Buffer As Input0Buffer) ' Loop through buffer While Buffer.NextRow() ' Process an input row Input0_ProcessInputRow(Buffer) ' Change the indicator after the first row has been processed initialRow = False End While ' Check if this is the last row If Buffer.EndOfRowset() Then ' Fill the columns of the existing output row with values ' from the variable before closing this Script Component Output0Buffer.Teacher = teacher Output0Buffer.Students = students End If End Sub Public Overrides Sub Input0_ProcessInputRow(Row As Input0Buffer) If initialRow Then ' This is for the first input row only ' Create a new output row Output0Buffer.AddRow() ' Now fill the variables with the values from the input row teacher = Row.Teacher students = Row.Student ElseIf (Not initialRow) And (teacher <> Row.Teacher) Then ' This isn't the first row, but the teacher did change ' Fill the columns of the existing output row with values ' from the variable before creating a new output row Output0Buffer.Teacher = teacher Output0Buffer.Students = students ' Create a new output row Output0Buffer.AddRow() ' Now fill the variables with the values from the input row teacher = Row.Teacher students = Row.Student ElseIf (Not initialRow) And (teacher = Row.Teacher) Then ' This isn't the first row, and the teacher did not change ' Concatenate the studentsname to the variable students += "," & Convert.ToString(Row.Student) End If End Sub ' Little explanation: ' Rows are created in memory with .AddRow() ' and will be submitted to the output when a ' new / subsequent row is created or when ' the last buffer has been finished. End Class
Note: You can change the delimiter in C# line 70 and VB.Net line 64.
6) Testing
For testing purposes I added a derived column and a couple of data viewer.
![]() |
The result |
Conclusion: both methods have the same result. For a SQL database source, the T-SQL method is probably a little faster. If you can do the sort in the source the performance differences will diminish.
In one of my next blog posts I will show you how to split a comma separated list in to records.
Thanks a lot for posting this, made a task I was working on so much easier.
ReplyDelete