Monthly Archives: August 2011

Improve your SSIS package’s performance

Hello everyone.

I spent almost the whole last week and the first 2 days of this week trying to improve my BI solutions’ performance. In my quest on learning the tricks to make my package faster, I came across SSIS Performance Design Patterns video by Matt Masson. A comprehensive discussion indeed, that I’m gonna list in vrief here:
1- Utilize parallelism: It is easy to utilize parallelism in SSIS. All you need to do is to recognize which Data Flow Tasks (DFTs) could be started at the same time and set the control flow constraints of your package in the way that they all can run simultaneously.

2- Synchronous vs. Asynchronous components: A synchronous transformation of SSIS  takes a buffer, processes the buffer, and passes the result through without waiting for the next buffer to come in. On the other hand, an asynchronous transformation needs to process all its input data to be able to give out any output. This can case serious performance issues when the size of the input data to the async. transformation is too big to fit into memory and needs to be transferred to HDD at multiple stages.

3- Execution tree: An execution tree starts where a buffer starts and ends where the same buffer ends. These execution trees specify how buffers and threads are allocated in the package. Each tree creates a new buffer and may execute on a different thread.  When a new buffer is created such as when a partially blocking or blocking transformation is added to the pipeline, additional memory is required to handle the data transformation; however, it is important to note that each new tree may also give you an additional worker thread.

4-OLE DB Command transformation: OLE DB Command is a row-by-row transformation, meaning that it runs the command in it on each one of its input rows. This make sit to be damn too slow when the number of the rows goes up. The solution for boosting performance is to stage data into a temporary table and use Execute SQL Task outside that DFT.

5- SQL Server Destination vs. OLE DB Destination: There are multiple reason why to use OLE DB Destination and not use SQL Server Destination:

  • OLE DB Destination is mostly faster,
  • OLE DB Destination is a lot clearer when it fails (The error message is more helpful),
  • SQL Server Destination works only when SSIS is installed on the destination server.

6- Change Data Capture (CDC): Try to reduce the amount of data to be transferred to the maximum level you can, and do it as close to the source as you can. A Modified On column on the source table(s) helps a lot in this case.

7- Slowly Changing Dimension (SCD) transformation: There is only one advice about SSIS’s Slowly Changing Dimension transformation, and that is get rid of it! The reasons are:

  • It doesn’t use any cached data, and goes to the data source every single time it is called,
  • It uses many OLE DB Command transformations,
  • Fast Data Load is off by default on its OLE DB Destination.

I recommend you to go and watch the whole video if you have enough time. All these topics are discussed more in details in the video.
Cheers!


Making SSRS reports faster: get rid of Parameter Sniffing

Is your SSRS report running slowly? Are you using a stored procedure to pull the data and pass them to report? If your answer to these questions is yes, then you are a victim of SQL Server’s Parameter Sniffing.
The first question is, what is Parameter Sniffing? It refers to SQL Server’s effort to reduce CPU overhead by using the same query execution plan for all similar queries instead of compiling the query each time it is being executed. As long as the queries would have really returned the same plan, this is a big performance winner. SQL Server internally tries to automatically turn simple non-parameterized user queries into parameterized queries to take advantage of this performance gain.
Parameter use, especially in more complex scenarios, can also cause performance issues. If the queries are complex and/or the data distribution on the columns against which the parameter is compared vary, the cost of different plan choices can change. A plan that is optimal for one parameter value may perform poorly for another value. The query optimizer still needs to estimate the selectivity and cardinality of predicates using a parameter value. This section describes how this process works in more detail.
So, what to do to prevent being a victim of this process? It is very easy to achieve: all you need to do is to declare local parmeter(s) inside your stored procedure’s code and assign the values passed to the SP to the newly introduced parameters. for example, lets say we have a sp with following definition:

CREATE PROCEDURE [SP_Test_ParameterSniffing]
@CustomerID INT
AS
BEGIN

SELECT *
FROM Customer c
WHERE c.CustomerID = @CustomerID

END

All you need to do is to add a new parameter to your SP code and assign the value passed to the SP to theis new parameter and use it in your WHERE clause:

CREATE PROCEDURE [SP_Test_ParameterSniffing]
@CustomerID INT
AS
BEGIN

DECLARE @CustomerID2 INT;
SET @CustomerID2 = @CustomerID;
SELECT *
FROM Customer c
WHERE c.CustomerID = @CustomerID2

END

This will cause the original query execution plan to be bypassed and have no effect on the query performance.


Spliting records into multiple rows using SSIS Script Component

Hi folks.
This is a post in response to Add Interim Rows Depending on Max and Min Values question asked in MSDN SSIS Forum, but it could be applied to any similar situation and problem. HEre I am gonna describe the steps needs to be taken to split the records coming from a data base table into multiple records and insert them into a destination table.

Let’s say we have a table with the following data in it:

Key  min  max
A     0    3
B     2    5

and we need to get:

Key  Values
A      0
A      1
A      2
A      3
B      2
B     …
to
B     5

out of it. This situation could be easily addressed using SSIS Script Component. The steps to achieve that are as follows:

1- Open BIDS, create a new Integration Services project, and open Package.dtsx file;

2- Add a new Data Flow Task to the package, and double-click on it to navigate to Data Flow tab;

3- Add a Script Component to your data flow. When the “Select Script Component Type” dialog box opens, select Source and hit OK;

4- Open Script Component by double-clicking on it and navigate to “Inputs and Outputs” tab. Add 2 Output columns by selecting “Output Columns” folder and hitting Add Column button, as displayed in the following picture:

5- Add a new Conenction Manager to “Connection Managers” tab to point to your instance of SQL Server DB:

6- Now navigate to the Script tab, and hit Edit Script. This will get you to Visual Studio Tools for Applications (VSTA), where you can edit your script’s code. Replace the code with the following snippet:

using System;

using System.Data;

using Microsoft.SqlServer.Dts.Pipeline.Wrapper;

using Microsoft.SqlServer.Dts.Runtime.Wrapper;

using System.Data.SqlClient;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]


public classScriptMain : UserComponent

{

          privateSqlConnection cnn;

          publicoverridevoid AcquireConnections(object Transaction)

{

                      //base.AcquireConnections(Transaction);

                      IDTSConnectionManager100 cnnMgr = this.Connections.Connection;

                      string cnnstr = cnnMgr.ConnectionString.Substring(cnnMgr.ConnectionString.IndexOf(“Data Source”), cnnMgr.ConnectionString.IndexOf(“Provider”));

cnnstr +=“Integrated Security=SSPI;”;

                   cnn = new SqlConnection(cnnstr);

}

           publicoverridevoid PreExecute()

{

                        base.PreExecute();

cnn.Open();

}

          publicoverridevoid PostExecute()

{

                           base.PostExecute();

cnn.Close();

}

        publicoverridevoid CreateNewOutputRows()

{

                          SqlCommand cmd = newSqlCommand(“SELECT * FROM ss”, cnn);

                         SqlDataReader reader = null;

reader = cmd.ExecuteReader();

                          while (reader.Read())

{

                                          for (int i = (int)reader[1]; i <= (int)reader[2]; i++)

{

Output0Buffer.AddRow();

Output0Buffer.oName = (string)reader[0];

Output0Buffer.oValue = i;

}

}

}

}

7- Add a new OLE DB Destination to the Data Flow tab. Get the Output (green arrow) of the Script component and conenct it to OLE DB Destination just added. Configure OLE Db Destination to insert data into the destination table:

Run the package. You’ll get the splitted records into your destination table.