Calling SSIS from .Net

In a recent DotNetRocks show, episode 483, Kent Tegels was discussing SQL Server Integration Services and how it can be useful to both the BI Developer as well as the traditional application developer. While today I am a SQL Server BI guy, I come from a long developer background and could not agree more. SSIS is a very powerful tool that could benefit many developers even those not on Business Intelligence projects. It was a great episode, and I high encourage everyone to listen.

There is one point though that was not made very clear, but I think is tremendously important. It is indeed possible to invoke an SSIS package from a .Net application if that SSIS package has been deployed to the SQL Server itself. This article will give an overview of how to do just that. All of the sample code here will also be made available in download form from the companion Code Gallery site, http://code.msdn.microsoft.com/ssisfromnet .

In this article, I do assume a few prerequisites. First, you have a SQL Server with SSIS installed, even if it’s just your local development box with SQL Server Developer Edition installed. Second, I don’t get into much detail on how SSIS works, the package is very easy to understand. However you may wish to have a reference handy. You may also need the assistance of your friendly neighborhood DBA in setting up the SQL job used in the process.

Summary

While the technique is straightforward, there are a fair number of detailed steps involved. For those of you just wanting the overview, we need to start with some tables (or other data) we want to work with. After that we’ll write the SSIS package to manipulate that data.

Once the package is created it must be deployed to the SQL Server so it will know about it. This deploy can be to the file system or to SQL Server.

Once deployed, a SQL Server Job must be created that executes the deployed SSIS package.

Finally, you can execute the job from your .Net application via ADO.NET and a call to the sp_start_job stored procedure built into the msdb system database.

OK, let’s get to coding!

Setup the Tables

First we need some data to work with. What better than a listing of previous Dot Net Rocks episodes? I simply went to the Previous Shows page, highlighted the three columns of show number, show name, and date, and saved them to a text file. (Available on the Code Gallery site.)

Next we need a place to hold data so SSIS can work with it. I created a database and named it ArcaneCode, however any database should work. Next we’ll create a table to hold “staging” DNR Show data.

CREATE TABLE [dbo].[staging_DNRShows](
  [ShowData] [varchar](250) NOT NULL
) ON [PRIMARY]

This table will hold the raw data from the text file, each line in the text file becoming one row here. Next we want a table to hold the final results.

CREATE TABLE [dbo].[DNRShows](
  [ShowNumber] [int] NOT NULL,
  [ShowName] [varchar](250) NULL,
  [ShowDate] [datetime] NULL,
  CONSTRAINT [PK_DNRShows] PRIMARY KEY CLUSTERED
  (
  [ShowNumber] ASC
  )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
  ) ON [PRIMARY]

The job of the SSIS package will be to read each row in the staging table and split it into 3 columns, the show’s number, name, and date, then place those three columns into the DNRShows table above.

The SSIS Package

The next step is to create the SSIS package itself. Opening up Visual Studio / BIDS, create a new Business Intelligence SQL Server Integration Services project. First let’s setup a shared Data Source to the local server, using the ArcaneCode database as our source.

The default package name of “Package.dtsx” isn’t very informative, so let’s rename it ”LoadDNRShows.dtsx”. Start by adding a reference to the shared data source in the Connection Managers area, taking the default. Then in the Control Flow surface add 3 tasks, as seen here:

clip_image001

The first task is an Execute SQL Task that simply runs a “DELETE FROM dbo.DNRShows” command to wipe out what was already there. Of course in a true application we’d be checking for existing records in the data flow and doing updates or inserts, but for simplicity in this example we’ll just wipe and reload each time.

The final task is also an Execute SQL Task, after we have processed the data we no longer need it in the staging table, so we’ll issue a “DELETE FROM dbo.staging_DNRShows” to remove it.

The middle item is our Data Flow Task. This is what does the heavy lifting of moving the staging data to the main table. Here is a snapshot of what it looks like:

clip_image002

The first task is our OLEDB Source, it references the staging_DNRShows table. Next is what’s called a Derived Column Transformation. This will allow you to add new calculated columns to the flow, or add columns from variables. In this case we want to add three new columns, based on the single column coming from the staging table.

clip_image003

As you can see in under Columns in the upper left, we have one column in our source, ShowData. In the lower half we need to add three new columns, ShowNumber, ShowDate, and ShowName. Here are the expressions for each:

ShowNumber
    (DT_I4)SUBSTRING(ShowData,1,FINDSTRING(ShowData,"\t",1))

ShowDate
    (DT_DBDATE)SUBSTRING(ShowData,FINDSTRING(ShowData,"\t",2) + 1,LEN(ShowData) – FINDSTRING(ShowData,"\t",2))

ShowName
    (DT_STR,250,1252)SUBSTRING(ShowData,FINDSTRING(ShowData,"\t",1) + 1,FINDSTRING(ShowData,"\t",2) – FINDSTRING(ShowData,"\t",1) – 1)

The syntax is an odd blend of VB and C#. Each one starts with a “(DT_”, these are type casts, converting the result of the rest of the expression to what we need. For example, (DT_I4) converts to a four byte integer, which we need because in our database the ShowNumber column was defined as an integer. You will see SUBSTRING and LEN which work like their VB counterparts. FINDSTRING works like the old POS statement, it finds the location of the text and returns that number. The “\t” represents the tab character, here the C# fans win out as the Expression editor uses C# like escapes for special characters. \t for tab, \b for backspace, etc.

Finally we need to write out the data. For this simply add an OLEDB Destination and set it to the target table of dbo.DNRShows. On the mappings tab make sure our three new columns map correctly to the columns in our target table.

Deploy the Package

This completes the coding for the package, but there is one final step we need to do. First, in the solution explorer right click on the project (not the solution, the project as highlighted below) and pick properties.

clip_image004

In the properties dialog, change the “CreateDeploymentUtility” option from false (the default) to True.

clip_image006

Now click the Build, Build Solution menu item. If all went well you should see the build was successful. It’s now time to deploy the package to the server. Navigate to the folder where your project is stored, under it you will find a bin folder, and in it a Deployment folder. In there you should find a file with a “.SSISDeploymentManifest” extension. Double click on this file to launch the Package Installation Wizard.

When the wizard appears there are two choices, File system deployment and SQL Server deployment. For our purposes we can use either one, there are pros and cons to each and many companies generally pick one or the other. In this example we’ll pick SQL Server deployment, but again know that I’ve tested this both ways and either method will work.

Once you pick SQL Server deployment, just click Next. Now it asks you for the server name, I’ve left it at (local) since I’m working with this on a development box; likewise I’ve left “Use Windows Authentication”. Finally I need the package path, I can select this by clicking the ellipse (the …) to the right of the text box. This brings up a dialog where I can select where to install.

clip_image007

In a real world production scenario we’d likely have branches created for each of our projects, but for this simple demo we’ll just leave it in the root and click OK.

Once your form is filled out as below, click Next.

clip_image008

We are next queried to what our installation folder should be. This is where SSIS will cache package dependencies. Your DBA may have a special spot setup for these, if not just click next to continue.

Finally we are asked to confirm we know what we are doing. Just click Next. If all went well, the install wizard shows us it’s happy with a report, and we can click Finish to exit.

Setup the SQL Server Job

We’ve come a long way and we’re almost to the finish line, just one last major step. We will need to setup a SQL Server Job which will launch the SSIS package for us. In SQL Server Management Studio, navigate to the “SQL Server Agent” in your Object Explorer. If it’s not running, right click and pick “Start”. Once it’s started, navigate to the Jobs branch. Right click and pick “New Job”.

When the dialog opens, start by giving your job a name. As you can see below I used LoadDNRShows. I also entered a description.

clip_image010

Now click on the Jobs page over on the left “Select a page” menu. At the bottom click “New” to add a new job step.

In the job step properties dialog, let’s begin by naming the step “Run the SSIS package”. Change the Type to “SQL Server Integration Services Package”. When you do, the dialog will update to give options for SSIS. Note the Run As drop down, this specifies the account to run under. For this demo we’ll leave it as the SQL Server Agent Service Account, check with your DBA as he or she may have other instructions.

In the tabbed area the General tab first allows us to pick the package source. Since we deployed to SQL Server we’ll leave it at the default, however if you had deployed to the file system this is where you’d need to change it to pick your package.

At the bottom we can use the ellipse to pick our package from a list. That done your screen should look something like:

clip_image011

For this demo that’s all we need to set, I do want to take a second to encourage you to browse through the other tabs. Through these tabs you can set many options related to the package. For example you could alter the data sources, allowing you to use one package with multiple databases.

Click OK to close the job step, then OK again to close the Job Properties window. Your job is now setup!

Calling from .Net

The finish line is in sight! Our last step is to call the job from .Net. To make it a useful example, I also wanted the .Net application to upload the data the SSIS package will manipulate. For simplicity I created a WinForms app, but this could easily be done in any environment. I also went with C#, again the VB.Net code is almost identical.

I started by creating a simple WinForm with two buttons and one label. (Again the full project will be on the Code Gallery site).

clip_image012

In the code, first be sure to add two using statements to the standard list:

using System.Data.SqlClient;

using System.IO;

Behind the top button we’ll put the code to copy the data from the text file we created from the DNR website to the staging table.

    private void btnLoadToStaging_Click(object sender, EventArgs e)

    {

      /* This method takes the data in the DNRShows.txt file and uploads them to a staging table */

      /* The routine is nothing magical, standard stuff to read as Text file and upload it to a  */

      /* table via ADO.NET                                                                      */

 

      // Note, be sure to change to your correct path

      string filename = @"D:\Presentations\SQL Server\Calling SSIS From Stored Proc\DNRShows.txt";

      string line;

 

      // If you used a different db than ArcaneCode be sure to set it here

      string connect = "server=localhost;Initial Catalog=ArcaneCode;Integrated Security=SSPI;";

      SqlConnection connection = new SqlConnection(connect);

      connection.Open();

 

      SqlCommand cmd = connection.CreateCommand();

 

      // Wipe out previous data in case of a crash

      string sql = "DELETE FROM dbo.staging_DNRShows";

      cmd.CommandText = sql;

      cmd.ExecuteNonQuery();

 

      // Now setup for new inserts

      sql = "INSERT INTO dbo.staging_DNRShows (ShowData) VALUES (@myShowData)";

 

      cmd.CommandText = sql;

      cmd.Parameters.Add("@myShowData", SqlDbType.VarChar, 255);

 

      StreamReader sr = null;

 

      // Loop thru text file, insert each line to staging table

      try

      {

        sr = new StreamReader(filename);

        line = sr.ReadLine();

        while (line != null)

        {

          cmd.Parameters["@myShowData"].Value = line;

          cmd.ExecuteNonQuery();

          lblProgress.Text = line;

          line = sr.ReadLine();

        }

      }

      finally

      {

        if (sr != null)

          sr.Close();

        connection.Close();

        lblProgress.Text = "Data has been loaded";

      }

 

Before you ask, yes I could have used any number of data access technologies, such as LINQ. I went with ADO.NET for simplicity and believing most developers are familiar with it due to its longevity. Do be sure and update the database name and path to the file in both this and the next example when you run the code.

This code really does nothing special, just loops through the text file and uploads each line as a row in the staging table. It does however serve as a realistic example of something you’d do in this scenario, upload some data, then let SSIS manipulate it on the server.

Once the data is there, it’s finally time for the grand finale. The code behind the second button, Execute SSIS, does just what it says; it calls the job, which invokes our SSIS package.

    private void btnRunSSIS_Click(object sender, EventArgs e)

    {

      string connect = "server=localhost;Initial Catalog=ArcaneCode;Integrated Security=SSPI;";

      SqlConnection connection = new SqlConnection(connect);

      connection.Open();

 

      SqlCommand cmd = connection.CreateCommand();

 

      // Wipe out previous data in case of a crash

      string sql = "exec msdb.dbo.sp_start_job N’LoadDNRShows’";

      cmd.CommandText = sql;

      cmd.ExecuteNonQuery();

      connection.Close();

      lblProgress.Text = "SSIS Package has been executed";

 

    }

The key is this sql command:

exec msdb.dbo.sp_start_job N’LoadDNRShows’

“exec” is the T-SQL command to execute a stored procedure. “sp_start_job” is the stored procedure that ships with SQL Server in the MSDB system database. This stored procedure will invoke any job stored on the server. In this case, it invokes the job “LoadDNRShows”, which as we setup will run an SSIS package.

Launch the application, and click the first button. Now jump over to SQL Server Management Studio and run this query:

select * from dbo.staging_DNRShows;

select * from dbo.DNRShows;

You should see the first query bring back rows, while the second has nothing. Now return to the app and click the “Execute SSIS” button. If all went well running the query again should now show no rows in our first query, but many nicely processed rows in the second. Success!

A few thoughts about xp_cmdshell

In researching this article I saw many references suggesting writing a stored procedure that uses xp_cmdshell to invoke dtexec. DTEXEC is the command line utility that you can use to launch SSIS Packages. Through it you can override many settings in the package, such as connection strings or variables.

xp_cmdshell is a utility built into SQL Server. Through it you can invoke any “DOS” command. Thus you could dynamically generate a dtexec command, and invoke it via xp_cmdshell.

The problem with xp_cmdshell is you can use it to invoke ANY “DOS” command. Any of them. Such as oh let’s say “DEL *.*” ? xp_cmdshell can be a security hole, for that reason it is turned off by default on SQL Server, and many DBA’s leave it turned off and are not likely to turn it on.

The techniques I’ve demonstrated here do not rely on xp_cmdshell. In fact, all of my testing has been done on my server with the xp_cmdshell turned off. Even though it can be a bit of extra work, setting up the job, etc., I still advise it over the xp_cmdshell method for security and the ability to use it on any server regardless of its setting.

In Closing

That seemed like a lot of effort, but can lead to some very powerful solutions. SSIS is a very powerful tool designed for processing large amounts of data and transforming it. In addition developing under SSIS can be very fast due to its declarative nature. The sample package from this article took the author less than fifteen minutes to code and test.

When faced with a similar task, consider allowing SSIS to handle the bulk work and just having your .Net application invoke your SSIS package. Once you do, there are no ends to the uses you’ll find for SQL Server Integration Services.

SSIS For Developers at CodeStock 2009

At the 2009 CodeStock event I am presenting SQL Server Integration Services for Developers. This class will demonstrate tasks commonly done by VB.Net or C# developers within SQL Server Integration Services.

The sample project and documentation for the lab can be found on the code gallery site at http://code.msdn.microsoft.com/SSISForDevs .

Arcane Fun Fridays

WHEW! All of this WPF / XAML sure has been a lot of fun. But I think it’s time to come up for air and see what else is happing out there in Dot Net land.

Alabama Code Camp is coming up in just a little over a week, Saturday October 6th to be exact. Still plenty of time to register and even just a bit of time if you want to get in on the Silverlight programming contest. First prize for that is a Zune! http://www.alabamacodecamp.com/home.html

devLink, the large conference for a cheap price comes up right afterward in Nashville, Friday and Saturday October 12th and 13th. http://www.devlink.net/ . You can tell I’ll be there, my name’s on the front page as a winner of a Barnes and Nobel gift card (look for the dude from AL !)

(By the way, anyone know of a good dog repellent? My nephew is coming to house sit and is bringing Marshmallow and Buttercup, his twin Dobermans along because I have a big back yard they can play in. Last time though they ate the garden hose, chewed the handle off my shovel, and bit through one of my lawnmower tires.)

There’s a new add-on for SQL Server Management Studio I’m eager to try out. It’s still in Beta but looks promising. It was blogged about at http://weblogs.sqlteam.com/mladenp/archive/2007/09/20/SSMS-Tools-Pack—an-add-in-for-SQL-Management-Studio.aspx or you can download it directly at http://www.ssmstoolspack.com/ .

If you are a fan of NUnit, you’ll appreciate the new xUnit. Read James’ announcement at http://jamesnewkirk.typepad.com/posts/2007/09/announcing-xuni.html .

In a recent Dot Net Rocks episode, Carl Franklin announced they would be taking over Shrinkster.com. Shrinkster has been down due to spam abuse, as soon as Carl gets everything setup we’ll be able to go back to using short links again!

Speaking of Dot Net Rocks, I especially enjoyed show 274, where the new features of VB.Net and C# for the 2008 release were discussed. Entertaining and lots of good tidbits. I think my favorite feature so far has got to be C#’s extension methods. http://www.dotnetrocks.com/default.aspx?showNum=274

During my long drive to the Tallahassee Code Camp last week, I put together a podcast theme session, and copied a bunch of related podcasts onto my cheapo SanDisk mp3 player. This time I went with a “Millenator” theme and got all the episodes of Dot Net Rocks that Mark Miller appeared on. Good stuff, lots of thoughtful material combined with some humor. Next time you go on a trip, copy a bunch of past episodes of your favorite podcast that are in the same theme and make that long drive go much quicker.

There have been several updates to the world’s greatest Visual Studio Add-In, CodeRush, over the last few weeks ( http://www.devexpress.com/Home/Announces/CodeRush25.xml ). Apparently Mark Miller and the boys have been busy! If you’re not on 2.5.4 go update yours today.

Speaking of Mark Miller, I love his intro slide for his VSLive session coming up in LasVegas. Take a look, pure genius. http://www.doitwith.net/2007/09/11/MyLastVSLiveSessionEver.aspx

A final note, between getting ready for Alabama Code Camp and going to devLink my blogging may get spotty for the next few weeks, bear with me and I’ll have full reports from both code camps and lots of fun new stuff to share.

Arcane Add-Ins: KNOCKS Solutions VS 2005 Add-In

It’s been a while since I talked about Visual Studio add-ins, so when I ran across KNOCKS Solutions “Knocks VS2005 Add-In”, I knew I’d found the perfect subject. Available at http://www.knocks-solutions.com/VS2005Addins.aspx?sm=idVS20052 , this rather full featured add-in offers many utilities.

First is a personal clipboard, that allows you to store and retrieve up to 9 different items. And they’re persistent, they will stay between VS sessions.

Next is a code snippets manager. This seems a bit redundant in light of VS 2005’s snippets, but I can see it being very useful during a presentation.

In addition to the code snippets is a Notes module. I’ve often wished for the ability to store quick notes to use in a presentation, so this is a handy module.

Up next is the one tool of theirs I have a beef with, the “Re-arrange code” tool. I like the idea of being able to re-arrange my code. Often I’m working on some public method, and realize I need a private helper method and, being in a hurry and lazy, will drop it right under the public method. Later I’ll move it around, which was why I was excited to see this tool.

Sadly, it has a really bad side affect, it strips out any regions you’ve put into your code, and it strips out any comments that might lie between methods (I often put a comment header right above my method, instead of in the method.) When Knocks rearranges your code all of that goes into the bit bucket, making the tool useless in all but (perhaps) the very earliest stages of coding. I would have thought it possible to add regions to the tool as well, and allow code rearranging within a region, between regions, or even to move regions around. Perhaps this will be addressed in the next version (he said, ever full of eternal hope).

There is a nifty zip tool that will zip your entire project, handy for quick backups. Also is a tool that embraces the concept of a favorites for projects. Another tool is one I wonder why no one did sooner, a keyword search on Google (or other configurable search engine, like MS Live). This is one I’ll be using often.

Also included is a simple “Data Object” generator. You bring it up and enter a few property names and types and Knocks will create the basic class for you. While I have seen more full featured code generators, I appreciate the basic simplicity of this one, not to mention the price (which I’ll get to shortly).

knocks01The final two tools I’ll mention are my favorites. First is a Design Explorer. This adds an explorer window (I put mine with the Solution Explorer area) to your display. In it are all the controls for your current form. Clicking on the control not only makes it the active control in the designer, but displays the properties in the lower half of the Design Explorer window.

 

 

 

 

 

 

 

 

 

knocks02The other tool is the Code Explorer. It displays a tree of your current code module. Double clicking on the code element will take you to it in the code window.

I’ve seen other add-ins with code windows, this one seems equal in functionality with many others similarly priced.

 

 

 

 

Oh, did I forget to mention the price? It’s free. Yes, FREE. Knocks has packed a lot of functionality into this add-in, and the fact it’s free makes it well worth the time to download and learn.

Thanks for coming!

I just wanted to thank everyone who took the effort to come to the presentation I did tonight on SQL Server Compact Edition at the Birmingham Dot Net Users Group ( http://www.bugdotnet.com ). It was a small crowd but very engaged, all in all a very enjoyable evening for everyone.

As promised, here is a link to the Power Point presentation (in PDF format) I used during the presentation:

SSCE presentation for BUG.Net group

The complete C# and VB.Net code samples were posted April 13th, 2007:

http://arcanecode.wordpress.com/2007/04/13/sql-server-compact-edition-with-c-and-vbnet/

And finally, the series of posts I mentioned on system Views started with this post on April 16th, 2007:

http://arcanecode.wordpress.com/2007/04/16/system-views-in-sql-server-compact-edition-tables/

If you want to see all of my SSCE posts, simply click the SQL Server Compact Edition tag over in the categories area, or use this link:

http://arcanecode.wordpress.com/tag/sql-server-compact-edition/
Thanks again to  everyone, I had a great time and hope came away with a better understanding of SQL Server Compact Edition.

Arcane Thoughts: Thinking Inside the Box

Today I’m at an offsite meeting, talking about a new project. I won’t get into too many specific details, but we have to pull data from a web service and update an Oracle database. We can use a vendor provided Java API that runs on a Unix box to do the updates, or we can write to the database directly as long as we handle integrity issues.

So we spent the day brainstorming, to come up with possible solutions. Here is the list of contenders:

  • Write a Java app that runs on Unix that uses the vendor API’s.
  • Write a Java app that runs on Unix and updates the database directly.
  • Write a C# app that runs on a Windows Server, where a Batch Scheduler will kick it off.
  • Write a C# app that runs as a Windows Service under XP (we haven’t taken the Vista plunge at work yet).
  • Write a SQL Server Integration Services package that is run by the SQL Server job scheduler. It will use the web service as the input and update Oracle.
  • Use one of the above methods to pull the data then let BizTalk process it from there.

We haven’t made a decision yet, and my point was not so much to talk about the pro’s and cons of each solution. Instead it’s to get you to think creatively when it comes to new solutions for your company. Sitting down and cranking out yet another C# or VB.Net app may not always be the best approach. You may have a task you can accomplish with less code by using SQL Server Integration Services. Or maybe BizTalk might fit the bill.

All too often as programmers our first answer to any solution is to pull up Visual Studio and start grinding out code. Take some time though, to explore a few other options. There’s a rich set of tools out there, and sometimes the best solution to a programming problem may not be programming.

Alabama Code Camp Slides

Here’s a copy of the Slides I’ll be using (or used depending on when you read this) at Alabama Code Camp 4 in Mobile AL. They are in PDF format.

Getting Started with SQL Server Compact Edition – Code Camp Slides

Thanks to all who came to my presentation!

Follow

Get every new post delivered to your Inbox.

Join 101 other followers