Import Filters in PowerPivot

PowerPivot has the ability to import millions of rows of data into Excel 2010 for purposes of analyzing, slicing and dicing. However, even though you can import vast amounts of data you may not always want to. There are many reasons for this.

You may wish to limit the amount of data for security reasons. Perhaps bringing in all of the data may put an unneeded strain on the server. Most likely though, is you simply do not need all of the data. As an analyst you may be interested in only the data for a segment of your organization, such as a single plant or department.

Fortunately filtering data is an easy task. If you are not familiar with importing data into PowerPivot, may I suggest you first review my step by step posted yesterday.

Launch Excel 2010 and go to the PowerPivot tab, then launch the PowerPivot window. Select the "From Database" to start the import process, and for this example we’ll use SQL Server. Now, like in my previous blog post enter your credentials to connect to a SQL Server. For this example we’ll just pick one table, SalesOrderDetail from the AdventureWorksLT2008 database.

With the SalesOrderDetail picked, look in the lower right side of the dialog. You will see a button labeled "Preview & Filter".

clip_image001

When you click that button you should see a new dialog appear.

clip_image002

Here you can see a subset of your data, only the first few rows. It’s enough to give you an idea of what the data looks like, but it won’t show you every single row. Considering the fact that PowerPivot is capable of importing millions of rows, this is probably a good thing.

Within this dialog we can do some pretty powerful things. Let’s start by eliminating a column we don’t need in the data we want to import. Scroll to the right until you see the rowguid column and uncheck the box next to the column header, as you see below.

clip_image003

By unchecking this box this column will not be imported into our Excel 2010 PowerPivot table.

Recall though that the name of this dialog was Preview & Filter. We can also do some review of our data to ensure it’s what we want. Let’s say we want to look to see the range of values for our line totals. Click the downward facing triangle button to the right of the LineTotal column header. A drop down menu will appear. Select "Sort Smallest to Largest"

clip_image004

Scrolling through the data you’ll notice that the data has indeed been sorted. Also note the menu icon to the right of the column name changes to indicate a sort has been applied to this column. (Remember though it’s not showing all rows, just the first few sorted in order.) An important thing to note tough, this sorting applies only to the data as you see it in the Preview & Filter area, once you click the Finish button on the Table Import Wizard the sort is removed. While the preview options in this dialog are not saved, filters are. If you had pressed Finished, you’d have seen that rowguid is not included in the result set. But don’t click Finish yet, we’re not quite done filtering.

For our next filter, let’s decide for purposes of this report we are only interested in large orders. We’ll define large as "Line Total greater than 1,000 dollars". Once again open the menu to the right of the LineTotal column header and select "Number Filters". A pop out menu will appear, from it select "Greater Than…"

clip_image005

When the Custom Filter pops up, enter 1000 next to the is greater than box and click OK.

clip_image006

Clicking OK will reveal lesser amounts have been removed. Other filters besides numeric filters are available. For text data, you have "is equal to" and "is not equal to" available. Date filters work the same, having the "is equal to" and "is not equal to" available. I’m hopeful that for date types further functionality will be added in the future, such as date ranges or "is greater/less than" types of functionality.

Let’s click the OK button on the Preview & Filters window to return to the Table Import Wizard. You’ll now see an Applied filters link in the row with the table name, click it to see what filters are in effect.

clip_image007

When you click the link you’ll see:

clip_image008

While you cannot edit this information, it is nice to see it all in one location. Click OK to close and return to the previous window.

Click Finish to complete the import. You will see the data you asked for, only rows with LineTotal greater than 1000 and without the rowguid column.

Hopefully you’ve seen how powerful the filtering tools included with PowerPivot are. Using them you can remove unwanted rows and columns, limiting not only the amount of data you have to pull across the network but that you have to store locally in the PowerPivot Excel 2010 spreadsheet. Limiting your data will ensure only the rows required for the analysis are included, saving time and enhancing security.

SharePoint Saturday Birmingham – Intro to Microsoft PowerPivot

spsbham Last Saturday I not only attended but presented at SharePoint Saturday Birmingham. First I have to give Terry Webster a shout out, he did a fantastic job organizing and running the event. I learned a lot, although I admit I felt a bit like a fish out of water, being a SQL Server guy in the SharePoint pond.

Thanks too to all the volunteers who helped run the show, my co-presenters, and the many vendors who attended. Without the support of all those folks this event would not have been possible.

I also want to thank everyone who attended my PowerPivot presentation. Last Friday I did a rather lengthy overview that encapsulated much of my presentation; here are the slides that you saw at the event.

I also want to give you a quick overview of the steps I did in my demo so you can follow along. First you’ll need the bits. The PowerPivotPro has a good blog entry on where to download them from:

http://powerpivotpro.com/2009/11/18/powerpivot-ctp3-beta-download-links/

All you need are the first two items, Office 2010 Beta 2 and PowerPivot for Excel 2010 to reproduce what I did in the demo I presented.

Next you’ll need some data. I had a SQL Server instance installed and used the AdventureWorks2008 LT database available at CodePlex. The LT version is the light weight version of the full AdventureWorks sample, I chose it for it’s simplicity.

http://msftdbprodsamples.codeplex.com/

If you don’t have SQL Server on your box you can use alternate data, or check with your friendly neighborhood DBA. Many have AdventureWorks installed on a test box and might be able to help you out.

OK, so to get this far you’ve downloaded and installed everything, and have access to a dataset. Ready? Let’s go.

1. Launch Excel 201, and click the PowerPivot tab.

2. Click the PowerPivot window button.

step01

3. Click the “From Database” button on the Home tab. In the menu select “From SQL Server”

step02

4. The Table Import Wizard will appear. In “Server name” field, click the drop down and pick your database server. Leave it set to the “Use Windows Authentication” (unless told otherwise if you are using a database provided by a DBA). Under Database name, pick AdventureWorksLT2008. It’s probably a good idea to click the Test Connection button just to make sure everything is correct. If so press Next to continue.

step03

5. On the next screen, leave it at the default of “Select from a list of tables and views to choose the data to import” and click Next.

6. On the next screen pick these tables: SalesOrderDetail, SalesOrderHeader, Product, Product Category, Address, Customer, CustomerAddress. Note the “Preview & Filter” button. In the future you may wish to experiment with this to filter the data you are importing, but it’s not needed for this demo. Click Finish to move on.

step04

7. On the next screen you’ll see it importing the tables. When it gives you the Success message click Close.

8. (Note, this next step differs just slightly from the live presentation, in it I added a column but did it in a different place.) Click on the SalesOrderDetail tab. Click on a blank cell under the Add Column out to the right of the data.

9. In this example we’ll decide that 80 percent of our Line Total is cost, which means the other 20% is profit. We’ll calculate the profit by clicking in the fx bar and typing =[LineTotal]*.20

step05

10. Right click on the “CalculatedColumn1” column header and select “Rename column” from the menu. The column header will highlight and allow you to type over it. Change the name to Profit.

11. Click on PivotTable and pick “Single Pivot Table” in the drop down menu.

step06

12. In a pop up dialog, it will ask if you want a new or existing worksheet. Pick “New worksheet” and click OK.

13. Over on the right you will see the “Gemini Task Pane”. Gemini is the original code name for PowerPivot. To create a basic PowerPivot pivot table, follow these steps.

14. Under SalesOrderDetail, click on then drag the LineTotal and Profit fields to the Values area.

15. Under ProductCategory, drag the Name down to Row Labels.

16. Under Product drag the Name to the Row Labels area.

17. Under the Address, click and drag the CountryRegion, then StateProvince to the ColumnLabels area.

18. Drag the StateProvince to the Slicers Vertical area.

19. Your Gemini Task Pane should look something like:

step07

20. Your spreadsheet should look something like:

step08

At this point you have completed all of the steps as shown in the presentation. Now start playing. You can jump back to the PowerPivot data sheets by clicking the “PowerPivot window” button as shown in step 1. Back there you might try experimenting with a chart, or on the pivot table apply different effects and slicers.

Introducing Microsoft PowerPivot

What is PowerPivot? Well according to Microsoft:

“PowerPivot is Microsoft Self-Service Business Intelligence”

I can see from the glazed looks you are giving your monitor that was clear as mud. So let’s step back a bit and first define what exactly is Business Intelligence.

Business Intelligence

Business Intelligence, often referred to as simply “BI”, is all about taking data you already have and making sense of it. Being able to take that information and turn it from a raw jumble of individual facts and transform it into knowledge that you can take informed actions on.

In every organization there is already someone who is doing BI, although they may not realize it. Microsoft (and many IT departments) refer to this person as “that guy”. A power user, who grabs data from anyplace he (or she) can get it, then uses tools like Excel or Access to slice it, dice it, and analyze it. This person might be an actual Business Analyst, but more often it’s someone for who BI is not their main job. Some common examples of people doing their own BI today are production managers, accountants, engineers, or sales managers, all who need information to better do their job. Let’s look at an illustration that will make it a bit clearer.

In this example, put yourself in the role of a sales manager. You have gotten IT to extract all of your sales orders for the last several years into an Excel spreadsheet. In order to determine how well your sales people are doing, you need to measure their performance. You’ve decided that the amount sold will be a good measure, and use Excel to give you totals.

IntroEx01

In BI terms, the column “Total Sales” is known as a measure, or sometimes a fact, as it measures something, in this case the sales amount. The grand total sales amount is often called an aggregation, as it totals up the individual rows of data that IT gave us. But now you might be wondering why Andy’s sales are so low? Well, now you want to dig deeper and look at sales by year.

IntroEx02

In BI terms, the names of the sales people are a dimension. Dimensions are often either a “who” (who sold stuff) or a “what” (what stuff did we sell). Places (where was it sold) and dates (when was it sold) are also common dimensions. In this case the sales dates across the top (2007, 2008, 2009) are a date dimension. When we use two or more dimensions to look at our measures, we have a pivot table.

Now we can see a picture emerging. It’s obvious that Andy must have been hired as a new salesperson in late 2008, since he shows no sales for 2007 and very small amount in 2008. But for Paul and Kimberly we can look at something called trends in the BI world. Kimberly shows a nice even trend, rising slowly over the last three years and earns a gold star as our top performer.

By being able to drill down into our data, we spot another trend that was not readily obvious when just looking at the grand totals. Paul has been trending downward so fast the speed of light looks slow. Clearly then we now have information to take action on, commonly known as actionable intelligence.

So remind me, why do we need PowerPivot?

As you can see in the above example, “that guy” in your company clearly has a need to look at this data in order to do his job. Not only does he need to review it, he also has the issue of how to share this information with his co-workers. Unfortunately in the past the tools available to “that guy” have had some drawbacks. The two main tools used by our analyst have been either Excel, or a complete BI solution involving a data warehouse and SQL Server Analysis Services.

Excel’s main limitations center around the volume of data needed to do good analysis. Excel has limits to the number of rows it can store, and for large datasets a spreadsheet can consume equally large amounts of disk space. This makes the spreadsheet difficult to share with coworkers. In addition mathematical functions like aggregations could be slow. On the good side, Excel is readily available to most workers, and a solution can be put together fairly quickly.

A full blown BI solution has some major benefits over the Excel solution. A data warehouse is created, and then SQL Server Analysis Services (often abbreviated as SSAS) is used to precalculate aggregations for every possible way an analyst might wish to look at them. The data is then very easy to share via tools like Excel and SQL Server Reporting Services. While very robust and powerful solution, it does have some drawbacks. It can take quite a bit of time to design, code, and implement both the data warehouse and the analysis services pieces of the solution. In addition it can also be expensive for IT to implement such a system.

Faster than a speeding bullet, more powerful than a locomotive, it’s PowerPivot!

PowerPivot combines the best of both worlds. In fact, it’s not one tool but two: PowerPivot for Microsoft Excel 2010, and PowerPivot for SharePoint 2010. What’s the difference you ask? Good question.

PowerPivot for Microsoft Excel 2010

PowerPivot acts as an Add-on for Excel 2010, and in many ways is quite revolutionary. First, it brings the full power of SQL Server Analysis Services right into Excel. All of the speed and power of SSAS is available right on your desktop. Second, it uses a compression technology that allows vast amounts of data to be saved in a minimal amount of space. Millions of rows of data can now be stored, sorted, and aggregated in a reasonable amount of disk space with great speed.

PowerPivot can draw its data from a wide variety of sources. As you might expect, it can pull from almost any database. Additionally it can draw data from news feeds, SQL Server Reporting Services, other Excel sheets, it can even be typed in manually if need be.

Another issue that often faces the business analyst is the freshness of the data. The information is only as good as the date it was last imported into Excel. Traditionally “that guy” only got extracts of the database as IT had time, since it was often a time consuming process. PowerPivot addresses this through its linked tables feature. PowerPivot will remember where your data came from, and with one simple button click can refresh the spreadsheet with the latest information.

Because PowerPivot sits inside Microsoft Excel, it not only can create basic pivot tables but has all the full featured functionality of Excel at its disposal. It can format pivot tables in a wide array of styles, create pivot charts and graphs, and combine these together into useful dashboards. Additionally PowerPivot has a rich set of mathematical functionally, combining the existing functions already in Excel with an additional set of functions called Data Analysis eXpressions or DAX.

PowerPivot for SharePoint 2010

PowerPivot for Excel 2010 clearly solves several issues around the issue of analysis. It allows users to quickly create spreadsheets, pivot tables, charts, and more in a compact amount of space. If you recall though, creation was only half of “that guys” problem. The other half was sharing his analysis with the rest of his organization. That’s where PowerPivot for SharePoint 2010 comes into play.

Placing a PowerPivot Excel workbook in SharePoint 2010 not only enables traditional file sharing, but also activates several additional features. First, the spreadsheet is hosted right in the web browser. Thus users who might not have made the transition to Excel 2010 can still use the PowerPivot created workbook, slicing and filtering the data to get the information they require.

Data can also be refreshed on an automated, scheduled basis. This ensures the data is always up to date when doing analysis. Dashboards can also be created from the contents of a worksheet and displayed in SharePoint. Finally these PowerPivot created worksheets can be used as data sources for such tools as SQL Server Reporting Services.

Limitations

First, let me preface this by saying as of this writing all of the components are either in CTP (Community Technology Preview, a pre-beta) or Beta state. Thus there could be some changes between now and their final release next year.

To use the PowerPivot for Excel 2010 components, all you have to have is Excel 2010 and the PowerPivot add-in. If you want to share the workbook and get all the rich functionality SharePoint has to offer, you’ll have to have SharePoint 2010, running Excel Services and PowerPivot 2010 Services. You’ll also have to have SQL Server 2008 R2 Analysis Services running on the SharePoint 2010 box. Since you’ll have to have a SQL Server instance installed to support SharePoint this is not a huge limitation, especially since SSAS comes with SQL Server at no extra cost.

One thing I wish to make clear, SharePoint 2010 itself can run using any version of SQL Server from SQL Server 2005 on. It is the PowerPivot service that requires 2008 R2 Analysis Services.

One other important item to note: at some point the load upon the SharePoint 2010 server may grow too large if especially complex analysis is being done. Fortunately SharePoint 2010 ships with several tools that allow administrators to monitor the load and plan accordingly. At the point where the load is too big, it is a clear indication it’s time to transition from a PowerPivot solution to a full BI solution using a data warehouse and SQL Server Analysis Services.

What does PowerPivot mean for business users?

For business users, and especially “that guy”, it means complex analysis tools can be created in a short amount of time. Rich functionality makes it easier to spot trends and produce meaningful charts and graphs. It also means this information can be shared with others in the organization easily, without imposing large burdens on the corporate e-mail system or local file sharing mechanisms.

No longer will users be dependent on IT for their analysis, they will have the power to create everything they need on their own, truly bringing “self service BI” to fruition.

What does PowerPivot mean for Business Intelligence IT Pros?

The first reaction many BI developers have when hearing about PowerPivot is “oh no, this is going to put me out of a job!” Far from it, I firmly believe PowerPivot will create even more work for BI Professionals like myself.

As upper management grows to rely on the information provided by PowerPivot, they will also begin to understand the true value BI can bring to an organization. Selling a new BI solution into an organization where none currently exists can be difficult, as it can be hard to visualize how such a solution would work and the value it brings. PowerPivot allows BI functionality to be brought into an organization at a low development cost, proving the value of BI with minimal investment. Thus when there is a need to implement a larger, traditional BI project those same managers will be more forthcoming with the dollars.

Second, as users pull more and more data, they are going to want that data better organized than they will find in their current transactional business systems. This will in turn spur the need to create many new data warehouses. Likewise the IT department will also want data warehouses created, to reduce the load placed on those same transactional business systems.

I also foresee PowerPivot being used by BI Pros themselves to create solutions. The database structure of many transactional database systems can be difficult to understand even for experienced IT people, much less users. BI Pros can use PowerPivot to add a layer of abstraction between the database and the users, allowing business analysts to do their job without having to learn the complexity of a database system.

BI Pros can also use PowerPivot to implement quick turnaround solutions for customers, bringing more value for the customer’s dollar. When a BI Pro can prove him (or her) self by providing rich functionality in a short time frame it’s almost always the case they are brought back in for multiple engagements.

PowerPivot also provides great value to BI Pros who are employed full time in an enterprise organization. They can create solutions much quicker than before, freeing them up to do other valuable tasks. In addition PowerPivot solutions can provide a “stop gap” solution, pushing the date at which the organization needs to spend the dollars for a full blown BI solution and allowing IT to plan better.

Finally I see great value in PowerPivot as a prototyping tool for larger BI projects. Now users can see their data, interact with it, analyze it, and ensure the required measures and dimensions are present before proceeding with the larger project.

I’ll reiterate, if anything I believe PowerPivot will create an explosion of work for the Business Intelligence Professional.

Where can I learn more?

Well right here for one. I have become quite interested in PowerPivot since seeing it at the SQL PASS 2009 Summit. I think it will be a valuable tool for both myself and my customers. This will be the first of many blog posts to come on PowerPivot. I am also beginning a series of presentations on PowerPivot for local user groups and code camp events. The first will be Saturday, November 21st 2009 at the SharePoint Saturday in Birmingham Alabama, but there will be many more to come. (If you’d like me to come speak at your group just shoot me an e-mail and we’ll see what we can arrange.)

There’s also the PowerPivot site itself:

I’ve also found a small handful of blogs on PowerPivot, listed in no particular order:

Summary

Thanks for sticking with me, I know this was a rather long blog post but PowerPivot has a lot of rich functionality to offer. While PowerPivot is still in the CTP/Beta stage as of this writing, I see more and more interest in the community, which will continue to grow as PowerPivot moves closer to release. I hope this post has set you off on the right step and you’ll continue to come back for more information.

Populating a Kimball Date Dimension

I’m a big fan of the Kimball method of Data Warehousing. A common task most of us setting up a new Data Warehouse face is creating a Date Dimension. In their book, “The Microsoft Data Warehouse Toolkit With SQL Server 2005 and the Microsoft Business Intelligence Toolset”, they have an example of a good date dimension table in their books sample code. My complaint though was not so much with the layout itself, I liked it and found it fairly complete. Instead it was the method they chose to load it. They used an Excel spreadsheet, then a SQL Server Integration Services package to read the Excel file and load the date dimension table.

To me this approach has a couple of drawbacks. First, if you are doing all the loading on the server itself, you may not have Excel loaded. Thus you may be faced with the headache of creating the sheet then figuring out how to get it to a location the server can read. Second, when you go to add more dates in the future, you have to go into the spreadsheet and reset everything, removing what was there before. It can also be quite a headache to go back several years from know and find both SSIS packages and that Excel spreadsheet. Plus after that time changes may be made to both Excel and SSIS that make that solution no longer workable. Finally quite often it’s a DBA setting up the warehouse, and I’ve found there are still a few DBAs who are uncomfortable relying on SSIS, although I’m happy to say that number continues to shrink.

A T-SQL solution was clearly, to me anyway, the superior answer for both ease of use and long term stability. I assumed that as popular as the Kimball method is, someone would have already created a routine to load their style of date dimension, but some Binging and Googling around proved fruitless. I did find some code for loading some very simple date dimensions, but nothing as complete as the Kimball design. So, relishing a good coding challenge, I rolled up my sleeves and went to work. Below is the fruit of my labor, a script for loading a Kimball like date dimension. All you have to do is set the begin and end dates, indicate the offset for your fiscal year, and let ‘er rip. You can easily go back and add more dates by just adjusting the begin and end times.

A few things you should note. First, I did make a few slight modifications to the standard Kimball date dimension table as found in the previously mentioned book. They have a column titled “DateName” which holds the date as a string in YYYY/MM/DD format. As long as I was putting the date in, I decided to add string versions of the date for the US and Europe. These are in MM/DD/YYYY and DD/MM/YYYY formats and the columns are named “DateNameUS” and “DateNameEU” (for European Union) respectively.

Their table also had an audit key, used presumably by the SSIS package. I didn’t really see the need for an audit key for a date table, so I changed it to an identity column so I could have a secondary surrogate key if I needed it, just something to count the number of date rows easily and track the order they were inserted in.

One final, but very important distinction. I was in a post conference session taught by Erik Veerman at SQL PASS 2009. In it he mentioned using Dim and Fact schemas, thus you’d have [Dim].[Date] instead of [dbo].[DimDate]. I liked the idea as it was something I’d been considering myself, so in this version that is what I did. If you use the more traditional naming format of dbo.DimDate you’ll need to tweak the code.

Below is the code to load the Date Dimension table, which is my creation. Under it I placed my modified version of the Kimball Date Dimension table. It’s core code came from the sample code mentioned in the first paragraph then was modified by me. I include it for completeness.

Code Sample 1 – Script to load a date dimension.

/*---------------------------------------------------------------------------*/
/* Loads a Date Dimension                                                    */
/*---------------------------------------------------------------------------*/

-- A few notes, this code does nothing to the existing table, no deletes
-- are triggered before hand. Because the DateKey is uniquely indexed,
-- it will simply produce errors if you attempt to insert duplicates.
-- You can however adjust the Begin/End dates and rerun to safely add
-- new dates to the table every year.
--
-- If the begin date is after the end date, no errors occur but nothing
-- happens as the while loop never executes.

SET NOCOUNT ON -- turn off all the 1 row inserted messages

-- Hold our dates
DECLARE @BeginDate DATETIME
DECLARE @EndDate DATETIME

-- Holds a flag so we can determine if the date is the last day of month
DECLARE @LastDayOfMon CHAR(1)

-- Number of months to add to the date to get the current Fiscal date
DECLARE @FiscalYearMonthsOffset INT   

-- These two counters are used in our loop.
DECLARE @DateCounter DATETIME    --Current date in loop
DECLARE @FiscalCounter DATETIME  --Fiscal Year Date in loop

-- Set the date to start populating and end populating
SET @BeginDate = '01/01/2008'
SET @EndDate = '12/31/2010' 

-- Set this to the number of months to add to the current date to get
-- the beginning of the Fiscal year. For example, if the Fiscal year
-- begins July 1, put a 6 there.
-- Negative values are also allowed, thus if your 2010 Fiscal year
-- begins in July of 2009, put a -6.
SET @FiscalYearMonthsOffset = 6

-- Start the counter at the begin date
SET @DateCounter = @BeginDate

WHILE @DateCounter <= @EndDate
      BEGIN
            -- Calculate the current Fiscal date as an offset of
            -- the current date in the loop
            SET @FiscalCounter = DATEADD(m, @FiscalYearMonthsOffset, @DateCounter)

            -- Set value for IsLastDayOfMonth
            IF MONTH(@DateCounter) = MONTH(DATEADD(d, 1, @DateCounter))
               SET @LastDayOfMon = 'N'
            ELSE
               SET @LastDayOfMon = 'Y'  

            -- add a record into the date dimension table for this date
            INSERT  INTO [Dim].[Date]
                    (
                      [DateKey]
                    , [FullDate]
                    , [DateName]
                    , [DateNameUS]
                    , [DateNameEU]
                    , [DayOfWeek]
                    , [DayNameOfWeek]
                    , [DayOfMonth]
                    , [DayOfYear]
                    , [WeekdayWeekend]
                    , [WeekOfYear]
                    , [MonthName]
                    , [MonthOfYear]
                    , [IsLastDayOfMonth]
                    , [CalendarQuarter]
                    , [CalendarYear]
                    , [CalendarYearMonth]
                    , [CalendarYearQtr]
                    , [FiscalMonthOfYear]
                    , [FiscalQuarter]
                    , [FiscalYear]
                    , [FiscalYearMonth]
                    , [FiscalYearQtr]
                    )
            VALUES  (
                      ( YEAR(@DateCounter) * 10000 ) + ( MONTH(@DateCounter)
                                                         * 100 )
                      + DAY(@DateCounter)  --DateKey
                    , @DateCounter -- FullDate
                    , CAST(YEAR(@DateCounter) AS CHAR(4)) + '/'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(mm, @DateCounter) AS CHAR(2))), 2) + '/'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(dd, @DateCounter) AS CHAR(2))), 2) --DateName
                    , RIGHT('00' + RTRIM(CAST(DATEPART(mm, @DateCounter) AS CHAR(2))), 2) + '/'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(dd, @DateCounter) AS CHAR(2))), 2)  + '/'
                      + CAST(YEAR(@DateCounter) AS CHAR(4))--DateName
                    , RIGHT('00' + RTRIM(CAST(DATEPART(dd, @DateCounter) AS CHAR(2))), 2) + '/'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(mm, @DateCounter) AS CHAR(2))), 2)  + '/'
                      + CAST(YEAR(@DateCounter) AS CHAR(4))--DateName
                    , DATEPART(dw, @DateCounter) --DayOfWeek
                    , DATENAME(dw, @DateCounter) --DayNameOfWeek
                    , DATENAME(dd, @DateCounter) --DayOfMonth
                    , DATENAME(dy, @DateCounter) --DayOfYear
                    , CASE DATENAME(dw, '11/16/2009')
                        WHEN 'Saturday' THEN 'Weekend'
                        WHEN 'Sunday' THEN 'Weekend'
                        ELSE 'Weekday'
                      END --WeekdayWeekend
                    , DATENAME(ww, @DateCounter) --WeekOfYear
                    , DATENAME(mm, @DateCounter) --MonthName
                    , MONTH(@DateCounter) --MonthOfYear
                    , @LastDayOfMon --IsLastDayOfMonth
                    , DATENAME(qq, @DateCounter) --CalendarQuarter
                    , YEAR(@DateCounter) --CalendarYear
                    , CAST(YEAR(@DateCounter) AS CHAR(4)) + '-'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(mm, @DateCounter) AS CHAR(2))), 2) --CalendarYearMonth
                    , CAST(YEAR(@DateCounter) AS CHAR(4)) + 'Q' + DATENAME(qq, @DateCounter) --CalendarYearQtr
                    , MONTH(@FiscalCounter) --[FiscalMonthOfYear]
                    , DATENAME(qq, @FiscalCounter) --[FiscalQuarter]
                    , YEAR(@FiscalCounter) --[FiscalYear]
                    , CAST(YEAR(@FiscalCounter) AS CHAR(4)) + '-'
                      + RIGHT('00' + RTRIM(CAST(DATEPART(mm, @FiscalCounter) AS CHAR(2))), 2) --[FiscalYearMonth]
                    , CAST(YEAR(@FiscalCounter) AS CHAR(4)) + 'Q' + DATENAME(qq, @FiscalCounter) --[FiscalYearQtr]
                    )

            -- Increment the date counter for next pass thru the loop
            SET @DateCounter = DATEADD(d, 1, @DateCounter)
      END

SET NOCOUNT ON -- turn the annoying messages back on

-- Select all rows inserted for the final year as a sanity check
SELECT  *
FROM    [Dim].[Date]
WHERE DateKey > (YEAR(@EndDate) * 10000)

 

Code Sample 2 – Modified Kimball code to create a Date dimension.

/* Make sure the Dim schema exists */
IF SCHEMA_ID('Dim') IS NULL
   EXECUTE('CREATE SCHEMA [Dim] AUTHORIZATION [dbo]')
GO

/* Drop table DimDate */
IF EXISTS ( SELECT  *
            FROM    dbo.sysobjects
            WHERE   id = OBJECT_ID(N'[Dim].[Date]')
                    AND OBJECTPROPERTY(id, N'IsUserTable') = 1 )
   DROP TABLE [Dim].[Date]
GO

/* Create table DimDate */
CREATE TABLE [Dim].[Date]
       ( [DateKey] BIGINT NOT NULL
       , [FullDate] DATETIME NULL
       , [DateName] CHAR(11) NULL
       , [DateNameUS] CHAR(11) NULL   --US Date FORMAT, MM/DD/YYYY
       , [DateNameEU] CHAR(11) NULL   --European Union Date Format DD/MM/YYYY
       , [DayOfWeek] TINYINT NULL
       , [DayNameOfWeek] CHAR(10) NULL
       , [DayOfMonth] TINYINT NULL
       , [DayOfYear] SMALLINT NULL
       , [WeekdayWeekend] CHAR(7) NULL
       , [WeekOfYear] TINYINT NULL
       , [MonthName] CHAR(10) NULL
       , [MonthOfYear] TINYINT NULL
       , [IsLastDayOfMonth] CHAR(1) NULL
       , [CalendarQuarter] TINYINT NULL
       , [CalendarYear] SMALLINT NULL
       , [CalendarYearMonth] CHAR(7) NULL
       , [CalendarYearQtr] CHAR(7) NULL
       , [FiscalMonthOfYear] TINYINT NULL
       , [FiscalQuarter] TINYINT NULL
       , [FiscalYear] INT NULL
       , [FiscalYearMonth] CHAR(9) NULL
       , [FiscalYearQtr] CHAR(8) NULL
       , [AuditKey] BIGINT IDENTITY NOT NULL
       , CONSTRAINT [PK_DimDate] PRIMARY KEY CLUSTERED ( [DateKey] )
       )
ON     [PRIMARY]
GO

EXEC sys.sp_addextendedproperty @name = N'Table Type', @value = N'Dimension',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date'
EXEC sys.sp_addextendedproperty @name = N'View Name', @value = N'Date',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date'
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Date dimension contains one row for every day, beginning at 1/1/2000. There may also be rows for "hasn''t happened yet."',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date'
EXEC sys.sp_addextendedproperty @name = N'Used in schemas',
  @value = N'Sales (3 roles); Finance; Currency Rates; Sales Quota (2 roles; one at Cal Qtr level)',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date'

GO

INSERT  INTO [Dim].[Date]
        ( DateKey
        , FullDate
        , [DateName]
        , [DateNameUS]
        , [DateNameEU]
        , [DayOfWeek]
        , DayNameOfWeek
        , [DayOfMonth]
        , [DayOfYear]
        , WeekdayWeekend
        , WeekOfYear
        , [MonthName]
        , MonthOfYear
        , IsLastDayOfMonth
        , CalendarQuarter
        , CalendarYear
        , CalendarYearMonth
        , CalendarYearQtr
        , FiscalMonthOfYear
        , FiscalQuarter
        , FiscalYear
        , FiscalYearMonth
        , FiscalYearQtr
        )
VALUES  ( -1
        , NULL
        , 'Unknown'
        , 'Unknown'
        , 'Unknown'
        , NULL
        , 'Unknown'
        , NULL
        , NULL
        , 'Unknown'
        , NULL
        , 'Unknown'
        , NULL
        , 'N'
        , NULL
        , NULL
        , 'Unknown'
        , 'Unknown'
        , NULL
        , NULL
        , NULL
        , 'Unknown'
        , 'Unknown'
        )
GO

EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Surrogate primary key', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DateKey' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Full date as a SQL date (time=00:00:00)', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'FullDate' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Standard Date Format of YYYY/MM/DD', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DateName' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Standard US Date Format of MM/DD/YYYY', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DateNameUS' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Standard European Union Date Format of DD/MM/YYYY', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DateNameEU' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Number of the day of week; Sunday = 1', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DayOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Day name of week', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DayNameOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Number of the day in the month', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Number of the day in the year', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DayOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Is today a weekday or a weekend', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'WeekdayWeekend' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Week of year', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'WeekOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description', @value = N'Month name',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'MonthName' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Month of year', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'MonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Is this the last day of the calendar month?',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'IsLastDayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Calendar quarter', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Calendar year', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Calendar year and month', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'CalendarYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Calendar year and quarter', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'CalendarYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Fiscal month of year (1..12). FY starts in July',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalMonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Fiscal quarter', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Fiscal year. Fiscal year begins in July.',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'FiscalYear' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Fiscal year and month', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'FiscalYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'Fiscal year and quarter', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'FiscalYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Description',
  @value = N'What process loaded this row?', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'AuditKey' ;
EXEC sys.sp_addextendedproperty @name = N'FK To',
  @value = N'DimAudit.AuditKey', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'AuditKey' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'20041123', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DateKey' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'11/23/2004', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FullDate' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'23-Nov-2004', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DateName' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'1..7',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'Sunday',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DayNameOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'1..31',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'1..365',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'Weekday, Weekend', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'WeekdayWeekend' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'1..52 or 53', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'WeekOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'November', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'MonthName' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'1, 2, …, 12', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'MonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'Y, N',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'IsLastDayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'1, 2, 3, 4', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'2004',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'2004-01',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'2004Q1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'1, 2, …, 12', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalMonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'1, 2, 3, 4', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values', @value = N'2004',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'FiscalYear' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'FY2004-01', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Example Values',
  @value = N'FY2004Q1', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DateName' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DayNameOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'WeekdayWeekend' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'WeekOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'MonthName' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'MonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'IsLastDayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalMonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'FiscalYear' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'SCD  Type', @value = N'1',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DateKey' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'FullDate' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DateName' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'DayNameOfWeek' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'DayOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'WeekdayWeekend' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'WeekOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'MonthName' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'MonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'IsLastDayOfMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'CalendarYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalMonthOfYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalQuarter' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN', @level2name = N'FiscalYear' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearMonth' ;
EXEC sys.sp_addextendedproperty @name = N'Source System', @value = N'Derived',
  @level0type = N'SCHEMA', @level0name = N'Dim', @level1type = N'TABLE',
  @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'FiscalYearQtr' ;
EXEC sys.sp_addextendedproperty @name = N'Source System',
  @value = N'Derived in ETL', @level0type = N'SCHEMA', @level0name = N'Dim',
  @level1type = N'TABLE', @level1name = N'Date', @level2type = N'COLUMN',
  @level2name = N'AuditKey' ;
EXEC sys.sp_addextendedproperty @name = N'Comments',
  @value = N'In the form: yyyymmdd', @level0type = N'SCHEMA',
  @level0name = N'Dim', @level1type = N'TABLE', @level1name = N'Date',
  @level2type = N'COLUMN', @level2name = N'DateKey' ;
GO

SQL PASS Summit 2009 – Birds of a Feather Lunch

I’m currently at the SQL PASS Summit in Seattle Washington. At today’s Birds of a Feather lunch I’ll be acting as host for a table. Our subject is Full Text Searching and FileStreaming. If you’d like to learn more about either of these topics come on by, would love to meet you and see more of how you use these products in your environment.

TechMixer University – SSIS for Developers

In addition to help recruit speakers, I also had the privilege of speaking at TechMixer University 2009.

The slide deck and main demo can be found at my Code Gallery site:

https://code.msdn.microsoft.com/Release/ProjectReleases.aspx?ProjectName=SSISForDevs&ReleaseId=2883

The calling of SSIS from .Net demo can be found at:

http://code.msdn.microsoft.com/ssisfromnet

Thanks to everyone who attended TechMixer University. I look forward to seeing you next year!

SQL Saturday 25 Gainesville GA – October 10 2009

SQL Saturday 25 Logo

SQL Saturday 25 occurred in the lovely town of Gainesville GA on October 10th. At the event I did two presentations.

My first presentation of the day was Introduction to SQL Server Integration Services. The sample project, slide deck, and step by step instructions can be found at http://code.msdn.microsoft.com/introssis . In addition I also showed how to call SSIS from a .Net application. You can find that sample at http://code.msdn.microsoft.com/ssisfromnet

The second presentation is SQL Server Full Text Searching. You can find the slide deck in PDF format as well as sample code at http://code.msdn.microsoft.com/SqlServerFTS.

Thanks to Stu and the crew for a great event!

Using SSIS Package Configuration Values as Parameters to SQL Tasks

Last weekend I presented at SQL Saturday in Redmond WA. One attendee asked if it was possible to use values from the Package configuration file as parameters to a SQL Task. The answer was a definite yes, although I didn’t have a good example. This post aims to fix that.

Let’s start with the basic workflow. First you will need to create a variable, this variable will be used to pass the value from the config file to the SQL statement. Next you will need to establish a package configuration file, to hold the configured value. After that you will create an Execute SQL Task. In the input statement for the task, simply use ? for each parameter in the T-SQL statement. Finally you’ll click on the Parameter Mapping area of the SQL Task and map the variable to the position of each ?.

Now that you have a basic overview of where we’re going, let’s start this example by creating a simple table to update. Pick your favorite database, or create a test one, then run a simple script to create a table. I used my ArcaneCode database, which I use for samples, and created a table named TestParamToProcs in the dbo schema.

USE [ArcaneCode]
GO

/****** Object:  Table [dbo].[TestParamToProcs]    Script Date: 10/06/2009 21:45:19 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[TestParamToProcs](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [SomeText] [nvarchar](50) NULL,
CONSTRAINT [PK_TestParamToProcs] PRIMARY KEY CLUSTERED
(
    [ID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

GO

Now let’s create a new package. I started by opening BIDS (Business Intelligence Developer Studio), creating a solution and named it ConfigFileToSqlParams. I added a Shared Data Source to connect to my database. Next I renamed the default package to ParamToProcs.dtsx. I added the shared data source to the connection managers area of the package.

OK, that takes care of the basics. Next we need to setup the package for variables and a config file. Open up the variables window (click in the Control Surface area, and in the SSIS menu pick Variables.) Create a new variable, we’ll call it MyParam. Make it a string and insert the value of “X” into it.

Now right click on the Control Flow surface (all that yellow area in the Control Flow tab) and pick Package Configurations. Click the check box to “Enable package configurations”, then click the Add button at the bottom of the screen. You are now presented with the Package Configuration Wizard. The first screen is just a welcome screen, you can click past it. The next screen is the “Select Configuration Type” screen. Make sure the default of “XML configuration file” is selected in the drop down. Next under “Specify configuration files directly” enter the path and file name for your config XML file. I named mine ParamToProcsConfig.xml and clicked Next.

The next step in the wizard is the “Select Properties to Export” screen. Expand the Variables area in the tree, then the tree for MyParam, then Properties. Now make sure the only box checked is the one for “Value”.

1_PropsToExport

Click Next once your screen looks like the one above, and you’ll be at the completion screen. Give your package config a good name, I used MyParamConfig, and clicked Finish.

Now let’s drop 3 SQL Tasks into the Control Flow surface. The first task will simply delete all the rows in the table that was just created. That way we can run the package repeatedly. For all three tasks set the Connection to the shared source you added in the Connection Managers area. Now for the SQLStatement of our first task, we’ll use Direct Input and enter this SQL statement:

DELETE FROM dbo.TestParamToProcs

For the next SQL task we’ll insert a row into our table to manipulate. Use this for the SQL Statement property.

INSERT INTO dbo.TestParamToProcs (SomeText) VALUES (‘Hello’)

OK, those were fairly easy and straight forward. Now comes the fun part. For the next SQL Task, we’ll use this statement for the SQLStatement property.

UPDATE dbo.TestParamToProcs SET SomeText = ?

Note the use of the ? (question mark) for the value of the SomeText field. This is the parameter. Now we need to supply a value to this parameter. Over on the left, click the Parameter Mapping tab. Click Add and a new row will be placed in the grid. In the first column, Variable Name, from the drop down pick “User::MyParam”. Leave the Direction as “Input”. Since this is a string variable, we’ll need to change the Data Type column to “VARCHAR”. Finally we need to set the Parameter Name, For this we’ll enter a 0 (zero). We’ll leave the Parameter Size at the default of –1.

2_Params

When your dialog resembles this one, you can click OK. If you needed to use more than one parameter, just use ? for each place where a parameter would go in your T-SQL statement. Then use the values of 0, 1, 2 etc for the Parameter Name. 0 will map to the first ?, 1 to the second, and so on. OK at this point your SSIS package should look something like:

3_Package

If you now run the package, then jump into SQL Server Management Studio and run a “SELECT * FROM dbo.TestParamToProcs”, you should see one row with the SomeText column set to X. Great! Now lets give it one more test. We should change the config file, and run the package outside of BIDS.

First, open notepad or some other text editor, then open up the ParamToProcsConfig.XML file. Move to the <ConfiguredValue> tags, and change the X to something else. I used “ArcaneCode” as it was rather spiffy, but you can use your name, or your cat’s name if you like. Save the file and you can close your text editor.

Now let’s use dtexec to run our package. Open up a command window and navigate to the folder where your package is located. I then used the dtexecui utility to create a package execution command. I’d encourage you to play with this, but so we’re all on the same page here is the dtexec command line I came up with:

dtexec /FILE "D:\Presentations\SQL Server\ConfigFileToSqlParams\ConfigFileToSqlParams\ConfigFileToSqlParams\ParamToProcs.dtsx" /CHECKPOINTING OFF  /REPORTING EW

Of course you’ll want to change the path to wherever you have the ParamToProcs.dtsx package. You should be able to run the above command line and execute your package. If you the jump back to SSMS and rerun the “SELECT * FROM dbo.TestParamToProcs”, you should now see the row with the new value from the config file.

To summarize, the basic steps are:

1. Create the variable

2. Store the variable in a package configuration XML file.

3. Create a SQL Task. Use ? for the parameters in the T-SQL statement.

4. Map the variables to the parameters.

5. Run. Be happy.

And that’s all there is to it.

SQL Saturday Redmond – October 3 2009

I am fortunate enough to be able to give three presentations at Redmond WA’s SQL Saturday event. The first session is “Introduciton to Data Warehousing / Business Intelligence”. Here is the PDF slide deck for that presentation. (Right click and save as if you want to save a copy for later reference).

The second presentation is SQL Server Full Text Searching. You can find the slide deck in PDF format as well as sample code at http://code.msdn.microsoft.com/SqlServerFTS.

The final presentation of the day was Introduction to SQL Server Integration Services. The sample project, slide deck, and step by step instructions can be found at http://code.msdn.microsoft.com/introssis . In addition I also showed how to call SSIS from a .Net application. You can find that sample at http://code.msdn.microsoft.com/ssisfromnet .

SQL Server MVPs Help War Child International

SQL Server MVP Deep Divers CoverI’m proud to announce the new book SQL Server MVP Deep Dives has been released. You can find out more, as well as place your pre-order for the book at http://www.sqlservermvpdeepdives.com

I am proud for several reasons. First, I am a contributing author. If you look at Chapter 13, Full Text Searching, you’ll find my name, Robert C. Cain. This is my first work in print, and it was a great experience. I got a lot of great advice from the editors, fellow MVPs. It was also good as I got to do some editing myself. (To keep down costs we edited each others chapters.) In addition I got to work with the great folks at Manning and working through their publication process.

But I’m even more proud because all proceeds from the book go to War Child International. War Child International is a federation of charities devoted to helping children in war torn countries. They not only meet the basic needs of the kids, but work to give a message of peace, so when they grow up the cycle of violence will be broken.

The official book launch will take place at the PASS Summit, Nov. 2 to 5 in Seattle Washington. Manning promises to have plenty of the books in the Summit bookstore. Many of the MVPs, including myself, will be there and be glad to sign books for those interested.

To make the most of your donation though, placing your order at http://www.sqlservermvpdeepdives.com will get the most money to War Child. If you order now, you can access the early online version, and a printed copy will be mailed to you. This is a great chance to gain a tremendous amount of knowledge and help a worthy cause at the same time. Plus there is an added bonus for those attending the PASS Summit, if you buy now you can read chapters on line, and be prepared to ask questions of the authors at the summit!

 

wc_wallpaper23_800x600

 

How can you help?  Obviously, start by buying a copy of the book. Then let your fellow SQL Server and Developer geeks know about this effort. Urge them to buy a copy, get your company to buy several for the company library. Or do like I will, and buy several copies to give to friends. Finally, you can take the direct approach. Go directly to the War Child site and make a donation today.

WordCamp Birmingham 2009 – Freedom of Speech

Over the weekend I attended WordCamp Birmingham 2009. For those of you who don’t know, WordCamp is a code camp for WordPress users and developers. WordPress is a very popular open source blogging engine. Users can download the WordPress engine from http://wordpress.org/, then customize it to their needs.

WordPress can do some pretty amazing things, many of the developers there specialize in customizing WordPress to individual corporate needs, and transforming WordPress to a full Content Management System (CMS).

In addition to being an open source engine, it’s sister sight, http://wordpress.com, is a hosting site. Through it you can create your own blog, for free. You can customize your blog from a variety of base templates, then add further custom tweaks through the various widgets and plug-ins offered by the WordPress.com folks. For minimal fees you can add a custom domain name, and extend the amount of space you have available.

That’s what I’ve done with this blog. Arcane Code is hosted on WordPress.com, and I pay the small fee (about 15 dollars US per year) to have the http://arcanecode.com URL. You may ask “gee Robert, you’re a smart guy, why let them host and not get the software from the .org site and host yourself?” You’re right, I am a smart guy! ;-) Seriously, I could host the engine myself, but to be honest I would rather spend my time writing blog posts than worrying about upgrading my blog software to the current version. I let the nice folks at wordpress.com take care of those headaches for me.

Of course there are a few restrictions with the .com site that I would not have with the .org software and self hosting. The biggest is no advertising, I can’t sell ads on the blog while it’s on the .com site. I’m also limited on templates and customizations, I have to use the built in .com templates, with the .org software the sky is the limit to what I want to do. For not having to deal with the headaches of managing my blog engine though, these are trade off’s I’m willing to make. One day in the future I may change my mind, but for now I’m quite happy.

But enough about the blogging software, let’s talk about WordCamp. For my .Net developer buddies or fellow SQL geeks, WordCamp is just like any code camp or SQL Saturday you’ve been to. Speakers are organized into one or more tracks. Most of the speakers are from the local community or surrounding region, with a few big wigs thrown in for good measure.

This year Innovation Depot hosted the Saturday tracks, one for developers and the other for bloggers. The blogger track was aimed at new users or folks who simply wanted to work with social media, and leave the technical considerations to others. The developer track was for the geeks who liked to customize and develop widgets and plug ins for use with WordPress. Lunch was some great BBQ, fitting for a true southern event.

The Sunday event took place at Shift Workspace, which is a facility where you can rent space to work in for under $50 US a month. Tables, comfortable chairs, coffee and soda, and all the wi-fi you can eat. It’s a nice place, and the format was very open. On the first floor small groups gathered to discuss and debate topics around the software. The second floor was the experts area, I saw many groups of two huddled around laptops, getting and giving advice on particular issues folks were having.

The highlight of the event was Matt Mullenweg’s lunchtime keynote on Saturday. Matt is the original creator of WordPress, it was his idea and his guidance that made it successful. In addition Matt also founded a company to host WordPress.com and provide extra services for advanced users. In addition to being a good businessman Matt is also a great speaker, his lunch time presentation was both informative and humorous.

Also in attendance was Dougal Campbell. Dougal was one of the original group of developers of WordPress. He and Matt have been working together on the open source software since 2003. Oddly enough this event was something of a historic occasion for them, even though they have been e-mailing and phone calling with each other since 2003, this weekend was the first time Matt and Dougal had actually met face to face! In the interest of full disclosure I should add that Dougal is my brother-in-law, he is married to my kid sister. But I won’t hold that against him.

The closing keynote on Saturday was from an Iranian Bahrainian blogger. In the interest of protecting their security I won’t say too much, but it was a very moving presentation that reminded us all of how great a privilege freedom of speech is. One Iranian blogger has already died in jail, and another Egyptian blogger is currently in jail right now for doing nothing more than speaking his opinions through his blog. 

I have to give the organizers high marks, the event was run well, lunch arrived on time, and plenty of it. There was a big crowd, I heard about 165 registered, and I think just about every one of them made it from the crowds I saw. We had such a good response the organizers even spoke about the possibility of creating a WordPress user group of sorts, and having smaller events either on a monthly or quarterly basis.

This was a really fun event. I saw some friends (and relatives if you count Dougal) and met a lot of new people. I talked to folks from Nashville TN, Charlotte NC, Atlanta GA, and one lady from New Jersey. I also heard about one person coming in from Arkansas and another from Texas. I also came away with some great ideas around social networking, and using various forms of multimedia to enhance information and knowledge transfer in the work place. I spoke to a lawyer who specializes in discovery and got into an interesting discussion about data mining of unstructured data. I also have an idea that might be relevant for a presentation next year. Finally I am struck with the notion of taking WordPress and making it a dashboard for a SQL Server Business Intelligence solution. Hmmm…..

All in all it was a great WordCamp, and I’m looking forward to the 2010 event.

Calling SSIS from .Net

In a recent DotNetRocks show, episode 483, Kent Tegels was discussing SQL Server Integration Services and how it can be useful to both the BI Developer as well as the traditional application developer. While today I am a SQL Server BI guy, I come from a long developer background and could not agree more. SSIS is a very powerful tool that could benefit many developers even those not on Business Intelligence projects. It was a great episode, and I high encourage everyone to listen.

There is one point though that was not made very clear, but I think is tremendously important. It is indeed possible to invoke an SSIS package from a .Net application if that SSIS package has been deployed to the SQL Server itself. This article will give an overview of how to do just that. All of the sample code here will also be made available in download form from the companion Code Gallery site, http://code.msdn.microsoft.com/ssisfromnet .

In this article, I do assume a few prerequisites. First, you have a SQL Server with SSIS installed, even if it’s just your local development box with SQL Server Developer Edition installed. Second, I don’t get into much detail on how SSIS works, the package is very easy to understand. However you may wish to have a reference handy. You may also need the assistance of your friendly neighborhood DBA in setting up the SQL job used in the process.

Summary

While the technique is straightforward, there are a fair number of detailed steps involved. For those of you just wanting the overview, we need to start with some tables (or other data) we want to work with. After that we’ll write the SSIS package to manipulate that data.

Once the package is created it must be deployed to the SQL Server so it will know about it. This deploy can be to the file system or to SQL Server.

Once deployed, a SQL Server Job must be created that executes the deployed SSIS package.

Finally, you can execute the job from your .Net application via ADO.NET and a call to the sp_start_job stored procedure built into the msdb system database.

OK, let’s get to coding!

Setup the Tables

First we need some data to work with. What better than a listing of previous Dot Net Rocks episodes? I simply went to the Previous Shows page, highlighted the three columns of show number, show name, and date, and saved them to a text file. (Available on the Code Gallery site.)

Next we need a place to hold data so SSIS can work with it. I created a database and named it ArcaneCode, however any database should work. Next we’ll create a table to hold “staging” DNR Show data.

CREATE TABLE [dbo].[staging_DNRShows](
  [ShowData] [varchar](250) NOT NULL
) ON [PRIMARY]

This table will hold the raw data from the text file, each line in the text file becoming one row here. Next we want a table to hold the final results.

CREATE TABLE [dbo].[DNRShows](
  [ShowNumber] [int] NOT NULL,
  [ShowName] [varchar](250) NULL,
  [ShowDate] [datetime] NULL,
  CONSTRAINT [PK_DNRShows] PRIMARY KEY CLUSTERED
  (
  [ShowNumber] ASC
  )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
  ) ON [PRIMARY]

The job of the SSIS package will be to read each row in the staging table and split it into 3 columns, the show’s number, name, and date, then place those three columns into the DNRShows table above.

The SSIS Package

The next step is to create the SSIS package itself. Opening up Visual Studio / BIDS, create a new Business Intelligence SQL Server Integration Services project. First let’s setup a shared Data Source to the local server, using the ArcaneCode database as our source.

The default package name of “Package.dtsx” isn’t very informative, so let’s rename it ”LoadDNRShows.dtsx”. Start by adding a reference to the shared data source in the Connection Managers area, taking the default. Then in the Control Flow surface add 3 tasks, as seen here:

clip_image001

The first task is an Execute SQL Task that simply runs a “DELETE FROM dbo.DNRShows” command to wipe out what was already there. Of course in a true application we’d be checking for existing records in the data flow and doing updates or inserts, but for simplicity in this example we’ll just wipe and reload each time.

The final task is also an Execute SQL Task, after we have processed the data we no longer need it in the staging table, so we’ll issue a “DELETE FROM dbo.staging_DNRShows” to remove it.

The middle item is our Data Flow Task. This is what does the heavy lifting of moving the staging data to the main table. Here is a snapshot of what it looks like:

clip_image002

The first task is our OLEDB Source, it references the staging_DNRShows table. Next is what’s called a Derived Column Transformation. This will allow you to add new calculated columns to the flow, or add columns from variables. In this case we want to add three new columns, based on the single column coming from the staging table.

clip_image003

As you can see in under Columns in the upper left, we have one column in our source, ShowData. In the lower half we need to add three new columns, ShowNumber, ShowDate, and ShowName. Here are the expressions for each:

ShowNumber
    (DT_I4)SUBSTRING(ShowData,1,FINDSTRING(ShowData,"\t",1))

ShowDate
    (DT_DBDATE)SUBSTRING(ShowData,FINDSTRING(ShowData,"\t",2) + 1,LEN(ShowData) – FINDSTRING(ShowData,"\t",2))

ShowName
    (DT_STR,250,1252)SUBSTRING(ShowData,FINDSTRING(ShowData,"\t",1) + 1,FINDSTRING(ShowData,"\t",2) – FINDSTRING(ShowData,"\t",1) – 1)

The syntax is an odd blend of VB and C#. Each one starts with a “(DT_”, these are type casts, converting the result of the rest of the expression to what we need. For example, (DT_I4) converts to a four byte integer, which we need because in our database the ShowNumber column was defined as an integer. You will see SUBSTRING and LEN which work like their VB counterparts. FINDSTRING works like the old POS statement, it finds the location of the text and returns that number. The “\t” represents the tab character, here the C# fans win out as the Expression editor uses C# like escapes for special characters. \t for tab, \b for backspace, etc.

Finally we need to write out the data. For this simply add an OLEDB Destination and set it to the target table of dbo.DNRShows. On the mappings tab make sure our three new columns map correctly to the columns in our target table.

Deploy the Package

This completes the coding for the package, but there is one final step we need to do. First, in the solution explorer right click on the project (not the solution, the project as highlighted below) and pick properties.

clip_image004

In the properties dialog, change the “CreateDeploymentUtility” option from false (the default) to True.

clip_image006

Now click the Build, Build Solution menu item. If all went well you should see the build was successful. It’s now time to deploy the package to the server. Navigate to the folder where your project is stored, under it you will find a bin folder, and in it a Deployment folder. In there you should find a file with a “.SSISDeploymentManifest” extension. Double click on this file to launch the Package Installation Wizard.

When the wizard appears there are two choices, File system deployment and SQL Server deployment. For our purposes we can use either one, there are pros and cons to each and many companies generally pick one or the other. In this example we’ll pick SQL Server deployment, but again know that I’ve tested this both ways and either method will work.

Once you pick SQL Server deployment, just click Next. Now it asks you for the server name, I’ve left it at (local) since I’m working with this on a development box; likewise I’ve left “Use Windows Authentication”. Finally I need the package path, I can select this by clicking the ellipse (the …) to the right of the text box. This brings up a dialog where I can select where to install.

clip_image007

In a real world production scenario we’d likely have branches created for each of our projects, but for this simple demo we’ll just leave it in the root and click OK.

Once your form is filled out as below, click Next.

clip_image008

We are next queried to what our installation folder should be. This is where SSIS will cache package dependencies. Your DBA may have a special spot setup for these, if not just click next to continue.

Finally we are asked to confirm we know what we are doing. Just click Next. If all went well, the install wizard shows us it’s happy with a report, and we can click Finish to exit.

Setup the SQL Server Job

We’ve come a long way and we’re almost to the finish line, just one last major step. We will need to setup a SQL Server Job which will launch the SSIS package for us. In SQL Server Management Studio, navigate to the “SQL Server Agent” in your Object Explorer. If it’s not running, right click and pick “Start”. Once it’s started, navigate to the Jobs branch. Right click and pick “New Job”.

When the dialog opens, start by giving your job a name. As you can see below I used LoadDNRShows. I also entered a description.

clip_image010

Now click on the Jobs page over on the left “Select a page” menu. At the bottom click “New” to add a new job step.

In the job step properties dialog, let’s begin by naming the step “Run the SSIS package”. Change the Type to “SQL Server Integration Services Package”. When you do, the dialog will update to give options for SSIS. Note the Run As drop down, this specifies the account to run under. For this demo we’ll leave it as the SQL Server Agent Service Account, check with your DBA as he or she may have other instructions.

In the tabbed area the General tab first allows us to pick the package source. Since we deployed to SQL Server we’ll leave it at the default, however if you had deployed to the file system this is where you’d need to change it to pick your package.

At the bottom we can use the ellipse to pick our package from a list. That done your screen should look something like:

clip_image011

For this demo that’s all we need to set, I do want to take a second to encourage you to browse through the other tabs. Through these tabs you can set many options related to the package. For example you could alter the data sources, allowing you to use one package with multiple databases.

Click OK to close the job step, then OK again to close the Job Properties window. Your job is now setup!

Calling from .Net

The finish line is in sight! Our last step is to call the job from .Net. To make it a useful example, I also wanted the .Net application to upload the data the SSIS package will manipulate. For simplicity I created a WinForms app, but this could easily be done in any environment. I also went with C#, again the VB.Net code is almost identical.

I started by creating a simple WinForm with two buttons and one label. (Again the full project will be on the Code Gallery site).

clip_image012

In the code, first be sure to add two using statements to the standard list:

using System.Data.SqlClient;

using System.IO;

Behind the top button we’ll put the code to copy the data from the text file we created from the DNR website to the staging table.

    private void btnLoadToStaging_Click(object sender, EventArgs e)

    {

      /* This method takes the data in the DNRShows.txt file and uploads them to a staging table */

      /* The routine is nothing magical, standard stuff to read as Text file and upload it to a  */

      /* table via ADO.NET                                                                      */

 

      // Note, be sure to change to your correct path

      string filename = @"D:\Presentations\SQL Server\Calling SSIS From Stored Proc\DNRShows.txt";

      string line;

 

      // If you used a different db than ArcaneCode be sure to set it here

      string connect = "server=localhost;Initial Catalog=ArcaneCode;Integrated Security=SSPI;";

      SqlConnection connection = new SqlConnection(connect);

      connection.Open();

 

      SqlCommand cmd = connection.CreateCommand();

 

      // Wipe out previous data in case of a crash

      string sql = "DELETE FROM dbo.staging_DNRShows";

      cmd.CommandText = sql;

      cmd.ExecuteNonQuery();

 

      // Now setup for new inserts

      sql = "INSERT INTO dbo.staging_DNRShows (ShowData) VALUES (@myShowData)";

 

      cmd.CommandText = sql;

      cmd.Parameters.Add("@myShowData", SqlDbType.VarChar, 255);

 

      StreamReader sr = null;

 

      // Loop thru text file, insert each line to staging table

      try

      {

        sr = new StreamReader(filename);

        line = sr.ReadLine();

        while (line != null)

        {

          cmd.Parameters["@myShowData"].Value = line;

          cmd.ExecuteNonQuery();

          lblProgress.Text = line;

          line = sr.ReadLine();

        }

      }

      finally

      {

        if (sr != null)

          sr.Close();

        connection.Close();

        lblProgress.Text = "Data has been loaded";

      }

 

Before you ask, yes I could have used any number of data access technologies, such as LINQ. I went with ADO.NET for simplicity and believing most developers are familiar with it due to its longevity. Do be sure and update the database name and path to the file in both this and the next example when you run the code.

This code really does nothing special, just loops through the text file and uploads each line as a row in the staging table. It does however serve as a realistic example of something you’d do in this scenario, upload some data, then let SSIS manipulate it on the server.

Once the data is there, it’s finally time for the grand finale. The code behind the second button, Execute SSIS, does just what it says; it calls the job, which invokes our SSIS package.

    private void btnRunSSIS_Click(object sender, EventArgs e)

    {

      string connect = "server=localhost;Initial Catalog=ArcaneCode;Integrated Security=SSPI;";

      SqlConnection connection = new SqlConnection(connect);

      connection.Open();

 

      SqlCommand cmd = connection.CreateCommand();

 

      // Wipe out previous data in case of a crash

      string sql = "exec msdb.dbo.sp_start_job N’LoadDNRShows’";

      cmd.CommandText = sql;

      cmd.ExecuteNonQuery();

      connection.Close();

      lblProgress.Text = "SSIS Package has been executed";

 

    }

The key is this sql command:

exec msdb.dbo.sp_start_job N’LoadDNRShows’

“exec” is the T-SQL command to execute a stored procedure. “sp_start_job” is the stored procedure that ships with SQL Server in the MSDB system database. This stored procedure will invoke any job stored on the server. In this case, it invokes the job “LoadDNRShows”, which as we setup will run an SSIS package.

Launch the application, and click the first button. Now jump over to SQL Server Management Studio and run this query:

select * from dbo.staging_DNRShows;

select * from dbo.DNRShows;

You should see the first query bring back rows, while the second has nothing. Now return to the app and click the “Execute SSIS” button. If all went well running the query again should now show no rows in our first query, but many nicely processed rows in the second. Success!

A few thoughts about xp_cmdshell

In researching this article I saw many references suggesting writing a stored procedure that uses xp_cmdshell to invoke dtexec. DTEXEC is the command line utility that you can use to launch SSIS Packages. Through it you can override many settings in the package, such as connection strings or variables.

xp_cmdshell is a utility built into SQL Server. Through it you can invoke any “DOS” command. Thus you could dynamically generate a dtexec command, and invoke it via xp_cmdshell.

The problem with xp_cmdshell is you can use it to invoke ANY “DOS” command. Any of them. Such as oh let’s say “DEL *.*” ? xp_cmdshell can be a security hole, for that reason it is turned off by default on SQL Server, and many DBA’s leave it turned off and are not likely to turn it on.

The techniques I’ve demonstrated here do not rely on xp_cmdshell. In fact, all of my testing has been done on my server with the xp_cmdshell turned off. Even though it can be a bit of extra work, setting up the job, etc., I still advise it over the xp_cmdshell method for security and the ability to use it on any server regardless of its setting.

In Closing

That seemed like a lot of effort, but can lead to some very powerful solutions. SSIS is a very powerful tool designed for processing large amounts of data and transforming it. In addition developing under SSIS can be very fast due to its declarative nature. The sample package from this article took the author less than fifteen minutes to code and test.

When faced with a similar task, consider allowing SSIS to handle the bulk work and just having your .Net application invoke your SSIS package. Once you do, there are no ends to the uses you’ll find for SQL Server Integration Services.

Intro to DW/BI at the Steel City SQL Users Group

Tonight I’ll be presenting “Introduction to Data Warehousing / Business Intelligence” at the Steel City SQL users group, right here in Birmingham Alabama. If you attended my Huntsville presentation last week, I’ve already added some new slides and revised the deck, so it will be worth another look.

My slide deck is IntroToDataWarehouse.pdf . Come join us tonight at 6 pm at New Horizons, there will be pizza and fun for all.

UPDATE: Before the presentation I was showing a video of Sara Ford jumping off a tower to support CodePlex. Got tons of laughs so here’s a link to the video:

http://blogs.msdn.com/saraford/archive/2009/09/14/my-codeplex-jump-from-tallest-building-in-the-southern-hemisphere-the-full-video.aspx

Intro to DW/BI at the Huntsville User Group meeting

Tonight at the Huntsville User group I am presenting “Introduction to Data Warehousing / Business Intelligence”. The slide deck for my presentation is now available in PDF format at the link below. If you have attended this presentation in the past you may wish to download the slides again as I have updated it with new information.

IntroToDataWarehousing

Data Warehousing / BI at the next HuntUG Meeting!

Business Intelligence is one of the most in demand skill sets right now. Do you want to know more about it? Be guided through all the terminology and concepts? Do you live in the Huntsville Alabama area? Well here’s your golden opportunity!

On Tuesday, September 8th I will be presenting “Introduction to Data Warehousing / Business Intelligence” at the next meeting of the Huntsville User Group. I’ll demystify all the terms around DW/BI and give a demonstration of the Microsoft SQL Server tools used in the DW/BI process. See their site for meeting time and directions.