Zero to Hero with PowerShell and SQL Server at 24 Hours of Pass

On September 9th I am co-presenting “Zero to Here with SQL Server and PowerShell” for the 24 Hours of PASS. If you’ve not heard of 24 Hours of PASS, it is 24 straight hours of online presentations. This time the sessions are a preview of the SQL PASS Summit in Seattle, WA in November.

At the PASS Summit I, along with two co-workers, am presenting a full day Pre-Con entitled Zero to Hero with PowerShell and SQL Server. I’m also doing a regular session, Make SQL Server POP with PowerShell.

The session for 24 Hours of PASS will take place at 00:00 GMT on September 10th, or for those of us in the states, September the 9th, 8 PM Eastern, 7 PM Central, 6 Mountain, or 5 Pacific. The session is titled the same as the precon, Zero to Hero with PowerShell and SQL Server. Through the preceding link you can see more about the session, get to the full schedule, and most importantly register!

Be sure to check out my co-presenters too, Bradley Ball (@SqlBalls | http://sqlballs.com ) and Jason Strate (@StrateSQL | http://www.jasonstrate.com )

PowerScripting Podcast

I just wanted to give a thanks to the guys at the PowerScripting Podcast for having me on tonight. As soon as it is released I’ll follow up with a link.

For those who came here from hearing me on the podcast, you can find more info on SQL Saturday at: http://bit.ly/sqlsat328

If you want to find out more about my sessions at the PASS Summit, you can jump to http://bit.ly/acsummit. My co-presenters for the precon are Brad Ball @sqlballs and Jason Strate @stratesql.

The Pragmatic Works webinars can be found on the company website at http://pragmaticworks.com. Just follow the Free Training on the T’s to get access to the webinars. You can search by author name (Robert Cain will get you mine) or topic.

My other training videos can be found on Pluralsight, http://pluralsight.com/training.

I also have a youtube channel with a couple of videos, https://www.youtube.com/user/arcanecode. Check out the Column Mode Editing video for a quick editing tip on making your life easier with both PowerShell and SQL Server.

Arcane-SQL–A PowerShell Module for Generating SQL Code

Overview

There are many PowerShell modules available for assisting the busy DBA with managing their SQL Server environment. This isn’t one of them. This module is targeted toward SQL Developers, with special functionality for data warehouse developers. A common task for BI professionals, one that is performed on almost every project, is the creation of a staging area. This might be a set of tables in the data warehouse, perhaps in their own schema, or in an entirely separate database often called an operational data store (ODS).

The staging tables are typically similar in structure to the ones in the source database. Similar, but not identical, as there are some small modifications which are commonly made to the staging tables. First, large data types such as VARCHAR(MAX) are seldom useful in data analysis and thus could be removed. Next, even the most casual user of SSIS will quickly see SSIS prefers to work with the double byte character sets (WSTR in SSIS, which maps to NVARCHAR in T-SQL) as opposed to the single byte (STR/VARCHAR) character sets. It can be helpful to convert these in the staging area.

This Module can (optionally) do all of these things and more when it is used to generate CREATE TABLE or SELECT statements. Imagine if you will a source system with thousands of tables and the need to create a staging area for it in a new data warehouse. This quickly becomes a long, boring tedious task. Now imagine being able to write a bit of PowerShell code and generate these tables in just a few minutes time.

Before diving in, it is highly suggested you download and review the example script, Arcane-SQL-Example.ps1. This demonstrates the most commonly used functions and provides patterns for their use.

Functionality

While the module is full of functions, there are a few core ones that should be highlighted. Complete documentation can be found in the module itself, which has been fully documented using the native PowerShell Help system. In addition there is an example script file which demonstrates some of the most common tasks.

  • Enable-SqlAssemblies – This is the most important function, without calling it nothing else works. Be aware the SQL Server assemblies (including the SMO – SQL Management Objects – and SQL Provider) need to be on the machine where this script is run. This module has been tested on, and intended for, SQL developers with SQL Server Developer Edition installed on their workstations.
  • Join-InvokeInstance and Join-ProviderInstance – Most of the interaction done with the SQL Provider requires the server name and instance name, assembled in a path like syntax. The Invoke-SQLCommand likewise requires this formatting, however it has a little quirk. If the instance is "default" then the Invoke requires it to be omitted while the provider requires it to be present. These two functions reduce the confusion, simply pass in the server name and instance, and they will format things correctly.
  • Get-TablesCollection – When working with tables it is common to iterate over all the tables in a database. This function will generate a PowerShell array of table objects, each object being of type Microsoft.SqlServer.Management.Smo.Table. By having table objects the wide variety of properties for the table are available, such as Schema name, Table name, and Row Count.
  • Get-TableByName – Most commonly scripts will retieve an array of tables using the above Get-TablesCollection, then iterate over them in a foreach loop. There are times however when only a single table from the collection is desired. For those types the Get-TableByName can be used to retrieve a specific table object based on the name of the table. 
  • Remove-SchemasFromTableCollection and Select-SchemasInTableCollection – Get-TablesCollection will return an array of all the tables in a database. Often there is a need to only work with a subset of that table collection. These two functions will filter based on the schema and return a new array. The first, Remove-SchemasFromTableCollection, removes all tables from the array of schemas that are passed in. The second, Select-SchemasInTableCollection, will retain only those tables in the schemas passed into the function.
  • Remove-TablesFromTableCollection and Select-TablesInTableCollection – These work as filters, similar to the functions above. Instead of the schema however, they are based on table name. All tables that begin with the text passed in are either removed, or in the latter function the only ones retained.
  • Get-PrimaryKeyIndex – Returns the primary key object for the passed in table object.
  • Get-PrimaryKeyColumnNames – returns a comma delimited list of the column names in the primary key
  • Decode-IsPrimaryKeyColumn – Will determine if the passed in column name is part of the primary key index
  • Get-TruncateStatement – Will generate a SQL Truncate Table statement based on the table object passed in.
  • Get-DropTableStatement – Generates a Drop Table statement, including the check to see if the table exists, for the passed in table object.
  • Get-CreateStatement – To simply say this function generates a create table statement would do it disservice. It will take a table object and reverse engineer it, generating a create table statement. Unlike other code generators, it has a suite of parameters which allow customization of the generated statement with an eye toward the needs of a data warehouse developer. A few are:
    • DataTypeAlignColumn – Set the column number to line up the data type declarations on. Passing in a value of 1, will suppress alignment and simply place the data type after the column name. The default is column 50.
    • OverrideSchema – It is common place staging tables in the data warehouse in their own schema, often named ‘Staging’ or ‘ETL’. Passing in a value here will include the new schema name in the create table declaration. If the table object passed in had a schema other than dbo, it is placed in front of the table name with an underscore. If it was dbo, the source schema is simply omitted.
    • PrependToTableName and AppendToTableName – Allows extra text to be placed before or after the table name. For example, it is common to create tables with _Delete, _Update, and _Insert in the staging area. This provides a simple way to do that.
    • AdditionalColumns – When creating tables in a data warehousing environment, there are often extra columns to hold metadata about the ETL process. A user of this function can create an array of additional columns using the Add-ColumnDefinition function and have them added to the create table statement.
    • Scrub – This is a very powerful switch. When added it will perform a cleanup to make the output suitible for data warehousing. Columns with large data types such as VARCHAR(MAX) are removed. All single byte character sets in the source are converted to double byte sets.
    • SuppressIdentity – Source systems will sometimes use the IDENTITY clause in the primary key column. Using this switch will suppress that identity clause from being generated in the new create table statement.
    • SuppressNotNull – Often staging tables will not be concerned with null versus not null values. Using this switch will create all columns as nullible, regardless of their setting in the source.
    • IncludeDropTable – Adding this switch will include a ‘if exists drop table’ style clause prior to the create table statement.
    • PrimaryKeysOnly – Will generate a create table statement that only has the primary keys found in the source system.

Finally, if a column in the source table object has a custom data type, the script will reverse engineer the data type back to its basic SQL data type.

  • Get-SelectStatement – Like its sister function Get-CreateStatement, under the covers this function provides a lot of power and flexibility to the statement it creates. Additional columns can be added, columns can be specified to order the output by, table aliases can be used, and most powerful of all is the ability to generate a HASHBYTES column, including the ability to remove specified columns from the hash byte calculation. Here are some of its parameters:
    • AsColumn – The routine will line up the AS <column alias> at the column number passed in here. The default is 50. To not use aligning, set this to 1.
    • PrependToColumnName – Text to include before each column name.
    • AppendToColumnName – Text to place after each column name.
    • AdditionalColumns – A collection of additional columns to be added to the SELECT statement. Useful for adding metadata columns. All items in the AdditionalColumns collection should be generated using the Add-OutputColumn function.
    • OrderByColumns – A list of columns to add to the ORDER BY clause. All items in the OrderByColumns collection should be generated using the Add-OutputColumn function.
    • TableAlias – Allows user to specify an alias to use for the table. The alias is then put in front of each column name.
    • HashBytes – If included the select statement will include a HASHBYTES function with all columns except the primary keys and any columns included in the OmitFromHashBytes collection. The name passed in this parameter will be used for the name of the HashBytes column.
    • OmitFromHashBytes – A collection of column names that should be excluded from the HashBytes calculated column. Useful for excluding metadata columns. All items in the OmitFromHashBytes collection should be generated using the Add-OutputColumn function.
    • Scrub – When included this will remove certain data columns from the output, such as BINARY, NVARCHAR(MAX), XML, and other large types not normally used in data warehouses. Additionally VARCHAR/CHAR are converted to NVARCHAR/NCHAR, and DATETIME converted to DATETIME2(4).
    • Flatten – When included will return the SELECT statement as one long string, without any Carriage Return / Line Feed characters. Additionally, any additional spacing (such as indicated with the AsColumn) is eliminated.
    • IncludeOrderByPK – When included the Primary Keys in the table object are included in the order by clause. If any columns are passed in the OrderByColumns parameter, the Primary Keys occur first in the Order By clause, followed by any columns in the OrderByColumns parameter.
    • IncludeNoLock – When included, a WITH NOLOCK clause is added to the SELECT statement.

Construction

Those PowerShell experts who review the code may note that in many places code does not follow the most "powershelly" way of doing things. In some places rather than using pipelining it was instead decided to use a foreach loop, for example. The intended audience for this module are T-SQL developers who may not be as comfortable in PowerShell as they are T-SQL. Thus using code that more closely aligned with T-SQL patterns would make it more useful and modifiable by SQL developers.

When development first started attempts were made to use advanced functions, using the pipeline for input and output. At some point however this didn’t make sense for a majority of the functions. Time constraints further impinged this effort. Some future revision may attempt to migrate selected functions back to an advanced design, but for now they will have to stand as is.

Development Environment

This module is intended to be used on a developer workstation, not on a server, and especially not on a production server. As such deployment has been made simple, just copy the Arcane-SQL folder to the developers PowerShell module library. On a standard Windows 7 machine this would be C:\Users\<<usernamehere>>Documents\WindowsPowerShell\Modules. If the Windows PowerShell folder and modules subfolder do not exist, they will need to be created first.

To keep things simple, no attempt was made to sign the script. If this is an issue the developer using this module can self sign it on their PC. Check the execution policy on the workstation where the module is installed to ensure sufficient rights to run the module.

This module was developed on machines with both SQL Server (Developer Edition) 2008R2 and 2012 installed (some machines with both) and worked without issue. One machine it was tested on had 3 versions of SQL Server, 2008R2, 2012, and 2014. On that one machine there were some errors with some of the functions passing in the SMO table objects. Those are still being investigated.

SQL Security was assumed to be handled using built in Windows Credentials. Thus the logged in user would need to have rights based on Windows credentials to the SQL Server they are targeting. 

The machine being developed on was using PowerShell v4, however v3 should work as well.

This module was developed using SAPIEN PowerShell Studio 2014. To make life easy for other developers the PowerShell Studio files (Arcane-SQL.psproj, Arcane-SQL.psproj.build, and Arcane-SQL.psprojs) were included in the code. If you are using a different editor, such as the PowerShell ISE, simply discard these files.

Warranty

To put it succinctly, there is none. No guarantee is made for the code in this module, users of this module assume all risks. While I am happy to receive bug reports, I make no promises or guarantees if, or when, they will be fixed.

Contributions

No, not the money kind, code contributions. If anyone wishes to extend the functionality of this module I am happy to collaborate as long as the coding standards demonstrated in this module are adhered to, and the contributions are relevant to the goals of this module. Be aware though this is not a money making effort, so expect no monetary reimbursement for any contributions.

Download

You can download the module and its example at:

http://gallery.technet.microsoft.com/Arcane-SQL-A-PowerShell-185651ad

Importing MongoDB Data Using SSIS 2012

I have embarked on a little quest to learn other database platforms (especially NoSQL) as more and more of our clients at Pragmatic Works have them in their enterprise, and want to be able to import data from them into their SQL Server data warehouses using SQL Server Integration Services (SSIS). While I found several articles that showed how to do so, these were outdated due to changes in the MongDB C# driver. After quite a bit of effort figuring out how to get this working, I thought I’d pass along my hard fought knowledge.

First, I assume you are familiar with MongoDB (http://www.mongodb.org/) and SQL Server (https://www.microsoft.com/en-us/sqlserver/default.aspx). In my examples I am using SSIS 2012 and MongoDB 2.4.8, along with the C# driver version 1.7 for MongoDB available at http://docs.mongodb.org/ecosystem/drivers/csharp/ .

First, download and install the C# driver. This next step is important, as there was a change that occurred with version 1.5 of the driver: the DLLs are no longer installed in the GAC (Global Assembly Cache) automatically. They must be there, however, for SSIS to be able to use them.

By default, my drivers were installed to C:\Program Files (x86)\MongoDB\CSharpDriver 1.7. You’ll want to open a CMD window in Administrator mode, and navigate to this folder. Next you’ll need GACUTIL, on my computer I found the most recent version at:

C:\Program Files (x86)\Microsoft SDKs\Windows\v8.1A\bin\NETFX 4.5.1 Tools\x64\

A simple trick to find yours: Since you are already in the CMD window, just move to the C:\Program Files (x86) folder, and do a “dir /s gacutil.exe”. It will list all occurrences of the program, just use the one with the most recent date. Register the dlls by entering these commands:

“C:\Program Files (x86)\Microsoft SDKs\Windows\v8.1A\bin\NETFX 4.5.1 Tools\x64\gacutil” /i MongDB.Bson.dll

“C:\Program Files (x86)\Microsoft SDKs\Windows\v8.1A\bin\NETFX 4.5.1 Tools\x64\gacutil” /i MongDB.Driver.dll

Note the “ quote marks around the path are important for the CMD window to correctly separate the gacutil program from the parameters.

Once that is done, create a new SQL Server Integration Services project in SQL Server Data Tools (SSDT), what used to be called BIDS in SQL Server 2008R2 (and previous). Put a Data Flow Task on the Control Flow design surface. Then open the Data Flow Task for editing.

Next, drag and drop a Script Component transformation onto the Data Flow design surface. When prompted, change the component type to Source.

image

Now edit the script transform by double clicking on it. Move to the Inputs and Outputs page. For my test, I am using the dbo.DimCurrency collection I created using the technique I documented in the previous post, Exporting Data from SQL Server to CSV Files for Import to MongoDB Using PowerShell ( http://arcanecode.com/2014/01/13/exporting-data-from-sql-server-to-csv-files-for-import-to-mongodb-using-powershell/ )

I renamed the output from “output” to “MongoDB_DimCurrency”. I then added four columns, CurrencyName, CurrencyAlternateKey, CurrencyKey, and ID.

image

Make sure to set CurrencyName, CurrencyAlternateKey, and ID to “Unicode string [DT_WSTR]” Data Type. Then change CurrencyKey to “four byte signed integer [DT_I4]”.

Now return to the Script page and click Edit Script. In the Solution Explorer pane, expand References, right click and pick Add Reference. Go to Browse, and navigate to the folder where the MongoDB C# drivers are installed. On my system it was in C:\Program Files (x86)\MongoDB\CSharpDriver 1.7\. Add both MongoDB.Driver.dll and MongoDB.Bson.dll.

image

Click OK when done, your Solution Explorer should now look something like:

image

Now in the script, expand the Namespaces region and add these lines:

using MongoDB.Bson;
using MongoDB.Driver;
using MongoDB.Bson.Serialization;

Now scroll down to the CreateNewOutputRows() procedure. Here is a sample of the code I used:

public override void CreateNewOutputRows()
{
  string connectionString = "mongodb://localhost";
  string databaseName = "AdventureWorksDW2014";

  var client = new MongoClient(connectionString);
  var server = client.GetServer();
  var database = server.GetDatabase(databaseName);
  string CurrencyKey = "";

  foreach (BsonDocument document in database.GetCollection<BsonDocument>("dbo.DimCurrency").FindAll())
  {
    MongoDBDimCurrencyBuffer.AddRow();
    MongoDBDimCurrencyBuffer.CurrencyName = document["CurrencyName"] == null ? "" : document["CurrencyName"].ToString();
    MongoDBDimCurrencyBuffer.CurrencyAlternateKey = document["CurrencyAlternateKey"] == null ? "" : document["CurrencyAlternateKey"].ToString();
    CurrencyKey = document["CurrencyKey"] == null ? "" : document["CurrencyKey"].ToString();

    MongoDBDimCurrencyBuffer.CurrencyKey = Convert.ToInt32(CurrencyKey);
    MongoDBDimCurrencyBuffer.ID = document["_id"] == null ? "" : document["_id"].ToString();
  }

}

I start by defining a connection string to the MongoDB server, followed by the database name. I then create a MongoClient object. Note the MongoClient is the new way of connecting to the MongoDB server. In earlier versions of the C# driver, you used MongoServer objects.

I then cycle through each document in the collection “dbo.DimCurrency”, using the FindAll() method. For each item I use the AddRow() method to add a row to the buffer. In order to find the proper name for the buffer I went to the Solution Explorer and expanded the BufferWrapper.cs file. This is a class created by the script transform with the name of the output buffer.

image

For each column in my outputs, I map a column from the document. Note the use of the ternary operator ? : to strip out nulls and replace them with empty strings. String columns you can map directly from the document object to the output buffers columns.

The CurrencyKey column, being an integer, had to be converted from a string to an integer. To make it simple I created a string variable to hold the return value from the document, then used the Convert class to convert it to an INT 32.

Once you’ve done all the above, validate the code by building the code. If that all checks out save your work, close the code window, then close the Script Transformation Editor by clicking OK.

Now place a destination of some kind on the Data Flow. Since I have my company’s Task Factory tools I used a TF Terminator Destination, but you could also use a Row Count destination. On the precedence constraint between the two, right click and Enable Data Viewer. Execute the package, if all goes well you should see:

image

A few final notes. This test was done using a MongoDB document schema that was flat, i.e. it didn’t have any documents embedded in the documents I was testing with. (Hopefully I’ll be able to test that in the future, but it will be the subject of a future post.) Second, the key was the registering of the DLLs in the GAC. Until I did that, I couldn’t get the package to execute. Finally, by using the newer API for the MongoDB objects I’ve ensured compatibility for the future.

Exporting Data from SQL Server to CSV Files for Import to MongoDB Using PowerShell

I’ve been exploring other database systems, in order to determine how to import data from them using SQL Server Integration Services (SSIS). My first step though was to create some test data. I wanted something familiar, so I decided to export the Adventure Works Data Warehouse sample database and import into MongoDB. While I had many options I decided the simplest way was to first export the data to CSV files, then use the MongoDB utility mongimport. Naturally I turned to PowerShell to create an automated, reusable process.

First, if you need the Adventure Works DW database, you’ll find it at http://msftdbprodsamples.codeplex.com/. Second, I did my export from a special version of Adventure Works DW I created called AdventureWorksDW2014. This is optional, but if you want to have a version of Adventure Works DW updated with current dates, see my post at http://arcanecode.com/2013/12/08/updating-adventureworksdw2012-for-2014/. Third, I assume you are familiar with MongoDB, but if you want to learn more go to http://www.mongodb.org/.

Below is the PowerShell 3 script I created. The script is broken into four regions. The first, User Settings, contains the variables that you the user might need to change to get the script to run. It has things like the name of the SQL Server database, the path to MongoDB, etc.

The second region, Common, establishes variables that are used by the remaining two regions. You shouldn’t need to change or alter these. The third region accesses SQL Server and exports each table in the selected database to a CSV format file.

The final region, “Generate MongoDB import commands”, creates a batch (.BAT) file which has all the commands needed to run mongoimport for each CSV file. I decided not to have the PowerShell script execute the .BAT file so it could be reviewed before it is run. There might be some tables you don’t want to import, etc.

It is also quite easy to adapt this script to just create CSV files from SQL Server without using the MongoDB piece. Simply remove the fourth and final region, then in the Common and User Settings regions remove any variables what begin with the prefix “mongo”.

As the comments do a good job of explaining what happens I’ll let you review the included documentation for step by step instructions.

#==================================================================================================
# SQLtoCSVtoMongoDb.ps1
# Robert C. Cain | @ArcaneCode |
http://arcanecode.com
#
# If you need a simple way to export data from SQL Server to MongoDb, here is one way to do it.
# The script starts by setting up some variables to the server environment (see the User Settings
# region)
#
# Next, it exports data from each table in the selected database to individual CSV files.
# Finally, it generates a batch file which executes mongoimport for each csv file to import
# into MongoDb.
#
# I broke this into four regions so if all that is desired is a simple export of data to CSVs,
# you can simply omit the final region along with any variables that begin with "mongo".
#
# While I could have gone ahead and run the batch file at the end, I chose not to in order to
# give you time to review the output prior to running the batch file.
#==================================================================================================

Clear-Host

#region User Settings

  # In this section, set the variables so they are appropriate for your project / environment
 
  # This is the spot where you want to store the generated CSVs.
  # Make sure it does NOT end in a \
  $csvPath = "C:\mongodb"

  # If you are running this on a computer other than the server, set the name of the server below
  $sqlServer = $env:COMPUTERNAME

  # If you have a named instance be sure replace "default" with the name of the instance
  $sqlInstance = "\default"

  # Enter the name of the database to export below
  $sqlDatabaseName = "AdventureWorksDW2014"

  # The settings below only apply to the MongoDB code generation
  # Assemble path to mongodb. This assumes utlities are stored in the default bin folder
  $mongoPath = "C:\mongodb"
  $mongoImport = "$mongoPath\bin\mongoimport"

  # Set the server name and port
  $mongoHost = "localhost"   # Leave blank to default to localhost
  $mongoPort = ""            # Leave blank to default to 27107
 
  # Set the user name and password, leave blank if it isn’t needed
  $mongoUser = ""
  $mongoPW = ""

  # Enter the name of the database to import to.
  $mongoDatabaseName = "AdventureWorksDW2014"

  # Upserts are REALLY slow, especially on large datasets. Setting this to $true will turn off
  # the upsert option. If set to true, you are responsible for either deleting all documents
  # in the collection before hand, or allowing the risk of duplicates.
  #
  # Setting to false will enable the upsert option for mongoimport, and attempt to determine the
  # keys and (if found) add them to the final mongoimport command.
  $mongoNoUpsert = $true

#endregion

#region Common ————————————————————————————
 
  # This section sets variables used by both regions below. There is no need to alter anything
  # in this region.

  # Import the SQLPS provider (if it’s not already loaded)
  if (-not (Get-PSProvider SqlServer))
    { Import-Module SQLPS -DisableNameChecking }

  # Assemble the full servername \ instance
  $sqlServerInstance = "$sqlServer\$sqlInstance"

  # Assemble the full path for the SQL Provider to get to the database
  $sqlDatabaseLocation = "SQLSERVER:\sql\$sqlServerInstance\databases\$sqlDatabaseName"

  # Now tack on the Tables ‘folder’ to the SQL Provider path, the move there
  $sqlTablesLocation = $sqlDatabaseLocation + "\Tables"
  Set-Location $sqlTablesLocation

  # Get a list of tables in this database
  $sqlTables = Get-ChildItem

#endregion

#region Export SQL Data —————————————————————————
  # In this section we will export data from each table in the database to a CSV file.
  # WARNING: If the CSV file exists, it will be overwritten.

  # These are just used to display informational messages during processing
  $sqlTableIterator = 0
  $sqlTableCount = $sqlTables.Count

  # Iterate over each table in the database
  foreach($sqlTable in $sqlTables)
  {
    $sqlTableName = $sqlTable.Schema + "." + $sqlTable.Name   

    # I’ll grant you the next little bit of formatting for the progress messages is a bit
    # OCD on my part, but I like my output formatted and easy to read.
    $sqlTableIterator++
    $padCount = " " * (1 + $sqlTableCount.ToString().Length – $sqlTableIterator.ToString().Length)
    $sqlTableIteratorFormatted = $padCount + $sqlTableIterator

    if( $sqlTableName.Length -gt 50 )
      { $padTable = " " }
    else
      { $padTable = " " * (50 – $sqlTableName.Length) }

    Write-Host -ForegroundColor White -NoNewline "Processing Table $sqlTableIteratorFormatted of $sqlTableCount : $sqlTableName $padTable"
   
    # If the instance is "default", we have to exclude it when we use Invoke-SqlCmd
    if($sqlInstance.ToLower() -eq "\default")
      { $sqlSI = $sqlServer }
    else
      { $sqlSI = $sqlServerInstance }

    # Load an object with all the data in the table
    # Note if you have especially large tables you may need to modify this
    # section to break things into smaller chunks.
    $sqlCmd = "SELECT * FROM " + $sqlTableName
    $sqlData = Invoke-Sqlcmd -Query $sqlCmd `
                             -ServerInstance $sqlSI `
                             -SuppressProviderContextWarning `

    # Now write the data out.
    # Note utf8 encoding is important, as it is all mongoimport understands
    # Also need to omit the Type Info header PowerShell wants to write out
    Write-Host -ForegroundColor Yellow "    Writing to table $sqlTableName.csv"
    $sqlData | Export-Csv -NoTypeInformation -Encoding "utf8" -Path "$csvPath\$sqlTableName.csv"

  }

  # Just add a blank line after the processing ends
  Write-Host

#endregion

#region Generate MongoDB import commands ———————————————————-

  # In this region we will generage the commands to import our newly exported data
  # into an existing database in MongoDB. This is an example of our desired output (wrapped
  # onto multiple lines for readability, in the output it will be a single line):

  #  C:\mongodb>bin\mongoimport –host localhost -port 27107
  #                             –db AdventureWorksDW2014 –collection DimSalesReason
  #                             –username Me –password mySuperSecureP@ssW0rd!
  #                             –type csv –headerline –file DimSalesReason.csv
  #                             –upsert –upsertFields SalesReasonKey

  # Note several of these parameters are optional, and could use defaults, or be potentially
  # omitted from the final output, based on the choices at the very beginning of this script

  # Feel free to alter the $mongoCommand as needed for other circumstances

  # Final warning, the database must already exist in MongoDb in order to import the data. This
  # script will not generate the database for you.

  # Create the name for the batch file we will generate
  $mongoBat = $csvPath + "\Import_SQL_" + $sqlDatabaseName + "_to_MongoDb_" + $mongoDatabaseName + ".bat"

  # See if file exists, if so delete it
  if (Test-Path $mongoBat)
    { Remove-Item $mongoBat }

  # These are just used to display informational messages during processing
  $sqlTableIterator = 0
  $sqlTableCount = $sqlTables.Count

  # mongoimport allows us to do upserts, helping to eliminate duplicate rows on import.
  #
  # To make an upsert work there has to be a key column to match up on. Fortunately,
  # most tables in the SQL Server world have Primary Keys, so we can find out what
  # columns those are and add it to the command. Note if there is no PK in SQL Server,
  # no upsert will be attempted.
  #
  # Note though that upserts are REALLY slow, so the option to skip them is
  # built into the script and set at the top (mongoNoUpsert). The generated batch file
  # assumes that either a) you have deleted all data from the collection ahead of time,
  # or b) you are OK with the risk of duplicate data.

  # Iterate over each table in the database to build the mongoimport command
  foreach($sqlTable in $sqlTables)
  {
    $sqlTableName = $sqlTable.Schema + "." + $sqlTable.Name

    # A bit more OCD progress messages
    $sqlTableIterator++
    $padCount = " " * (1 + $sqlTableCount.ToString().Length – $sqlTableIterator.ToString().Length)
    $sqlTableIteratorFormatted = $padCount + $sqlTableIterator
    Write-Host -ForegroundColor Green "Building mongoimport command for table $sqlTableIteratorFormatted of $sqlTableCount : $sqlTableName"

    # Begin building the command
    $mongoCommand = "$mongoImport "
   
    if ($mongoHost.Length -ne 0)
      { $mongoCommand += "–host $mongoHost " }

    if ($mongoPort.Length -ne 0)
      { $mongoCommand += "–port $mongoPort " }

    $mongoCommand += "–db $mongoDatabaseName –collection $sqlTableName "

    if ($mongoUser.Length -ne 0)
      { $mongoCommand += " –username $mongoUser –password $mongoPW " }

    $mongoCommand += " –type csv –headerline –file $csvPath\$sqlTableName.csv "
       
    # Build the upsert clause, if the user has elected to use it.
    if ($mongoNoUpsert -eq $false)
    {
      $mongoPKs = ""
      foreach($sqlIndex in $sqlTable.Indexes)
      {
        if($sqlIndex.IndexKeyType -eq ‘DriPrimaryKey’)
        {
          foreach($sqlCol in $sqlIndex.IndexedColumns) #$sqlPKColumns)
          {
            if ($mongoPKs.Length -ne 0)
              { $mongoPKs += "," }
            # Note column names are returned with [ ] around them, and must be removed
            # Have to use -replace instead of .Replace() because $sqlCol is an column not a string
            $mongoPKs += ($sqlCol -replace "\[", "") -replace "\]", ""
          }
               
          $mongoCommand += " –upsert –upsertFields $mongoPKs"
        }           
      }
    }

    # Append the command to the batch file
    $mongoCommand | Out-File -FilePath $mongoBat -Encoding utf8 -Append

  }

  # Just add a blank line after the processing ends
  Write-Host

#endregion

 

Updating AdventureWorksDW2012 for 2014

A while back I did a post that contained a script to update the AdventureWorksDW2012 database to have dates for the 2013 time period. This will allow folks to demo date related queries and be able to simply use things like GETDATE or NOW without having to do funky math tricks to take into account the pitifully out of date offering.

I’ve now updated the script for 2014, thought I’d pass along the updated version. Note some browsers don’t seem to render the script using the mono-spaced font I intend, but just ignore. Copy and paste into SQL Server Management Studio and it should work fine.

/*-----------------------------------------------------------------------------------------------*/
/* Updating AdventureWorks2012 for Today                                                         */
/*                                                                                               */
/* Robert C. Cain, http://arcanecode.com @ArcaneCode                                             */
/*                                                                                               */
/* Script Copyright (c) 2013 by Robert C. Cain                                                   */
/* AdventureWorks database Copyright (c) Microsoft.                                              */
/*                                                                                               */
/* This script will make a backup of the AdventureWorks2012DW database, then copy and restore it */
/* as AdventureWorksDW2014. It will then update it for current dates. 2008 now becomes 2014,     */
/* 2007 is now 2012, and so forth. This script is dependent on the AdventureWorks2012DW sample   */
/* database already being installed. It won't change AdventureWorksDW2012 in anyway.             */
/*                                                                                               */
/* Be warned, if AdventureWorksDW2014 exists, it will be deleted as part of this process.        */
/*                                                                                               */
/*-----------------------------------------------------------------------------------------------*/

PRINT 'Updating AdventureWorks2012 for Today - Starting'
GO

/*-----------------------------------------------------------------------------------------------*/
/* Step 1 - Make a copy of AdventureWorksDW2012 and restore as AdventureWorksDW2014              */
/*-----------------------------------------------------------------------------------------------*/
SET NOCOUNT ON

USE [master]

-- Step 1.1. Make a backup of AdventureWorksDW2012 ----------------------------------------------
PRINT 'Backing up AdventureWorksDW2012'
GO

BACKUP DATABASE [AdventureWorksDW2012] 
    TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\AdventureWorksDW2012.bak' 
  WITH NOFORMAT, 
       INIT,  
       NAME = N'AdventureWorksDW2012-Full Database Backup', 
       SKIP, 
       NOREWIND, 
       NOUNLOAD,  
       STATS = 10
GO


-- Step 1.2. Delete the database AdventureWorksDW2014 if it exists ------------------------------
PRINT 'Deleting AdventureWorksDW2014, if it exists'
GO

IF (EXISTS (SELECT 1 
              FROM master.dbo.sysdatabases 
             WHERE name = 'AdventureWorksDW2014' )
   )
   EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'AdventureWorksDW2014'
GO

IF (EXISTS (SELECT 1 
              FROM master.dbo.sysdatabases 
             WHERE name = 'AdventureWorksDW2014' )
   )
   DROP DATABASE [AdventureWorksDW2014]
GO

-- Step 1.3. Restore the database to a new copy -------------------------------------------------
PRINT 'Restoring AdventureWorksDW2012 to AdventureWorksDW2014'
GO

RESTORE DATABASE [AdventureWorksDW2014] 
   FROM  DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\Backup\AdventureWorksDW2012.bak' 
   WITH  FILE = 1,  
   MOVE N'AdventureWorksDW2012_Data' 
     TO N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\AdventureWorksDW2014_Data.mdf',  
   MOVE N'AdventureWorksDW2012_Log' 
     TO N'C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\AdventureWorksDW2014_log.ldf',  
        NOUNLOAD,  STATS = 5

GO

PRINT 'Done Creating AdventureWorksDW2014'
GO



/*-----------------------------------------------------------------------------------------------*/
/* Step 2. Create a helper function to convert dates to a YYYYMMDD format Date Id.               */
/*-----------------------------------------------------------------------------------------------*/

USE [AdventureWorksDW2014]
GO

IF EXISTS (SELECT [name] FROM [sys].[all_objects] WHERE [name] = 'DateToDateId')
  DROP FUNCTION [dbo].[DateToDateId];
GO

CREATE FUNCTION [dbo].[DateToDateId]
(
  @Date DATETIME
)
RETURNS INT
AS
BEGIN

  DECLARE @DateId  AS INT
  DECLARE @TodayId AS INT

  SET @TodayId = YEAR(GETDATE()) * 10000
               + MONTH(GETDATE()) * 100
               + DAY(GETDATE())         

  -- If the date is missing, or a placeholder for a missing date, set to the Id for missing dates
  -- Else convert the date to an integer
  IF @Date IS NULL OR @Date = '1900-01-01' OR @Date = -1
    SET @DateId = -1  
  ELSE
    BEGIN
      SET @DateId = YEAR(@Date) * 10000
                  + MONTH(@Date) * 100
                  + DAY(@Date)         
    END  
  
  -- If there's any data prior to 2000 it was incorrectly entered, mark it as missing
  IF @DateId BETWEEN 0 AND 19991231 
    SET @DateId = -1

  -- Commented out for this project as future dates are OK
  -- If the date is in the future, don't allow it, change to missing
  -- IF @DateId > @TodayId 
  --   SET @DateId = -1

  RETURN @DateId

END

GO




/*-----------------------------------------------------------------------------------------------*/
/* Step 3. Add new dates to the dbo.DimDate table.                                               */
/*-----------------------------------------------------------------------------------------------*/
PRINT 'Adding new dates to dbo.DimDate'
GO

SET NOCOUNT ON

-- Later we will be writing an INSERT INTO... SELECT FROM to insert the new record. I want to 
-- join the day and month name memory variable tables, but need to have something to join to. 
-- Since everything is calculated, we'll just create this little bogus table to have something
-- to select from.
DECLARE @BogusTable TABLE
  ( PK TINYINT)

INSERT INTO @BogusTable SELECT 1;


-- Create a table variable to hold the days of the week with their various language versions
DECLARE @DayNameTable TABLE
  ( [DayNumberOFWeek]      TINYINT
  , [EnglishDayNameOfWeek] NVARCHAR(10)
  , [SpanishDayNameOfWeek] NVARCHAR(10)
  , [FrenchDayNameOfWeek]  NVARCHAR(10)
  )

INSERT INTO @DayNameTable
SELECT DISTINCT 
       [DayNumberOFWeek]      
         , [EnglishDayNameOfWeek] 
         , [SpanishDayNameOfWeek] 
         , [FrenchDayNameOfWeek]  
  FROM dbo.DimDate

-- Create a month table to hold the months and their language versions.
DECLARE @MonthNameTable TABLE
  ( [MonthNumberOfYear] TINYINT
  , [EnglishMonthName]  NVARCHAR(10)
  , [SpanishMonthName]  NVARCHAR(10)
  , [FrenchMonthName]   NVARCHAR(10)
  )

INSERT INTO @MonthNameTable
SELECT DISTINCT
       [MonthNumberOfYear] 
     , [EnglishMonthName]  
     , [SpanishMonthName]  
     , [FrenchMonthName]   
  FROM dbo.DimDate

-- This is the start and end date ranges to use to populate the 
-- dbo.DimDate dimension. Change if it's 2014 and you run across this script.
DECLARE @FromDate AS DATE = '2011-01-01'
DECLARE @ThruDate AS DATE = '2015-12-31'

-- CurrentDate will be incremented each time through the loop below.
DECLARE @CurrentDate AS DATE
SET @CurrentDate = @FromDate

-- FiscalDate will be set six months into the future from the CurrentDate
DECLARE @FiscalDate  AS DATE

-- Now we simply loop over every date between the From and Thru, inserting the
-- calculated values into DimDate.
WHILE @CurrentDate <= @ThruDate
BEGIN

  SET @FiscalDate = DATEADD(m, 6, @CurrentDate)

  INSERT INTO dbo.DimDate
  SELECT [dbo].[DateToDateId](@CurrentDate)
       , @CurrentDate
       , DATEPART(dw, @CurrentDate) AS DayNumberOFWeek
       , d.EnglishDayNameOfWeek
       , d.SpanishDayNameOfWeek
       , d.FrenchDayNameOfWeek
       , DAY(@CurrentDate) AS DayNumberOfMonth
       , DATEPART(dy, @CurrentDate) AS DayNumberOfYear
       , DATEPART(wk, @CurrentDate) AS WeekNumberOfYear
       , m.EnglishMonthName
       , m.SpanishMonthName
       , m.FrenchMonthName
       , MONTH(@CurrentDate) AS MonthNumberOfYear
       , DATEPART(q, @CurrentDate) AS CalendarQuarter
       , YEAR(@CurrentDate) AS CalendarYear
       , IIF(MONTH(@CurrentDate) < 7, 1, 2) AS CalendarSemester
       , DATEPART(q, @FiscalDate) AS FiscalQuarter
       , YEAR(@FiscalDate) AS FiscalYear
       , IIF(MONTH(@FiscalDate) < 7, 1, 2) AS FiscalSemester
    FROM @BogusTable
    JOIN @DayNameTable d
      ON DATEPART(dw, @CurrentDate) = d.[DayNumberOFWeek]
    JOIN @MonthNameTable m
      ON MONTH(@CurrentDate) = m.MonthNumberOfYear

  SET @CurrentDate = DATEADD(d, 1, @CurrentDate)
END
GO

-- If you want to verify you can uncomment this line.
-- SELECT * FROM dbo.DimDate WHERE DateKey > 20110000

PRINT 'Done adding new dates to dbo.DimDate'
GO





/*-----------------------------------------------------------------------------------------------*/
/* Step 4. Update the Fact Tables with the new dates.                                            */
/*-----------------------------------------------------------------------------------------------*/


PRINT 'Update Fact Tables'
GO

SET NOCOUNT ON

-- To move forward five years, we simply add 50,000 to the date key

-- 4.1 FactFinance ------------------------------------------------------------------------------
PRINT '  FactFinance'
GO

UPDATE [dbo].[FactFinance]
   SET [DateKey] = [DateKey] + 60000;


-- 4.2 FactInternetSales ------------------------------------------------------------------------
PRINT '  FactInternetSales'
GO

-- There are a few rows where the due date is on leap year. Update these to back off a day 
-- so the date add works OK
UPDATE [dbo].[FactInternetSales]
   SET [OrderDateKey] = 20080228
     , [OrderDate] = '2008-02-28'
 WHERE [OrderDateKey] = 20080229

UPDATE [dbo].[FactInternetSales]
   SET [DueDateKey] = 20080228
     , [DueDate] = '2008-02-28'
 WHERE [DueDateKey] = 20080229

UPDATE [dbo].[FactInternetSales]
   SET [ShipDateKey] = 20080228
     , [ShipDate] = '2008-02-28'
 WHERE [ShipDateKey] = 20080229

-- Now update the rest of the days. 
UPDATE [dbo].[FactInternetSales]
   SET [OrderDateKey] = [OrderDateKey] + 60000
     , [DueDateKey] = [DueDateKey] + 60000
     , [ShipDateKey] = [ShipDateKey] + 60000
     , [OrderDate] = DATEADD(yy, 6, [OrderDate])
     , [DueDate] = DATEADD(yy, 6, [DueDate])
     , [ShipDate] = DATEADD(yy, 6, [ShipDate])


-- 4.3 FactResellerSales ------------------------------------------------------------------------
PRINT '  FactResellerSales'
GO

-- As with Internet Sales, there are rows where the due date is on leap year. 
-- Update these to back off a day so the date add works OK
UPDATE [dbo].[FactResellerSales]
   SET [OrderDateKey] = 20080228
     , [OrderDate] = '2008-02-28'
 WHERE [OrderDateKey] = 20080229

UPDATE [dbo].[FactResellerSales]
   SET [DueDateKey] = 20080228
     , [DueDate] = '2008-02-28'
 WHERE [DueDateKey] = 20080229

UPDATE [dbo].[FactResellerSales]
   SET [ShipDateKey] = 20080228
     , [ShipDate] = '2008-02-28'
 WHERE [ShipDateKey] = 20080229

-- Now update the table
UPDATE [dbo].[FactResellerSales]
   SET [OrderDateKey] = [OrderDateKey] + 60000
     , [DueDateKey] = [DueDateKey] + 60000
     , [ShipDateKey] = [ShipDateKey] + 60000
     , [OrderDate] = DATEADD(yy, 6, [OrderDate])
     , [DueDate] = DATEADD(yy, 6, [DueDate])
     , [ShipDate] = DATEADD(yy, 6, [ShipDate])

-- 4.4 FactSalesQuota ---------------------------------------------------------------------------
PRINT '  FactSalesQuota'
GO

UPDATE [dbo].[FactSalesQuota] 
   SET [DateKey] = [DateKey] + 60000

-- 4.5 FactSurveyResponse -----------------------------------------------------------------------
PRINT '  FactSurveyResponse'
GO

UPDATE [dbo].[FactSurveyResponse]
   SET [DateKey] = [DateKey] + 60000

-- 4.6 FactCallCenter ---------------------------------------------------------------------------
PRINT '  FactCallCenter'
GO

-- All the rows in call center have a 2010 date, just add 3 years to make these 2014
UPDATE [dbo].[FactCallCenter]
   SET [DateKey] = [DateKey] + 40000


-- 4.7 FactCurrencyRate -------------------------------------------------------------------------
PRINT '  FactCurrencyRate'
GO

-- Because the DateKey is part of the PK, we have to drop the key before we can update it
ALTER TABLE [dbo].[FactCurrencyRate] DROP CONSTRAINT [PK_FactCurrencyRate_CurrencyKey_DateKey]
GO

-- Shift the 2008 Leap Year days to 2012 Leap Year
UPDATE [dbo].[FactCurrencyRate]
   SET [DateKey] = 20120229
 WHERE [DateKey] = 20080229

-- Update everything except the leap year we fixed already
UPDATE [dbo].[FactCurrencyRate]
   SET [DateKey] = [DateKey] + 60000
 WHERE [DateKey] <> 20120229

-- Add the PK back
ALTER TABLE [dbo].[FactCurrencyRate] 
  ADD CONSTRAINT [PK_FactCurrencyRate_CurrencyKey_DateKey] PRIMARY KEY CLUSTERED 
      ( [CurrencyKey] ASC,
          [DateKey] ASC
      )
 WITH ( PAD_INDEX = OFF
      , STATISTICS_NORECOMPUTE = OFF
      , SORT_IN_TEMPDB = OFF
      , IGNORE_DUP_KEY = OFF
      , ONLINE = OFF
      , ALLOW_ROW_LOCKS = ON
      , ALLOW_PAGE_LOCKS = ON
      ) ON [PRIMARY]
GO


-- 4.8 FactProductInventory ---------------------------------------------------------------------
PRINT '  FactProductInventory'
GO

-- As with the previous step, the date is part of the primary key, so we need to drop it first.
ALTER TABLE [dbo].[FactProductInventory] DROP CONSTRAINT [PK_FactProductInventory]
GO

-- Shift the 2008 Leap Year days to 2012 Leap Year
UPDATE [dbo].[FactProductInventory]
   SET [DateKey] = 20120229
 WHERE [DateKey] = 20080229

-- Update everything except the leap year we fixed already
UPDATE [dbo].[FactProductInventory]
   SET [DateKey] = [DateKey] + 60000
 WHERE [DateKey] <> 20120229
 
-- Add the PK back
ALTER TABLE [dbo].[FactProductInventory] 
  ADD CONSTRAINT [PK_FactProductInventory] PRIMARY KEY CLUSTERED 
      (    [ProductKey] ASC
      , [DateKey] ASC
      )
 WITH ( PAD_INDEX = OFF
      , STATISTICS_NORECOMPUTE = OFF
      , SORT_IN_TEMPDB = OFF
      , IGNORE_DUP_KEY = OFF
      , ONLINE = OFF
      , ALLOW_ROW_LOCKS = ON
      , ALLOW_PAGE_LOCKS = ON
      ) ON [PRIMARY]
GO

PRINT 'Done updating the Fact tables'
GO



/*-----------------------------------------------------------------------------------------------*/
/* Step 5. Cleanup, remove the helper function we added earlier.                                 */
/*-----------------------------------------------------------------------------------------------*/
PRINT 'Removing Helper Function'
GO

IF EXISTS (SELECT 1 FROM [sys].[all_objects] WHERE [name] = 'DateToDateId')
  DROP FUNCTION [dbo].[DateToDateId];
GO

/*-----------------------------------------------------------------------------------------------*/
/* All done!                                                                                     */
/*-----------------------------------------------------------------------------------------------*/
PRINT 'Updating AdventureWorks2012 for Today - Completed'
GO

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

SSAS Duplicate Attribute Error – Another Cause

I had  a real head banger this afternoon and I’m not talking about the heavy metal playlist I was jamming to in my iPod.

I had a table that, in addition to the surrogate key, business keys, etc had these columns:

Level1 Level2
Phineas and Ferb Phineas
Phineas and Ferb Ferb
Phineas and Ferb Perry

I had a dimension in SSAS where I had a Level1 -> Level2 Hierarchy built. When I tried to process the dimension, SSAS kept kicking out “duplicate attribute error” on Perry. I did the usual checking, yes my attribute relationships were OK, the Key property was built correctly, etc.

So then I moved to look at the data itself. I first did a SELECT * FROM CoolShow WHERE Level1 = ‘Phineas and Ferb’ and Level2 = ‘Perry’.

I got back 4 rows. Hmm. After some more head banging (Guns ‘n Roses, Paradise City) I wound up doing a SELECT * FROM CoolShow WHERE Level1 = ‘Phineas and Ferb’ and I get back 42 rows with Perry. Hmm, I say to myself, “self, that looks odd”. To which self replied “duh”.

Then self suggested I do a SELECT ‘*’ + Level2 + ‘*’ FROM CoolShow WHERE Level1 = ‘Phineas and Feb’

This yielded some interesting results, 4 rows read *Perry* the other rows read *Perry *   (Note the blank space between y and * .)

Well obviously I needed a RTRIM, which I dutifully added then reran the query. Only to get the *Perry * again in the output. At this point self said I was on my own and abandoned me to drown its sorrows in a pitcher of margaritas.

I took the output and copied it into an editor that would do hex mode. So what do I see but a 0D 0A in the space between the y and the *, causing me to scream “AH-HA” as Queen’s Bohemian Rhapsody hit its crescendo. I also scared the cat, but I only mention that because cute cat things are supposed to be popular on the internet and I figure it might help my SEO. For those who don’t speak HEX, 0D 0A is 13 and 10, which turn into a Carriage Return and Line Feed.

Now by this point most of you have probably given up on this handy tip, deciding a pitcher of margaritas sounded pretty good and left to find some. But if you are still hanging in, I modified the view with this code:

RTRIM(REPLACE(REPLACE([Level2], CHAR(13), ”), CHAR(10), ”) ) AS [Level2]

Returning to the cube I was able to process the dimension successfully and answer the question of “Where’s Perry?” (Answer: He’s at the bar trying to keep a drunken self from using his evil margaritainator invention.)

So the moral of the story, if you get duplicates error, and your dimension looks okey-dokey, check the data to see if you have some errant CR/LFs. Apparently SSAS doesn’t handle them very well.

Now if you’ll excuse me, I’m going to join self at the bar before self guzzles all the margaritas (self is such a drunken sot). AC/DC, take me away with some “Highway to Hell”!

Follow

Get every new post delivered to your Inbox.

Join 103 other followers