Conformed Dimensions

Yesterday I talked about the differences in dimensions versus facts. Today I’d like to extend that discussion with the importance of Conformed Dimensions.

One of the major advantages of a data warehouse is the ability to combine data from various, and sometimes vastly different, systems. Let’s take a common problem: your company has three different systems, sales, production, and purchasing. You’ve bought these from three different vendors, so unfortunately the part numbers used throughout the systems are not consistent, but you need to generate some reports showing how part x went into product y and was sold to customer z.

Unfortunately the part numbers are not consistent across the three systems because, as I mentioned, they came from three different vendors. What’s a programmer to do?

This is where conformed dimensions come in handy. In the part dimension table you create a surrogate key. This is the new primary key for a part, which is simply a made up value. Maybe you chose to use a GUID, or perhaps it’s just an auto incrementing integer. Regardless, this is now your new “part number” for all 3 systems once you bring the data in the warehouse.

You would add three more fields to the part dimension table. In addition to the primary key you would have a field “saleskey”, a field “productionkey”, and finally a “purchasingkey”. Then, when bringing your sales data into the warehouse, you look up the saleskey in the dimension table, get the primary key for the part, and place it in the fact sales table.

Repeat with production and purchasing systems. By now you are beginning to get the idea. Because you have conformed the part key across the three fact tables, you can now draw reports using the new part key as a common thread to join the various fact tables together.

This process is known as a conformed dimension. ALL of your dimensions in your warehouse need to be conformed if you want to truly leverage the power of your warehouse. Employees, parts, customers, and locations are just a few examples of dimensions you’d want to conform.

As you can see, having conformed dimensions is key to the success of your warehouse. Failure to conform your dimensions means you loose one of the most powerful features of warehousing, the ability to produce reports across differing systems.

Dimensions versus Facts in Data Warehousing

I’ve made mention before of a large data warehousing project I’ve been involved with, using the SQL Server 2005 tools. Like a lot of developers on a lot of projects, I was thrown in the deep end and had to learn a lot of new technologies in a short order of time. Fortunately I’m a glutton for punishment.

I found a lot of material on the various tools, and quickly became competent in their use. I also quickly got up to speed on star schema versus snow flake schema, and the divvying up of data into facts and dimensions.

The thing that always puzzled me was, how do you decide what goes into facts versus what goes into your dimensions. Having a short breather I decided to do some reading up to make sure my understanding of theory was up to my understanding of the tools.

The best explanation I’ve found is in “The Microsoft Data Warehouse Toolkit” by Joy Mundy and Warren Thornthwaite. http://shrinkster.com/r4n (Good book, highly recommend it, and standard disclaimer I don’t get any kickbacks so buy it where ever you want.) It covers the Kimball method for warehousing. To quote…

It may help to think of dimensions as things or objects. A thing such as a product can exist without ever being involved in a business event.

Ah, grasshopper, enlightenment begins.

As the book describes, a dimension is your noun. It is something than can exist independent of a business event, such as a sale. Products, employees, equipment, are all things that exist. A dimension either does something, or has something done to it.

Employees sell, customers buy. Employees and customers are examples of dimensions, they do.

Products are sold, they are also dimensions as they have something done to them.

Facts, to carry on another concept from the book are the verb. An entry in a fact table marks a discrete event that happens to something from the dimension table. A product sale would be recorded in a fact table. The event of the sale would be noted by what product was sold, which employee sold it, and which customer bought it. Product, Employee, and Customer are all dimensions that describe the event, the sale.

In addition fact tables also typically have some kind of quantitative data. The quantity sold, the price per item, total price, and so on.

Knowing this makes it much clearer in my mind how to start designing my own warehouses. Or at least ready to take the first step.

SQL Server Integration Services and the “The script files failed to load” Error

I have an SSIS job that has been running on our test server since April, with no issues. Last week it just quit working. About the fourth package in I started getting “The script files failed to load” error when loading a very simple VB Script inside a package.

A web search led me to this forum page:

http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=267047&SiteID=1

To save you from wading through all the messages, the two causes of this error could be 1) Package’s PreCompile was set to False, or 2) I had breakpoints in the package.

So I check, and nope my PreCompile was set to true (the default) and there were no breakpoints. In addition, most of the people posting in the forum link above couldn’t get their packages to run in the first place, this didn’t seem to occur in packages like mine that had been running fine for a while.

Next I thought, OK what about environmental issues? Mining our extensive logs I was able to determine the time when one moment the scripts work, then 15 minutes later they no longer worked.

With that exact 15 minute interval known, the DBA team reviewed what had happened on the server, and sure enough a security patch for the .Net 1.1 Framework had been applied on the server during that exact time. Bingo!

It took us 2 days to work through the issue, the team, and I have to be honest and include myself here, had the thought process “well, the error is just on the test server, we’ll get to it when we can.” Since it was just a test server, and we weren’t testing anything at the time, we weren’t stressing over it.

We should have been. The very same patch was scheduled to be rolled to our production server just an hour and a half after we found it. Had that scheduled deployement occurred we would have been in deep diaper filling.

Let me make a quick distinction here, in that it’s not necessarily fixing the issue that’s urgent, but identifiying it. Once you have it identified, you can control it. By not quickly identifying the issue, we nearly rolled the same patch into production.

So the first point here, if your SSIS package which has been running for a while suddenly starts getting “The script files failed to load” error, check to see if any .Net Framework 1.1 patches were applied.

Second, this shows you how important good logging can be. With our extensive logging I was able to determine with a fifteen minute window when the problem occurred.

Finally, never treat issues that occur on your test server as if they are not important. We were quite lucky, narrowly avoiding disaster on our production server. Identify the cause quickly, so you can control it.

Update: We found a fix that worked, see https://arcanecode.wordpress.com/2007/07/18/ssis-issue-from-monday/ for more detail.

Arcane Talks

On Thursday July 12th I’ll be speaking at the Birmingham Software Developers Association (BSDA). You can get directions from the club’s website, http://www.bsda.info/ .

I’ll be speaking on the subject of SQL Server Compact Edition. It’s been a while since I blogged about this, so I thought I’d provide a few links for quick reference.

As promised, here is a link to the Power Point presentation (in PDF format) I used during the presentation:

 SSCE presentation for BSDA user group

My very first post was back in January 2007:

http://shrinkster.com/nsk

My next series of posts began on April 10th, and described how to create databases using the various tools available to you.

http://shrinkster.com/qtl

The complete C# and VB.Net code samples were posted April 13th, 2007:

http://shrinkster.com/qtm

And finally, the series of posts I mentioned on system Views started with this post on April 16th, 2007:

http://shrinkster.com/qtn

If you want to see all of my SSCE posts, simply click the SQL Server Compact Edition tag over in the categories area, or use this link:

 https://arcanecode.wordpress.com/tag/sql-server-compact-edition/

Please note each of these links is a starting point, be sure to read the blog for the next few days after each link in order to get the full story.

And now a question, I’m working up material for a new presentation. Debating between SQL Server 2005 Full Text Searching and SQL Server Integration Services. Any opinions?

Eventually I’ll do both, but would like to do the first one based on feedback. Even if you can’t attend please post a comment and let me know where your interests lie.

Getting Started with SQL Server 2005 Full Text Searching: Part 5 – Advanced Searching

Last week we looked at how to search for data by passing a simple search word into a SQL Query that uses Full Text Searching. As you would expect, Full Text Searching is capable of much more powerful searches than a single word.

First, you can search for a phrase. To do so, you include the phrase inside double quotes.

SELECT WebSiteID, WebSiteName
FROM MyTableOfCoolWebsites
WHERE CONTAINS(*, ‘”Arcane Code”’)

Will find occurances of the phrase Arcane Code in your index. Let’s say however, you want to search for either Arcane or Code? FTS supports Boolean searches.

WHERE CONTAINS(*, ‘”Arcane” OR “Code”’)

Will return results where either the word Arcane or Code is in the text. If you want them both, use an AND in place of the OR. Additionally, FTS supports the AND NOT keyword, for when you want the first word but not the second.

Full Text Search also supports something called Proximity searches. In a proximity seach, one word appears near another.

WHERE CONTAINS(*, ‘arcane near code’)

So how near is near? Well the online docs are a bit vague on this, a safe guess is about ten words. You may also see this form of the syntax:

WHERE CONTAINS(*, ‘arcane ~ code’)

~ equates to near, although to me not nearly as readable, be aware though in case you run across it.

Finally, you should be aware that certain words are excluded from searches. Common words such as a, an, the, and so on. Microsoft refers to these as “Noise Words”. You can edit the list of noise words in case you have some words in your environment that wind up being Noise Words. Your company name might be one example.

I found the file in the folder C:\Program Files\Microsoft SQL Server\MSSQL.2\MSSQL\FTData (your milage may vary). The file is named noiseenu.txt. (ENU is for English US, not to be confused with noiseeng.txt which our friends in the British Isles will be using.)

This is also handy to know in case there is a reserverd word you need to remove from the list. In our environment one of the reserve words is also an abbreviation for a piece of our equipment, so I would want to remove this from our list.

Getting Started with SQL Server 2005 Full Text Searching: Part 4 – Valid Data Types

Hopefully everyone had some fun over the weekend with their digital cameras and listening to some good photo podcasts. Today we’ll return to the subject of Full Text Searching (FTS) under SQL Server 2005.

If you’re like me one of your questions is “OK, FTS is great, but exactly what data can I index with it?” Good question, a quck reference of the books online gives us the answer.

Char, nchar, varchar, nvarchar, text, ntext, xml, varbinary(max), and image are the valid data types for FTS. However, according to the documentation text, ntext, and image are going to be removed in a future version of SQL Server, so I’m going to avoid them and so should you.

Char, nchar, varchar, nvarchar are pretty straight forward, and probably expected, so I won’t go into them further. XML is pretty cool, as you may be aware with SQL Server 2005 Microsoft added the ability to store XML in the database. You can then do XML queries into these XML fields, a subject for a future post. However, MS also allows you to search inside these XML fields using Full Text Searching as well.

The final one that needs explanation then is varbinary(max). I’m sure you’re scratching you’re head over this, how can FTS work on binary data? Well remember, the Full Text Search engine is built on top of the Microsoft Indexing Service, which can scan inside Word, Excel, etc files stores on a server.

Thus FTS can actually look inside files stored inside varbinary(max) and if they have the proper extension FTS will look inside them and index the contents of the files. How sweet is that?

Getting Started with SQL Server 2005 Full Text Searching: Part 3 – Using SQL

OK, so you have this spiffy catalog, and you’ve populated it with full text searches for your favorite tables. Now you’d like to actually use those index from within your SQL. There are four new commands you can use in SQL to get to your data. All are used as part of the where clause, and have similar syntax but different results.

The first is the one you’ll probably use the most, it’s the contains command. You simply pass in the column name and what you want to search for.

select col1, col2 from myTable
where contains(fts_column, ‘searchword’)

For fts_column you can use the name of one of the columns that was indexed, or you can use * (and asterisk) to search in all of the columns that were full text search indexed. In the single quotes you put in the word or phrase you want to look for.

Contains searchs for an exact match. It either finds it or it doesn’t, and it has to be an independent word. For example, if your text field contained “I love the Mythbusters every week.” and you searched for ‘Mythbuster’, would NOT return a match.

If you want your text searching to be a little more open minded, use the freetext command instead. The syntax is identical to contains, including the ability to use an asterisk.

select col1, col2 from myTable
where freetext(fts_column, ‘searchword’)

In this case, however, a search of our afore mentioned text field for ‘Mythbuster’ would return a match, as freetext understands that Mythbuster and Mythbusters are essentially the same word.

In your application, you might consider using a check box that says “exact match”. For exact match queires use the contains keyword, when the user does not check you can use the freetext command.

It’s also possible to return a list of results that are sorted by a rank. The rank indicates the strength of the match to the search phrase passed in. To get a list of ranks, use either the containstable or freetexttable commands. Their syntax is like their cousins, as is the method it uses for searching (containstable is exact, freetexttable is more liberal). The only addition is the first parameter must be the name of the table, then comes the column name and search condition.

Instead of rows, what is returned are two columns: key and rank. The rank is a relative score from 0 to 1000 that indicates the strength of the match. A higher value means it’s a better match.

The key is the primary key from the table you’re searching. You can then use this key to pull back the data from the main table. Let’s do a simple example: you want an exact match for all employees who live in Alabama. Unfortunately the DBA who created the table had just come off a three day drinking binge, and instead of separate street / city / state fields, just created a big text field called emp_address.

select rank, emp_id, emp_name from empTable
join freetexttable(empTable, emp_address, ‘Alabama’) ftt
on empTable.emp_id = ftt.[KEY]

This would return something like:

Rank emp_id emp_name
255 12345 Jamie Hyneman
128 45678 Adam Savage

And there you go, four ways you can have your SQL leverage the power of full text searching to return results.

Getting Started with SQL Server 2005 Full Text Searching: Part 2 – The Indexes

Yesterday I introduced you to full text searching, and covered the basics on creating catalogs to hold your full text indexes. A full text search index is a little different than a regular index. First, each table can only have one full text search index created for it. Next, the create syntax is slightly different. OK, in fact it’s a lot different. Let’s take a look:

create fulltext index on my_table_name_here
(column1, column2,…)
key index my_tables_unique_index_name
on my_catalog_name_here
with change_tracking {manual | auto | off}, no population


The first thing is also the most obvious, you need to supply the name of the table in the first line. Note we’re not supplying a name for the full text search index. Since there’s only one per table, SQL Server takes care of creating the full text search index name for us.

Next we need to supply the name of the column or columns we want indexed. These can be any sort of text field. Just list them one after another, separated by commas.

The next item is also required, and sort of tricky. Each row in the table you are doing full text searching on must have a unique index. It makes sense when you think about it, for the text search to be efficient it must be able to quickly move to the row with the word you’re hunting for, and the way to do that is via the unique index.

So for this parameter you’ll need to supply a unique index name for “my_tables_unique_index_name”. Keep in mind this is not the name of the columns from the table. Instead this is the name of a “normal” index (not a full text search index) that is unique for the table.

The “on” parameter is optional, you only need it if you set up multiple catalogs and don’t have a default. If you omit it, it will simply put the new index in the default catalog.

Next you will need to tell SQL Server how often to update the index. You do this through the with change_tracking parameter. OFF turns it off entirely, no updates will be done until you issue a rebuild via the alter syntax I’ll cover momentarily. You might want to use OFF when you have a table that gets updated very rarely.

AUTO, on the other hand is for when you have a table that gets updated frequently. It will update the full text search index when the associated table is updated. The final option, MANUAL will flag changes to the underlying table, but it won’t update the full text search index until you tell it to.

The final parameter, no population, only applies when you use OFF. It tells SQL Server not to populate the index when it’s created. If you omit it, or use AUTO or MANUAL, SQL Server will populate the full text search index when the index is created.

OK, so you’ve got this index created and need to change it, or perhaps you need to work with one that’s already in existence. For this there’s the alter command:

alter fulltext index on my_table_name_here
parameters here

There’s quite a few parameters you can pass, so let’s look at them individually. Just know that when you see them below, they should go where you see “parameters here” above.

set change_tracking {off | auto | manual} – This works the same as with the create command, it lets you change the tracking mode.

disable – Disables the full text search index, it’s not used for searching nor is it updated. However the data is left intact, should you want to turn it back on.

enable – Enables the full text search index after a disable.

add ( column ) – Adds the passed in column to the full text search index.

drop ( column ) – Removes the passed in column from the full text search index.

start full population –This rebuilds the index from the ground up.

start incremental population –This will update the index since the last time it was updated. Note you must have a timestamp column on your table for this to work.

start update population –Remember a moment ago when I talked about the change_tracking manual option? Well this command is how you update an index with manual change tracking.

And finally, you may decide one day you no longer need the full text search index. Since the readers of this blog are the smartest, most intelligent readers on the planet you’ve already figured out we’ll need to use a variant of the drop command:

drop fulltext index on my_table_name_here

And there you go, you now know how to create, change, or remove a full text search index. Now there’s one more piece, you need to know how to use them from within your SQL. But we’ll save that for tomorrow.

Getting Started with SQL Server 2005 Full Text Searching: Part 1 – The Catalog

One of the coolest features of SQL Server 2005 is the ease with which you can implement full text searching. True, it was available in previous versions but 2005 makes it very easy to implement and use.

Full Text Search is an offshoot of the Microsoft Index Server technology. It’s what you could call an “add-on”. By default it’s enabled for every database you create in 2005.

But just having it turned on is not enough, now you have to create a catalog to hold the data for your full text data. The catalog is a separate file from your database, and holds all the key words it finds. The syntax to create a catalog is pretty simple:

create fulltext catalog my_catalog_name_here
in path ‘c:\mysqldata\somesubdirectory’
as default

The ‘in path’ is optional, if you omit it your catalog is created in the same place as the data. For small databases this is fine, for large ones you might actually want to store the catalog on a separate hard disk in order to get a performance boost.

The ‘as default’ clause says this catalog will be the default one used for new full text search indexes, or for searching existing ones. Most times you’ll probably only need one catalog for a database, so you can add this and forget it.

Once you have a catalog created, you may need to tweak it. There’s not a lot of tweaking you can do, just three ways you can alter it, and all are implemented via the alter command.

alter fulltext catalog my_catalog_name_here rebuild

alter fulltext catalog my_catalog_name_here reorganize

alter fulltext catalog my_catalog_name_here as default

The first command, rebuild does just what it says. Your old catalog goes to the great bit bucket in the sky (i.e. it’s deleted) and SQL Server will recreate all of your full text search indexes. And it should be obvious, but remember during this time your full text search will not be available.

Reorganize is something like doing a disk defrag, it cleans up and reorganizes your full text search indexes. While it may not be as efficient as doing a complete rebuild, it does have the advantage of not taking the catalog offline while it does it’s work.

Finally ‘as default’ simply makes the catalog the default, in case you either forgot or were distracted by Mike Rowe doing something nauseating on “Dirty Jobs” (http://www.discovery.com/dirtyjobs) .

OK, you now have a catalog. But the catalog is simply a space to hold your full text search indexes, and those we’ll create in the next post.

Arcane Reasons for Data Warehousing

I may have mentioned that of recent I have been doing a lot of work in the Data Warehousing arena. Today I met with some IT folks from another branch of the company who are considering a reporting strategy for their area. One of the people I was meeting with asked me “With so much data available, how do you decide what data to put in the warehouse first, versus what data do you leave in the application, either permanently or until a later point?

Great question, and I thought that you too might be interested in the answer.

Interapplication Reports. Historically trying to combine data from multiple applications has been painful, to put it nicely. Clearly than this turns out to be one of the most compelling reasons for data warehousing, to house data from multiple applications and allow users to easily combine that data into singular reports.

Phasing Out Historical or Ad-Hoc Systems. Accounting systems seem to have an existence all their own. For various reasons they live well beyond their normal lifespan. We have a system at work, written in an old DOS based reporting tool that dates back to the late 1980’s. Over the years it’s been used to do reporting from other systems. As it turns out it has some issues with Vista, and will need replacing. Rather than getting yet another system, we plan to replace its reports with ones from our data warehouse.

Friendlier Reporting. Often when I see databases, the field names are quite cryptic. Names like fklnam (foreign name last name) and accsbcd (account sub code) litter databases. It’s difficult enough for IT Professionals to decipher the field name mayhem, but asking users to do so just to create a few ad-hoc reports can be asking far too much. Not to mention the sometimes bizarre seeming relationships between tables.

Moving to a data warehouse allows you to give much saner, user friendly names to your data. In addition you can flatten out some of the tables, simplifying the relationship structures significantly.

Production Server Load Reduction. Production systems are usually optimized for dealing with single records at a time. As a result, searching through and retrieving data for large quantities of data can be resource intensive on the production system. Shifting reporting to a warehouse means a reduced load for the production system. In addition you elimante the chance that malformed SQL from some ad-hoc query can cripple your production system.

Ease of Offline Maintenance for Production Systems. Finally, having a warehouse makes it easier to take production systems offline for maintenance. If users know they can still get to their data via the warehouse, they will be less concerned about their production system going offline for work, which in turn makes it easier to schedule such work. If you have a system that requires frequent maintenance, your users will be less likely to give you grief if their data is available elsewhere.

Those are my primary ways in which we decide which data is targeted for inclusion into the warehouse. If you need to combine data from multiple applications, have older systems that need replacement, have cryptic field names or complex table relationships, need to reduce the load on your production server, or have systems that need frequent maintenance then consider those systems first for inclusion into a data warehouse.

I’d be curious to hear your comments on your strategies for determining inclusion into your own data warehouse.

Orcas and SQL Server Compact Edition

The new beta of Orcas (the next version of Visual Studio) is now out. You can download it either as an installer or as a Virtual PC image. I opted to download the VPC image (I would have put it in a VPC anyway, so why not save some work?).

The main info page for Orcas is at http://msdn2.microsoft.com/en-us/vstudio/aa700831.aspx or http://shrinkster.com/oqp. From there you can pick the download type you want. As I mentioned before, there’s an installer version and the VPC version. If you have a spare machine lying around, you can use the installer, otherwise I highly recommend the Virtual PC version. (I grabbed the Visual Studio Team Suite Only – VPC).

It’s very very VERY important you completely read the instructions. Did I mention it was important to read the instructions? Well in case I didn’t, be sure to read the instructions.

There’s actual two VPC images you have to download. The first is named Orcas, the second though is called BASE (it’s in an exe). When you run the Orcas VPC the first time, it will ask you to point to the base VPC. In addition, you will need to know the user id and password to login, both available on the download page and buried in the instructions. (See, I told you it was important to read the instructions!)

Like most people playing with Orcas, the first thing I did was tested what I already knew. In this case, I loaded in some of the SQL Server Compact Edition samples I’ve published here in the past few weeks. Most notably the code in my post on April 13th (http://shrinkster.com/oqq).

In some ways I wish I had a lot of new technical content to share with you. Harrowing tales of how I was able to fight the bugs and to conquer the evil things lurking in Orcas. But I can’t. It just worked. And, I’m happy to say worked without any flaws. I didn’t have to install any special add-ins (Orcas comes with SSCE assemblies preinstalled) or make special references.

I’ll keep playing with it, but for now I’m quite happy to report that while I don’t see any radical changes with SSCE in Orcas, I don’t see any issues so far either.

SQL Server Compact Edition To Be Shipped With Windows Mobile 6.0!

I was relaxing Monday evening, watching some of Channel 9 videos I’d downloaded (yeah, I know, but I really do find it relaxing). In this video on Windows Mobile 6: http://channel9.msdn.com/Showpost.aspx?postid=303900 Rory is interviewing Mel Sampat who showed off some cool stuff with Windows Mobile 6.0.

One of the coolest things is that WM6 includes both SQL Server Compact Edition and .Net Compact Framework 2.0. What this means for you as a developer is that you don’t have to worry about deploying all the SSCE plumbing. How sweet will that be!

They also demoed some cool voice capabilities that you have to see. Very very awesome.

Now if they’d just hurry up and release Windows Mobile 6!!

SQL Server 2005 Compact Edition – Important Component

In my post on Getting Started with SQL Server ( http://shrinkster.com/nsk ), I mentioned 3 things you need. SQL Server CE, Visual Studio 2005 SP 1, and the SQL Server CE Books on Line. As it turns out there’s an important fourth item.

The missing component is the “Microsoft SQL Server Compact Edition Tools for Visual Studio 2005 Service Pack 1”, available at http://www.microsoft.com/downloads/details.aspx?familyid=877C0ADC-0347-4A47-B842-58FB71D159AC&displaylang=en or http://shrinkster.com/oam .

This service pack fixes a few things using SSCE with VS, one of them I consider critical. When you do a Project, References from VS, in the .Net list you will now be able to find a selection for System.Data.SqlServerCe.

Second, all of the Create Database dialogs now correctly read “SQL Server Compact Edition” instead of “SQL Server Mobile”. It also updates device CAB files correctly and includes new features such as Click Once support.

This should be installed fourth in your list, so if you’ve already done the other components, you are good to go for installing this. To recap:

  1. Install SQL Server Compact Edition components. (http://shrinkster.com/l9f).
  2. Install Visual Studio Service Pack 1. (http://shrinkster.com/lel )
  3. Install SQL Server CE Books On-line. ( http://shrinkster.com/lem )
  4. Install SSCE for VS SP1, as I’ve described here. ( http://shrinkster.com/oam).

A special thanks to Doug Tunure, our regional MS Developer Evangelist who helped me communicate with the SSCE team to figure out what was missing, and thanks to the SSCE team for a great tool.

System Views in SQL Server Compact Edition: Provider Types

The last view we’ll discuss is the INFORMATION_SCHEMA.PROVIDER_TYPES. Strictly speaking, this does not provide a view into your particular database, but into SQL Server Compact Edition. As such, it probably won’t be very useful in day to day queries into the database.

What it may be useful for is in automating database code generation. This table lists all of the valid database types that SQL Server CE supports. Presumably if Microsoft chooses to add new data types to SSCE they will show up here.

Let’s take a look at the query that’ll return some useful info, and what each field means.

select type_name, data_type, column_size,

       literal_prefix, literal_suffix,

       is_nullable, case_sensitive,

       unsigned_attribute, fixed_prec_scale,

       minimum_scale, maximum_scale, is_fixedlength

 from information_schema.provider_types

 

type_name contains the list of valid datatypes, such as int, real, nvarchar, etc.

data_type contains a numeric representation of the type of data stored in the field. Note this is not unique, for example nchar and nvarchar, and ntext all have the same data_type of 130.

Column_size indicates the max size for a data_type.

Literal_prefix / literal_suffix indicates what you should use to wrap a literal value. This is probably one of the more useful items in this table. Let’s say you have a static text value you want to insert into a SQL statement, perhaps a system name such as the word arcanecode.com. Looking up the literal_prefix for a ntext item I see that it is N’ (the letter N followed by a single quote). The literal_suffix is just a single quote mark, so to include in a SQL statement I’d need to put N’arcanecode.com’ .

Is_nullable is obvious, a 1 indicates the field is allowed to hold null values, a 0 means it can’t. As of this writing, all the field types in SSCE are nullable.

Case_sensitive is similar, 1 shows the field is case sensitive, 0 not. Again, as of this writing none of the fields in SSCE are case sensitive.

Unsigned_attribute is null for non-numeric type, but for numbers a 1indicates the data_type is unsigned only (i.e. no negative values allowed). A 0 indicates negative values are valid.

Fixed_prec_scale is another Boolean field. A 1 indicates the precision (the number of positions after the decimal point) is fixed. A 0 indicates you can change the precision.

Minimum_scale, maximum_scale is tied to the fixed_prec_scale. For items like int’s where the precision is always the same, this is null. However on numerics the user (i.e. you) can set the scale, and these fields indicate the valid range.

Is_fixedlength is a Boolean that flags whether the data_types length is fixed. For numerics this is going to be 1 (true), and for most text types this is 0 (false).

If nothing else this table makes a great quick reference. Since SSCE only supports a subset of the SQL Server data types, it’s handy to have a place to see which types are supported.

In your application, you could use this to build code generators or build SQL statements to create new databases. For example you could use it to look up the prefix / suffix types for a data_type.

And there you go, this completes a week long look at the various views built into SQL Server 2005 Compact Edition, and how you can use them within your applications.