Got us another one!

Looks like I’ve done it again. My friend and now co-worker Jeff Barnes has gotten the Podcast Fever. ( http://jeffbarnes.net/portal/blogs/jeff_barnes/archive/2007/07/31/podcast-fever.aspx ). He’s blogged about his new Zune and how he’s finally using it for educational purposes. Jeff offers a great list for the .Net developer and beginning podcast listener to review. Be sure to check his list out.

Oh, and Jeff, don’t worry, there’s currently no known cure for Podcast Fever!

Arcane Add-Ins: KNOCKS Solutions VS 2005 Add-In

It’s been a while since I talked about Visual Studio add-ins, so when I ran across KNOCKS Solutions “Knocks VS2005 Add-In”, I knew I’d found the perfect subject. Available at http://www.knocks-solutions.com/VS2005Addins.aspx?sm=idVS20052 , this rather full featured add-in offers many utilities.

First is a personal clipboard, that allows you to store and retrieve up to 9 different items. And they’re persistent, they will stay between VS sessions.

Next is a code snippets manager. This seems a bit redundant in light of VS 2005’s snippets, but I can see it being very useful during a presentation.

In addition to the code snippets is a Notes module. I’ve often wished for the ability to store quick notes to use in a presentation, so this is a handy module.

Up next is the one tool of theirs I have a beef with, the “Re-arrange code” tool. I like the idea of being able to re-arrange my code. Often I’m working on some public method, and realize I need a private helper method and, being in a hurry and lazy, will drop it right under the public method. Later I’ll move it around, which was why I was excited to see this tool.

Sadly, it has a really bad side affect, it strips out any regions you’ve put into your code, and it strips out any comments that might lie between methods (I often put a comment header right above my method, instead of in the method.) When Knocks rearranges your code all of that goes into the bit bucket, making the tool useless in all but (perhaps) the very earliest stages of coding. I would have thought it possible to add regions to the tool as well, and allow code rearranging within a region, between regions, or even to move regions around. Perhaps this will be addressed in the next version (he said, ever full of eternal hope).

There is a nifty zip tool that will zip your entire project, handy for quick backups. Also is a tool that embraces the concept of a favorites for projects. Another tool is one I wonder why no one did sooner, a keyword search on Google (or other configurable search engine, like MS Live). This is one I’ll be using often.

Also included is a simple “Data Object” generator. You bring it up and enter a few property names and types and Knocks will create the basic class for you. While I have seen more full featured code generators, I appreciate the basic simplicity of this one, not to mention the price (which I’ll get to shortly).

knocks01The final two tools I’ll mention are my favorites. First is a Design Explorer. This adds an explorer window (I put mine with the Solution Explorer area) to your display. In it are all the controls for your current form. Clicking on the control not only makes it the active control in the designer, but displays the properties in the lower half of the Design Explorer window.

 

 

 

 

 

 

 

 

 

knocks02The other tool is the Code Explorer. It displays a tree of your current code module. Double clicking on the code element will take you to it in the code window.

I’ve seen other add-ins with code windows, this one seems equal in functionality with many others similarly priced.

 

 

 

 

Oh, did I forget to mention the price? It’s free. Yes, FREE. Knocks has packed a lot of functionality into this add-in, and the fact it’s free makes it well worth the time to download and learn.

Taming the Outlook E-Mail Monster

Over time I’ve read quite a few helpful hints and tips on how to “tame” your e-mail. I have a few I’ve developed over time that I haven’t seen mentioned before, so I thought I’d share.

First, I deal with a lot of different projects at once. One thing I find valuable is to include the name of the project the e-mail is about in the subject line. That helps me later, to quickly categorize my mails. At the very least, make sure to include the project name somewhere in the body of the mail. Nothing’s more confusing then getting cc’d on an e-mail that says “I took care of the files” and not knowing what project the person refers to.

Next comes archiving of your e-mails. Many texts I’ve read tell you to read the e-mail, take action on it, then get it out of your inbox. But what do you do with it when you’re done for the time, but you may want to save it to refer back to later?

I’ve found the best method is to create individual Outlook data files (.pst) files for each project. True, you will wind up with a lot of pst files, but you can easily close them once the project is complete and get them out of your way. You can even burn them to a CD or DVD when you need more disk space yet still be able to open them.

You still have the luxury of creating individual folders within the projects pst file, if you need to subdivide more; perhaps meeting minutes, agendas, coding, and testing might be folders you want.

At one time I just had one projects folder with folders and subfolders galore. The problem was it quickly became cluttered from past projects, and kept growing in size. I found moving each project to its own data file to be much easier to manage.

OK, I hear you asking “What about those e-mails not associated with projects?” Maybe it’s a policy notice, or a confirmation about a software purchase, or just some “congrats you did a good job” e-mail you’d like to hang on to. For those I create an Outlook data file for each year. I then have 12 folders, one for each month.

I am very strict with myself about what goes in here, to keep it from becoming a miscellaneous junk bin. I typically have no more than 20 or 25 messages for any given month worthy of hanging on to.

OK, we’ve all been victims of this next situation. We go off to a two hour meeting, come back, and find thirty seconds after we walked away someone sent out an e-mail and copied the entire department. Half the folks chose to respond, then the other half replied to the response, and before you know it there’s 42 unread mails on the one subject alone, not to mention all the other mail that’s come in. How do you quickly isolate those e-mails for a given project and deal with them?

For that I find Outlook’s Find tool invaluable. In Outlook 2003, select Tools, Find, Find from the menu:

taming01

You should then see a new tool area just above your inbox:

taming02

In Outlook 2007, the Find feature is turned on and built into the Inbox bar by default:

taming03

In either case, simply type in what you want, like the name of your project (remember my first tip?) and hit enter, or click the word Search (2003) or the magnifying glass (2007). The area where your inbox sits will now show only the messages with your search word in either the subject or the message body.

Once you have filtered your box to show only those messages you want, it becomes an easy matter to move them to archive, delete them, or deal with them in some other manner.

When done, simply click Clear (2003) or the X (it pops up where the magnifying glass is in 2007) and your inbox will be returned to it’s non-filtered state, hopefully with a few less messages for you to deal with.

OK, so you have a piece of mail that you want to keep in your inbox for a few days, you don’t want to file it quite yet, but don’t have to handle it right this second. Most common for me are announcements that a database or system will be offline for maintenance. I certainly want to know about it, and be reminded, but don’t need to do anything right now. For this I use the flags.

The very right most column of your inbox depicts a small flag. Clicking on it will turn the flag to a red color. In 2003, you can pick different colors, in 2007 the color is tied to the distance in the future the event will occur.

taming04

In either version, one of the menu options is “Add Reminder”. With it, a dialog pops up to let you give a calendar date / time when you need to take action.

taming05

In this example, the e-mail was letting me know of a live radio interview being done with a member of one of my favorite bands, Midnight Syndicate (http://www.midnightsyndicate.com). I’m adding a reminder to that e-mail so I’ll be sure not to miss it.

I’ll then basically ignore the message, letting it sit in my inbox until the time comes for me to deal with it. Once the event is complete, be it a database outage, meeting, or special event, I can click again on the flag, to “Mark as complete”. I can choose to archive the message, respond to it, or delete it.

Speaking of deleting, the final piece of advice I can offer is delete, delete, delete. Let’s face it, how many of those messages do you really need? If you are the recipient of a long chain of e-mails, just keep the last one and delete the rest, their contents duplicated in the last one.

Meeting announcements, bake sales, grocery lists from the spouse, are all things which hit the bit bucket as soon as I’m done with them. I’d bet if you’re like me, a good percentage of your e-mail can safely be deleted.

Using these techniques, I’m able to keep my inbox to between 100 and 150 messages, a manageable level. A fry cry from the old days where I might have 2,500 messages in my inbox!

I’m always trying to improve, though, so if you have ideas for taming your inbox please post a comment and share with the community.

Arcane Fun Fridays: The Eighth Dimension

I realize the entries this week on facts and dimensions can be enough to give even the most stalwart geek a pain in the pocket protector. So allow me to wrap the week yet stay on the topic of dimensions with a movie suggestion:

The Adventures of Buckaroo Banzai Across the Eight Dimension

(http://www.amazon.com/Adventures-Buckaroo-Banzai-Across-Dimension/dp/B00005JKEX )

This fun little romp from 1984 has a large collection of now well known stars such as John Lithgow, Peter Weller, Christopher Lloyd, and Jeff Goldblum. The hero, Buckaroo Banzai is a brain surgeon, particle physicist, and rock star. His experiments in the eighth dimension land earth right in the middle of an interplanetary war, and that’s just the first few minutes of the movie.

You have to watch this several times in order to catch all of the jokes and subtle humor. I think the funniest line is from Lithgow, playing an apparently insane but in reality alien possessed Dr. Lizardo, who keeps telling everyone “”Laugh-a while you can, monkey boy! You all are-a gonna die!”

After a week of sifting throught facts and dimensions, it’s great to be able to relax and spend a little time with Buckaroo in the eighth dimension. And even if you don’t get to see the movie, at least remember Buckaroo Banzai’s advice…

“No matter where you go… there you are.”

Standard disclaimer: I don’t make any money or have any financial affiliation with this flick or anyone who sells it. Just a cool movie. So there, monkey boy!

Just The Facts

I’ve spent the last few days covering dimension tables, but I should briefly mention that there are three types of fact tables as well. The first is the kind most will be familiar with, a transactional fact.

These are straightforward, every time an event occurs in your transaction system, a copy of that event is made in the fact table. Someone sells a product, or updates an inventory record, or some similar event occurs it’s reflected in your transaction fact table. Predominately your warehouse will be made up of these types of tables.

Often times your users will want to be quickly able to reference data that’s aggregated over time; sales to date, sales per week, and so forth. To make getting to these totals fast, the second and third types of fact tables are known as snapshot tables. In addition to speeding the return of data to the user, it also reduces the load on the server as totals are aggregated at one time instead of each time the user runs a report.

The first of the snapshots (which is also the second type of fact table) is called a periodic snapshot fact table. At some defined point in time, a job will run that will aggregate the transactional fact data and place it into the periodic snapshot fact table. Perhaps every Saturday night at midnight a job runs which tallies up the previous weeks sales into a sales for the week fact table. Another classic example is the month ending totals after accounting closes out the business month.

The other type of snapshot, and the third type of fact table, is an accumulating snapshot. An accumulating snapshot is constantly being updated, instead of waiting for some point in time. Perhaps you have a trigger on the transaction sales table. Whenever a sale is made, that trigger updates the sales to date table.

Accumulating snapshots have the advantage of always being up to date and accurate. If you have a lot of them, or the transaction tables that they are based on have frequent updates, they can become a major drain on your system resources. For that reason it’s best to limit, to use them with transaction tables that are seldom updated. Let’s take two examples to help illustrate.

In an average car dealership, I’d guess that they might sell four cars a day. Since the transaction volume is so low, four a day, an accumulating snapshot would be entirely reasonable to track sales to date.

On the other hand, let’s look at a large site like woot.com, that deals with hundreds if not thousands of sales in a very short time span. In their case, an accumulating snapshot would be a major drain on system resources. Instead they would be much better served by a periodic snapshot.

A periodic snapshot, by the way, isn’t limited to a once a week or once a month situation, even though that’s how they are primarily used. You could update your periodic snapshots once a day, once an hour, or even once every 10 minutes if your server could support it. The “periodic” simply means that it runs on a regular schedule rather than constantly.

In addition to your transactional fact data, you should review your reporting needs to see if either of the two types of snapshots, periodic or accumulating, could be of use in your warehouse to speed data reporting and reduce the load on your servers.

Slowly Changing Dimensions

Since the facts in your fact tables mark discrete events in time, you would expect them to change pretty often. Your dimensions, on the other hand, tend to be pretty stable. An Employee dimension, for example, wouldn’t change very often. But, it will change. Employees move, get married, croak, or quit.

To handle these changes, data warehouses have adopted the concept of slowly changing dimensions. They are categorized into three types.

Type 1 is the basic type, with it previous versions of the dimension are overwritten with the latest data. Changes are not tracked. This can be OK, based on what the business will be doing with the data.

Take the example of an Employee dimension. Our star employee, Hortence McGillicutty moves, and updates her address. With the warehouse, the only business task done with the address is mailings. In this case, we don’t care what her address was two years ago, we only need her current address for mailing her paycheck and other related data. In this case, loosing the history of her previous address is fine, so a Type 1 dimension is a good choice.

Let’s say, however, that the business uses the Employee dimension much more extensively. Perhaps we have an application that on the left side of the screen displays information pertaining to the most recent financial statements. On the right is a scanned in image where our star Hortence McGillicutty has signed off on these statements.

A few months go by, and our heroine falls in love and gets married. If we were still using a Type 1, the scanned in image still has her maiden name, but the display to the left now shows her new married name, Hortence Hollywogger. Not a big deal you say? How about the first time a Sarbanes-Oxley auditor comes in, finds the discrepancy, and threatens big fines.

To solve this, data warehousing has Type 2 slowly changing dimensions. With Type 2, from date and to date fields are added to the dimensional table. These dates are the valid dates for that record. In our example, once Hortence got married we’d have two rows in the employee dimension:

Key EmpId Name From To
123 555 Hortence McGillicutty 07/17/2001 06/15/2007
123 555 Hortence Hollywogger 06/16/2007 12/31/9999

The warehouse could then look up the correct name based on the date. In the case of our above example, the system would look up based on the date the report was signed, discover it was during the 07/17/2001-06/15/2007 time frame, and display her name correctly, leaving our auditor looking elsewhere for imperfections.

I mentioned there was a third type, which predicibly is named Type 3. With type 3, you don’t add new rows, but new columns to the table to handle changes. As you might guess, this can become quite a maintenance nightmare, and is rarely if ever used. Like Microsoft Bob ( http://en.wikipedia.org/wiki/Microsoft_bob ), Type 3 is probably best forgotten about.

To summarize, use Type 1 when tracking changes to the data is not necessary. Use Type 2 when you need to know when those changes were made. How do you decide?

That’s where your customers become involved. It will be necessary to know what they plan to do with that information. Yes, I know, talking to customers can be a scary experience, but hey just envision them using Microsoft Bob and you’ll be OK.

Conformed Dimensions

Yesterday I talked about the differences in dimensions versus facts. Today I’d like to extend that discussion with the importance of Conformed Dimensions.

One of the major advantages of a data warehouse is the ability to combine data from various, and sometimes vastly different, systems. Let’s take a common problem: your company has three different systems, sales, production, and purchasing. You’ve bought these from three different vendors, so unfortunately the part numbers used throughout the systems are not consistent, but you need to generate some reports showing how part x went into product y and was sold to customer z.

Unfortunately the part numbers are not consistent across the three systems because, as I mentioned, they came from three different vendors. What’s a programmer to do?

This is where conformed dimensions come in handy. In the part dimension table you create a surrogate key. This is the new primary key for a part, which is simply a made up value. Maybe you chose to use a GUID, or perhaps it’s just an auto incrementing integer. Regardless, this is now your new “part number” for all 3 systems once you bring the data in the warehouse.

You would add three more fields to the part dimension table. In addition to the primary key you would have a field “saleskey”, a field “productionkey”, and finally a “purchasingkey”. Then, when bringing your sales data into the warehouse, you look up the saleskey in the dimension table, get the primary key for the part, and place it in the fact sales table.

Repeat with production and purchasing systems. By now you are beginning to get the idea. Because you have conformed the part key across the three fact tables, you can now draw reports using the new part key as a common thread to join the various fact tables together.

This process is known as a conformed dimension. ALL of your dimensions in your warehouse need to be conformed if you want to truly leverage the power of your warehouse. Employees, parts, customers, and locations are just a few examples of dimensions you’d want to conform.

As you can see, having conformed dimensions is key to the success of your warehouse. Failure to conform your dimensions means you loose one of the most powerful features of warehousing, the ability to produce reports across differing systems.

Follow

Get every new post delivered to your Inbox.

Join 102 other followers