If you remember my Intro to PowerPivot post, you’ll recall that every organization has that power user who leveraged the tools he had on hand, mostly Excel, to do their own form of Business Intelligence. One headache that plagued “that guy” was the freshness of data. It could be quite difficult to keep the data used in his or her spreadsheets up to date. One of the great boons that PowerPivot offers is the ability to quickly and easily refresh data.
Once again let’s return to the Excel 2010 spreadsheet we created in my Combining Data from Multiple Sources in PowerPivot. In this example we’ll put ourselves in the role of a salesperson, creating an analysis for one of our best customers, one Lucy Harrington. Going to the Customer tab in our PowerPivot workbook, we click on the drop down menu for the FirstName column. Clear the “Select All” option then scroll down and only check the first name “Lucy” which should be sufficient to find only the customer we wish.
In looking at her record, we are horrified to see that some nit-wit in data entry had improperly given our best customer the title of “Mr.” even though we know Lucy is not only a “Mrs.” but as a newlywed very insistent on the use of the Mrs. title.
Correcting this is easy enough, in the source system we can go make the fix, or perhaps we bribe the DBA with lunch so he’ll make our data fix a high priority and issue this update command to the database:
UPDATE SalesLT.Customer SET Title = 'Mrs.' WHERE FirstName = 'Lucy'
So now how do we refresh our data in PowerPivot? Easy enough. In PowerPivot on the Home tab you’ll notice a Refresh button in the Connections group. Clicking on it gives us two options, one to refresh just the current table we are viewing, another to refresh all of the tables in our PowerPivot work area. For this example we’ll pick just the Refresh command.
Once you pick Refresh the Data Refresh dialog appears to let you know of its progress. When done it will let you know of its success and you can then click the Close button to close the window. At this point you’ll see some behavior that I can only attribute to the fact that as of this writing PowerPivot is still in the CTP stage. My screen flashed a few times, then when it was done I was on the last tab in my PowerPivot workbook, the CountryInfo tab. Simply click on the Customer tab to return.
A great thing about the Refresh is that it left our filters intact, you’ll see we are still only looking at Lucy’s, and even better we can see the title of our best customer is now corrected:
If you’re using the same sheet from the previous lessons, you will also notice the calculated column we’d put in place, FullName, is still present.
A word of caution.
When using the Refresh feature, be aware that every time you do a refresh you are pulling the entire dataset back, not just the changed rows. Using SQL Server Profile I monitored the Refresh operation from the server side, and found the query issues was:
SELECT [SalesLT].[Customer].* FROM [SalesLT].[Customer]
If you were to attempt this on a large table in your production system during a busy production time it could have severely negative consequences. Imagine the slowdown that would occur if this were executed against a system having hundreds of millions of rows.
When getting into data sets that large, I would advise moving towards a more structured data warehousing environment. This would provide a structure better suited for analysis. If nothing else though, get IT to setup a copy of the production system in a location updated during off peak hours so that analysts could pull massive amounts of data without impacting production systems.
Summary
Taking into account the strain on the source systems, the easy one button refresh does provide a great way for analysts to ensure the data they are making business decisions from is current and up to date.
Innnteresting. Does the DBA have any control over when these spreadsheets are refreshed? In a scenario with dozens or hundreds of managers checking out their numbers in Excel, I can totally envision at least a handful of them refreshing their data every morning “just to be safe.”
Unfortunately no, which is something of a flaw with PowerPivot. But I think this will self-police itself in a number of ways.
If the manager is refreshing small datasets, then it’ll be fast and no one will really notice. If however the datasets are huge, then it will only take one time of the manager waiting around a while for his refresh to pull all that data to cure him of the “just to be safe” situation.
Further, if the manager does persist then he’ll not only have phone calls from irate DBAs, but he’ll have to deal with complaints from his own staff (or potentially other managers) for killing performance of production systems.
If / when the PowerPivot sheet is uploaded to Excel 2010, a schedule can be setup there to do the refreshes. Now this may actually mean a SharePoint Admin and a DBA might actually have to talk to each other, but stranger things have happened.
If all that doesn’t fix it, then I advise that a data warehouse, or even just a simple copy of the production data be done each night, and then only allow people to pull from the copy and not the live database. In the long run this is what I suspect will happen in many organizations.
Also recall that SQL Server Analysis Services can be used as a data source, so a very likely scenario would be a BI project is created that sets up an SSAS database, but instead of creating pre-built pivot tables users are allowed to create their own pivot tables / charts in PowerPivot using SSAS as the source.
And there is a silver lining, in that when a DBA sees a lot of hits from the same PowerPivot spreadsheet, IT will know it’s a popular sheet and perhaps some IT dollars should be focused there to create a full blown BI solution, as opposed to the “pointy haired boss” who is yelling loudly to the IT manager that his sheet needs to be converted although he’s the only one who uses it.
Thanks for this fantastic tutorial! It’s awesome to see the Excel community already helping users with 2010. Keep up the great posts!
We’d love to hear more from you at http://www.facebook.com/office
Cheers,
Andy
MSFT Office Outreach Team
Am I able to order more then one at a time?