Creating Tables and Inserting Rows With FILESTREAM

In previous lessons, we setup the server to handle FILESTREAM, then we looked at how to create or alter a database to work with FILESTREAM. In this lesson we will create a table to store a FILESTREAM, then insert a few rows into it.

Before we create our table, be aware of some requirements necessary for FILESTREAM to work. First, you must have a special column that FILESTREAM uses to uniquely identify the stream. It must be a unique non null identifier of type ROWGUIDCOL. If we specify a default of NEWSEQUENTIALID the column becomes self maintaining. When we insert a new value into the row, SQL Server takes care of creating a GUID for us and we essentially can ignore the column. Here is an example:


USE FileStreamFTS
GO

CREATE TABLE DocumentRepository(
  ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, FileStreamID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE DEFAULT NEWSEQUENTIALID()
, DocumentExtension VARCHAR(10)
, DocumentName VARCHAR(256)
, Document VARBINARY(MAX) FILESTREAM DEFAULT(0x)
);
GO

Here the column “FileStreamID” will become the required column for FILESTREAM. Note the column name is not important, I could have called it “FSID”, “FSIdentity”, or even “Spock”. You’ll also note I created an ID column for our use in normal day to day operations. This is not a requirement of FILESTREAM, just good practice. There is a second requirement however. For the column that will be storing the documents, it must be VARBINARY(MAX), and it must add the FILESTREAM clause.

You will also note the default of “0x” (hex 0). This will be important if you wish to insert a new row without supplying the document at the time the row is created. It will create a file to act as a placeholder until such time as the real document is supplied.

You can also alter an existing table to add FILESTREAM capabilities. Simply use the ALTER TABLE command, add the unique identifier column (in this example “FileStreamID”) and the VARBINARY(MAX) column to hold your data (“Document” in the above example).

It’s time to insert some data. Normally the best way to insert data is using a client application, such as something written in .Net. It is possible to add documents though via T-SQL. To supply data for these examples, I decided a little culture was in order. I went to http://shakespeare.mit.edu/ and copied some of the plays into Microsoft Word documents. Note I used the older “.doc” format from 2003, not the 2007 “.docx”. This is not particularly important as far as FILESTREAM is concerned, but will come into play in later lessons when we look at doing Full Text Searches on this data.


INSERT INTO DocumentRepository(DocumentExtension, DocumentName, Document)
SELECT
 'doc' AS DocumentExtension
 , 'Hamlet.doc' AS DocumentName
 , * FROM OPENROWSET(BULK 'D:\Docs\Hamlet.doc', SINGLE_BLOB) 
   AS Document;
GO

Here we’ve inserted a new row into the table, and have ignored the “ID” and “FileStreamID” letting SQL Server create the values. The “DocumentExtension” and “DocumentName” columns are straightforward. To supply the document however, we need to use an OPENROWSET. This will supply the data from a disk file as a single BLOB (Binary Large OBject).

Let’s verify what was just inserted with this query:


SELECT ID
     , FileStreamID
     , DocumentExtension AS Ext
     , DocumentName
     , CAST(Document AS VARCHAR) as DocumentData
FROM DocumentRepository;

Notice the CAST, its necessary in order to get a view into the stored document.

ID FileStreamID Ext DocumentName DocumentData
1 A54A8FFE-F742-DE11-AC61-002243CE0AAB doc Hamlet.doc ÐÏࡱá > þÿ

 

As mentioned previously, it is also possible to insert a new row, then add the document later. To insert a new row into our example table, we simply use:


INSERT INTO DocumentRepository(DocumentExtension, DocumentName)
VALUES ('doc', 'AMidsummerNightsDream.doc');
GO

Now that the row exists, we can update it.


UPDATE  DocumentRepository
SET     Document = ( SELECT *
                     FROM OPENROWSET(
                      BULK 'D:\Docs\AMidsummerNightsDream.doc',
                      SINGLE_BLOB) AS TheDocument
                   )
WHERE   ID = 2 ;
GO

While it is possible to add data in this format, it’s fairly rare. The majority of the time you will want to use a client side application, which is what we’ll look at in the next installment of this series.

Advertisements

Enabling FILESTREAM In The Database

Once you have your server configured for FILESTREAM, you will need to configure your database. The easiest way is to establish it when you first create the database.


CREATE DATABASE FileStreamFTS ON 
PRIMARY ( NAME = FileStreamFTS_Data,
    FILENAME = 'd:\data\FileStreamFTS_Data.mdf'),
FILEGROUP FileStreamFTS_FileGroup1 
  CONTAINS FILESTREAM( NAME = FileStreamFTS_FileGroup,
    FILENAME = 'd:\data\FileStreamFTS_FileGroup')
LOG ON  ( NAME = FileStreamFTS_Log,
    FILENAME = 'd:\data\FileStreamFTS_Log.ldf')
GO

In the above example, I created my database in a directory on my D drive named “D:\Data”. If you examine the folder you ’ll now see two files, the mdf and ldf files. There is also a folder “FileStreamFTS_FileGroup”. It is in this folder that your files will eventually be stored in.

You may be tempted to poking around in the folder, and perhaps even access the stored files directly. My advice to you: don’t. Microsoft has strongly advised against this practice. After all, you went to all the trouble of setting up FILESTREAM to let SQL Server handle this for you, so let it do the handling. If you are curious to dig deeper into how FILESTREAM works behind the scenes, I would recommend Paul Randal’s excellent white paper available at http://msdn.microsoft.com/en-us/library/cc949109.aspx.

The above example shows how to add FILESTREAM capability when you first create your database. That’s great, but what if you already have an existing database you want to add FILESTREAM to? I have just such an existing database, “ArcaneCode”. To add FILESTREAM I need to issue two ALTER DATABASE statements. First, we need to add a FILEGROUP and indicate that file group contains FILESTREAM objects.


ALTER DATABASE ArcaneCode
ADD FILEGROUP ArcaneFileStreamGroup1 
  CONTAINS FILESTREAM
GO

In the second we have to indicate to the FILEGROUP where the FILESTREAM should be stored. Pass in the directory name where you want your files saved as the FILENAME parameter. As with the create statement the directory can not exist, if it does you’ll get an error.


ALTER DATABASE ArcaneCode
ADD FILE  
  (NAME = 'ArcaneFS_Group'
   , FILENAME = 'D:\Data\ArcaneCodeFS'
   )
TO FILEGROUP ArcaneFileStreamGroup1 
GO

This completes the steps needed to create a database with FILESTREAM, or add it to an existing database. In the next lesson we will look at creating tables that use FILESTREAM.

Enabling FILESTREAM on SQL Server 2008

One of the newly touted features of SQL Server 2008 is file streaming. For some time now SQL Server has allowed users to store large binary objects in the database inside a varbinary(max) field. Performance began to suffer, however, when the binary object was over 1 megabyte or so in size.

SQL Server 2008 solves this through its new FILESTREAM feature. With FILESTREAM SQL Server lets the operating system, Windows Server, do what it does best: handle the storage of binary objects, better known as files. In the database, SQL Server simply stores a reference that will let it open and close the file from the disk. FILESTREAM gives us the best of all worlds. When the file group is backed up so are the files, they can also be made part of a transaction.

By default, FILESTREAM is not enabled when you install SQL Server 2008. Activating it is a simple, two step process. As always though make sure to do proper backups before making any changes to your SQL Server, and be sure to first implement on development, then test systems before making changes to any production server.

First, you will need to enable FILESTREAM for the Windows Server it is running on. Begin by opening the SQL Server Configuration Manager. Once inside click on “SQL Server Services” then highlight the server you want to activate. In this example we clicked on “SQL Server (MSSQLSERVER)”.

image

Now that it is highlighted, right click on it and pick properties from the menu. Click on the FILESTREAM tab in the dialog that appears.

image

If you have a default install, your dialog should appear with nothing enabled, as this one does. Check on the first line, “Enable FILESTREAM for Trancact-SQL access”. This turns on the FILESTREAM feature, but with a drawback. It only works from T-SQL, and then only on the server itself. To make it useful we need to go further.

Once the first line is checked on, the second line, “Enable FILESTREAM for file I/O streaming access” will become enabled. Check it on. At this point the server will allow FILESTREAM from a separate application, but only when run on the same Windows Server that SQL Server is running on.

Since most of your users will not be running on the SQL Server itself, you will want to check on the final option which is now available, “Allow remote clients to have streaming access to FILESTREAM data”. Once this is checked, you can click OK to save your changes.

Note there may be one exception to checking on the final option. If you are running SQL Server Express, it is quite possible your application and the database may all be running on the same computer, and that you do not want any other machine to get access to this database. While this would be an uncommon situation, please note that the capability does exist for you to set things up this way if needed.

Now that the Windows Server is ready to handle FILESTREAM we need to enable the SQL Server. Fortunately this winds up being a simple matter. Open up SQL Server Management Studio, and enter the following command:


  EXEC sp_configure filestream_access_level, 2
  RECONFIGURE

Note the number at the end of the first line. Valid values for this are “0”, “1”, and “2”. “0” will disable FILESTREAM completely. “1” will enable it, but only for T-SQL code. “2” will fully enable it and allow access via .NET or other code. Note that these are slightly different from the way the Windows Server rights are set. Here, in order to run anything other than T-SQL you must use a “2”, regardless of where the code is executed (local or remote).

It is also possible to change this setting in the GUI. Right click on the server in SQL Server Management Studio, in this case it is “(local)”, and pick “Properties” from the menu. Go to the “Advanced” page and the FILESTREAM is the top most item, you can see it here with all of its options.

image 

Now your server has been properly configured to handle file streaming, you’ll want to create a database in a way to handle file streaming. We’ll do that in the next installment in this series.