In previous lessons, we setup the server to handle FILESTREAM, then we looked at how to create or alter a database to work with FILESTREAM. In this lesson we will create a table to store a FILESTREAM, then insert a few rows into it.
Before we create our table, be aware of some requirements necessary for FILESTREAM to work. First, you must have a special column that FILESTREAM uses to uniquely identify the stream. It must be a unique non null identifier of type ROWGUIDCOL. If we specify a default of NEWSEQUENTIALID the column becomes self maintaining. When we insert a new value into the row, SQL Server takes care of creating a GUID for us and we essentially can ignore the column. Here is an example:
USE FileStreamFTS
GO
CREATE TABLE DocumentRepository(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY
, FileStreamID UNIQUEIDENTIFIER ROWGUIDCOL NOT NULL UNIQUE DEFAULT NEWSEQUENTIALID()
, DocumentExtension VARCHAR(10)
, DocumentName VARCHAR(256)
, Document VARBINARY(MAX) FILESTREAM DEFAULT(0x)
);
GO
Here the column “FileStreamID” will become the required column for FILESTREAM. Note the column name is not important, I could have called it “FSID”, “FSIdentity”, or even “Spock”. You’ll also note I created an ID column for our use in normal day to day operations. This is not a requirement of FILESTREAM, just good practice. There is a second requirement however. For the column that will be storing the documents, it must be VARBINARY(MAX), and it must add the FILESTREAM clause.
You will also note the default of “0x” (hex 0). This will be important if you wish to insert a new row without supplying the document at the time the row is created. It will create a file to act as a placeholder until such time as the real document is supplied.
You can also alter an existing table to add FILESTREAM capabilities. Simply use the ALTER TABLE command, add the unique identifier column (in this example “FileStreamID”) and the VARBINARY(MAX) column to hold your data (“Document” in the above example).
It’s time to insert some data. Normally the best way to insert data is using a client application, such as something written in .Net. It is possible to add documents though via T-SQL. To supply data for these examples, I decided a little culture was in order. I went to http://shakespeare.mit.edu/ and copied some of the plays into Microsoft Word documents. Note I used the older “.doc” format from 2003, not the 2007 “.docx”. This is not particularly important as far as FILESTREAM is concerned, but will come into play in later lessons when we look at doing Full Text Searches on this data.
INSERT INTO DocumentRepository(DocumentExtension, DocumentName, Document)
SELECT
'doc' AS DocumentExtension
, 'Hamlet.doc' AS DocumentName
, * FROM OPENROWSET(BULK 'D:\Docs\Hamlet.doc', SINGLE_BLOB)
AS Document;
GO
Here we’ve inserted a new row into the table, and have ignored the “ID” and “FileStreamID” letting SQL Server create the values. The “DocumentExtension” and “DocumentName” columns are straightforward. To supply the document however, we need to use an OPENROWSET. This will supply the data from a disk file as a single BLOB (Binary Large OBject).
Let’s verify what was just inserted with this query:
SELECT ID
, FileStreamID
, DocumentExtension AS Ext
, DocumentName
, CAST(Document AS VARCHAR) as DocumentData
FROM DocumentRepository;
Notice the CAST, its necessary in order to get a view into the stored document.
ID | FileStreamID | Ext | DocumentName | DocumentData |
1 | A54A8FFE-F742-DE11-AC61-002243CE0AAB | doc | Hamlet.doc | ÐÏࡱá > þÿ |
As mentioned previously, it is also possible to insert a new row, then add the document later. To insert a new row into our example table, we simply use:
INSERT INTO DocumentRepository(DocumentExtension, DocumentName)
VALUES ('doc', 'AMidsummerNightsDream.doc');
GO
Now that the row exists, we can update it.
UPDATE DocumentRepository
SET Document = ( SELECT *
FROM OPENROWSET(
BULK 'D:\Docs\AMidsummerNightsDream.doc',
SINGLE_BLOB) AS TheDocument
)
WHERE ID = 2 ;
GO
While it is possible to add data in this format, it’s fairly rare. The majority of the time you will want to use a client side application, which is what we’ll look at in the next installment of this series.
Very helpful article! Thanks
I did notice though that FILESTREAM best practice says…
“Use NULL instead of 0x to represent a non-initialized FILESTREAM column. The 0x value causes a file to be created, and NULL does not.”
Has that changed?
Hi Arcane,
I noticed when we select the data, the value in column “DocumentData” is unreadable. Is there a way to fix it?