Using PowerShell To Find A Value In A Set Of Files

I often present PowerShell at SQL Saturdays. One question I’m often asked is “Why learn PowerShell? I can do everything in T-SQL.” Well here’s a great example.

I was developing some SSIS packages that took a set of text files and loaded them into multiple dimension and fact tables. In the process I had one particular value that was creating duplicate values in one of the target tables. Obviously, I needed to find that value in the various files and see what’s different about it versus the other data.

The problem was there are close to 200 files I’m loading. Each file has hundreds of lines, and each line is over 200 characters long. Yikes! That’s a lot of data to try to look through. It would have been very time consuming to look in all of those files manually.

PowerShell to the rescue! In just a few minutes I created a PowerShell script to loop over the files, find the ones that had the value I was looking for, and list those file names. Made it easy then to open just the files I needed and look at the rows. I could have gone further, actually printed out the data, or perhaps load it into some array where it would count it, or more.

I decided to share the script, which you will see below. It actually took me longer to write up all the comments than it did to write the initial code. With the comments removed there’s only 9 lines of real code in the script.

Using PowerShell I was able to quickly find the rows I needed and find the issue. This is just one simple example of how PowerShell can be used to quickly solve a problem that might take a long time to accomplish in other languages.


# Script: Search for a string in a group of files
# Author: Robert C. Cain
# Revision History
#   V01 - 05/28/2012 - Initial version
# This PowerShell script will loop over a collection of text files 
# which are delimited somehow (comma, tab, etc) and look for a 
# specific value. It will then print out the name of the file. 

# Here I've hardcoded a value to look for, you could also
# pass this in as a paramter if you convert this to a function
$ValueToLookFor = '123456789'

# Our data files have a header row, this should match the 
# names of those columns. This will allow us to address 
# the columns by name later in the routine
$header = "Column1", "Column2", "ColumnToSearch", "Column3", "Column4"

# Move to the location with all the files
Set-Location 'C:\MyData'

# Retrive a list of files that we need to look thru
# Note the use of the filter to only select some files
# If you are doing all files you can omit the filter
$files = Get-ChildItem -Filter "Data*.txt" | Select-Object Name

foreach($file in $files)
  # Load the data in the file into $data
  # In this example the files are tabbed delimted, hence the need
  # to specify the tab character as a delimter. If your file is
  # comma delimted you can omit the -Delimter, otherwise specify
  # whatever yours is. 
  # We also need to pass in the $header variable so PowerShell 
  # will be able to associate each column of data with a name
  $data = Import-Csv $file.Name -Delimiter "`t" -Header $header

    foreach($line in $data)
      # Here's where the header comes in handy, I can now use the 
      # column name as the .Property to look in
      if ($line.ColumnToSearch -eq $ValueToLookFor)
        # Print out the name of the file. You could do other things
        # like print out the actual line, or load it into a variable,
        # just to name a few. 
        # Use a break if you only want the file name listed once
        # I actually let it print out multiple times to indicate how
        # many times it's in the file

        # break  
    } # foreach($line in $data)
} # foreach($file in $files)


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s