ArcaneBooks – Parsing Library of Congress Control Number (LCCN) Data With PowerShell

Introduction

In my previous post in this series, ArcaneBooks – Library of Congress Control Number (LCCN) – An Overview, I provided an overview of the LCCN and the basics of calling its public web API to retrieve data based on the LCCN.

In this post I will demonstrate how to call the API and dissect the data using PowerShell. This will be a code intensive post.

You can find the full ArcaneBooks project on my GitHub site. Please note as of the writing of this post the project is still in development.

The code examples for this post can be located at https://github.com/arcanecode/ArcaneBooks/tree/main/Blog_Posts/005.00_LCCN_API. It contains the script that we’ll be dissecting here.

XML from Library of Congress

For this demo, we’ll be using an LCCN of 54-9698, Elements of radio servicing by William Marcus. When we call the web API URL in our web browser, we get the following data.

<zs:searchRetrieveResponse xmlns:zs="http://docs.oasis-open.org/ns/search-ws/sruResponse">
  <zs:numberOfRecords>2</zs:numberOfRecords>
  <zs:records>
    <zs:record>
      <zs:recordSchema>mods</zs:recordSchema>
      <zs:recordXMLEscaping>xml</zs:recordXMLEscaping>
      <zs:recordData>
        <mods xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns="http://www.loc.gov/mods/v3" version="3.8" xsi:schemaLocation="http://www.loc.gov/mods/v3 http://www.loc.gov/standards/mods/v3/mods-3-8.xsd">
          <titleInfo>
            <title>Elements of radio servicing</title>
          </titleInfo>
          <name type="personal" usage="primary">
            <namePart>Marcus, William. [from old catalog]</namePart>
          </name>
          <name type="personal">
            <namePart>Levy, Alex,</namePart>
            <role>
              <roleTerm type="text">joint author</roleTerm>
            </role>
          </name>
          <typeOfResource>text</typeOfResource>
          <originInfo>
            <place>
              <placeTerm type="code" authority="marccountry">nyu</placeTerm>
            </place>
            <dateIssued encoding="marc">1955</dateIssued>
            <issuance>monographic</issuance>
            <place>
              <placeTerm type="text">New York</placeTerm>
            </place>
            <agent>
              <namePart>McGraw Hill</namePart>
            </agent>
            <dateIssued>[1955]</dateIssued>
            <edition>2d ed.</edition>
          </originInfo>
          <language>
            <languageTerm authority="iso639-2b" type="code">eng</languageTerm>
          </language>
          <physicalDescription>
            <form authority="marcform">print</form>
            <extent>566 p. illus. 24 cm.</extent>
          </physicalDescription>
          <subject authority="lcsh">
            <topic>Radio</topic>
            <topic>Repairing. [from old catalog]</topic>
          </subject>
          <classification authority="lcc">TK6553 .M298 1955</classification>
          <identifier type="lccn">54009698</identifier>
          <recordInfo>
            <recordContentSource authority="marcorg">DLC</recordContentSource>
            <recordCreationDate encoding="marc">820525</recordCreationDate>
            <recordChangeDate encoding="iso8601">20040824072855.0</recordChangeDate>
            <recordIdentifier>6046000</recordIdentifier>
            <recordOrigin>Converted from MARCXML to MODS version 3.8 using MARC21slim2MODS3-8_XSLT1-0.xsl (Revision 1.172 20230208)</recordOrigin>
          </recordInfo>
        </mods>
      </zs:recordData>
      <zs:recordPosition>1</zs:recordPosition>
    </zs:record>
  </zs:records>
  <zs:nextRecordPosition>2</zs:nextRecordPosition>
  <zs:echoedSearchRetrieveRequest>
    <zs:version>2.0</zs:version>
    <zs:query>bath.lccn=54009698</zs:query>
    <zs:maximumRecords>1</zs:maximumRecords>
    <zs:recordXMLEscaping>xml</zs:recordXMLEscaping>
    <zs:recordSchema>mods</zs:recordSchema>
  </zs:echoedSearchRetrieveRequest>
  <zs:diagnostics xmlns:diag="http://docs.oasis-open.org/ns/search-ws/diagnostic">
    <diag:diagnostic>
      <diag:uri>info:srw/diagnostic/1/5</diag:uri>
      <diag:details>2.0</diag:details>
      <diag:message>Unsupported version</diag:message>
    </diag:diagnostic>
  </zs:diagnostics>
</zs:searchRetrieveResponse>

Let’s see how to retrieve this data then parse it using PowerShell.

Parsing LCCN Data

First, we’ll start by setting the LCCN in a variable. This is the LCCN for "Elements of radio servicing" by William Marcus

$LCCN = '54-9698'

To pass in the LCCN to the web API, we need to remove any dashes or spaces.

$lccnCleaned = $LCCN.Replace('-', '').Replace(' ', '')

After 2001 the LCCN started using a four digit year. By that time however, books were already printing the ISBN instead of the LCCN. For those books we’ll be using the ISBN, so for this module we can safely assume the LCCNs we are receiving only have a two digit year.

With that said, we’ll use the following code to extract the two digit year.

$lccnPrefix = $lccnCleaned.Substring(0,2)

Since digits 0 and 1 are the year, we’ll start getting the rest of the LCCN at the third digit, which is in position 2 and go to the end of the string, getting the characters.

Next, the API requires the remaining part of the LCCN must be six digits. So we’ll use the PadLeft method to put 0’s in front to make it six digits.

$lccnPadded = $lccnCleaned.Substring(2).PadLeft(6, '0')

Now combine the reformatted LCCN and save it to a variable.

$lccnFormatted ="$($lccnPrefix)$($lccnPadded)"

Now we’ll combine all the parts to create the URL needed to call the web API.

$baseURL = "http://lx2.loc.gov:210/lcdb?version=3&operation=searchRetrieve&query=bath.lccn="
$urlParams = "&maximumRecords=1&recordSchema=mods"
$url = "$($baseURL)$($lccnFormatted)$($urlParams)"

It’s time now to get the LCCN data from the Library of Congress site. We’ll wrap it in a try/catch so in case the call fails, for example from the internet going down, it will provide a message and exit.

Note at the end of the Write-Host line we use the PowerShell line continuation character of ` (a single backtick) so we can put the foreground color on the next line, making the code a bit more readable.

try {
  $bookData = Invoke-RestMethod $url
}
catch {
  Write-Host "Failed to retrieve LCCN $LCCN. Possible internet connection issue. Script exiting." `
    -ForegroundColor Red
  # If there's an error, quit running the script
  exit
}

Now we need to see if the book was found in the archive. If not the title will be null. We’ll use an if to check to see if the LCCN was found in their database. If not, the title property will be null. If so we display a message to that effect.

If it was found, we fall through into the else clause to process the data. The remaining code resides within the else.

# We let the user know, and skip the rest of the script
if ($null -eq $bookData.searchRetrieveResponse.records.record.recordData.mods.titleInfo.title)
{
  Write-Host = "Retrieving LCCN $LCCN returned no data. The book was not found."
}
else # Great, the book was found, assign the data to variables
{

To get the data, we start at the root object, $bookData. The main node in the returned XML is searchRetrieveResponse. From here we can use standard dot notation to work our way down the XML tree to get the properties we want.

Our first entry gets the Library of Congress Number. The syntax is a little odd. If we walk XML tree, we find this stored in:

<identifier type="lccn">54009698</identifier>

If we display the identifier property using this code:

$bookData.searchRetrieveResponse.records.record.recordData.mods.identifier

We get this result.

type #text
---- -----
lccn 54009698

The LCCN we want is stored in the property named #text. But #text isn’t a valid property name in PowerShell. We can still use it though if we wrap the name in quotes.

  $LibraryOfCongressNumber = $bookData.searchRetrieveResponse.records.record.recordData.mods.identifier.'#text'

From here we can process other properties that are easy to access.

  $Title = $bookData.searchRetrieveResponse.records.record.recordData.mods.titleInfo.title
  $PublishDate = $bookData.searchRetrieveResponse.records.record.recordData.mods.originInfo.dateIssued.'#text'
  $LibraryOfCongressClassification = $bookData.searchRetrieveResponse.records.record.recordData.mods.classification.'#text'
  $Description = $bookData.searchRetrieveResponse.records.record.recordData.mods.physicalDescription.extent
  $Edition = $bookData.searchRetrieveResponse.records.record.recordData.mods.originInfo.edition

Now we get to the section where an XML property can contain one or more values.

Books can have multiple authors, each is returned in its own item in an array. One example is the book subjects. Here is a sample of the XML:

<subject authority="lcsh">
  <topic>Radio</topic>
  <topic>Repairing. [from old catalog]</topic>
</subject>

As you can see, this has two topics. What we need to do is retrieve the root, in this case subject, then loop over each item.

For our purposes we don’t need them individually, a single string will do. So in the PowerShell we’ll create a new object of type StringBuilder. For more information on how to use StringBuilder, see my post Fun With PowerShell – StringBuilder.

In the loop if the variable used to hold the string builder is empty, we’ll just add the first item. If it’s not empty, we’ll append a comma, then append the next value.

  $authors = [System.Text.StringBuilder]::new()
  foreach ($a in $bookData.searchRetrieveResponse.records.record.recordData.mods.name)
  {
    if ($a.Length -gt 1)
      { [void]$authors.Append(", $($a.namePart)") }
    else
      { [void]$authors.Append($a.namePart) }
  }
  $Author = $authors.ToString()

As a final step we used the ToString method to convert the data in the string builder back to a normal string and store it in the $Author variable.

From here, we’ll repeat this logic for several other items that can hold multiple values. The books subjects is one example.

  $subjects = [System.Text.StringBuilder]::new()
  $topics = $bookData.searchRetrieveResponse.records.record.recordData.mods.subject | Select topic
  foreach ($s in $topics.topic)
  {
    if ($subjects.Length -gt 1)
      { [void]$subjects.Append(", $($s)") }
    else
      { [void]$subjects.Append($s) }
  }
  $Subject = $subjects.ToString()

A book could have multiple publishers over time. The author could shift to a new publisher, or more likely a publishing house could be purchased and the new owners name used. The data is returned as an array, so combine them as we did with authors and subjects.

Note that in the returned data, the publisher is stored as an "agent". We’ll use the name Publisher to keep it consistent with the ISBN data.

  $thePublishers = [System.Text.StringBuilder]::new()
  foreach ($p in $bookData.searchRetrieveResponse.records.record.recordData.mods.originInfo.agent)
  {
    if ($thePublishers.Length -gt 1)
      { [void]$thePublishers.Append(", $($p.namePart)") }
    else
      { [void]$thePublishers.Append($p.namePart) }
  }
  $Publishers = $thePublishers.ToString()

Since there could be multiple publishers, logically there could be multiple publishing locations. This section will combine them to a single location.

  $locations = [System.Text.StringBuilder]::new()
  foreach ($l in $bookData.searchRetrieveResponse.records.record.recordData.mods.originInfo.place.placeTerm)
  {
    if ($locations.Length -gt 1)
      { [void]$locations.Append(", $($l.'#text')") }
    else
      { [void]$locations.Append($l.'#text') }
  }
  $PublisherLocation = $locations.ToString()

All done! We’ll give a success message to let the user know.

  Write-Host "Successfully retrieved data for LCCN $LCCN" -ForegroundColor Green

Finally, we’ll display the results. Note some fields may not have data, that’s fairly normal. The Library of Congress only has the data provided by the publisher. In addition some of the LCCN data dates back many decades, so the data supplied in the 1940’s may be different than what is supplied today.

  "LCCN: $LCCN"
  "Formatted LCCN: $lccnFormatted"
  "Library Of Congress Number: $LibraryOfCongressNumber"
  "Title: $Title"
  "Publish Date: $PublishDate"
  "Library Of Congress Classification: $LibraryOfCongressClassification"
  "Description: $Description"
  "Edition: $Edition"
  "Author: $Author"
  "Subject: $Subject"
  "Publishers: $Publishers"
  "Publisher Location: $PublisherLocation"
}

The Result

Here is the result of the above code.

LCCN: 54-9698
Formatted LCCN: 54009698
Library Of Congress Number: 54009698
Title: Elements of radio servicing
Publish Date: 1955
Library Of Congress Classification: TK6553 .M298 1955
Description: 566 p. illus. 24 cm.
Edition: 2d ed.
Author: Marcus, William. [from old catalog], Levy, Alex,
Subject: Radio, Repairing. [from old catalog]
Publishers: McGraw Hill
Publisher Location: nyu, New York

As you can see it returned a full dataset. Not all books my have data for all the fields, but this one had the full details on record with the Library of Congress.

See Also

This section has links to other blog posts or websites that you may find helpful.

The ArcaneBooks Project – An Introduction

ArcaneBooks – ISBN Overview, PowerShell, and the Simple OpenLibrary ISBN API

ArcaneBooks – PowerShell and the Advanced OpenLibrary ISBN API

ArcaneBooks – Library of Congress Control Number (LCCN) – An Overview

Fun With PowerShell – StringBuilder

The GitHub Site for ArcaneBooks

Conclusion

In this document we covered the basics of the LCCN as well as the web API provided by the Library of Congress. Understanding this information is important when we integrate the call into our PowerShell code.

Fun With PowerShell – StringBuilder

Introduction

As I was creating the next post in my ArcaneBooks series, I realized I had not written about the StringBuilder class. As the code in my ArcaneBooks module relies on it in several places, I thought it best to add a new post to my Fun With PowerShell series explaining how to use it before continuing.

It’s a common need in any language, and PowerShell is no exception, to need to add more text to an existing string.

What many people don’t realize though is that PowerShell strings are immutable. They cannot change. As an example, let’s talk about what happens behind the scenes when you execute this code sample.

$x = 'Arcane'
$x = $x + 'Code'

First, PowerShell creates a variable in memory. For an example, we’ll say the memory is located at position 0001.

In the second line of code, PowerShell creates a second variable in memory, let’s say it is position 0002. Into position 0002, it copies the data from position 0001 then adds the Code string.

Next, it changes $x to point to memory location 0002. Finally, it marks position 0001 as no longer in use. At some point in the future, the garbage collector will clean up the memory when there is some idle time. The garbage collector is a system function that removes chunks of memory that are no longer in use, freeing up memory for other code to use.

Why This Is Bad

In the example above, we only had one variable (the one at location 0001) that needed to be garbage collected. Imagine though you were looping over thousands of records of data, building a complex string that perhaps you’ll later save to a file. The amount of work the garbage collector would need to do is enormous. It would have a negative impact on system performance, and create a slow running script.

To solve this, the StringBuilder class was created. Behind the scenes it uses a linked list. Let me step through an example a step at a time.

Step 1 – Create an empty string builder object

$output = [System.Text.StringBuilder]::new()

Step 2 – Append text to the StringBuilder variable we created

To add a string value, we will use the Append method. Note when we use methods such as Append it returns data. Most of the time we don’t need to see this. By using [void] before the line, the output of the Append method is discarded.

[void]$output.Append('Arcane')

We now have an item in memory, we’ll call it position one. This holds two values, the string value and a pointer to the next item. If there is no next item, the pointer value is null.

Position Text Pointer to next item
0001 Arcane null

Step 3 – Append a second string

[void]$output.Append('Code')

The string builder now updates the linked list.

Position Text Pointer to next item
0001 Arcane 0002
0002 Code null

Step 4 – Retrieve the data

When we go to retrieve the data, the string builder will go through the chain, assemble the final data and return it. In order to copy it into a standard string variable, we’ll need to use the ToString method to convert the result from a string builder object to a standard string.

$result = $output.ToString()

Why this is a good solution

Here, PowerShell only created one variable, then kept appending to the linked list. When we are done with the variable $output the garbage collector only has to cleanup one variable, not hundreds or (potentially) thousands.

When you only have a few items, and are sure their sizes are small, then using a string builder may not provide much benefit in terms of performance. However, when you have an unknown number of items then string builder can be a friend.

In addition to Append, string builder has several more methods that are of use. Let’s look at them now.

Append

While we just looked at using Append, I want to use this section to remind you to include proper spacing when creating your strings.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'PowerShell is awesome!' )
[void]$output.Append( ' It makes my life much easier.' )
[void]$output.Append( ' I think I''ll go watch some of Robert''s videos on Pluralsight.' )
$output.ToString()

This results in:

PowerShell is awesome! It makes my life much easier. I think I''ll go watch some of Robert''s videos on Pluralsight.

Note that on the second and third calls to the Append method I included a space at the beginning of the line. This was needed to make the output look like a true series of sentences, with spaces after the periods.

You could have also put spaces at the end of the lines, that is up to you and your needs when building your code.

AppendLine

When appending, you sometimes want a carriage return / line feed character added to the end of the text that was appended. To handle this, we have the Appendline method.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'PowerShell is awesome!' )
[void]$output.AppendLine( ' It makes my life much easier.' )
[void]$output.Append( 'I think I''ll go watch some of Robert''s videos on Pluralsight.' )
$output.ToString()

In the result, you can see the line wraps after the "…much easier." line.

PowerShell is awesome! It makes my life much easier.
I think I'll go watch some of Robert's videos on Pluralsight.

This can be handy when, for example, you are building a string that will be written out as a CSV (comma separated values) file. Each row of data will be saved as an individual line.

You may also have situations where you are building a big string that you want as something more readable. Perhaps you are building a string that will be emailed as a report. In it you’d want blank lines between each paragraph.

To accomplish this, you can just use AppendLine without passing a value into it.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'PowerShell is awesome!' )
[void]$output.AppendLine( ' It makes my life much easier.' )
[void]$output.AppendLine()
[void]$output.Append( 'I think I''ll go watch some of Robert''s videos on Pluralsight.' )
$output.ToString()

The output from this code is:

PowerShell is awesome! It makes my life much easier.

I think I'll go watch some of Robert's videos on Pluralsight.

AppendFormat

The third version of append is AppendFormat. It allows you to append a numerical value, and specify a string format.

In the example below, the first parameter is {0:C}. Into the spot where the 0 is, the numeric value in the second parameter, $value is placed. The :C indicates a currency format should be used.

$value = 33
$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'The value is: ' )
[void]$output.AppendFormat( "{0:C}", $value )
$output.ToString()

This results in:

The value is: $33.00

The formats supported by string builder are identical to the ones that the string data type uses.

For more information on string formatting, please see my post Fun With PowerShell String Formatting

Insert

You may have a situation where you need to insert text into the text already saved in your string builder variable. To accomplish this, we can use the Insert method.

As the first parameter we pass in the position we wish to start inserting at. The second parameter holds the text to be inserted.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'Arcane' )
[void]$output.Append( ' writes great blog posts.' )
[void]$output.Insert(6, 'Code')
$output.ToString()

The output of the above sample is:

ArcaneCode writes great blog posts.

Remove

In addition to inserting text, we can also remove text using the Remove method. It requires two parameters, the first is the position to start removing at, the second is the number of characters to remove.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'ArcaneCode' )
[void]$output.Append( ' writes great blog posts.' )
[void]$output.Remove(6, 4)
$output.ToString()

In this example I’m removing the text Code from ArcaneCode.

Arcane writes great blog posts.

Replace

You may recall that the string data type has a replace method. So too does the string builder, also named Replace. In the first parameter you pass in the character to be replaced. The second is what you want to replace it with.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( 'ArcaneCode' )
[void]$output.AppendLine( ' writes great blog posts.' )
[void]$output.Append( 'I think I''ll go watch some of Robert''s videos on Pluralsight.' )
[void]$output.Replace('.', '!')
$output.ToString()

In this simple example, I’m going to replace all periods in the text with exclamation marks.

ArcaneCode writes great blog posts!
I think I'll go watch some of Robert's videos on Pluralsight!

Be aware Replace works on the entire text held in string builder, replacing every occurance found. If you want to limit the replacements, you’d have to do so prior to any appending you do.

The Replace method is most commonly used to remove special characters from your text, perhaps a result from reading in data from file that contains things like squiggly braces and brackets.

The replacement character can be an empty string, which results in simply removing the unwanted character.

Finally, you can stack multiple methods into one operation. For example, if the string builder holds the text:

{ArcaneCode}, [arcanecode.com]

You can do:

$output.Replace('{', '').Replace('}', '').Replace('[', '').Replace(']', '')

Which results in the following text:

ArcaneCode, arcanecode.com

And you aren’t limited to stacking replaces, you can mix and match methods.

$output = [System.Text.StringBuilder]::new()
[void]$output.Append( '[ArcaneCode]' ).Replace('[', '').Replace(']', '').Insert(6, ' ')
$output.ToString()

Results in:

Arcane Code

If you get carried away this can get ugly and hard to read. But it is possible so you should know about it. There are times when it can make the code more compact and a bit easier to read, such as:

[void]$output.Replace('[', '').Replace(']', '')

Adding the first string when you create a StringBuilder object

There is one last capability to look at. When you instantiate (fancy word for create) the new string builder object, you can pass in the first text value to be stored in the string builder.

Here I’m passing in the text ArcaneCode when we create the variable.

$output = [System.Text.StringBuilder]::new('ArcaneCode')
[void]$output.Append( ' writes great blog posts.' )
$output.ToString()

The output is like you’d expect.

ArcaneCode writes great blog posts.

See Also

You may find more helpful information at the links below.

Fun With PowerShell Strings

Fun With PowerShell String Formatting

If you want to go deeper on the internals of the StringBuilder class, Andrew Lock has a great series of articles at his blog.

Conclusion

The string builder class can be a great tool for optimizing your scripts that do a lot of text manipulation.

Now that you have an understanding of the string builder class, we’re free to proceed with the next post in the ArcaneBooks project.

ArcaneBooks – Library of Congress Control Number (LCCN) – An Overview

Introduction

This is part of my ongoing series on my ArcaneBooks project. The goal is to provide a module to retrieve book data via provided web APIs. In the SEE ALSO section later in this post I’ll provide links to previous posts which cover the background of the project, as well as how to use the OpenLibrary APIs to get data based on the ISBN.

In this post I will provide an overview of using the Library of Congress API to get data based on the LCCN, short for Library of Congress Control Number.

The next post in this series will provide code examples and an explanation of how to use PowerShell to get data using the Library of Congress API.

LCCN Overview

The abbreviation LCCN, according to the Library of Congress’s own website, stands for Library of Congress Control Number. When the system was first created in 1898, however, LCCN stood for Library of Congress Card Number, and I’ve seen it both ways in publications.

I’ve also seen a few places define it as Library of Congress Catalog Number, although this was never an official designation.

The LCCN was created in 1898 to provide a unique value to every item in the Library of Congress. This not only includes books, but works of art, manuscripts (not in book form), maps, and more.

LCCN Format

The LCCN has two parts, a prefix followed by a serial number. From 1898 to 2000 the prefix was two digits, representing the year. Beginning in 2001 the prefix became four digits, representing the year.

The serial number is simple a sequential number. 45-1 was the first number assigned in 1945. 45-1234 was the 1,234th item assigned in that year.

Be aware from 1969 to 1972 there was an experiment where the single digit of 7 was used for the prefix. They decided this scheme wasn’t going to work out, and reverted to the standard format of year followed by serial number.

Here are a few examples of real LCCNs from books in my personal collection. You can use these in your own testing.

LCCN Title
54-9698 Elements of Radio Servicing
40-33904 Radio Handbook Twenty-Second Edition
41-3345 The Radio Amateur’s Handbook 42nd Edition 1965
64-20875 Early Electrical Communication
74-75450 VHF Handbook for Radio Amateurs
76-190590 Wire Antennas for Radio Amateurs
71-120473 73 Vertical, Beam, and Triangle Antennas

Accessing Book Data from the Library of Congress

The Library of Congress actually provides two web APIs for getting book data. The first API is for accessing assets, such as digital assets. It doesn’t return much data for books.

The second is the LC Z39.50 system, accessible through lx2.loc.gov. Here is an example of calling it to retrieve a record for the book Elements of Radio Servicing, which has the LCCN of 54-9698. (It should, of course, all be used as a single line just in case your web browser wraps it.)

http://lx2.loc.gov:210/lcdb?version=3&amp;operation=searchRetrieve&amp;query=bath.lccn=54009698&amp;maximumRecords=1&amp;recordSchema=mods

Breaking it down, the root call is to http://lx2.loc.gov:210/lcdb. After this is a question mark ?, followed by the parameters.

The first parameter is version=3. This indicates which format to use for the return data. It supports two versions, 1.1 and 3. For our purposes we’ll use the most current version, 3.

Following the ampersand &amp; is operation=searchRetrieve. This instructs the Library of Congress’s API that we want to do a search to retrieve data.

Next is the core piece, we need to tell it what LCCN number to look up, query=bath.lccn=54009698. The root object is bath, then it uses the property lccn.

The LCCN has to be formatted in a specific way. We start with the two or four digit year. In the above example, 54-9698, this would be the two digit year of 54.

Next is the serial number. If the number is less than six digits, it must be left zero padded to become six. Thus 9698 becomes 009698. The year and serial number are combined, removing any dashes, spaces, or other characters and becomes 54009698.

Following is maximumRecords=1, indicating we only expect one record back. That’s all we’ll get back with a single LCCN anyway, so this will work fine for our needs.

The final parameter is recordSchema=mods. The API supports several formats.

Record Schema Description Notes
dc Dublin Core (bibliographic records) Brings back just the basics (Name, author, etc)
mads MADS (authority records) Brief, not a lot of info
mods MODS (bibliographic records) Very readable XML schema, most info
marcxml MARCXML – the default schema Abbreviated schema, not readable
opacxml MARCXML (wth holdings attached) As above with a bit more info

You are welcome to experiment with different formats, but for this module we’ll be using mods. It provides the most information, and is in XML. XML is very easy to read, and it works great with PowerShell.

ISBN and Library of Congress

It is possible to use the Library of Congress to look up the ISBN. In my testing though, the interface provided by OpenLibrary provided more data. Thus we’ll be using it for looking up ISBNs in this module.

We’ll use the LCCN API for books where we only have the LCCN.

See Also

The ArcaneBooks Project – An Introduction

ArcaneBooks – ISBN Overview, PowerShell, and the Simple OpenLibrary ISBN API

ArcaneBooks – PowerShell and the Advanced OpenLibrary ISBN API

Conclusion

In this document we covered the basics of the LCCN as well as the web API provided by the Library of Congress. Understanding this information is important when we integrate the call into our PowerShell code.

ArcaneBooks – PowerShell and the Advanced OpenLibrary ISBN API

Introduction

This post continues my series on my ArcaneBooks project. For a background see my post The ArcaneBooks Project – An Introduction.

For this project I am using the OpenLibrary.org website, which provides two web APIs to access book data based on the ISBN. OpenLibrary is sponsored by the InternetArchive.

In a previous post, ArcaneBooks – ISBN Overview, PowerShell, and the Simple OpenLibrary ISBN API, I covered the use of the first API which I nicknamed the Simple API as it is a bit easier to use and dissect the results. I also provided a background on what the ISBN is and how it is formed.

In this post I’ll dive into the more complex of the APIs, what I call the Advance API.

Be aware the use of Simple and Advance are my terms, so I can easily distinguish between the two. They are not terms used by the OpenLibrary.

The Advanced OpenLibrary API

The format of the Advanced API is slightly different from the simple. Here is template.

https://openlibrary.org/api/books?bibkeys=ISBN:[ISBN Goes Here]&jscmd=data&format=json"

You will replace the [ISBN Goes Here] text with the ISBN number you want to look up. Be aware this can only be digits, you must remove any spaces, dashes, or other characters.

Let’s look at a code example of calling the API and getting all its properties.

Calling The API with PowerShell

First, set an ISBN to lookup. We’ll include some dashes for the demo. The title of the book is "Your HF Digital Companion"

$ISBN = '0-87259-481-5'

Now remove any spaces or dashes, then create the URL.

$isbnFormatted = $ISBN.Replace('-', '').Replace(' ', '')
$baseURL = "https://openlibrary.org/api/books?bibkeys=ISBN:"
$urlParams = "&jscmd=data&format=json"
$url = "$($baseURL)$($isbnFormatted)$($urlParams)"

Now let’s call the URL and put the data into a variable.

$bookData = Invoke-RestMethod $url

If we look at the data held by the variable, we get back a single column. That column holds JSON formatted data. (Note I truncated the XML for readability purposes.)

$bookData

This is the output of displaying the variable.

ISBN:0872594815
---------------
@{url=https://openlibrary.org/books/OL894295M/Your_HF_digital_companion; key=/books/OL894295M; title=Your HF digital companion; authors=System.Object[]; number_of_pages=197; …

We could address the data like:

$bookData.'ISBN:0872594815'.Title

Note we had to wrap the ISBN number in quotes since a colon : isn’t an allowed in property names. However, when we make the call the ISBN isn’t set in stone.

But we do have it in a variable, and we can use string interpolation to format the property.

$bookData."ISBN:$isbnformatted".title

This returns "Your HF digital companion". And yes, the words "digital" and "companion" should normally be capitalized, but this is the way the title comes from OpenLibrary.

Now that we have the formatting for the property name down, we can get the other properties. Note that not all properties that are returned will have data.

$ISBN10 = $bookData."ISBN:$isbnformatted".identifiers.isbn_10
$ISBN13 = $bookData."ISBN:$isbnformatted".identifiers.isbn_13
$Title = $bookData."ISBN:$isbnformatted".title
$LCCN = $bookData."ISBN:$isbnformatted".identifiers.lccn
$NumberOfPages = $bookData."ISBN:$isbnformatted".number_of_pages
$PublishDate = $bookData."ISBN:$isbnformatted".publish_date
$LibraryOfCongressClassification = $bookData."ISBN:$isbnformatted".classifications.lc_classifications
$DeweyDecimalClass = $bookData."ISBN:$isbnformatted".classifications.dewey_decimal_class
$Notes = $bookData."ISBN:$isbnformatted".notes
$CoverUrlSmall = $bookData."ISBN:$isbnformatted".cover.small
$CoverUrlMedium = $bookData."ISBN:$isbnformatted".cover.medium
$CoverUrlLarge = $bookData."ISBN:$isbnformatted".cover.large

The ByStatement sometimes begins with the word "By ". If so we want to remove it. However if we try and do a replace and the by_statement column is null, attempting to call the Replace method will result in an error. So first we have to check for null, and only if the by_statement isn’t null do we attempt to do a replace.

if ($null -eq $bookData."ISBN:$isbnformatted".by_statement)
  { $ByStatement = '' }
else
  { $ByStatement = $bookData."ISBN:$isbnformatted".by_statement.Replace('by ', '') }

For the remaining data, each item can have multiple entries attached. For example, a book could have multiple authors. For our purposes we will just combine into a single entry.

We’ll create a new variable of type StringBuilder, then loop over the list of items in the JSON, combining them into a single string.

In the if, we check to see if the string already has data, if so we append a comma before adding the second (or more) authors name.

Finally we use the ToString method of the StringBuilder class to convert the value back into a standard string data type.

Books can have multiple authors, as stated each is returned in its own item in an array. This code will combine them into a single string.

Note that when we call the Append method of the StringBuilder class, we need to prepend it with [void], otherwise it will send output to the console which we don’t want.

$authors = [System.Text.StringBuilder]::new()
foreach ($a in $bookData."ISBN:$isbnformatted".authors)
{
  if ($authors.Length -gt 1)
    { [void]$authors.Append(", $($a.name)") }
  else
    { [void]$authors.Append($a.name) }
}
$Author = $authors.ToString()

Subjects can be an array, let’s combine them into a single string.

$subjects = [System.Text.StringBuilder]::new()
foreach ($s in $bookData."ISBN:$isbnformatted".subjects)
{
  if ($subjects.Length -gt 1)
    { [void]$subjects.Append(", $($s.name)") }
  else
    { [void]$subjects.Append($s.name) }
}
$Subject = $subjects.ToString()

A book could have multiple publishers over time. The author could shift to a new publisher, or more likely a publishing house could be purchases and the new owners name used. The data is returned as an array, so combine them as we did with authors and subjects.

$thePublishers = [System.Text.StringBuilder]::new()
foreach ($p in $bookData."ISBN:$isbnformatted".publishers)
{
  if ($thePublishers.Length -gt 1)
    { [void]$thePublishers.Append(", $($p.name)") }
  else
    { [void]$thePublishers.Append($p.name) }
}
$Publishers = $thePublishers.ToString()

Since there could be multiple publishers, logically there could be multiple publishing locations. This will combine them into a single string.

$locations = [System.Text.StringBuilder]::new()
foreach ($l in $bookData."ISBN:$isbnformatted".publish_places)
{
  if ($locations.Length -gt 1)
    { [void]$locations.Append(", $($l.name)") }
  else
    { [void]$locations.Append($l.name) }
}
$PublisherLocation = $locations.ToString()

Now print out all the returned data.

$ISBN10
$ISBN13
$Title
$LCCN
$NumberOfPages
$PublishDate
$LibraryOfCongressClassification
$DeweyDecimalClass
$Notes
$CoverUrlSmall
$CoverUrlMedium
$CoverUrlLarge
$ByStatement
$Author
$Subject
$Publishers
$PublisherLocation

The Output

Here is the output, I put it into a table for easier reading.

Item Value
ISBN10 95185134
ISBN13 [Missing Value]
Title Your HF digital companion
LCCN [Missing Value]
Number of Pages 197
Publish Date 1995
LibraryOfCongressClassification TK5745 .F572 1995
DeweyDecimalClass 004.6/4
Notes Includes bibliographical references.
Based on Your RTTY/AMTOR companion. 1st ed. c1993.
CoverUrlSmall https://covers.openlibrary.org/b/id/12774631-S.jpg
CoverUrlMedium https://covers.openlibrary.org/b/id/12774631-M.jpg
CoverUrlLarge https://covers.openlibrary.org/b/id/12774631-L.jpg
ByStatement Steve Ford.
Author Steve Ford
Subject Radiotelegraph, Amateurs’ manuals
Publishers American Radio Relay League
PublisherLocation Newington, CT

See Also

The following operators and functions were used or mentioned in this article’s demos. You can learn more about them in some of my previous posts, linked below.

Fun With PowerShell Logic Branching

Fun With PowerShell Loops

Fun With PowerShell Strings

Fun With PowerShell String Formatting

Conclusion

In this post we saw how to use what I call the advanced API offered by OpenLibrary to retrieve book data based on the ISBN.

In the next post we’ll see how to get book data based on the Library Of Congress Catalog Number, using PowerShell and the Library of Congresses web API.

The demos in this series of blog posts was inspired by my Pluralsight course PowerShell 7 Quick Start for Developers on Linux, macOS and Windows, one of many PowerShell courses I have on Pluralsight. All of my courses are linked on my About Me page.

If you don’t have a Pluralsight subscription, just go to my list of courses on Pluralsight . At the top is a Try For Free button you can use to get a free 10 day subscription to Pluralsight, with which you can watch my courses, or any other course on the site.