WorldCat Identities Change

By jr

I meant to get this out a week or two ago. Seems as if there’s been a little, but possibly significant change in the way WorldCat Identities works. Before you could get directly to a WCID record by knowing the proper base link and the NACO normalization of the name (1XX). The NACO normalization seems to be what some FRBRizations use to match identities. Just matching normalized strings. This value was marked up in the xml of the Identities records as the ‘pnkey.’ For most names you could very easily normalize it and get directly to the record you want, especially if you had a MARC record that already had authority work done on it. The resulting link would look something like: http://orlabs.oclc.org/SRW/search/Identities?query=local.pnkey+exact+%22fforde,%20jasper%22

I was hoping to use WCID as a way to easily grab MARCXML authority records from the Linked Authority File (see the bottom of any WCID record’s page for the link). I thought this would be a nice feature to add into my little Ruby copy cataloging script. It would have been nice to have the ability to grab authority records at the same time as cataloging and maybe enrich the bibliographic record with such things as a link to wikipedia or back to the WCID record.

To make sure I was able to generate the links properly I was working on a little Ruby library to do the NACO normalizations correctly. I invested a lot of time in learning how to pack a hex number representing the UTF-16 value for a letter into the UTF-8 version. The python example for NACO normalizations used the UTF-16 value so instead of looking up and calculating all the UTF-8 octals I wanted to just use these values directly. I spent time with Iconv for Ruby trying to get diacritic stripping transliteration to work before realizing that it wasn’t very portable since it relied on system locales. In fact I could get Iconv to do proper transliteration to ASCII from UTF-8 in irb, but as soon as I tried it in a script it failed replacing everything with a diacritic with a ‘?’. I took a look at the ICU4R library but couldn’t get it to compile on my system. Finally I fell back on the older Unicode gem to do decomposing normalizations and then strip out every byte above 127 which would include all those diacritics. Of course there’s an OCLC web service for NACO normalization but I wanted to learn some of this stuff anyways.

I think I had it mostly working when I wanted to take a look at some of the WCID pnkeys to make sure I had all the spacing correct for a good match. The site seemed down since a link directly to one of the records I often took a look at threw an ugly error. Since the site has been in beta I’ve often seen these types of errors, so I went to the homepage instead. The homepage came up. Searching a name also threw an error so I clicked on the tag cloud.

I discovered that now WCID has changed the pnkey to be the 010$a from the LC authority record with “lccn-” tacked onto the front. The links look nicer (more RESTful?): http://orlabs.oclc.org/Identities/key/lccn-n79-21164 But you’re no longer able to get directly to the record unless you already know the LC control number for the authority record.

It’s still to be seen what other kind of valid links you might be able to create with WCID, but for now it’s not looking good for what I had hoped to do. I can’t get to an authority record I don’t have if I need information from that record to get there. They’ve cut off my one simple route to the information I want. It may be that I’ll have to do a two step process: search first and then have a way to pick the proper record. More cumbersome than I had hoped for sure. It looks to me as if OCLC may have made WCID less generally useful. It’s a beautiful project but if I can’t have an API to it or have to wade through a couple step process and parse all the xml myself, then it’s not really very useful.

It’ll be interesting to see what happens but I don’t know that I’ll be touching anything from OCLC in the future until it’s out of beta. Lesson learned.

Update: Searching WCID now works. It seems for entities without authority records they still use a normalized form of the name as their pnkey.

Tags: , , , ,

3 Responses to “WorldCat Identities Change”

  1. jr Says:

    See the inkdroid post for more on how OCLC could use REST.

  2. Thom Hickey Says:

    Interesting. We weren’t aware anyone was trying to do this. I can think of a several ways we could make this work:

    1) Add a NACO-normalized index to the LAF
    2) Use http://viaf.org/ It’s probably close to what you need if you are willing to fuss with SRU responses
    3) Follow through with our intention to have a ‘name’ URI in addition to the ‘key’ URI on WorldCat Identities

    Sorry, all the links got broken. As you noted, we’re still in beta and we’re just not ready yet to commit to final forms of the URIs, although we hope we are close.

    –Th

  3. jr Says:

    Thom,
    Thanks for your response.
    I think LAF won’t work even for some simple cases. Last I knew the LAF did not have corporate names included. The VIAF I see has a new look and may work just to grab authority records (right now I’m getting a 404).

    Of course you know, the advantage of WorldCat Identities is that a much more positive ID can be made because of all the kinds of information included there that isn’t available in authority records. While initially I wanted to support copy cataloging, I wanted to make it work for out of date copy which hasn’t had authority work updated.

    Also add the link to the WCID record and Wikipedia entry into the authority record as I grab it and configure the OPAC to display those link fields for associated bib records, and then we’ve really got something interesting. :) It’s one thing for a cataloger to make a positive ID and another for the user to start making connections.

    I do hope you add a name URI. Even more interesting is the recent development I’ve read concerning an OCLC developer’s network forming. Mentioned was the possibility of an xID service which I assume is WorldCat Identities related? I’d certainly be interested in helping develop a Ruby library against that.

    (Which reminds me I’ve worked on updating the xISBN rubygem and need to finish cleaning it up, once I get some school work out of the way, and make a release.)

Leave a Reply