Citation IDs
Posted in Uncategorized on November 27th, 2004 by darcusb – 16 CommentsIt’s clear the future of bibliographic and citation management is greater interoperability and collaboration; not less. In the future, users will create less bibliographic data, and consume more of it.
For individual documents authors, this raises an issue: how to code citations in such a way that they clearly and unambiguously point to the correct record? In a single-user context, this problem has been generally solved in one of two ways:
- a numeric database id
- a natural language citation key (e.g.
Doe99a
The first approach—used by default in Endnote—has the virtue of uniqueness within a single-user context. Beyond that, however, documents break. An ID of 2312 will point to one record on User A’s system, and another record entirely on User B’s.
I prefer the second approach myself, because the identifier it tied to the content, not the storage. I can look at the citation and deduce the record it refers to. However, the citekey approach has its own problems when you start to scale it from the desktop to the internet. How does one insure that cite{Smith99} is unique?
So, question: what should be a standardized way to identify citation records that best balances these needs? There’s a discussion of this over the BibDesk wiki. I tend to the like the approach they note from CiteSeer, which concatenates the author name, two-digit year, and the title. Example: authorYEARtitle (mccracken03greatPaperAboutStuff). While it is a bit verbose, it seems to be the best approach to me. It is superior to the tradition (which I use!) of appending a suffix to multiple author-year combinations (e.g. Doe1999b) because:
- more portable
- more information rich (you can see at a glance the specific record it points to)
- it doesn’t add punctuation like colons (that could cause problems in some contexts?)
Thoughts?
Creative Commons License