Open XML, Draft 1.4

MS recently released a new draft of their Open XML format. This version includes more information about the citation and bibliographic support. Some notes …

First, citations. Really nothing to say except, yeah, they’ve documented what they’re doing a bit more (see the pdf, p1513), but no, they’ve not bothered to fix any of it. This isn’t really surprising I suppose, since they’re using generic fields to encode citations. But it still results in rather unfriendly XML.

Second, the bibliographic source format.

All the problems I noted earlier remain:

  1. the personal name model is Western—even U.S.—centric
  2. the model is (almost) totally flat and inflexible

Here are the list of types, with my own parenthetical comments:

  1. Art
  2. ArticleInAPeriodical (should be just “Article”)
  3. Book
  4. BookSection
  5. Case
  6. ConferenceProceedings
  7. DocumentFromInternetSite (should be just “Document”)
  8. ElectronicSource (which is?)
  9. Film
  10. InternetSite
  11. Interview
  12. JournalArticle
  13. MagOrNewsArticle (how is this any different from ArticleInPeriodical??)
  14. Misc (hints of a broken model)
  15. Patent
  16. Performance
  17. Report
  18. SoundRecording

Again, limited and inconsistent, and it seems fixed.

Beyond that, the fields are pretty much flat after that, so the only thing to do is list them:

  1. Author
  2. BookTitle
  3. Broadcaster
  4. BroadcastTitle
  5. CaseNumber
  6. ChapterNumber
  7. City
  8. Comments
  9. ConferenceName
  10. Country
  11. CountryRegion
  12. Court
  13. Day
  14. DayAccessed
  15. Department
  16. Distributor
  17. Edition
  18. Guid
  19. Institution
  20. InternetSiteTitle
  21. Issue
  22. JournalName
  23. LCID
  24. Medium
  25. Month
  26. MonthAccessed
  27. NumberVolumes
  28. Pages
  29. PatentNumber
  30. PeriodicalTitle
  31. PlacePublished
  32. ProductionCompany
  33. PublicationTitle
  34. Publisher

As before, because of the flat model, we have six different title properties for the same thing: a related title. And the fields are fixed, and uncontrolled in the schema (the properties are just a blunt zero-or-more choice list).

In other words, on the one hand we have a relatively limited data model that does not reflect the kind of complexity and variability of real world citation data. On the other hand, it’s reflected in an incredibly loose schema that cannot be extended. The first problem is compounded by the second.

Finally, the one place where there is some more structure is contributors.

Awkwardness 1: the main element is Author, but in fact is a far broader Contributor, since the children of Author include:

  1. Artist
  2. Author
  3. BookAuthor (?? must be another awkward consequence of the flat model)
  4. Compiler
  5. Composer
  6. Conductor
  7. Counsel
  8. Director
  9. Editor
  10. Interviewee
  11. Interviewer
  12. Inventor
  13. Performer
  14. ProducerName
  15. Translator
  16. Writer

This is actually one of the few things I like about the schema, aside from the above-mentioned weirdness of paths like b:Author/b:Author.

I find the contributor name model particularly surprising in a format that aspires to be an international standard. The first/middle/last name tradition is quite culturally-specific, and I can only guess what Asian users will think about this, or Western users who need to deal with Asian sources. What’s even more frustrating is, it’s easy for them to fix.

Hopefully we’ll see some improvements in the next draft.

Comments are closed.


Creative Commons License Creative Commons License