Open XML, Draft 1.4
MS recently released a new draft of their Open XML format. This version includes more information about the citation and bibliographic support. Some notes …
First, citations. Really nothing to say except, yeah, they’ve documented what they’re doing a bit more (see the pdf, p1513), but no, they’ve not bothered to fix any of it. This isn’t really surprising I suppose, since they’re using generic fields to encode citations. But it still results in rather unfriendly XML.
Second, the bibliographic source format.
All the problems I noted earlier remain:
- the personal name model is Western—even U.S.—centric
- the model is (almost) totally flat and inflexible
Here are the list of types, with my own parenthetical comments:
- Art
- ArticleInAPeriodical (should be just “Article”)
- Book
- BookSection
- Case
- ConferenceProceedings
- DocumentFromInternetSite (should be just “Document”)
- ElectronicSource (which is?)
- Film
- InternetSite
- Interview
- JournalArticle
- MagOrNewsArticle (how is this any different from ArticleInPeriodical??)
- Misc (hints of a broken model)
- Patent
- Performance
- Report
- SoundRecording
Again, limited and inconsistent, and it seems fixed.
Beyond that, the fields are pretty much flat after that, so the only thing to do is list them:
- Author
- BookTitle
- Broadcaster
- BroadcastTitle
- CaseNumber
- ChapterNumber
- City
- Comments
- ConferenceName
- Country
- CountryRegion
- Court
- Day
- DayAccessed
- Department
- Distributor
- Edition
- Guid
- Institution
- InternetSiteTitle
- Issue
- JournalName
- LCID
- Medium
- Month
- MonthAccessed
- NumberVolumes
- Pages
- PatentNumber
- PeriodicalTitle
- PlacePublished
- ProductionCompany
- PublicationTitle
- Publisher
As before, because of the flat model, we have six different title properties for the same thing: a related title. And the fields are fixed, and uncontrolled in the schema (the properties are just a blunt zero-or-more choice list).
In other words, on the one hand we have a relatively limited data model that does not reflect the kind of complexity and variability of real world citation data. On the other hand, it’s reflected in an incredibly loose schema that cannot be extended. The first problem is compounded by the second.
Finally, the one place where there is some more structure is contributors.
Awkwardness 1: the main element is Author, but in fact is a far broader Contributor, since the children of Author include:
- Artist
- Author
- BookAuthor (?? must be another awkward consequence of the flat model)
- Compiler
- Composer
- Conductor
- Counsel
- Director
- Editor
- Interviewee
- Interviewer
- Inventor
- Performer
- ProducerName
- Translator
- Writer
This is actually one of the few things I like about the schema, aside from the above-mentioned weirdness of paths like b:Author/b:Author.
I find the contributor name model particularly surprising in a format that aspires to be an international standard. The first/middle/last name tradition is quite culturally-specific, and I can only guess what Asian users will think about this, or Western users who need to deal with Asian sources. What’s even more frustrating is, it’s easy for them to fix.
Hopefully we’ll see some improvements in the next draft.
Creative Commons License