OpenDocument and RDF: Storing What Metadata Where?
Posted in Uncategorized on October 30th, 2005 by darcusb – Comments OffEarlier I discussed one way I envision expanding metadata support in OpenDocument. That consisted of using some RELAX NG-magic to constrain the structure of metadata representation, but allowing it to be easily extended.
But this leaves two obvious questions: what sort of document objects might one want to add custom metadata to, and where might one store that metadata?
On the first question: the most obvious content would be the things that are summarized in lists of figures, captions, citations and so forth. Relevant metadata may include source information, including not only titles and creators and such, but also rights information. All of this metadata can lead not only to smarter documents that can be more easily searched, but also better user experiences. Imagine, for example, not just turbo-charged citation support, but also automatic figure captioning, including publisher-required rights information. Or perhaps information about where to access a data set summarized in a table.
On the question of where to store the metadata, since OD files are just zipped archives, the obvious place is in dedicated files in the file wrapper. Indeed, document metadata is already stored this way, in a “meta.xml” file.
Here I see two possibilities:
- Retain the single “meta.xml” file, and create elements to wrap the requisite metadata: meta:Document, meta:Figures, meta:Bibliography, etc.
- Create separate files for each kind of metadata: meta-document.xml, meta-figures.xml, meta-bibliography.xml.
I tend to favor the second approach myself.
Incidentally, when Daniel Vogelheim and I wrote the proposal to improve citation coding in OpenDocument last year (which was based on previous work with DocBook, and approved by the OD TC), we always had in mind something like this model; to move the bibliographic metadata out of the main content.xml file and into its own file. Indeed, the citation proposal is virtually meaningless without at least standardizing that bibliographic metadata is stored outside the content file, if not actually formalizing the format for interoperability purposes.
The current RDF discussion simply allows the opportunity to do this in a comprehensive and consistent way. Let’s hope the TC is far-sighted in its deliberations on this matter.


Creative Commons License