Ruby sort_by and citeproc-rb
There’s a new Ruby weblog over at O’Reilly. This post on sorting arrays reminds me; how to sort an array of references by creator name, then year:
refs.sort_by {|ref| [ref[:creator], ref[:year]]}.each do |ref|
puts "#{ref[:creator]}, #{ref[:title]}, (#{ref[:year]})"
end
This has me wondering again how much easier it’d be to rewrite CiteProc in Ruby (or Python). Consider how complicated this XSLT code is, for example. It’s task is for the most part to sort a reference list by author-year, to track when there are more than one reference in an author group (because many styles replace duplicate author listings with em-dashes), and to assign proper suffixes to duplicate author-years (to get Doe, 2001c).
My hunch is this is much easier to do with Ruby or Python, neither of which I yet have any skill with.
One caveat: a Java-based XSLT processor like Saxon does a good job handling unicode sorting, which can be critical in bibliographic formatting. Am not sure how well Ruby handles this. If the array data above includes extended unicode characters, they seem not to get sorted correctly.
Ultimately, I think for the OpenOffice bibliographic project we’re going to have to rewrite CiteProc in C++ anyway, so at some point we’re going to need to figure out how to port it to a non-XSLT language.
Creative Commons License