tech: November 2008 Archives

The Events Calendar on our website dates back to 2003, and database behind it uses the ISO/IEC 8859-1 encoding method that was current back then. We see more and more departments putting in event information with full diacritical support (such as "César Chávez"), as support in operating systems and keyboard layouts gets better and easier to use. The actual calendar pages use the old ISO Latin system for their character encoding, and since the only thing on those pages is the calendar itself, it's always worked just fine.

But we had recently transitioned the front page to UTF-8, in order to simplify editing text there. This led to an unfortunate case of Mojibake when events with upper-ascii in their titles hit our UTF-8 home page, as they do starting 7 days before their occurrence:

I considered doing a mass conversion of the database to UTF-8 encoding, but the narratives of people trying to do this on the web are pretty hairy. A better solution seemed to be to keep everything (underlying data and the specific calendar pages themselves) in legacy Latin, and transform the strings in real time when they were extracted from the database on the home page via php. There turns out to be a useful command for this: utf8_encode.

$uperson = utf8_encode($person);

This brings the character encoding into alignment with the rest of the page, while not generating gratuitous encoding headaches (and possible catastrophes) with our production event database.

Link to this Post | Leave a Comment

Digital Humanities Conference

Here I am giving my poster session on TEI-based markup of runic inscriptions. My neighbors were the University of Alberta, UCLA and the NSF.

Link to this Post | Leave a Comment

About this Archive

This page is an archive of entries in the tech category from November 2008.

tech: September 2008 is the previous archive.

tech: December 2008 is the next archive.

Find recent content on the main index or look in the archives to find all content.

Recent Activity

Thursday Oct 23
Wednesday Oct 22
Tuesday Oct 21
Monday Oct 20
Sunday Oct 19
Saturday Oct 18