This is a bit random, but does anyone know how to convert the grs1 character set into utf8?
For various reasons, I'm hacking about with Zebra and grs1 output records. My actual scenario (if it matters) is that I have utf8 SOIF records, and get them out of Zebra using Zap. To do this sensibly, I've ended up using grs1 format retrieval into Zap.
The grs1 character set is really weird. I've found that a » character appears to become Â» when piped though grs1 (very much like looking at the utf8 character with iso-8859-1 eyes). I'd like to have a proper grs1 character -> HTML entities converter, but alas, I can't find anything about the character set on t'interweb.
I have found a hack which seems to work, although probably will cause more problems than it solves. It's a Perl regular expression:
$data =~ s/..Â(W)/$1/g;
...I dunno why it needs two characters before the Â - hex dumps of data don't really show up what's going on either. I'd really like to get this cleaned up and working properly (ie. without some god-awful hack in place!).
UPDATE (30th Aug, 2004):
I've got a long way to sorting this out. Certainly, it now works in the majority of cases, unlike the regex which really doesn't work at all.
I've written a Perl module to perform character conversions from GRS1 to Unicode (and back again). I can't guarantee it's a full implementation, or even that it'll work how you might expect, but it does seem to do what I need (at least). Do with it what you will.