Yes -- I can add a search to Marc Spy.
--TR
-----Original Message-----
From: MarcEdit support in technical and instructional matters [mailto:[log in to unmask]] On Behalf Of Shelley Doljack
Sent: Friday, November 04, 2011 11:37 AM
To: [log in to unmask]
Subject: [MARCEDIT-L] Correcting character encoding issues
Hi Terry,
I found a whole bunch of character encoding problems in a set of vendor ebook MARC records. The LDR/09 value is "a" for UTF-8 but when I break the file (I do not select the checkboxes "translate to UTF-8/MARC-8"), the diacritic is not correct. For instance, where there should be an a umlaut, marcedit displays:
=100 1\$aKl{copy}?ger, Roland.
My font settings are set to Arial Unicode MS, so that's not the problem. And I'm pretty sure marcedit is not the problem either. I've used the MARC Spy tool to see the hex code points for the a umlaut and they are C3 3F. But C3 3F is not a valid UTF-8 sequence, I think. The a umlaut should be C3 A4 according to http://www.fileformat.info/info/unicode/char/e4/index.htm. When I change it to those values and save the file, I get the correct diacritic displaying when I re-break it.
I was wondering if you could add a search or find function to the MARC Spy tool so correcting the incorrect code points would be easier. Or maybe others on this list could recommend how they deal with correcting character encoding issues. I normally let the vendor know, but this vendor particular vendor is supposedly not able to replicate the problem or they don't want to deal with it.
I tried attaching a zip and .mrc file of records with character encoding issues but the list rejected my email both times. If anybody wants a copy of the file, let me know and I'll send it to you directly.
Thanks,
Shelley
----
Shelley Doljack
E-Resources Metadata Librarian
Metadata and Library Systems
Stanford University Libraries
[log in to unmask]
650-725-0167
________________________________________________________________________
This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
________________________________________________________________________
This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
|