Subject: | |
From: | |
Reply To: | |
Date: | Fri, 18 Mar 2011 15:53:59 +0700 |
Content-Type: | text/plain |
Parts/Attachments: |
|
|
On Fri, Mar 18, 2011 at 5:27 AM, Tina Herman Buck <[log in to unmask]> wrote:
> Hi all,
> I'm looking for a way to find multiple editions of the same title in our
> catalog. Can MarcEdit do some kind of title or author/title comparison if I
> output a few thousand records at a time?
> Thanks,
> Tina
I don't know how MarcEdit can help with this other than the
data-manipulation pieces, e.g. Marc to CSV conversion.
What I do is export the relevant fields to a CSV file (many ILSs will
do this directly) and bring it into a spreadsheet, then do various
sorts and scan through the records manually. I find I can get through
many thousands of records in a few hours with pretty good accuracy.
The critical factor is the consistency of your data - for collections
that haven't made proper use of Authorities, I've run across issues
with author first/last name order (search for commas and flip those),
and sometimes I've had to search/delete stopwords in the titles to get
them right. Sidenote - which makes me think, perhaps an export based
your ILS' z3950 indexing would help with those kinds of issues?
In order to do better job than a human, an automated "data deduping"
program would need to have elements of AI (anyone know of anything
like this please speak up!), unless you were happy to only get hits on
exact matches - which Excel can automate for you anyway.
But it's amazing how good the human/eye brain is at pattern
recognition, just don't try to do it for more than a couple of hours
at a time. . .
________________________________________________________________________
This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
|
|
|