Yeah -- as Tim noted, if you pulled these from connexion, leave the values. Oclc internally uses a longer 008 because it expands the date. I should hide that with the plugin so there's no confusion -- but that's what's up there. If they are not from connexion -- then you'd need to drop the extra bits. If it was me, I'd use
Find: (=008 ??)(.*)
Replace: =008 $2
Check regular expressions. This would deal with 19xx and 20xx dates.
--TR
> -----Original Message-----
> From: MarcEdit support in technical and instructional matters
> [mailto:[log in to unmask]] On Behalf Of Timothy M. McCarthy
> Sent: Tuesday, February 08, 2011 5:34 PM
> To: [log in to unmask]
> Subject: Re: [MARCEDIT-L] 42-character 008s & regex?
>
> Jim,
>
> Are you saving this data back to Connexion? If so, leaving the 008
> should be fine.
>
> If you're saving the file external to Connexion, then the 008 needs
> changing.
>
> I've been able to make the "=008 20" with "=008 " replacement with no
> problems in the past, but my conscience said OCLC should get the export
> credit, so now I highlight all records in local save file and do a
> batch
> export to a file on your PC. This will cost your institution the OCLC
> export charge, but will convert the 008 to the correct format.
>
> If you've already done significant editing to the original file, you
> might be able to merge the new file's 008 into the first one using
> Terry's wonderful Merge tool.
>
> My $0.02
> Tim
>
> Timothy M. McCarthy
> University at Buffalo
> 716.645.8577
> [log in to unmask]
>
>
> On 2/8/2011 6:52 PM, Jim Kuhn wrote:
> > Hello:
> >
> >
> >
> > I recently pulled in some OCLC local save file records and it took
> some
> > weeks before we noticed that the Connexion bib file reader was
> > interpreting the year portion of Date Entered (positions 00-05) as
> yyyy
> > rather than yy. This pushed our 008 data off by two bytes. For
> example,
> > compare the first 008 below (which is fine) with the second (which
> > isn't):
> >
> >
> >
> > 090924 m 1577 1583 xx_ ad__ _ _ ____ _ 0 0 0 _ 0 _
> eng
> > _ d
> >
> > 200909 2 4m15 7715 83x x_ad _ _ ____ _ _ _ 0 _ 0 _
> 0_e
> > n g
> >
> >
> >
> > The thing is, the apparently-missing final two characters are
> actually
> > present. So I've been puzzling over how to delete just the 1st two
> > characters of each 008 in affected records. That is, I want to make
> 008s
> > that look like this in MARCEdit:
> >
> >
> >
> > =008 20051110s1603\\\\xx\n\n\\\\\\\\\\\nknger\d
> >
> >
> >
> > instead look like this:
> >
> >
> >
> > =008 051110s1603\\\\xx\n\n\\\\\\\\\\\nknger\d
> >
> >
> >
> > Here's what I've tried, despite knowing only enough about regular
> > expressions to be mildly dangerous:
> >
> >
> >
> > - MARCValidator reports these 008s out as too-long. Right.
> >
> > - A straight replace to change "=008 20" to "=008 "
> corrupts
> > the file. Although I successfully produced an uncorrupted marc file
> by
> > manually modifying a single record, there are thousands of records
> > involved. Haven't yet tried my fallback method yet: write a macro to
> > just hop from 008 to 008 "manually" editing each in turn. Inelegant.
> And
> > maybe not effective? But I'll try it if I can't do this globally.
> >
> > - Edit subfield data, to find "([0-9]{2})(.*)" and replace
> with
> > "$2". Tried this using both "remove" and "replace", and by calling
> the
> > 008 position "00" or "02" or "00-05" or "05". No luck.
> >
> > - Then I thought I'd try regex on Find / Replace, or the
> Script
> > Wizard. But I don't do a heck of a lot with regex, and before I start
> in
> > on it I thought I'd check in here. Besides its not clear to me that
> 008s
> > are even editable in this way?
> >
> >
> >
> > Thanks in advance for any advice re regex in Replace, in Edit
> Subfield
> > Data, or in the Script Wizard. Or any other method for deleting just
> the
> > 1st two characters in invalidly-long 008s.
> >
> >
> >
> > Terry if you'd like a closer look at how the Connexion plugin is
> > behaving with this data I can send you our source bib.db.
> >
> >
> >
> > best,
> >
> > Jim Kuhn
> >
> >
> >
> > __________________________________
> >
> > Head of Collection Information Services
> >
> > Folger Shakespeare Library
> >
> > 201 E Capitol St SE
> >
> > Washington DC 20003-1004
> >
> > 202-675-0334
> >
> > [log in to unmask]
> >
> > www.folger.edu
> >
> >
> >
> >
> >
> _______________________________________________________________________
> _
> >
> > This message comes to you via MARCEDIT-L, a Listserv(R) list for
> technical and instructional support in MarcEdit. If you wish to
> communicate directly with the list owners, write to MARCEDIT-L-
> [log in to unmask] To unsubscribe, send a message "SIGNOFF
> MARCEDIT-L" to [log in to unmask]
> >
>
> _______________________________________________________________________
> _
>
> This message comes to you via MARCEDIT-L, a Listserv(R) list for
> technical and instructional support in MarcEdit. If you wish to
> communicate directly with the list owners, write to MARCEDIT-L-
> [log in to unmask] To unsubscribe, send a message "SIGNOFF
> MARCEDIT-L" to [log in to unmask]
________________________________________________________________________
This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
|