MARCEDIT-L Archives

November 2013

MARCEDIT-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Benjamin A Abrahamse <[log in to unmask]>
Reply To:
MarcEdit support in technical and instructional matters <[log in to unmask]>
Date:
Thu, 14 Nov 2013 00:03:58 +0000
Content-Type:
text/plain
Parts/Attachments:
text/plain (77 lines)
I'm sure someone could come up with a complex way to do this with a single find/replace, but I would do it in stages:

First, I'm going to assume that the World Bank urls are the only 856's in the file. If that's not the case I would use the Swap Fields tool to move only world bank urls into a separate field.

Then I'd use the Edit Subfield utility with the "Use regular expressions" check box on, to edit the content. I'd do it in steps:

Also--check and SAVE between each step, you only get 1 level of undo.

1. find: /
    repl: \t

This will replace all of the slashes with tabs.  (This step is not strictly necessary but it makes it easier to read the separate elements of the url.)


2.  We want to replace the term "content" with the term "doi":  

    find: content
    repl: book

3. Then it's a question of inserting the doi between the /book and the ISBN.  You could probably figure out a way to have the regex count tabs but it's faster just to\

   find: book
   repl: book/10.1596

4.  Now for the ISBNs. The first four digits, at least, are static so you can find "9780" and replace with "978-0-". Probably more, since these are emanating from the same publisher.

In any case, you can use regular expressions to copy and group the remaining digits and replace them with dashes in between:

   find:  (\d\d\d\d)(\d\d\d\d)(\d)
   repl: $1-$2-$3

Lastly, if you did step 1 and replaced / with tab, unreplace:

find: \t
repl: /

Then you should be done. 

With all the time you've saved, write an email to the world bank to ask them why they evidently hate catalogers so much.
________________________________________
From: MarcEdit support in technical and instructional matters [[log in to unmask]] on behalf of Bothmann, Robert L [[log in to unmask]]
Sent: Wednesday, November 13, 2013 6:14 PM
To: [log in to unmask]
Subject: [MARCEDIT-L] Reg. expression for adding hyphens

World Bank eLibrary has changed their URLs and I need to update ours that we have in the catalog.

We have this type of URL: http://elibrary.worldbank.org/content/book/9780821326886

And I need to change it into this: http://elibrary.worldbank.org/doi/book/10.1596/978-0-8213-2688-6

It’s a simple find and replace up to the 978, but thereafter I need a regular expression that can take the ISBN without hyphens and add them after the third, fourth, eighth, and twelfth digits.

Is there some simple magic for this?

Thank you most sincerely,
Bobby Bothmann

***********************************
Robert Bothmann
Metadata & Emerging Technologies Librarian
Professor, Library Services
Minnesota State University, Mankato
P.O. Box 8419, ML3097
Mankato, MN 56002
Voice: 507-389-2010
Fax: 507-389-5155
[log in to unmask]<mailto:[log in to unmask]>

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

ATOM RSS1 RSS2