Here's what we do. It may not catch everything or work in all cases, but
we find it acceptable:
Check for non-standard formatting/punctuation of 300s by doing regex
find all on:
=300 \\\\\$a((.*([^:]|[^ ]:)\$b)|([^$]*(ill|map|port))|(.*([^;]|[^
];)\$c)|(.*\$b[^$]*(\d| )cm))
If no results are found:
- run MARCedit task "Eresourcify 300s" (task text file attached:
task_x-300e.txt)
If any results are found:
- run MARCedit task that cleans up common punctuation/formatting
problems (task file attached: task_x-300cl.txt)
- do regex find all search again
-- if there are no results, run the "Eresourcify 300s" task
-- if there are results, clean up manually. If there are recurring
patterns in the problems, send examples to Kristina so she can write
them into the cleanup task
-=-=
To test the tasks, you will need to put them in the MARCedit Application
Data directory, which will be located somewhere like:
C:\Users\{Your Windows User Name}\AppData\marcedit\macros
or
C:\Documents and Settings\{Your User Name}\Application Data\marcedit\macros
Then, you will need to open the "_tasks.txt" file found in the
Application Data directory with a text editor, and add two new lines
pointing to the new tasks. You should just be able to copy an existing
line, change the name of the task to suit you (first thing on the line),
and edit the file name at the end of the file path string.
Restart MARCedit, and you should be able to try the tasks out.
best,
-=-
Kristina M. Spurgin
E-RESOURCES CATALOGER
E-Resources & Serials Management, Davis Library
University of North Carolina at Chapel Hill
CB#3938, Davis Library -- Chapel Hill, NC 27514-8890
919-962-2050 -- [log in to unmask]
On 9/13/2012 10:24 AM, Donley, Leah wrote:
> Hi Shelley,
>
>
>
> Thank you for taking the time to look at this. I would love to simplify my steps :) I tested your suggestion and while it works on the examples I mentioned, my file also contains records that are already in provider neutral format, and those ended up with an extra "1 online resource":
>
> \\$a1 online resource (1 online resource (xiv, 480 p.))
>
>
>
> I don’t understand how, but part of my second step seems to prevent that from occurring. The files I’m working on include almost every possible 300 field format so I'm trying to catch and address them all correctly in the simplest way possible. These examples are also not addressed by my procedure:
>
> \\$ap.$ccm.
>
> \\$ap. cm.
>
>
>
> Using your below steps, they change to:
>
> \\$a1 online resource (p.)$ccm.
>
> \\$a1 online resource (p. cm.)
>
>
>
> I’m currently handling these examples by extracting the records (I sort using the “Select Individual Records to Make” tool to find the ugly 300 fields and replace them all with “1 online resource.). If it’s not possible to catch these another way, I think going forward I will incorporate your steps at this point because the result is preferable over a blanket replace with “one online resource” which may or may not be accurate. Ideally, I would love a procedure that perfectly addresses all of the variations I’m coming across, but understand that may be wishful thinking!
>
>
>
> Thanks,
>
> Leah
>
>
>
>
>
> -----Original Message-----
>
> From: MarcEdit support in technical and instructional matters [mailto:[log in to unmask]]<mailto:[mailto:[log in to unmask]]> On Behalf Of Shelley Doljack
>
> Sent: Wednesday, September 12, 2012 1:34 PM
>
> To: [log in to unmask]<mailto:[log in to unmask]>
>
> Subject: Re: [MARCEDIT-L] Converting 300 field to provider neutral
>
>
>
> It seems to me that the steps you have are over-complicating it. This is what I would do (and tested it and it seems to work):
>
>
>
> Original 300 fields:
>
> =300 \\$avi<file:///\\$avi>, 514 p.
>
> =300 \\$av<file:///\\$av>.
>
> =300 \\$aix<file:///\\$aix>, 341 p. :$bill.
>
> =300 \\$axviii<file:///\\$axviii>, 263 p. :$bill. ; $e1 CD-ROM (4 3/4 in.)
>
> =300 \\$axiii<file:///\\$axiii>, 357 p. :$bill. ;$c24 cm.
>
> =300 \\$ax<file:///\\$ax>, 399 p. :$bill. ;$c24 cm. +$e1 CD-ROM (4 3/4 in.)
>
>
>
> 1. Use the Edit Subfield tool:
>
> Field: 300
>
> Subfield: a
>
> Field Data: a(.+)
>
> Replace with: a1 online resource ($1)
>
> check regex
>
> click replace text button
>
>
>
> 300 fields after step 1:
>
> =300 \\$a1<file:///\\$a1> online resource (vi, 514 p.)
>
> =300 \\$a1<file:///\\$a1> online resource (v.)
>
> =300 \\$a1<file:///\\$a1> online resource (ix, 341 p. :)$bill.
>
> =300 \\$a1<file:///\\$a1> online resource (xviii, 263 p. :)$bill. ; $e1 CD-ROM (4 3/4 in.)
>
> =300 \\$a1<file:///\\$a1> online resource (xiii, 357 p. :)$bill. ;$c24 cm.
>
> =300 \\$a1<file:///\\$a1> online resource (x, 399 p. :)$bill. ;$c24 cm. +$e1 CD-ROM (4 3/4 in.)
>
>
>
> 2. Use the Find/Replace tool to :
>
> Find: (=300.+)([\s]:\))
>
> Replace: $1) :
>
> check regex
>
> click replace all button
>
>
>
> 300 fields after step 2:
>
> =300 \\$a1<file:///\\$a1> online resource (vi, 514 p.)
>
> =300 \\$a1<file:///\\$a1> online resource (v.)
>
> =300 \\$a1<file:///\\$a1> online resource (ix, 341 p.) :$bill.
>
> =300 \\$a1<file:///\\$a1> online resource (xviii, 263 p.) :$bill. ; $e1 CD-ROM (4 3/4 in.)
>
> =300 \\$a1<file:///\\$a1> online resource (xiii, 357 p.) :$bill. ;$c24 cm.
>
> =300 \\$a1<file:///\\$a1> online resource (x, 399 p.) :$bill. ;$c24 cm. +$e1 CD-ROM (4 3/4 in.)
>
>
>
> 3. Remove $c's and $e's as you like.
>
>
>
>
>
> Regards,
>
> Shelley
>
>
>
> ----- Original Message -----
>
>> From: "Leah Donley" <[log in to unmask]<mailto:[log in to unmask]>>
>
>> To: [log in to unmask]<mailto:[log in to unmask]>
>
>> Sent: Wednesday, September 12, 2012 6:43:08 AM
>
>> Subject: Converting 300 field to provider neutral
>
>>
>
>>
>
>>
>
>>
>
>> Good morning,
>
>>
>
>>
>
>>
>
>> I’ve come across another 300 field “format” in ebook records. I’d
>
>> like to find a way to change these to provider-neutral format, but
>
>> again, my procedure (copied below) is not picking these instances
>
>> up. A few examples are below. Any suggestions would be appreciated –
>
>> whether it’s modifying the regex in a step or adding another
>
>> step(s).
>
>>
>
>>
>
>>
>
>> Example 300 fields:
>
>>
>
>> \\$aviii<file:///\\$aviii>, 248 p.
>
>>
>
>> \\$axiv<file:///\\$axiv>, 480 p.
>
>>
>
>> \\$av<file:///\\$av>.
>
>>
>
>>
>
>>
>
>>
>
>>
>
>> 1) Add semicolon to 300 $a that ends without semicolon or colon, and
>
>> replace it with a 300 $a that does:
>
>>
>
>> FIND: =300\s\s\\\\\$a(\d)\sv.$
>
>>
>
>> REPLACE: =300 \\$a$1<file:///\\$a$1> v. ;
>
>>
>
>> USE REGULAR EXPRESSIONS checked
>
>>
>
>>
>
>>
>
>> 2) This step that will insert “1 online resource” and move the
>
>> original contents of subfield a into the parentheses:
>
>>
>
>> Find: (?<one>=300.*\$a)(?!1 online resource \()(.*?)( [;:].*)
>
>>
>
>> Replace: ${one}1 online resource ($1)$2
>
>>
>
>>
>
>>
>
>> 3) This will remove any subfield c’s
>
>>
>
>> Find: (=300.*)( ;\$c.*)
>
>>
>
>> Replace: $1
>
>>
>
>>
>
>>
>
>> 4) Remove semicolons from ends of any subfield b’s
>
>>
>
>> Field: 300
>
>>
>
>> Subfield: b
>
>>
>
>> Field data: b(.+)\s;
>
>>
>
>> Replace with: b$1
>
>>
>
>> Click "Replace text"
>
>>
>
>>
>
>>
>
>> 5) Remove “trick” semicolon you inserted into subfield a in the first
>
>> step:
>
>>
>
>> Field: 300
>
>>
>
>> Subfield: a
>
>>
>
>> Field data: a(.+)\s;
>
>>
>
>> Replace with: a$1
>
>>
>
>> Click "Replace text"
>
>>
>
>>
>
>>
>
>> 6) My final step is to add 300 fields to the records that don’t have
>
>> them using the add/delete field function
>
>>
>
>>
>
>>
>
>> Thank you,
>
>>
>
>> Leah Donley
>
>>
>
>>
>
>>
>
>>
>
>>
>
>> Leah Donley
>
>> Information Specialist
>
>> Brookhaven National Laboratory
>
>> Email: [log in to unmask]<mailto:[log in to unmask]>
>
>>
>
>>
>
>>
>
>> ________________________________________________________________________
>
>>
>
>> This message comes to you via MARCEDIT-L, a Listserv(R) list for
>
>> technical and instructional support in MarcEdit. If you wish to
>
>> communicate directly with the list owners, write to
>
>> [log in to unmask]<mailto:[log in to unmask]>. To unsubscribe, send a message
>
>> "SIGNOFF MARCEDIT-L" to [log in to unmask]<mailto:[log in to unmask]>.
>
>
>
> --
>
> Shelley Doljack
>
> E-Resources Metadata Librarian
>
> Metadata and Library Systems
>
> Stanford University Libraries
>
> [log in to unmask]<mailto:[log in to unmask]>
>
> 650-725-0167
>
>
>
> ________________________________________________________________________
>
>
>
> This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask]<mailto:[log in to unmask]>. To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]<mailto:[log in to unmask]>.
>
________________________________________________________________________
This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
REPLACE ^(?<a>=300.*)[:,]\$c ${a};$$c 2
REPLACE ^(?<a>=300.*[:;])[/p]\$ ${a}$$ 2
REPLACE ^(?<a>=300.*;) *(?<b>[0-9]) ${a}$$c${b} 2
REPLACE ^(?<a>=300.*) ?: ?bill ${a} :$$bill 2
REPLACE ^(?<a>=300.*) ?; ?c ? ${a} ;$$c 2
REPLACE ^(?<a>=300.*)[^;]\$c; ${a};$$c 2
REPLACE ^(?<a>=300.*;) +\$c ${a}$$c 2
REPLACE ^(?<a>=300.*:) +\$b ${a}$$b 2
REPLACE ^(?<a>=300.*[^ ]): ?\$b ${a} :$$b 2
REPLACE ^(?<a>=300.*)\. ?\$c ${a}. ;$$c 2
REPLACE ^(?<a>=300.*)\. ?\$b ${a}. :$$b 2
REPLACE ^(?<a>=300.*[^ ]);\$c ${a} ;$$c 2
REPLACE ^(?<a>=300.*);\$b ${a}:$$b 2
REPLACE ^(?<a>=300.*[^ ]) ?[;:]\$b *(?<b>[0-9]+ cm\.?) ${a} ;$$c${b} 2
REPLACE ^(?<a>=300.*p\.)?,? ?:? ?(?<b>(ill|port|map)) ${a} :$$b${b} 2
REPLACE ^=300 \\\\\$ap\.? ?[;:]?\$?c.* =300 \\$$a1 online resource. 2
REPLACE ^=300 \\\\\$a(?!1 online resource)(?<a>.*?)(?<b>$| [:;]) =300 \\$$a1 online resource (${a})${b} 2
REPLACE ^(?<a>=300.*) ;\$c.*$ ${a} 2
|