MARCEDIT-L Archives

November 2020

MARCEDIT-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Terry Reese <[log in to unmask]>
Reply To:
MarcEdit support in technical and instructional matters <[log in to unmask]>
Date:
Sun, 29 Nov 2020 22:38:51 -0500
Content-Type:
text/plain
Parts/Attachments:
text/plain (40 lines)
I've taken a look at this - and as I've noted -- I have a handful of concerns about potentially corrupting data by allowing data that won't be validated -- but I have made the modification in the 7.5 codebase -- and I could potentially backport it.

Here's the challenge.  The tool currently allows you to take the 008 template and add mnemonics like {260$c} or {264$c}.  These values get validation so that when it encounters data in these fields -- it removed data that isn't valid within the 008 date field.  What's more, if the data isn't present at all -- those mnemonics still need to be removed -- so the data gets replaced with an undefined date value.  This is the problem with other data fields.  Since I'm not sure which field is being replaced, it's difficult to validate and almost impossible to handle adding default values.

So...what I would propose is the following.  The tool will still validate special data like the {260$c} and {264$c} because there is regular expression work potentially necessary to clean information.  But for other data fields -- I'll allow the ability to construct replacements like this:
{041$a.und}.  You should be able to see what I'm doing here.  The program is looking at the data in the 041$a and will use that data (it will only validate for punctuation -- removing non-word characters since punctuation isn't valid in the 008) -- however, if the 041$a isn't present -- the data following the period will be used as the default value.  This way, the mnemonic can be removed and the data in the 008 will continue to be valid (both in terms of length of the field and the data in the field).  Within this information, there is the real potential that corruption will be introduced into a record.

This process will only work with data that has subfields and where defaults are defined.  This means if you had the following:
* {001.und}
* {041$a}
* {040.OSU}

All of these would be ignored and the mnemonic will be left in the 008 (invalidating that field and the generated record).  So, if you use the option to utilize user defined replacements -- you will need to take care to provide the necessary default values.

--tr

-----Original Message-----
From: MarcEdit support in technical and instructional matters <[log in to unmask]> On Behalf Of Radim Chv ála
Sent: Sunday, November 29, 2020 2:25 PM
To: [log in to unmask]
Subject: Re: [MARCEDIT-L] replace 008 field with content of another MARC files

Thanks for the quick answer, Terry.

So I ask for advice:
1) which positions in 008 can still be modified in this way?

2) I create Marc21 from Excel, so I have to generate positions in field 008 from the content of other fields (eg language from 041). So please advise how to do it in batches (there are thousands of records).

Thank you very much.
Radim

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

ATOM RSS1 RSS2