MARCEDIT-L Archives

October 2021

MARCEDIT-L@LISTSERV.GMU.EDU

Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Subject:
From:
Terry Reese <[log in to unmask]>
Reply To:
MarcEdit support in technical and instructional matters <[log in to unmask]>
Date:
Wed, 13 Oct 2021 16:50:11 -0400
Content-Type:
text/plain
Parts/Attachments:
text/plain (355 lines)
If you post the file -- I can take a look.  I may also ask you to check or provide the rules file as these get changed periodically to address changes in how LC processes the data.

--tr

-----Original Message-----
From: MarcEdit support in technical and instructional matters <[log in to unmask]> On Behalf Of Jennings, Catherine
Sent: Wednesday, October 13, 2021 3:03 PM
To: [log in to unmask]
Subject: Re: [MARCEDIT-L] MARCEDIT-L Digest - 11 Oct 2021 to 12 Oct 2021 (#2021-214)

Hi,
I process 2000 records at a time and over 735 came back as invalid. I would say close to 90%  are valid. I did notice that it takes a really long time to process and wondering if this is a timing out issue. 
I've been running these reports for a long time and have never had this problem before. Anyone willing to run my file through the validation (file contains authors only) processing as a test?
The only change I made was that I updated MarcEdit - but that could just be a coincidence.

Thanks!

Catherine

Catherine Jennings
Technical Services Manager
Senior Cataloging Librarian
Oakland Public Library
510-238-7323

Working:
Monday 10:00-6:00
Tuesday – Friday 8:00-4:00


-----Original Message-----
From: MarcEdit support in technical and instructional matters <[log in to unmask]> On Behalf Of MARCEDIT-L automatic digest system
Sent: Tuesday, October 12, 2021 9:01 PM
To: [log in to unmask]
Subject: MARCEDIT-L Digest - 11 Oct 2021 to 12 Oct 2021 (#2021-214)

[EXTERNAL]  This email originated outside of the City of Oakland. Please do not click links or open attachments unless you recognize the sender and expect the message.

----------------------------------------------------------------------
There are 4 messages totalling 1217 lines in this issue.

Topics of the day:

  1. Regex to parse call numbers
  2. Problem subfields in 245 and 300
  3. Validate Headings report
  4. Conditional find/replace stumper

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

----------------------------------------------------------------------

Date:    Tue, 12 Oct 2021 11:41:12 +0000
From:    Pamela Swaidner <[log in to unmask]>
Subject: Regex to parse call numbers

Greetings!

We need a regex to parse our local call numbers into different subfields. We've had good luck with Find and Replace for things like Fiction and DVDs, but we get stuck when trying to figure out how to parse data that comes after the call number or cutter.

The top line is the current call number, and the second line reflects what is needed:

=099  \\ $a917.9 AND 2020
=099  \\ $a917.9$bAND$v2020

=099 \\ $aB Earhart, Amelia JAM
=099 \\ $aB Earhart, Amelia$bJAM

=099 \\ $aSpanish 636.7 ABC
=099 \\ $pSpanish$a636.7$bABC

Any help is greatly appreciated!


Pam Swaidner
Manager, Cataloging and Metadata
The Indianapolis Public Library
https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1U1A5HT-2DIK9b2nuZAz7oZuzqDdTPo2wIbCPgm83SedLbGn5rVZkyFMa3Y4GMwVv3ziymhvr-2DbJPgNZysmBz7-2DreGKTvnTYcSKOKtzbccYTQDDYEjZi5t94ObGZMBRmGFi0xkTDECDu4qfiyne0X5P7-5FOgdOarl8BSzaDaE7ZbodRh36jt2uEq-5Fj67DrM8BKrWEwIWY2dthon1Dnx348UYchcIzJ8n3TdVsdLzLfFwkY2d89-2Dyul3s8TNfEITvPM6WcqtvvPF9l1WzELDUk5KYj4T9oG4q2yCj6E4u3dYe0kNue-2DPE0AhnCChKRPEOqHwmPc0B98lVhh9nvPFAtrdvmiTY95-2DfhTz3gQ-5F4Uq7Z6TzIun90CCEdMMa2VnpmIHoz8j15lxS49vB7-2D-5FwBNIZVkaX0d1hqJk7nzDZJUb4-5FgnBLO7A4e1p5DEvy4gUydZmv_http-253A-252F-252Fwww.indypl.org&d=DwIFaQ&c=6ZboKdJzR8nZOqwBjhPnCw&r=HQvKUTT3P1eSvp-R7AvU3xm1zLVDJkEd0-jgD5rHP68&m=DhlhK04pcoc23Cv0_YftqkSj2HqM-AwixrioecN3JZs&s=rKaPeJBaeuBzXDRMs2oSn5YZufqaLx7LTZ_w6bOxlYg&e= 
[cid:image001.jpg@01D7BB8C.26B09040]






________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

------------------------------

Date:    Tue, 12 Oct 2021 18:38:59 +0000
From:    "Haley, Kathleen" <[log in to unmask]>
Subject: Re: Problem subfields in 245 and 300

If all the 008 fields you need to change follow this pattern: 20 characters, then the unwanted "nn", then 13 characters, then the "n" you want to flip to "v", then a final 6 characters, then you should be able to do this, again using Find/Replace with regular expressions:

Find:  (=008.{22})(nn)(.{13})(n)(.{6})

Replace: $1$3v$5

The .{22} syntax looks for 22 instances of any character—and you need 22 in the first group to account for the two blank spaces after "=008". Since the grand total is longer than a standard 008, the normal-length 008 fields should not be affected.

Kathleen

Kathleen M. Haley
Information Systems Librarian
American Antiquarian Society
185 Salisbury St.
Worcester, MA 01609

e-mail: [log in to unmask]<mailto:[log in to unmask]>
AAS website: American Antiquarian Society<https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1lUBvy2-2DaOVPGLBaMG08R8e-5FUhiYgRMHt6VFzDQceEIfTBKwmykm7ZOIdJx4vkOF337LweM6f-5FewUFTWZmmwDXxe8hq5qTjuWlmJ1g571Mh-5FDYtRR957WY1M9kMNiVEz49feeuws2YcagfkhIfiLddxo7-2D9jRssXUMQUJOr4zm9BZJXwGcGwu1tWBgxR6-5FALqPvPM9OqbHXIQoKaI8MlKoEMI-5FHD0Upaa9CAhFxSyoTyDJKnWQxRuGE4PGkmzaoiVSFE-5FyXoeyC9BGBBJtzdDK9lzs7Qs-5FdcRX3NClV5ym11h3ASY9rTRJHYU0uX0SiqU13pl4MsiTBCcFjyzRk2z7Xf70iNCMV6Ajf4taCk0SeBoOYzBVoCQZaPb1CUqhsamNtd0X7cchozJewOtDF2o3RUkU2fjcjcxW2IgnS2GtzltBQOfCN3cUwLQpBkbm2hh-2DIHmz7C-5FVSy4fBaV7lPRWg_http-253A-252F-252Fwww.americanantiquarian.org-252F&d=DwIFaQ&c=6ZboKdJzR8nZOqwBjhPnCw&r=HQvKUTT3P1eSvp-R7AvU3xm1zLVDJkEd0-jgD5rHP68&m=DhlhK04pcoc23Cv0_YftqkSj2HqM-AwixrioecN3JZs&s=BE9vs_PYUzHCUrLSac598tFwLQJc58RdW7lcOOUHP3U&e= >


From: MarcEdit support in technical and instructional matters <[log in to unmask]> On Behalf Of Cecil, Ivon
Sent: Friday, October 8, 2021 3:26 PM
To: [log in to unmask]
Subject: Re: [MARCEDIT-L] Problem subfields in 245 and 300

Many, many thanks Deb and Kathleen for the help! Problems solved!

However, a new one has cropped up.

I have a second file from the same vendor with a total of 11,000+ records. Running this one through MarcEdit validation uncovered over a thousand records with 008 fields that are 2 characters too long. I tried the Edit Field tool, but I haven’t been able to make it work.

Here’s an example:
=008  180723p20162012nyu\\nn\\\\\\\\o\\\\n|eng\d

The validation tool says there are 42 characters but 40 are expected.

I’m sure the nn shouldn’t be there (at least, not in that position), and I have my doubts about the “no attempt to code” markers, too. I would like to change that last solitary n to v, but I can live with it as is if I have to.

Once again, much appreciation for help anyone can give.

Ivon Cecil
Harrington Library Consortium

From: MarcEdit support in technical and instructional matters [mailto:[log in to unmask]] On Behalf Of Haley, Kathleen
Sent: Wednesday, October 06, 2021 7:35 AM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: [MARCEDIT-L] Problem subfields in 245 and 300

Attention: This email was sent from someone outside of City of Amarillo. Always use caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.
I haven't seen a reply to this so I'll venture some thoughts. I've run quick tests on #s 1-3, but there may be other things going on in your data.

1. If you're sure there are never more than 2 subfields $b, then I think you could do a find/replace with a regular expression along the lines of:

Find:  (=245.*\$b.*)(\$b)(.*)
Replace: $1 : $3

(Note there's a single blank space on either side of the colon in the Replace)

2. Are the subfields which are supposed to be $b in the 300 fields reliably preceded by space + colon? If so, you could do something like (with regular expressions):

Find: (=300.*? :\$)([^b].*)
Replace: $1b$2

(Note there's a single blank space before the colon in the first Find group. The ".?" construction looks for as few characters as possible.)

3. With regular expressions:
Find: \t
Replace:

(Use a single blank space in the Replace)

For 4 I often find it handy to convert the file to .mrk format and translate it to the Marc-8 character set. The mnemonics often make odd small or invisible characters visible. And then you can find/replace the mnemonics with nothing or a blank space (depending on the situation).

Kathleen

Kathleen M. Haley
Information Systems Librarian
American Antiquarian Society
185 Salisbury St.
Worcester, MA 01609

e-mail: [log in to unmask]<mailto:[log in to unmask]>
AAS website: American Antiquarian Society<https://urldefense.proofpoint.com/v2/url?u=http-3A__secure-2Dweb.cisco.com_1mXBv8HHi3wxJ6ojyzbX88gGiTR9MQln-2DUE76ZQUDFwhQ2tVz1bPNEFSjTluYAgnjdoemoaFcl1kcTxRaco3QfvX4HSF9Klnt6SOFPhzC2ZO9GbkCiZkBFiIblMUt6P0n8ZrAkurYyZMsn7llGhrXPMhTENwQe7cUdq-2Dl2ASPwa3AAo9yAlpy8UoOtIpEOz8qsfPsFbh91t2hiuq4q49wgZHLG6LghBEl2j8I4kk1o7HXnFUYiJFAh7cW7SSjPE4PQrQxKG-2DktLri1o-5FUTEs12tYA50UNYkERllhOfwrpZqx-5FrKFUQ4izF-5F0vcP6u6x0kP9RHiV9wH9zQVW-5FcMgfnhPb2THPUion9T2pMyAJLLJ7u5fxrwDSjbyTkHPLSsT60-2DbQ-2DgtCXlGph6vMNBa7T0wXlA0yZp2LgH8u0MS-2DJ1T0MGmf079-5FO0jNCos5ya9l8gE6XZYjHt000e4dG63eUQQ_http-253A-252F-252Fwww.americanantiquarian.org-252F&d=DwIFaQ&c=6ZboKdJzR8nZOqwBjhPnCw&r=HQvKUTT3P1eSvp-R7AvU3xm1zLVDJkEd0-jgD5rHP68&m=DhlhK04pcoc23Cv0_YftqkSj2HqM-AwixrioecN3JZs&s=nlEquK6oyTtaVSwYag78UALZw0R7NOt6niwf3E6s8nU&e= >


From: MarcEdit support in technical and instructional matters <[log in to unmask]<mailto:[log in to unmask]>> On Behalf Of Cecil, Ivon
Sent: Monday, October 4, 2021 12:41 PM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: [MARCEDIT-L] Problem subfields in 245 and 300

Good morning, everyone.

I’m dealing with a file of 42,000+ MARC records. Help with some problems would be much appreciated.


  1.  Many 245 fields contain two subfields b. I need to merge the contents into one subfield b. It would be nice to have a colon between them if there isn’t one now, but a number of these records lack punctuation altogether--I may have to learn to live without it.
  2.  Many 300 fields are missing the “b” in subfield b. The $ is there; I need to insert “b” after it.
  3.  Some records have embedded tabs, which are, of course, invisible. Is there a way to get rid of them?
  4.  Our ILS is Sirsi Symphony. When I ran the file through MARC Import, it produced numerous errors along the lines of “#Lossless conversion char (0xc2a0) near position 11 in tag 650” My wild guess is that 0xc2a0 could be a Unicode character that is invisible in the .mrk version of the file. I need to know more about what this means and what, if anything, to do about it. I’m not sure I want even a “lossless conversion” character in the records. Could such a thing interfere with indexing, searching or display in the patron interface? (We use Enterprise.) Is there a way to get rid of it in the original file? I realize this question is more appropriate for the Sirsi cataloging listserve, but I’ve been locked out of it for reasons unknown.

Should regular expressions be called for, I may need someone to hold my hand. I’ve never been proficient, and it’s been so long since I needed one that I appear to have forgotten everything I once knew.

Many, many thanks for help anyone can give with these problems. Alas, there may be more to come.

Ivon Cecil
Cataloger
Harrington Library Consortium



Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast.
________________________________________________________________________ This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask]<mailto:[log in to unmask]>. To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]<mailto:[log in to unmask]>.
________________________________________________________________________ This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask]<mailto:[log in to unmask]>. To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]<mailto:[log in to unmask]>.


Disclaimer

The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.

This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast.
________________________________________________________________________ This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask]<mailto:[log in to unmask]>. To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]<mailto:[log in to unmask]>.

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a messag

------------------------------

Date:    Tue, 12 Oct 2021 12:45:32 -0700
From:    Lisa Hatt <[log in to unmask]>
Subject: Re: Validate Headings report

On 10/11/2021 1:54 PM, Teresa Weisser wrote:

> I have seen this happen intermittently from time to time,
> particularly with LC Name Authorities, occasionally with subjects
> too.  I have noticed that, if I try to check the headings manually in
> the LC Authority files during the same timeframe that the report
> comes back with many incorrectly invalid headings, access to the
> Authority file seems to be running slowly.  Is there is a timeout
> feature in MarcEdit if the authority file doesn't respond?  I have
> tried re-running the report later and have had success doing that.

I get these false negatives fairly regularly too. Once a month when I 
load Kanopy and Films on Demand records I check the headings first, and 
there are usually at least a couple per batch that fail for no reason I 
can see because they're identical to the authorized heading.

I've also noticed it can fail on headings that are only capital letters, 
e.g.

150 ## DNA

and have wondered if maybe there's a technical reason for that.

-- 
Lisa Hatt
Cataloging | De Anza College Library
[log in to unmask] | 408-864-8459

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

------------------------------

Date:    Tue, 12 Oct 2021 16:14:54 -0400
From:    Diane Kinney <[log in to unmask]>
Subject: Re: Conditional find/replace stumper

Hi, Terry!  I was wondering if you have any more information on this.  If you can point me towards somewhere that explains it and I might be able to figure some of it out on my own, that'd be appreciated too.

It did just occur to me, is the fact that some of my "or" conditions are phrases, rather than single words, related at all to the issue?  If that would fix it, this might be more straightforward than it seems.

Thanks in advance!



Diane Kinney [she/her/hers]
Metadata Technician
Metadata & Processing Services

Drexel University Libraries 
Drexel University
3300 Market Street
W.W. Hagerty Library, Rm 112D
Philadelphia, PA 19104
Tel: 215.895.6845 |  Fax: 215.895.2070
library.drexel.edu






On Wed, 29 Sep 2021 12:45:21 -0400, Terry Reese <[log in to unmask]> wrote:

>I see -- you can do it this way -- the | are not true or statements in the REGEX.  In the IF statement, you can actually use the keyword (OR) to create multiple lookups that are atomic in nature.  Let me write up an example for you.
>
>--tr
>
>-----Original Message-----
>From: MarcEdit support in technical and instructional matters <[log in to unmask]> On Behalf Of Diane Kinney
>Sent: Wednesday, September 29, 2021 11:54 AM
>To: [log in to unmask]
>Subject: Re: [MARCEDIT-L] Conditional find/replace stumper
>
>Thank you, Terry!  Unfortunately, while that worked for that one record, changing all the find/replace statements hasn't fixed the overall problem.
>
>In this record, I'm including the 650 instead of the 245, because "Chemistry" in the 650 is what's causing the problem this time.  The 650 is causing this problem in some other records too.
>
>=650  \0$aChemistry--Computer simulation.
>=710  2\$aDrexel University.$bSchool.$tThesis.$f2019,$edegree granting institution.
>=999  00$aChemical Engineering
>
>My f/r statements are:
>
>Find: (=710.*)(\$b)(School)(.*)
>
>If: (=999.*Biological Sciences|Chemistry|Communication|Environmental Science|Mathematics|Physics|Psychology*)
>Replace: $1$2College of Arts and Sciences$4
>
>If: (=999.*Architectural|Chemical|Civil|Computer Engineering|Electrical|Environmental Engineering|Materials|Mechanical|Peace*)
>Replace: $1$2College of Engineering$4
>
>If I put the "College of Arts and Sciences" replace statement before "College of Engineering," it uses that statement.  If I put "College of Engineering" first, that's the one it uses.  But it seems like the order of the commands shouldn't make a difference, unless something in the find/replace statements is faulty.
>
>I also noticed in one record there wasn't a 999 field, but it still got the correct find/replace in the 710 field.  So it's getting the info from somewhere else, I just can't figure out why.
>
>________________________________________________________________________
>
>This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]
>
>________________________________________________________________________
>
>This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

------------------------------

End of MARCEDIT-L Digest - 11 Oct 2021 to 12 Oct 2021 (#2021-214)
*****************************************************************

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

________________________________________________________________________

This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]

ATOM RSS1 RSS2