September 2021


Options: Use Monospaced Font
Show Text Part by Default
Show All Mail Headers

Message: [<< First] [< Prev] [Next >] [Last >>]
Topic: [<< First] [< Prev] [Next >] [Last >>]
Author: [<< First] [< Prev] [Next >] [Last >>]

Print Reply
Paul Schaffner <[log in to unmask]>
Reply To:
Thu, 2 Sep 2021 12:51:32 -0400
text/plain (63 lines)
I hesitate to mention this, since it takes one out of ME proper, but 
most questions of the sort "which records are in X but not in Y" 
can be readily addressed by means of those ancient utilities
SORT, COMM, GREP, and UNIQ (native to Unix but ported to Windows -- and
my standbys). If you have unique identifiers in your records (say
in the =001 field), and you are able to extract them from the mnemonic
MarcEdit format files (say with a text editor or grep), and sort them,
then 'comm,' a beautiful little gem of a utility, will quickly tell you
which IDs are in the source files but not in the 'joined' file. That's
exactly what it is for, and it does its work almost instantaneously.
And of course once you discover which IDs are missing, you'll know which
records are missing. I'm not sure I could live without comm.


ps I assume that a number dropped out before the words "were lost"; "ten"?

On Thu, 2 Sep 2021, at 12:28, Terry Reese wrote:
> It is honestly difficult to answer questions like this without seeing 
> the files.  Marcedit itself, when processing individual files fixes 
> structure issues on the fly.  Join does not.  My guess, there are 
> potentially structural issues in the file
> On Thu, Sep 2, 2021, 12:21 PM Leslie Engelson <[log in to unmask]> wrote:
> > Hello:
> > 
> > I had 8 files that had a total of 6007 records between then. None of the records were duplicates. When I joined them, 8 records were duplicated in the joined file and were lost somehow resulting in 6005 in the joined file.
> > 
> > I can locate the duplicated records easily but can't figure out how to find which records were not included in the joined file. Can anyone help me know how to figure this out?
> > 
> > Also, why is this happening? I want to send the joined file for authority processing but I don't want to send duplicate records and I want to send all of the unique records so I'd like to resolve this problem
> > 
> > Thanks for your help
> > 
> > Leslie
> > *
> *
> > * *
> > Leslie Engelson
> > 
> > Metadata Librarian / Professor
> > [log in to unmask]
> > 
> > Murray State University 
> > 270-809-4818 
> > Waterfield Library 224GMurray State University 
> > <>
> > 
> > 
> > 
> ________________________________________________________________________ This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask] 
> ________________________________________________________________________ This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit. If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask] 

Paul Schaffner  UM Library : Digital Content & Collections
[log in to unmask] |


This message comes to you via MARCEDIT-L, a Listserv(R) list for technical and instructional support in MarcEdit.  If you wish to communicate directly with the list owners, write to [log in to unmask] To unsubscribe, send a message "SIGNOFF MARCEDIT-L" to [log in to unmask]