MARCit: MARC Enhancement Service: Initial Testing

The Ex Libris MARC Enhancement Service (MARCit!) became available for our use on June 5, 2004 (our contract will run to June 5, 2005). Janet has enabled the Twin Cities local SFX instance to extract the titles and accept the records on return. Cecilia Genereux and I worked on specs for bibliographic record changes. I was able to identify fix_docs for all but two of the desired changes.

Cecilia and I identified eight targets to initially run the service on. We especially chose Highwire and Synergy for their title overlap potential. As advertised, only one record per object (title) was created. Each record had only one URL, but multiple 853/863 pairs, reflective of each target and coverage. The fix_docs required some reworking. We accomplished the desired result after fix_doc modification.

I have included testing results and sample records. I also have several questions, requiring input from LEO, TS, and Ex Libris. I met with Joe Holtermann, from UMD, and we went over the results and discussed an implementation plan. It is appended at the end of this document. The times of the exact loads are not included, since they are dependent on answers to my questions and on the completion of tasks prior to the loads. If feasible, I would like to coordinate the loading of new records with the clean-up of old records to create as little confusion as necessary for public services.

MARCit – Test 1

Fix 1

!-!!!!!-!!-!-!!!-!!!-!!!!!-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!>

!* 05/25/2004:bhf:Created for SFX MARCit.

1 001 DELETE-FIELD

1 006 DELETE-FIELD

1 007 DELETE-FIELD

5 300 DELETE-FIELD

5 530 DELETE-FIELD

4 008 023 023 FIXED-CHANGE-VAL #,s

2 LDR ADD-FIELD 006 ,L,m|,|,|,|,d,|,|,|,|,|

3 LDR ADD-FIELD 007 ,L,cr,|,u,n,u,|,|,|,|,|,|,|

6 LDR ADD-FIELD 530 ,L,$$aOnline version of print title.

6 LDR ADD-FIELD 538 ,L,$$aMode of access: World Wide Web.

6 LDR ADD-FIELD 590 ,L,$$aejour

6 LDR ADD-FIELD 965 ,L,$$aNOEXP$$bSFX

Titles extracted from the following targets:

Synergy

OUP

Miscellaneous-ejournals

Journals @ OVID

Highwire

Ebsco Business Source Premier

ACS

IEEE Explore journals

Turn around time for from extract sent to return of records: approx. 4 minutes

Records loaded into TMN01, Friday morning, June 25

Time to load: 28 min.

No. of records: 3318

Record review:

Reviewed 20 records from set (appr. 2 from each target), compared against CONSER records with same OCLC number:

Findings (Bolded questions are here only in reference to findings. They are repeated in the Question section.):

  • Only 2 of 20 (5%) titles did not have full records … (SFX reps stated the percentage would be in the realm of 5-7%)
  • 18 fully cataloged records were for the print; 6 of the original OCLC records had online available 776 and 007; 1 of the original records was for electronic
  • LDR position 5 is marked ‘n’ as it should be
  • 003 has SFX on MARCit recs; OCoLC for OCoLC recs

006, 007 didn’t work correctly at all

008 needs a position fix

There is no 008 on brief records should we create?

0167s (NLM control number) stripped off SFX recs

090 (SFX sys id) on recs; should we strip it or is it necessary for match; it is part of the 035

SFX id no in 035 on MARCit recs

OCLC no in 035 on MARCit recs

049 not present on recs

  • 245 $$h appended on end of title w/o regard to punctuation; can it change, do we care?

Brief records have no first indicator

  • 300 stripped from MARCit

530 stripped from original recs, if it existed, and new one added; do we need to add, since it doesn’t apply to all?

538 stripped from original recs, if it existed, and new one added; should we delete field in case we hit an e-record?

590 added to MARCit recs; should we leave off

  • All 650s retained on MARCit records

Diacritics odd on 650 6 french subject heading; correct on OCLC rec (see 000533218TMN01)

  • 776 contains multiple blank subfields after the $$w; $$x added and dropped to its own line (tagged separately) … is the 776$$x searched by SFX in the catalog … do we need to move???
  • 852 MnU added

852 $$a provided by EL to mark SFX instance; can we/should we change to an 049 $$a

  • 89112 become 85312

89140 become 86340

MARCit pairs match the orig 891s exactly

If there are multiple 856s on orig rec; there are pairs of 85312/86340 on the MARCit recs

853/863 provide us no useful information; should we strip them?

  • 856 for SFX instance added

856, provided by SFX, now goes to the menu, BUT you see no other services; when we take off the part of URL immediately following the ISSN, we do see the services; EL has been contacted

  • 965 NOEXP added

Fix 2

!-!!!!!-!!-!-!!!-!!!-!!!!!-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!>

!* 05/25/2004:bhf:Created for SFX MARCit.

!* 07/01/2004:bhf:Updated for SFX MARCit.

1 001 DELETE-FIELD

1 006 DELETE-FIELD

1 007 DELETE-FIELD

5 300 DELETE-FIELD

5 530 DELETE-FIELD

4 008 023 023 FIXED-CHANGE-VAL #,s

2 LDR ADD-FIELD 006 ,L,m|||||||d||||||||

3 LDR ADD-FIELD 007 ,L,cr|unu|||||||

6 LDR ADD-FIELD 530 ,L,$$aOnline version of print title.

6 LDR ADD-FIELD 538 ,L,$$aMode of access: World Wide Web.

6 LDR ADD-FIELD 590 ,L,$$aejour

6 LDR ADD-FIELD 965 ,L,$$aNOEXP$$bSFX

MARCit Test 2 records

These records were not fully loaded, so the fields we added are out of order.

006, 007, 008 fixes are now correct

Record example 1

000000001 LCU L $$a20040624135252

000000001 FMT L SE

000000001 LDR L -----nas--2200457-a-4500

000000001 003 L SFX

000000001 005 L 20040324104439.0

000000001 008 L 901009c19909999enkqr-p-s-o---0----0eng-d

000000001 010 L $$a---92640104-$$zsn-90031375-

000000001 022 L $$a0955-2359

000000001 035 L $$a(OCoLC)22481516

000000001 035 L $$a(SFX)110975946727447

000000001 040 L $$aWMM$$cWMM$$dDNLM$$dMiU$$dWaU$$dMH$$dNST$$dDLC$$dNST$$dGU$$dNSDP$$dMH$$dCU-S$$dRPB$$dTNJ$$dGU$$dMH

000000001 042 L $$alc

000000001 043 L $$ae-uk---

000000001 05000 L $$aDA566$$b.A15

000000001 06000 L $$aW1$$bTW453

000000001 08200 L $$a941.082/05$$220

000000001 090 L $$a110975946727447

000000001 210 L $$aTWENTIETH CENTURY BRITISH HISTORY

000000001 24500 L $$a20 century British history.$$h[electronic resource]

000000001 24618 L $$aTwentieth century British history

000000001 2461 L $$iIssues for <2004-> have title:$$a20th century British history

000000001 260 L $$aOxford :$$bOxford University Press,$$c[1990-

000000001 310 L $$aQuarterly,$$b1999-

000000001 321 L $$aThree no. a year,$$b1990-1998

000000001 3620 L $$aVol. 1, no. 1 (1990)-

000000001 500 L $$aTitle from cover.

000000001 500 L $$aLatest issue consulted: Vol. 15, no. 1 (2004).

000000001 550 L $$aFounded by the Institute of Contemporary British History.

000000001 651 0 L $$aGreat Britain$$xHistory$$y20th century$$vPeriodicals.

000000001 650 2 L $$aHistory$$zGreat Britain$$vPeriodicals.

000000001 7102 L $$aInstitute of Contemporary British History.

000000001 77608 L $$iOnline version:$$t20 century British history (Online)$$x1477-4674$$w(DLC) 2002256058$$w(OCoLC)50024970

000000001 776 L $$x1477-4674

000000001 850 L $$aCSt$$aCStclU$$aDLC$$aDNLM$$aDeU$$aGU$$aICU$$aKyU$$aMBU$$aMH$$aMiU$$aMnU$$aNN$$aNcU$$aTxHR$$aWaU

000000001 85300 L $$81$$av.$$bno.$$u3$$vr$$i(year)$$wt

000000001 86340 L $$81.1$$a1$$b1$$i1990

000000001 85320 L $$82$$av.$$bno.$$u4$$vr$$i(year)$$wq

000000001 86341 L $$82.1$$a10$$b1$$i1999

000000001 852 L $$aMnU

000000001 856 L $$u for it!

000000001 866 L $$xOxford University Press:Full Text$$a Availability: from 2002 volume 13 issue 1

000000001 CAT L $$c20040416$$lJNL99$$h1258

000000001 SRC L $$aCONSER

000000001 006 L m|||||||d||||||||

000000001 007 L cr|unu|||||||

000000001 530 L $$aOnline version of print title.

000000001 538 L $$aMode of access: World Wide Web.

000000001 590 L $$aejour

000000001 965 L $$aNOEXP$$bSFX

Record Example 2

000000002 LCU L $$a20040624135252

000000002 FMT L SE

000000002 LDR L -----nas--2200397---4500

000000002 003 L SFX

000000002 005 L 20031014145901.0

000000002 008 L 771129d19772002ohubr1p-s-----0---a0eng-d

000000002 010 L $$a---77641061-$$zsc-78000495-

000000002 022 L $$a0149-1210

000000002 030 L $$aTHMMAG

000000002 032 L $$a459970$$bUSPS

000000002 035 L $$a(OCoLC)3451809

000000002 035 L $$a(SFX)963018022558

000000002 040 L $$aNSDP$$cNSDP$$dDLC$$dNSDP$$dAIP$$dNSDP$$dNST$$dOCoLC$$dMiU$$dOCoLC$$dNSDP

000000002 042 L $$ansdp$$alc

000000002 05000 L $$aTS300$$b.T47

000000002 08210 L $$a669/.005

000000002 090 L $$a963018022558

000000002 222 0 L $$a33 metal producing

000000002 24500 L $$a33 metal producing.$$h[electronic resource]

000000002 2463 L $$aThirty-three metal producing

000000002 24630 L $$aMetal producing

000000002 260 L $$a[Cleveland, Ohio, etc.$$bPenton Pub., etc.]

000000002 310 L $$aBimonthly,$$b2002

000000002 321 L $$aMonthly,$$bJan. 1977-

000000002 321 L $$a11 no. a year,$$b<July/Aug. 1987- >

000000002 321 L $$aMonthly,$$b<June 1988- >

000000002 3620 L $$av. 15-40, no. 6; Jan. 1977-Nov./Dec. 2002.

000000002 515 L $$aIssues for <1987- > also called v. <25- >

000000002 650 0 L $$aSteelwork$$vPeriodicals.

000000002 650 0 L $$aMetal-work$$vPeriodicals.

000000002 78000 L $$t33$$x0040-6155$$w(DLC) 73641630

000000002 78500 L $$tMetal producing & processing$$x1547-1411$$w(DLC) 2003227185$$w(OCoLC)51787000

000000002 850 L $$aCU$$aDLC$$aMiEM$$aPPi

000000002 852 L $$aMnU

000000002 856 L $$u for it!

000000002 866 L $$xBusiness Source Premier:Full Text$$a Availability: from 2003

000000002 CAT L $$c20040413$$lJNL99$$h1557

000000002 SRC L $$aCONSER

000000002 006 L m|||||||d||||||||

000000002 007 L cr|unu|||||||

000000002 530 L $$aOnline version of print title.

000000002 538 L $$aMode of access: World Wide Web.

000000002 590 L $$aejour

000000002 965 L $$aNOEXP$$bSFX

MARCit Test 3

On #3 of the MARC enhancement tool (SFX MARC Enhancement Service), I selected no active … to try to force the appearance of a full menu (not just available full text).

Results

URL still looks like this:

000000002 856 L $$u

Note: removal of &pid=serviceType=getFullTxt results in display of full menu of services

Questions:

  1. There is no 008 on brief records. Can we create one that will work for all?
  1. Should we strip the 090 (SFX sys id) or is it necessary for match?

According to correspondence from Cassandra Targett (CT), at Ex Libris, we can set the 035 to be the match point.

  • Check with CT to verify this is so
  • What if there are multiple 035s and the SFX 035 is second?
  • Check with Chris
  • Do multiple 035s pose a problem for matching? Doesn’t matter (CM)
  • Are the 090 and 003 better (more unique) match points? 035 is just fine (CM)
  1. We need to script the addition of a 130 to each record, by copying the 245$$a and adding (Online) to the end. There are at least three possible scenarios. What do we do about initial articles in the 245? There is a list that Erik’s script could use to get rid of most of them. (CM) Erik should be able to script
  • Examples:
  • 24500 L $$a20th century British history.$$h[electronic resource]
  • 1300 L $$aCanadian applied mathematics quarterly (Online) OR

1300 L $$aCanadian applied mathematics quarterly (Toronto, Ont. : Online)

24504 L $$aThe Canadian applied mathematics quarterly [electronic resource]

  • 1300 L $$aEnvironmental science & technology (Easton, Pa.)

24500 L $$aEnvironmental science & technology.$$[electronic resource]

If we could script the following:

  • If there is no 130: copy 245$$a and append (Online) to it
  • 1300 L $$a20th century British history (Online)

24500 L $$a20th century British history.$$h[electronic resource]

  • If there is a 130 with the string (Online) leave it as it is
  • If there is a 130 without the string (Online) remove terminal ) add ^:^Online)
  • 1300 L $$aEnvironmental science & technology (Easton, Pa. : Online)

24500 L $$aEnvironmental science & technology.$$[electronic resource]

  1. Can we change the 245 $$h appended on end of title w/o regard to punctuation? Do we care?

CT already queried CT answer: “That looks like a bug caused by tacking the subfield h. We'll fix that in our next round of fixes in the Fall.” Any reason to wait until fall for this?

  • Tech Serv can live with it

Can we get brief records to have a first indicator?

CT says the recs she looked at have them. Examples were sent CT answer: “I misunderstood the original question - the brief records are just kept from SFX, and those don't have indicators. They probably should - we'll add this to our list of fixes for Fall.”
Shall we (can we) add a first indicator ‘0’ through fix docs, if this will happen through fix … should we wait? Done through fix … changed from blank to 0

  1. We have added a 530 ‘Online version of print title.’ This is not true in all cases. Should we NOT add? Not added (CG, CM, CH).
  1. Is 538 ‘Mode of access: World Wide Web’ required or just reduntant? Should we NOT add? Added
  1. We added a 590 ‘ejour’ to MARCit recs. Should we delete?

Do not add (picked up through fixed fields)

  1. Can we script the move of $$x values from 776 with (Online) as a string to an 022 $$a<value> (Online) and delete that particular 776? Does SFX search this field? (Do we need to do this, if it is only for looks?)
  • Example:

000000001 77608 L $$iOnline version:$$t20 century British history (Online)$$x1477-4674$$w(DLC) 2002256058$$w(OCoLC)50024970

000000001 776 L $$x1477-4674 change to

000000001 022 L $$a1477-4674 (Online) and remove the 776s that include the ISSN string

CT comment: “There are some ISSNs in the 776 that are in the original SFX records. When you see it in a separate field like this, it came from SFX and is being retained. I'm not sure we want to do that - I'll check it more closely.”

7. MARCit provides an 852$$a on the records to distinguish instance. We can change the field to a 049, which is what we normally use for holdings creation, with a Change-Field fix doc. Should we change it?

8. If there are multiple targets for any given object, MARCit creates pairs of 85312/86340 on the MARCit records. The 853/863 pairs provide us with no useful information, as we are routing to the menu and we don’t need the pair for pattern creation? Should we delete them?

  1. We supply the base URL for our SFX instance to the MARCit service. The ISSN portion is added for the individual titles. However, regardless of which service we choose or if we choose none at all, we still get the appended &pid=serviceType=getFullTxt, e.g. $$u The effect is that we do not see all the services on the menu. We only see the targets offering the title and the holdings for each. EL has been queried on this. What is our position on this?

CT answer: “I checked with the SFX folks and they told me this isn't a bug at all, but the way it's designed. All the exports in SFX work this way (like the HTML export). The reason they do it this way is so the user goes directly to the Full Text (if you have the direct link enabled). If you don't want it to work this way, I guess you could strip it out - I won't be able to get a change into MARCit to strip it out until Fall at the earliest. You could also try to lobby them to change the default, but that probably wouldn't fit your timeframe either.”

10. One other consideration for the MARCit project is workflow. When the records are returned from Ex Libris, they are placed on the SFX server. Currently, only Janet and Erik have authority to fetch the records. Do we want to make any changes here?

MARCit Implementation Plan

Task

Contact coordinate campuses to set up MARCit

Communicate w/ Cecilia & Cassandra

Talk to LEO staff about scripts and implementation

Perfect fixes

Test overlay of records

Write and test scripts

Test merging of records

Load records

Aggregators-all campus

Vendor packages – all campus

Multi-campus targets

TC single campus targets

Duluth single campus targets

Morris single campus targets

Crookston single campus targets

Aggregator targets: Ebsco Business Source Premier

Ebsco Academic Search Premier

Ebsco MasterFile

Proquest Historical Newspapers (Historical NYT)

Lexis-Nexis ??

All campus pkgs.:Elsevier Science Direct

Kluwer Online

Wiley Interscience

IEEE Electronic Library (only journals; proceedings have ISBNs and are not included)

Nature

Step 1: Export of records to ExLibris (expect a monthly cycle)

· From each SFX instance (instances for MnU, MnDuU, MnMoU, MnCrUM)

· From each administrative module set up parameters for export on the MARC Enhancement tool form in SFX. Betsy will provide screen shot (blank form attached)

· Includes:Provide base URL

Output format of records (Aleph sequential)

Targets for export

Whether fresh or comparative export

E-mail address for record confirmation

File prefix (will use date of file)

245 $$h (electronic resource)

· Can select fields to strip (we have deleted through fixes but can change anytime)

Step 2: Records returned by ExLibris

· Records returned to SFX server (Castor/Pollux)

· Turn around during test was 4 minutes

· LDR position 5 will indicate new (n), changed (c), deleted (d)

· Expect 4 files one for each instance

· Records will be in Aleph sequential format

Step 3: Move records to Tomos

· Scp from Castor/Pollux to Tomos (Betsy or Connie may need to be authorized)

· Notify Erik that file is ready

Step 4: Pre-Load Script: Script to prepare records for load (EJB)

· Prepare one deduplicated file from the 4 returned files:

o Start with MnU instance and identify/eliminate any duplicate records (Law) (no duplicate records will be created from a single instance, only 1 per title is created)

o Identify records from other 3 files that are not in MnU instance and copy records to the unified file

o For the non-unique records in MnDuU, MnMoU, MnCrUM copy the 852, 856, and 866 to the MnU record (853/63 pairs that we talked about in the meeting are from the CONSER print record and do NOT reflect what we have in our instances). Add a $$x to the 866 with the value from the 852$$a

o Any special handling for deleted records?? (weren't we going to have Erik script a STA=DELETED) What happens if a MnU deleted record is still a 'n' or 'c' in one of the other files? (anyway to create a error report if the LDR on the instance records do not match?)

· Create/edit the 130 field

· Do we want Erik to remove period before $$h in 245? (EL is supposed to fix in fall version)

· Move the data in 776 $$x field to 022 field and append with (Online)

· Add a field with current date to be used for export purposes (Chris needs to define field)

· Copy final file to umn01/scratch

Step 5: Run p_manage_36 to identify new, unique and multiple matches

· Create tab_match

Step 6: Run p_manage_18 to load new and changed records (different parameters for new and uniq)

· Create local field for subject that will be retained during merge (Chris)

· Create merge section for umn01/tab/tab_fix

· Create fix section for umn01/tab/tab_fix

o Create generic 006

o Create generic 007

o Create generic 008

o Strip 003

o Strip 090

o Strip 530

o Strip 590 (consult with JA) (will be deleted)

o Strip 853 and 863

o Strip existing 538

o Add new 538

o Convert data in 852 to 049

o Strip 852

Step 7: Run p_manage_50 to create holdings and items

· Create tab_hol_item_create

· Create tab_hol_item_map

Step 8: Post-Load Script for new records: Script to move data to holdings (EJB)

· Add appropriate value to 852$$a based on 852$$b in holdings record

· Add 940 field: $$a<based on 852$$b>$$bCAT$$c<date>

· Move appropriate version of 856 to holdings record (Depends on 852$$b)

· Move 866 fields to holdings (Depends on $$x) and remove $$x

· Remove 856 fields and 853/863 pairs from the bib records (p_manage_21)

Step 8a: Post-Load Script for updated records

· Add 940 field $$a<based on 852$$b>$$bUPD$$c<date>

· Move appropriate version of 856 to holdings record and replace existing version (Depends on 852$$b)

· Move 866 fields to holdings and replace existing fields (Depends on $$x) and remove $$x

· Remove 856 fields and 866 pairs from the bib records (p_manage_21)

Step 9: Extract for MetaLib

· Use new field (date stamp) created for SFX export to select records

· Run p_print_03 to export records

· Create fix routine if needed to prepare records for MetaLib

Step 10: Import records to MetaLib

· Can we overlay or do we need to drop/recreate records?

· Indexing

· Affect on A-Z list

New Fix Doc

!-!!!!!-!!-!-!!!-!!!-!!!!!-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!-!!!!!!!!!!!!!!!!!!!!>

!* 05/25/2004:bhf:Created for SFX MARCit.

!* 07/26/2004:bhf:Updated for SFX MARCit.

1 001 DELETE-FIELD

1 003 DELETE-FIELD

1 006 DELETE-FIELD

1 007 DELETE-FIELD

1 090 DELETE-FIELD

1 300 DELETE-FIELD

1 530 DELETE-FIELD

1 850 DELETE-FIELD

1 853 DELETE-FIELD

1 863 DELETE-FIELD

2 008 023 023 FIXED-CHANGE-VAL #,s

3 245 CHANGE-FIRST-IND ,0

4 852 CHANGE-FIELD 049

5 LDR ADD-FIELD 006 ,L,m|||||||d||||||||

5 LDR ADD-FIELD 007 ,L,cr|unu|||||||

5 LDR ADD-FIELD 538 ,L,$$aMode of access: World Wide Web.

5 LDR ADD-FIELD 965 ,L,$$aNOEXP$$bSFX