Minutes of the CJH/Metro born-digital legacy media transfer working group, 5/08/14

Present: Ginger Barna (Leo Baeck Institute Library); Chris Bentley (Leo Baeck Institute Archives); Sarah Haug (Guggenheim Museum Archives); Christine McEvilly (American Jewish Historical Society); Natalie Milbrodt (Queens Library); Margo Padilla (METRO); Henry Raine (New-York Historical Society); Jefferson Bailey (METRO); Kevin Schlottmann (Center for Jewish History).

Administrative

Two new members were introduced: Margo Padilla of METRO, who will be working with Jefferson on this project; and Ginger Barna of the Leo Baeck Institute Library, who took over for Tracey Beck of the same institution.

Minutes of the last meeting were briefly reviewed.

Survey and materials for inclusion

The representative for each participating institution described the process and results of a survey of born-digital media holdings, and highlighted materials for potential transfer by METRO. Bullet points below summarize the process and results. Discussions and Ideas are specially marked as such.

NYHS:

*Top priority: "Here is New York" computer. Digital collection of materials related to 9/11. Computer "died", and which materials were backed-up is unclear. Also, metadata related to project (e.g. database of photographers) also possibly on computer. Goal: try to recover data from hard drive.

*LC American Memory files. Scans of large posters on Exabyte tape. METRO said transfer from this format is probably not cost-feasible in the scope of this project.

*Institutional archives. Past 20 years of materials include floppies, VHS tapes, etc. Materials are unprocessed (no institutional archivist), individual departments hold own materials; thus hard to survey, let alone access.

AJHS:

*Top priority: UJA floppies. 42 floppy disks (so far) from the four-year project to process the records of the United Jewish Appeal. Likely financial records, based on context (labels, folders where found) of unknown file format. While potentially duplicates of paper records, additional utility of manipulable financial data is very appealing.

*62 floppy disks from Richard Cohen Associates paper collection; potentially interesting files related to PR work done for various Jewish organizations.

*Born-digital scattered throughout collections; floppies, a hard drive, e.g. a disk containing a case file in the Ethiopian Jewry collection – potential privacy issue?

While AV materials in mixed collections are tracked, there is no similar process for born-digital media found in paper collections; the survey was done by using likely search terms such as "disk" and "disc" and reviewing the results, as well as by asking long-time staff.

Discussion: Sampling and surveying. How do we know what we have in the collections? Do we transfer samples of large collections of floppies, for example, to try to figure out what they contain? It is possible to change (or better apply) appraisal and accession policies to avoid taking legacy media whose contents are of little value?

Idea: Migrate-on-demand, similar to digitize on demand? Or, migrate media found in heavily used collections/series/folders?

LBI Archives:

*About 25 floppies from 8 collections, fairly well described. Also some USBs and DAT tapes. A sample of the floppies will be transferred, including disks from Werner Frank collection that likely contain GEDCOM files.

Survey was fairly straightforward, as media are removed from paper collection, cataloged, and added to AV collection.AV collection has become catch-all, includes legacy born-digital media. Rename: AV and Digital Media collection?

Queens Library

*Optical media (RW CDs with folklore records, CDs labeled "Roles", CDs with graphic design records from QCA photo collection); one floppy disk with WordPerfect files.

Survey conducted by searching keywords in Manuscripts Gateway; talking to manuscript and image archivists. In addition to materials they recalled, they suggested reviewing binders of accession inventories.

Discussion: Types of output – file/folder list as a first level? See below for tiered service discussion.

Idea: Start list of exotic media and formats uncovered by survey, possibly for sidebar in a publication?

LBI Library

*Floppy disk, component of a published book. Potentially contains a dBaseIV database.

Concerns raised include: what is "obsolete"? How to address copyright concerns if making item accessible? More generally, how should a library handle files from such media?

Guggenheim Archives

*Floppies, CDs, Zip discs. Latter possibly include records of interests such as publications, curatorial, and directorial files. Lack of intellectual control of legacy media is an issue. Priority: For purposes of this project, Zip discs so as to have a wide array of legacy media.

Surveyed using database with keyword search; search of finding aids; inventory list of materials not found in database.

Idea: Controlled vocabulary for legacy media? Review of existing controlled vocabularies to see if any are useful?

Idea: Better description of legacy media at time of accession (e.g., not "box of disks" but "34 5.25'' floppy disks with labels")

Idea: Include immediate transfer of optical media into regular processing workflows?

Transfer agreements and delivery expectations

METRO (Jefferson and Margo) presented a draft service agreement (attached). Emphasis on pilot nature of the process – how long do various transfer services take, what is scalable?

*Deliverables – potentially tiered service, depending on what is known of the contents?

Forensic disk image, logical disk images, or just files

File/folder lists

Format analysis – FITS + DROID reporting

DFXML

Bitcurator content analysis for SSNs, PII, credit card numbers

Virus scan

Format migration to text or PDF

Photos of media

*Security

Insurance, shipping policies

*QA

What is a success?

How are errors and corrupt media handled?

Tasks and Next Meeting

*Review service agreement, continue to discuss deliverables by email, and provide feedback to METRO

*Final selection of media for transfer - no more than ~5 items per institution

*Future tasks: QA of transferred materials; workflows for born-digital material returned from METRO; publication of working group results.

*Next meeting @ METRO in summer (July?) to hand off media