Authors: Alex Addyman and Lara Whitelaw 26/06/2013

Library Services

OUDL-Stellar metadata implementation

Author / Alex Addyman and Lara Whitelaw
Document no. (if applicable)
Publication Date / 26/06/2013
Version no.
Status / Draft
Confidentiality / Public
Location (inc. Livelink link)
Last saved / 2013-06-26 (note – this is an automated field)

Contents

Summary Table 3

Content Specific Metadata 4

Main Texts 4

Collections Metadata 5

Supplementary Texts 8

Time-Based Media 10

Web Pages 10

Figure 1 - Source metadata from Voyager 4

Figure 2 - MARCXML to MODS and DC 5

Figure 3 - Legacy collections metadata example 6

Figure 4 - Portfolio Record 9

Summary Table

Content type / Metadata
Audio
/ Generic: OAI-DC
Audio Metadata: EBUCore mapped to the Media Ontology
Preservation: PREMIS
Collection
/ Generic: OAI-DC
Collections Metadata: DCCAP or MODS (mapped to EAD)
Main texts
/ Generic: DC, RELS-INT, RELS-EXT
Digital Resource: MODS
Preservation: PREMIS
Moving image or video
/ Generic: OAI-DC
AV Metadata: EBUCore mapped to the Media Ontology
Preservation: PREMIS
Still image
/ Generic: OAI-DC
Image Metadata: VRA Core, MIX
Preservation: PREMIS
Supplementary text
/ Generic: OAI-DC
Digital Resource: MODS
Preservation: PREMIS
Web page
/ Generic: DC, RELS-INT, RELS-EXT
Webpage Metadata: MODS
Preservation: PREMIS

Content Specific Metadata

Main Texts

The Open University Archives has digitised a range of print study materials which formed part of the S100 Science Foundation Course during its early years. These include the Main Texts which were sent out to students in order to guide them through the course as well as any supplementary material such as course guides, workbooks, Tutor Mark Assignments and so on. For the purposes of metadata the Main Texts are treated separately from the supplementary texts.

Sources of legacy metadata

Every Main Text digitised will have an existing record in our Voyager Library Management System encoded in the MARC-21 format. The MARC field usage varies per record in our sample collection but the example in Figure 1 is fairly atypical.

Figure 1 - Source metadata from Voyager

MARC Code / MARC Field / Example
$001 / Control Number / 206749
$005 / Date and time of latest transaction / 20070716165119.0
008/30-31 MU (code j) / Fixed length data fields / 010508s1971 enka 000 0 eng
$008/15-17
008/07-10
008/07-11
$020 a / ISBN / 0335020321
$082 a / DDC Number / 500
$110 a / Corporate name / Open University S100/Unit 14
$245 a / Title statement / The chemistry and structure of the cell
$260 a / Place of publication / Milton Keynes
$260 b / Name of publisher / Open University
$260 c / Date of publication / 1971
$500 a / General note / Unit 14 of S100 Science foundation course
$650 a / Subject - topical / Biochemistry
Cells
Cell physiology
$650 x / Subject - general subdivision / History
$740 a / Uncontrolled Related/Analytical Title / Science foundation course
$842 a / Textual physical form designator
$852 b / Location
$852 n / Sublocation

In order to assess the suitability of MODS we took the MARC Voyager records and examined which MARC fields were most commonly used across our sample content. We then cross-walked this against MODS as shown in Figure 2. We also cross-walked against simple Dublin Core as this is a core requirement of OAI harvesting.

Figure 2 - MARCXML to MODS and DC

MARC Code / MARC Field / MODS field / DC field
$001 / Control Number / <recordIdentifier> / Identifier
$005 / Date and time of latest transaction / recordChangeDate> with encoding="iso8601" / No crossover
008/30-31 MU (code j) / Fixed length data fields / <language<languageTerm / Language
$008/15-17 / <place<placeTerm> with type="code" and authority="marccountry" / No crossover
008/07-10 / dateIssued> with encoding="marc" / Date
008/07-11 / dateCreated> with encoding="marc" / Date
$020 a / ISBN / <identifier type="isbn"> / Identifier
$082 a / DDC Number / <classification authority="ddc"> / Subject
$110 a / Corporate name / <name type="corporate"> / Contributor
$245 a / Title statement / titleInfo<title> / Title
$260 a / Place of publication / <place<placeTerm> with type="text" / Publisher
$260 b / Name of publisher / <publisher> / Publisher
$260 c / Date of publication / dateIssued / Date
$500 a / General note / <note> with type=appropriate name assigned / Description
$562 a / Copy and Version Identification Note / <note> with type="version identification " / No crossover
$562 b / Copy identification / <note> with type="version identification " / No crossover
$650 a / Subject - topical / <subject authority=" "> / Subject
$650 x / Subject - general subdivision / <topic> / Subject/Coverage
$740 a / Uncontrolled Related/Analytical Title / titleInfo type="alternative"<title> / No crossover
$842 a / Textual physical form designator / Format
$852 b / Location / <physicalLocation / No crossover
$852 n / Sublocation / shelfLocator / No crossover

MODS is derived from MARC21 and so it is no surprise that it is both granular and semantically similar to map to every MARC field from our legacy metadata. Dublin Core alone does not offer the complexity to fully represent our legacy metadata.

Collections Metadata

The large and varied content which will make up OUDL requires a clear and easy-to-navigate hierarchical structure. To achieve this collections and sub-collections will be developed to group similar content together.

Collections Characteristics

As the development of the OUDL is an iterative process and as digital items suitable for inclusion into the repository will no doubt grow with time there are no fixed collections as yet. However the following collections themes have been identified whose titles and contents are subject to change.

OU Study Materials Archive

Contents:

·  All the study materials (also called learning materials/course materials/teaching materials) in all formats. "Study Materials" was the preferred phrase when discussed with the Learning and Teaching team a while ago and is now the official title of the collection on the website etc.

OU Life Collection

Contents:

·  OU Historical Images (non-teaching - this could be complicated as our images are jumbled together at the moment - teaching and historical)

·  OU Historical broadcasting (Open Forum TV and radio, other OU "magazine" programmes)

·  OU Vice-Chancellor's speeches (need to talk to the VC's office about these)

·  Other collections such as Sesame, Open House, OU web archive of social/comms stuff could also go into here.

OU Learning Journey Collection

This collection title would be consistent with the OU/Agreement and the title now given to this material by OMU

Sources of legacy metadata

Collections-level metadata for digital content at the OUDL is non-existent. The closest thing we can draw on is Encoded Archival Description (EAD) metadata used to describe physical archive collections such as the Jennie Lee Collection and the Walter Perry Collection (Open University, 2006) (see

Figure 3). These are important to map because they contain useful identifier metadata and because they may be preserved in the OUDL in the future.

Figure 3 - Legacy collections metadata example

Element name / Example /
Reference / GB/2315/WP
Held.at / The Open University Archive
Dates of Creation / 1926 - 2003
Physical Description / 225 files
Name of creator / Walter Perry created the collection.
Title / The Walter Perry Collection
Sub-title
Author / Finding aid compiled by Miss Ruth Cammies
Publication / The Open University Library 2006 The Open University,, Walton Hall, , Milton Keynes, , MK7 6AA , Tel: 01908 653378,
Edition / 1st Edition
Creation / Finding aid encoded in EAD (Encoded Archival Description) 2002 using Altova XMLSpy by Miss Ruth Cammies, Open University Archivist, Mrs Julie Vavangas, Archive Assistant, Miss Georgina Parsons, Archive Assistant. Initial catalogue of the first deposit compiled by Beveley Hunt, Archivist, 2001.2006
Descriptive Rules / This finding aid has been created using the ISAD(G) 2nd Edition (International Standard of Archival Description (Generalised)) and Encoded Archival Description (EAD).
Language usage / Finding aid written inEnglish
scope and content
biographical history
arrangement / The second deposit arrived at the Open University Archive with no structure or filing system. The structure has therefore been artificially created to aid access. Individual files have not been split unless clearly stated in the item record.
WP/1 The Open University
WP/2 Other Educational Work
WP/3 Papers regarding Health, Science and the Environment
WP/4 Personal Files and Interests
access guidelines / To access the collection contact the Open University Archivist. All items will be monitored for personal or sensitive information before they are released to researchers. The Archivist reserves the right to restrict access if necessary. All researchers will be required to complete an access/data protection/ copyright form
Access to some of the papers within the collection is restricted under the principles of the Data Protection Act 1998.
copying restrictions / Reproduction of items from the collection will be permitted according to copyright legislation and Open University Library policy.
immediate source of acquisition / The first deposit of material was largely an internal transfer of papers from the Vice Chancellor's Office of the Open University. The second deposit was transferred from the Edinburgh Regional Office of the Open University. A small selection of materials were donated by Lady Perry.
custodial history / The first deposit was transferred to the University Library by Walter Perry. The second deposit was transferred from Walter Perry's office within the Edinburgh Regional Centre for the Open University after his death in 2003. A small selection of files was also donated by Lady Perry at this time.
archvists note / This finding aid was created in 2006.
related material
subjects / Education, Higher
Distance Education
Medicine Research
Broadcasting
subject.personal.names / Perry Walter 1921 - 2003 Lord Perry of Walton
Lee Baroness Jennie 1904 - 1988 MP
Wilson Baron of Rievaulx James Harold 1916 - 1995 Statesman
Goodman Baron Arnold Abraham 1913 - 1995 Lawyer
subject.corporate.names / Labour Party Great Britain
The Open University Great Britain
subject.geographical.names / Milton Keynes England
Edinburgh Scotland
Dundee Scotland

Dublin Core Collections-level Application Profile

The Dublin Core Collections-level Application Profile (Dublin Core Metadata Initiative, 2007) is one of the most widely used collections level profiles. As it draws on Dublin Core standards it is highly interoperable with other schemas and libraries. As is the case with the simple DC schema however it is somewhat limited in scope but at the collections level this is less of an issue as the collection descriptions should be very brief. DC-CAP draws on the functional model and element set defined by RSLP (UKOLN, 2000).

Supplementary Texts

Accompanying every Main Text produced for an Open University Unit is a series of Supplementary Texts. These include the following types[1]:

·  Assessment

·  Assignment

·  Computer Marked Assignment (CMA)

·  Calendar

·  Case Study

·  Companion

·  Computing Guide

·  End of Course Assessment (ECA)

·  Files

·  Glossary

·  Handbook

·  Media Notes

·  Module Guide

·  Musical Scores

·  Portfolio

·  Readings

·  Student Marked Assessment (SMA)

·  Specimen Exam Paper

·  Study File

·  Study Guide

·  Tutor Marked Assignment (TMA)

·  Work Book

·  Sources of Legacy Metadata

Historically this supplementary material has not been catalogued in the same way that Main Texts have been through Voyager MARC Records. There are two sources of metadata we can draw from however. The first is PLANET which is the Open University’s central planning system. Within PLANET every item produced for a course is recorded in an inventory. The metadata is minimal but will at least allow us to identify titles, identifiers and crucially which presentation of a course they belong to.

Some courses will also have more in-depth metadata profiles in the Portfolio system which is a digital asset management system. Roughly 30% of items within Portfolio (which itself only contains items from a small proportion of courses produced by the OU) have detailed IEEE LOM records (Figure 4). Where available these will be directly ported in to OUDL.

Figure 4 - Portfolio Record

Selection of Metadata Standards

As a number of items are contained within the IEEE LOM format it made sense to retain this format for all of our Supplementary Texts. LOM is also the most appropriate standard to use given that the items in questions are more exclusively learning objects as opposed to the Main Texts which are more like traditional bibliographic texts. We have mapped the Portfolio elements back to their original LOM structure and used this to map the PLANET fields to create a consistent profile.

Given that the PLANET metadata was not intended to be used for resource discovery purposes there are some fields which are difficult to map, particularly subject fields. Where possible and appropriate we may be able to draw these fields from the module to which the supplementary item belongs. However for more granular subject descriptions manual cataloguing may be required.

When we started to look at how we could transform these materials to Linked Data for the STELLAR project we found that IEEE LOM was not available in RDF. We considered whether to use Learning Resource Metadata Initiative (LRMI) instead which is the schema.org based standard that will replace IEEE LOM but found that although it will be useful for sharing our metadata with others, like OAI-DC this specification is not suitable for internal repository use. We have therefore made the decision to use MODS for supplementary resources as well as main texts and map this profile to LRMI for sharing externally.

Time-Based Media

The OUDL will include a significant amount of video and audio content which has been made available digitally through this and previous projects. The most notable project was the Access to Video Assets project (AVA) which sought to “address the increasing demand for exploitation of The Open University’s rich media legacy assets” (Open University Library Services, 2008). The key output of AVA was the development of VideoFinder – a centralised repository and catalogue for OU video (which has since been extended to include audio assets).