Task Force on Vendor Statistics

Report to DACC

November 5, 2001

Members of the Task Force:

Kristin Stoklosa, Chair

Ivy Anderson

Elaine Fadden

Jeff Kosokoff

Heather McMullen


Executive Summary

This report presents the investigations and recommendations of the Task Force on Vendor Statistics, which was initiated in May 2001 by the Digital Acquisitions and Collections Committee (DACC). As charged, the Task Force reviewed usage statistics available from providers of HOLLIS electronic resources with the following outcomes: recommendations for the appropriate use of vendor statistics in HOLLIS resource monitoring and evaluation, prioritization of resources for usage statistics evaluation purposes, identification of vendors and resources for which statistics are desired but presently unavailable, identification of desired improvements in the statistics available from electronic resource providers, and recommendations for distributing information on vendor reports to the Harvard library community.

The Task Force analyzed sample reports from major vendors and apprised itself of projects undertaken by library groups to develop best practices for usage reporting. As a result, the Task Force defined, as a standard, a set of major data elements that should be included in usage reports provided by vendors. It is important to realize that basic data are often not conveyed in usage reports, and that librarians need to proactively request not only the production of vendor reports when unavailable but also the presentation of meaningful, standard data elements in reports currently received.

The principal observation of the Task Force is that reports provided to institutions by vendors are disparate in format and content.

·  Across vendors, usage statistics are inconsistent in the types of data reported.

·  Within each report, vendors often do not define what is being counted.

·  Even when defined, vendors use different baselines for counting data elements.

·  Various genres of resources require different data elements to convey resource use.

·  Report availability varies in terms of each report’s file format and delivery mechanism.

For these reasons, the Task Force observed that limited comparability is possible across vendor reports.

Locally gathered statistics are the best measures for determining matters of user identity, whether by

faculty as currently analyzed by OIS or at more granular levels. It is important for the institution to retain a method of self-measurement: inconsistencies in vendor reports, such as missing data, and vendor interest in subscription revenue render the complete accuracy of vendor statistics an unsafe assumption. Usage counts gathered by OIS can serve as benchmarks for analysis of any discrepancies between OIS statistics and logins reported by vendors.

The Task Force recommends as the most appropriate context for the ongoing monitoring and analysis of usage reports the stewardship structure developed by the Committee on Electronic Resources and Services (COERS), in tandem with the continuous resource evaluation and maintenance responsibilities of the stewards. In this report, Task Force describes a suggested mechanism for centrally shared information on vendor statistics consisting of narrative analysis of observed trends in actual usage reports.

The fundamental conclusions of the Task Force are that librarians need to advocate with vendors for the inclusion of meaningful data in vendor reports and that the continual evaluation of usage reports is extremely important from the following subscription, programmatic, and collection standpoints.

·  Regarding subscriptions, analysis of vendor reports is one method for evaluating the use of the institution’s online investments: it can aid in determining the cost-benefit of online resources, provide a basis for communicating with vendors about renewal prices, and inform decisions on purchasing new resources and adjusting simultaneous user licenses.

·  Instruction programs could benefit from usage reports in the identification of both underutilized and heavily used resources needing emphasis in instruction.

·  Online materials bear strong relationships to print collections, and information on the usage of electronic formats can assist in coordinated collection building.

i

Report of the Task Force on Vendor Statistics

Charge and Background

The Task Force on Vendor Statistics was formed in May 2001 by the Digital Acquisitions and Collections Committee (DACC) with the following charge.

As a subgroup of COERS, the Vendor Statistics Task Force will work with the Coordinator for Digital Acquisitions to review the usage statistics available from providers of HOLLIS electronic resources and make recommendations in the following areas.

·  Recommend appropriate use of vendor statistics in the ongoing monitoring and evaluation of HOLLIS resources, including their relationship to usage statistics captured by OIS.

·  Prioritize vendors and resources for usage statistics evaluation purposes, based on criteria such as cost, importance to university curricula and research programs, and other evaluative measures.

·  Identify vendors and resources for which statistics are desired but presently unavailable.

·  Identify desired improvements in the statistics available from electronic resource providers, with particular reference to the ICOLC Guidelines, the ARL E-Metrics project, and other efforts to develop standards in this area.

·  Recommend how best to distribute or otherwise make vendor statistics available to the Harvard library community.

The Task Force commenced by analyzing reports from a sampling of twenty-four major HOLLIS vendors and resources, collating data elements generally desirable in usage reports and identifying vendors for which reports are unavailable or not yet readily available. The group also consulted with Ben Noeske, Systems Librarian, Office for Information Systems, on the types of information represented in the Access logs used to generate the statistics on HOLLIS resources that OIS produces. This report reviews and answers each section of the charge and is followed by an Appendix consisting of analyses of vendor reports examined by the Task Force.

An initial, basic observation is that vendors exhibit a wide diversity of reporting formats and types of data reported; therefore, the data provided are not always comparable across vendors. In November 1998, the International Coalition of Library Consortia (ICOLC) published guidelines, revised in December 2001, that identified important data elements for inclusion in usage reports, and some vendors have attempted to follow this standard. In addition, different data types illuminate different genres of resources; for example, the number of full-text articles viewed is appropriate for online journals but not for indexes. Therefore, rather than attempting to formulate a comparative analysis and promulgate a set standard, the Task Force concentrated on descriptive analysis of reports for major resources in order to formulate observations and conclusions in response to its charge.

Part I: Recommend appropriate use of vendor statistics in the ongoing monitoring and evaluation of HOLLIS resources, including their relationship to usage statistics captured by OIS.

Analysis of usage reports from vendors can serve purposes related to licensing and cost assessment, user behavior, and collection development. Information on usage can inform annual renewal decisions, track possible correlation between resource development and usage, provide a basis for comparison with competing products or interfaces which might have become available during the subscription year, and identify needs for resource enhancements.

Usage reports can be interpreted to assist in pricing assessment and license monitoring. By enabling the determination of cost per use or session, login data can provide a baseline for assessing the comparative value of resources. In this way, resources with different characteristics become comparable; for example, large or popular journals can be compared with journals with fewer articles available. To verify license fulfillment, the reports should provide information on the continuous online availability of resource units and titles. Any information provided in reports on the amount of downtime experienced can enable the monitoring of allowable contracted downtime. If this type of information is not explicit in reports, examination of daily and hourly data, if provided, may enable deductions about downtime for further inquiry with the vendor. Data on time of use may also inform the scheduling of user assistance in alternative reference environments such as electronic or telephone reference and may also assist OIS in scheduling system downtime.

Usage reports may also provide information helpful to the promotion of particular resources to users. Resources in relevant subject areas with low demonstrated usage could be targeted for faculty communication to promote awareness and to determine the value of the resource. Alternately, high usage can assist in identifying resources for instructional programs.

In terms of collection development, usage can inform selection and deselection of print and online journals. Comprehensive journal packages in which Harvard licenses access to all titles available from the publisher, such as Wiley InterScience and Kluwer Academic Publishers, present demand-driven access environments. Usage reports can measure this activity and provide a picture of user needs independent of the limitations of print collections. Selective journal packages, in which Harvard licenses a selection of titles based on print holdings, can be analyzed in terms of turn-aways, or user attempts to access journals which are not licensed, to determine interest in additional titles and adjust selections in the suite of journals licensed. Usage comparisons of titles held and not held in print can inform retention decisions for both print and online collections.

Relation to OIS Data Collection

The present method of ongoing usage analysis by OIS is valuable because, unlike vendors, whose reporting emphasizes user behavior, OIS is able to collect information pertaining to user agent identity. Currently, OIS usage reports consist of the number of user logins or sessions through HOLLIS resources by faculty.

The usefulness of vendor reporting, on the other hand, is its analyzability to reveal general patterns of use among patrons, such as time and manner of use. Harvard’s proxy environment precludes the inference of any benefits from vendor reporting of usage by IP subnet. Although most vendor reports do not provide information on IP addresses, any such information which vendors could provide would consist of proxy server IP addresses.

In addition to user identification information, it is important for the institution to retain a system of self-measurement for its own usage of resources: vendor interest in subscription revenue as well as inconsistencies in reports, such as missing data, render the complete accuracy of vendor statistics an unsafe assumption (please see Web of Science in the Appendix for an example of missing daily usage counts). Locally gathered OIS reports can serve as benchmarks for analysis of any discrepancies between OIS statistics and logins reported by vendors. Moreover, for vendors that are not providing usage reports, OIS statistics provide the only measure of usage.

OIS generates its usage statistics for HOLLIS portal resources from the Access service, so the Task Force examined sample Access logs to observe the information gathered. Access logs record a number of types of information, such as the IP address of the requesting user agent, the target resource, the time of the request, the authentication method and authentication scheme, the faculty code, the patron category, the browser and operating system of the user, and the referring page URL. Patron categories are recorded for transactions authenticated by PIN but not for in-library devices. IP addresses and faculty codes are recorded for all accesses, both in-library devices authenticated by IP and requests authenticated by ID. In the former case, the in-library device maps to the faculty code of the library.

The Task Force considered whether additional useful information could be deduced or inferred from the Access logs. Presently, OIS reports faculty information for FAS at a broad level according to the faculty code for “Arts & Sciences”; the faculty codes do not convey more granular departmental or library affiliation. The reporting of department level statistics would be of value to libraries in assessing usage of online subscriptions. Further investigation with FAS would be necessary to determine if it is possible to map IP addresses to specific physical regions, buildings or departments.

Sampling upon request for specific library purposes would be an appropriate means of gathering granular data inferable from IP addresses. Librarians interested in examining use from particular IP addresses and for specific resources can request from OIS copies of the Access logs for specific time periods to examine levels of access. However, the collection of data associated with IP addresses on any more general or systematic basis is not advisable because the body of substantive information would be incomplete. For example, individual libraries within HCL cannot be matched to specific IP addresses because they share a pool of dynamically assigned IP addresses. Furthermore, remote transactions via Internet service providers reflect the IP addresses of the ISP, which does not identify the user’s relationship to Harvard.

Part II: Prioritize vendors and resources for usage statistics evaluation purposes, based on criteria such as cost, importance to university curricula and research programs, and other evaluative measures.

Selected vendors and resources meriting priority for immediate evaluation over the next year meet criteria determined by high online cost, importance of resources to interdisciplinary research interests, low quality of current usage reports, mode of licensing related to print collections, and simultaneous use restrictions incorporated into access modes. Although this report highlights specific resources for priority treatment, the Task Force emphasizes that any HOLLIS resource has potential for prioritized attention at certain junctures. Appropriate times for usage analysis of any resources include license renewal, vendor sales promotion for new components, alternative platform availability, observation of low usage in OIS logs and vendor reports, and relevance to reference and instruction services.

The Task Force has identified a number of cross-disciplinary databases and journal packages for prioritized evaluation.

Major Cross-Disciplinary Databases

·  Web of Science

·  Economist Intelligence Unit

·  Ovid databases

Web of Science was selected for prioritization due to its broad interdisciplinary relevance, high one-time and annual costs, concurrent user license, and cited reference search service. Its unique citation indexes have broad relevance across the sciences, social sciences, and humanities. As an enhancement to concurrent user licensing, ISI has reduced its connection time-out feature to ten minutes of inactivity and has implemented an extended session time-out, enabling a user to re-establish the same session within an hour of disconnection if a user port is available. With its simultaneous user parameters and the option to purchase additional users, it is important to monitor the system performance metrics provided in the vendor’s reports, such as turn-aways, length of session, and maximum actual concurrent users. However, user behavior measurements are totally absent from its reports, so Web of Science is also prioritized for the purpose of report improvement. Please see Part III and the Appendix for more information on desired improvements for Web of Science reporting.