MXI Meta-Xceed, Inc. / STANDARD
OPERATING
PROCEDURE / Document No.: 20-0413_00
Rev.: 0.5
Title: SAS Data Flow

I.  PURPOSE/SCOPE

The purpose of this standard operating procedure is to give an overall description of the flow of data as it applies to the use of SAS at MXI. SAS is the primary analysis and reporting tool used by Biometry to conduct analysis and create summary tables, listings, and graphs.

The flow of data is somewhat complex, encompassing several distinct activities. This document is meant to give a high level description of the flow, and will refer to several Working Practice Guidelines and SOPs for the details of specific activities.

II.  RESPONSIBILITY

All members of the Biometry Group are responsible to conduct their work in accordance with this procedure as well as the Working Practice Guidelines referred to by this document.

III.  REFERENCES

·  Biometry Analysis Directory Structure WPG

·  DF2SAS WPG

·  Electronic Data Transfer WPG

·  SAS Data Standards WPG

·  SOP (20-0412_00) SAS Program Life Cycle

·  SOP (20-0410_00) SAS Program Risk Assessment Validation and Verification

·  SOP (20-0411_00) SAS Task Archival

IV.  DEFINITIONS

·  SAS - statistical package used by Biometry to perform statistical analysis.

·  Task – A set of related programs and outputs usually for a single purpose, see the Biometry Analysis Directory Structure WPG for the definition and use of project tasks.

·  Dataset – A collection of related data values in an organized structure.

V.  MATERIALS

(N/A)

VI.  SAFETY NOTES

(N/A)

VII.  PROCEDURES

Data Flow as it pertains to SAS encompasses the sources of the data and the processes that change or manipulate the data to produce some final outputs.

1.  New Task

1.1.  A request for an analysis will often generate a new task. A task represents a set of related work that is usually to be completed for a specific purpose. An example would be an interim analysis for a clinical trial.

1.2.  Data and programs are organized in a hierarchical fashion on the SAS server such that the data and programs necessary to complete a task are grouped together in one logical area; see the Biometry Analysis Directory Structure WPG for further details.

2.  Data Sources

2.1.  Data files for a task may be received in a variety of formats, and may come from either internal or external sources.

2.1.1.  Data Formats: Data is accepted in any of the following formats:

·  SAS datasets, either in native SAS format or SAS transport files.

·  ASCII text including fixed format, comma separated or otherwise delimited.

·  Excel or other spreadsheet data.

2.1.2.  Data Sources: Data may be received from any of the following or other unknown sources:

·  MXI’s Clinical Data Management System.

·  External Client Organizations

·  Central Labs

·  Standard Coding Dictionaries

·  Pk/PD Data

2.1.3.  Transfer Method: Data can be received either on physical media such as CD or diskettes, or electronically via email or FTP; see the Electronic Data Transfer WPG for specifics on electronic transfers.

2.2.  Once received, source data is stored on the SAS server in the hierarchical structure referenced above.

3.  SAS programs

3.1.  SAS programs are used throughout the Data Flow to either manipulate data, create summaries of graphs, or perform statistical tests on the data for the purpose of the analysis. The SAS Program Life Cycle SOP (20-0412_00) and its’ associated WPGs, as well as the SAS Program Risk Assessment Validation and Verification SOP (20-0410_00) detail the standard practices that relate to SAS programs.

4.  Analysis Files

4.1.  Source datasets often need some manipulation or preprocessing to be used efficiently to produce analyses or reports. Our standard practice is to create a set of SAS datasets called analysis files that are in a format that can be directly used to create the summary output or graphical output needed.

4.2.  Analysis files employ standard naming, type and format characteristics whenever possible. The SAS Data Standards WPG summarizes the recommendations for creating these analysis files.

5.  Adverse Event and Medications coding

5.1.  In a clinical trial Adverse Events and Medications are often recorded in unique text that describes a particular circumstance or entity.

5.2.  In order to provide a meaningful summary of Adverse Events or Medications the term used is mapped to a generalized or standard term such that like events or medications can be grouped together. The process of mapping to the standard terms is called coding.

5.3.  Coding is performed with the aid of industry-accepted dictionaries published by standards organizations, e.g. the World Health Organization, or MedDRA. The specific process involved in coding Adverse Event or Medications data is described in the Adverse Event and Medications coding WPG.

6.  Data Outputs

6.1.  Output from analyses are usually saved as electronic files in the appropriate area on the SAS server. Output file formats are either PDF, RTF, or ASCII text

6.1.1.  PDF (Adobe Portable Document Format): The primary choice for electronic file format; it is universally readable with the Acrobat Reader and cannot be easily modified

6.1.2.  RTF (Rich Text Format): Secondary choice; its advantage is that it is easily edited in MS-Word. For this reason outputs to be incorporated in text reports or manuscripts may be output to RTF.

6.1.3.  ASCII text: Advantages are that ASCII text can be easily read by other programs or applications.

7.  Quality Control

The integrity of data is maintained throughout the data flow using a risk-based approach. Programs that are of greater risk are subject to more verification than those that are determined to be of lesser risk. See the SAS Program Risk Assessment, Validation and Verification SOP (20-0410_00).

DOCUMENT HISTORY

Rev. / Effective Date / Change Type / Description of Change(s)
1.0 / January 2006 / DCO / New SAS SOP

Page 3 of 5