Unified Terminology Governance Project

Unified Terminology Governance Project

Unified Terminology Governance Project

Suggested Process Description

HL7 Project

  1. Introduction
  2. Problem Statement

HL7 currently maintains multiple terminologies - v2, v3, CDA value sets and FHIR, but does not currently have a uniform vocabulary governance process or strategy across all the HL7 product lines. Ongoing maintenance of these terminologies is resource intensive and is also quite opaque to much of the community. As FHIR and associated implementation guides become more popular and as HL7 continues to grow, this is becoming a greater problem. There is a need to maintain the terminologies that support all of HL7’s products (v2, v3, FHIR, CDA, etc.) in a way that is responsive and improves quality while reducing the resources that both HL7 and its volunteers must put into the process. As well, there’s a need for the process to align with the community’s expectations for a more modern, continuous peer-feedback related process.

The process proposed in this document would replace existing vocabulary maintenance processes (including harmonization) for HL7-maintained terminology across all HL7 project lines.

1.2.Objectives

There are a large number of objectives across the organization to comprehensively address all nooks and crannies of the problems noted above.

1.2.1.Minimize the number of ongoing support resources (and HL7 financial commitment) that are needed to support vocabulary maintenance as the volume of work increases

1.2.2.The Terminology governance process must effectively and efficiently fit into the HL7 balloting and publication processes so as to improve the production of the HL7 products, and not adversely impact them.

1.2.3.The tooling developed must be useful by a broad enough community to provide sustainability for ongoing maintenance and development

1.2.4.It must be an open process - proposals for change can come from anyone

Member or non-member

Familiar with committee & HL7 structure or not

Anywhere in the world

1.2.5.It must be a consensus-based process: Viewpoints are sought from all impacted stakeholders and the objective is to satisfy all those stakeholder needs as much as feasible

1.2.6.It must be a broad process: Stakeholders represent the full range of perspectives that are currently impacted by or are likely to be impacted by proposed changes - countries, product lines, implementer vs. academic, etc.

1.2.7.Tooling for authoring/viewing needs to be accessible independent of platform (Windows/Mac/Unix). Could be cloud-based, but offline editing is required. Can’t depend on “restricted” technologies (e.g. Google in China)

1.2.8.Tooling for reviewing proposals and submitted endorsements should be accessible independent of platform, and probably should support mobile platforms as well.

1.2.9.Process works across all terminology content

Different product families (v2/v3/CDA/FHIR/etc.)

Both HL7 content and content published by outside organizations in multiple

Different artifacts (code systems/value sets/concept domains/indirect bindings/concept maps/code system supplements)

May scale to other types of artifacts (for later implementation)

1.2.10.Process produces “awareness” in the community that’s impacted

Those who already use the vocabulary artifact

Those who are considering the vocabulary artifact

Those who have used it in the past and must access/process historical records

Those who have indicated a specific interest

1.2.11.Mechanically produces technically valid content with minimal manual overhead on those who apply changes

1.2.12.Encourages quality/good practice as part of creation/maintenance while not ignoring practical needs

1.2.13.Process allows for (and ensures) due diligence review

Participation from international community for artifacts with potentially international scope

Participation from those who know (and care about) good vocabulary processes

Participation from those who implement - actually use the vocabulary or might

Review by both those who are members/part of HL7 inner circle and those who are not

1.2.14.Content is protected from undue influence by participants outside of intended scope and by any particular stakeholder group within the intended scope

Internationally-scoped content is protected from undue influence by any particular realm/country

Particular realm/country-scoped content is protected from undue influence by the international community, and is identified so as to prevent inadvertent use in other realm/countries

Interested individuals and organizations are clearly identified, including individual-organization affiliations and care is ensured to balance inputs

1.2.15.Knowledge/experience/capability of participants is weighted by the process

Those who know more, have more experience, and have best skill have greatest influence

Influence accrues to those who do the work

1.2.16.Process ensures that content being reviewed is understandable by those who might perform the review

Clear what’s being changed - what the old version is, what the new version will be

Clear what the reason is for the change

Clear what (if any) the impacts will be on existing/planned artifacts

1.2.17.Process pushes the work to where the greatest bandwidth is

First choice: proposer (proposer could have an agent - agent should generally not be volunteer)

Second choice: HL7-funded resources

Last choice: volunteers

1.2.18.Process is responsive

Changes can migrate from “proposed” to “applied” in ~month

Changes are ideally continuous, not driven by specific HL7 events

Process does not depend on synchronous/multi-person meetings

1.2.19.Process reflects governance obligations

Adheres to expectations of ANSI & HL7 voting process

HL7 has ultimate authority over HL7 content

Process adheres to IP rules (not misusing copyrighted material, adhering to rules of terminology sources

1.2.20.Process is transparent - proposers and community are aware of where proposals are at in the process, what decisions have been made and why, and what steps still need to occur

1.2.21.Tooling should be multi-purpose and minimally specific to just this one terminology maintenance process. Ideally tooling should be generally available commercial products that can be configured straightforwardly to implement this process and its workflow

1.2.22.The ‘Terminology Objects” referenced here are HL7 curated Code Systems (and Code System Supplements?), Value Sets, Persistent Context Bindings, and Concept Domains. In the future it may be extended for use with the RIM ontology (should the RIM shift from a structural model to an ontology), and for secondary objects such as Concept Maps.

1.2.23.Process will permit evolution over time in order to accommodate changes in priorities, funding exigencies, and changes in organizational priorities, and potentially new standards families that will emerge in HL7 in the future.

1.2.24.Process will be sustainable with existing and anticipated resources

1.3.Relationship With Balloted Artifacts

HL7 terminology will be treated as a separate activity from the product family balloting processes, although there will be synchronization and coordination points and requirements in order to work effectively for implementers and balloters. Ballot feedback on the content of HL7 terminology artifacts will be considered as input to the terminology maintenance process, so that output from that process may be used in reconciliation and final publication of normative standards. Note that there is some HL7 terminology content which has limited or no impact on the structure of balloted model artifacts, and so can be treated in a similar manner as feedback on the content of LOINC or SNOMED. If the use of a particular code has a material impact on the use of a standard, the responsible work group may need to request a high priority terminology change in order to resolve a negative ballot. If there is no material impact, the comment may be found “not related”. If the external terminology custodian does not agree to make the change as desired by the balloter, the work group may need to re-design the artifact to use a different terminology or may need to move forward with the negative vote outstanding as per usual processes.

The process outlined in this document is proposed to be the process for treating the HL7 stewarded terminology objects as if they are an ‘external’ terminology, ie separate from the development and approval processes for balloted HL7 standards. There are already a handful of HL7 terminology objects that are treated in this way (e.g. Table0396)

Some terminology artifacts (e.g. structural codes) may be managed directly as part of a product family’s artifacts and not be subject to this process. Each family would need clear rules for what terminology artifacts would and would not be subject to this process. These objects must be clearly identified so that it is clear to anyone in the community that wishes to make changes to these where the source of truth resides and what the exact process must be for each.

  1. Process Description and flow

See the associated PDF document with the workflow diagrams, embedded in Annex A . This section contains the prose descriptions for those diagrams.

At a very high level, the process consists of two parallel processes:

Creating and editing proposed submissions for changes to terminology and reviewing and asserting opinions (which are here called 'endorsements') on the proposed changes;

Monitoring the endorsement process, and applying changes for those proposals that have achieved the level of consensus required for implementation in the HL7 Terminology, and monitoring/facilitating the terminology publication processes

These run in parallel and asynchronously, and are coordinated and facilitated by tooling which makes use of a centralized (or federated) HL7 Terminology Store and a broadly accessible centralized (or federated) Proposal and Endorsement Store. The process of proposals and endorsements is broad across the HL7 community, and the process of monitoring and applying changes is performed by a part-time Terminology Curator.

The descriptions below and the flow diagrams in Annex A describe these processes and their coordination in detail.

2.1.Overall Process Flow

The first page of the attached PDF file with the diagrams (Annex A) shows the design and architecture of the overall process. The second page illustrates the detail of the top level overall process flow. It incorporates a number of well-contained sub-processes which are detailed on the pages that follow. These are:

External Terminologies Process with HTA

Endorsement Acquisition Process

Create Delta Process

Endorsement Triage Process

Consensus:Controversial Resolution Process

Apply Approved Proposal Using Tooling Process

Each of these is documented on a separate page. This was done so that each of the flow diagrams can be digested easily as they are each on only one page.

  1. Major Components

There are several major components of the new process. These can be listed as:

3.1.Terminology Persistence and Maintenance Data Store

3.2.Terminology subset extraction for viewing and publishing

3.3.Proposal Persistenceand Endorsement tracking Data Store

3.4.Proposal Viewing/Creating/Editing

3.5.Proposal State Workflow Management

3.6.Terminology Curator Role

3.7.Endorsed Proposal Change Application

  1. Terminology Components

The HL7 Terminology across all the product families may be classified into three broad categories:

Structural Terminology

Domain Content Terminology

External Terminology

The so-called 'Structural Terminology' consists of HL7-defined technical coded content which is very tightly bound to the model structures and internal HL7 components within the various balloted and published standards. This includes for example the tables in V2 of the Message Structures, the code systems in V3/CDA for HL7 Realms and Datatypes, and code systems in FHIR for specific status codes. There are many of these across the product families; because this content is tightly bound to the technical artifacts under ballot, and usually also bound to specific versions, it is felt that maintaining this content through a shared harmonized governance process would be wasteful of resources and counter-productive. The same process and tooling and persistence mechanisms can be used to maintain it, but content of this type should bypass time-consuming editing and consensus-building processes in the workflow, and should instead go direct to application of the proposals in preparation for a ballot.

The main part of HL7 Terminology can be referred to as 'Domain Content Terminology'. This includes, for example, V2 Telecommunication Use Codes, V3/CDA ActCode and Privacy content, and FHIR ReferralCategory and VisionEyes. This content would all be subject to the suggested governance process.

Finally, there are many value sets in HL7 Standards which are built on content from external terminologies such as LOINC, NUBC terminologies, ISO terminologies, SNOMED, etc. There are special processes for changes in content for these; changes to value set definitions are handled through the suggested governance processes, but requested changes or additions to these external terminologies are subject to a separate process mediated through the HTA as dictated by our negotiated relationships with these external organizations.

  1. Needed Tool Packages

Only a few components are currently available, or can be made available easily by tweaking or repurposing existing tools; most of the tooling to support or enable the identified components must be new. The primary tools that are needed are:

5.1.Centralized HL7 Terminology Persistence Store

There is currently no single store for all of the HL7 terminology. Version 2 is maintained in a Microsoft Access database, CDA terminology is maintained in Trifolia, V3 terminology is maintained in the coremif and MSAccess java-based tooling used in the publication preparation processes, and FHIR terminology is maintained in FHIR codeSystem and valueSet resources in the FHIR database maintained by Grahame Grieve.

Currently the most comprehensive set of terminology is in the FHIR database, since it has all the FHIR-specific terminology, plus the imported V2 terminology that is used, and a significant amount of the V3 and CDA terminology as well. This is the closest thing HL7 currntly has to a single source of truth. But it is a *copy* of the V2, CDA, and V3 terminology, and as such currently suffers from subsetting, currency, and import error issues.

A Unified Terminology Governance process will require a unified persistance store of all terminology published in HL7 Standards.

5.2.Terminology Subset Extraction for Viewing and Publishing

In order to be of use to HL7 and its community of users, terminology in the central store must be able to be viewed and extracted for excerpt inclusion in published Implementation Guides, other Standards, and Harmonization Proposals. Currently, each product line (V2, V3, CDA, FHIR) has its own tools and mechanisms to extract terminology for publishing and for creation of artifacts for change proposals.

Tooling will be required to accomplish this against the central store. At the current time, there are several value set editors and code system content editors in various stages of capability and usability for the FHIR database, but significant investment would have to be made to bring these up to what would be required for the other product lines.

At this time, the V2 Chapter 2C format generation is done with a large number of MSAccess and VB scripts, and run manually by Frank Oemig. This does not produce the final publication format, which must be manually edited (a 850 page Word document) for a v2 ballot. In addition, it is widely recognized that Word and PDF are exceptionally poor vehicles for terminology access and publication. For Version 2, a whole new publication and accessibility format must also be designed, or it must be subsumed into one ofr the other families.

Tooling Requirement: Chapter 2C output format generation from the central Store.

Design and Tooling Requirement: Other output/accessibility format for V2 terminology

V3 ballot generation is done using RoseTree and a large collection of publishing tools run manually by Ted Klein and Lynn Laakso.. These already have bugs and issues, and can no longer be easily maintained as Woody (the original author) is no longer actively participating. Although FHIR has tools that can import V3 content from the coremif (the format which the V3 terminology is made available in) there is no current capability to extract FHIR content into the coremif format. In addition, there are a few V3 terminology components which are not part of the FHIR Core specification resources, and thus some extensions or some mappings are required in order to enable any lossless extraction from FHIR resources to V3 coremifs consumable by the ballot and Normative Edition generation tools. Note there are another existing tools, such as the RMIM designer, which also need the coremif format for the terminology.

Tooling Requirement: Output of a complete and correct coremif format for V3 terminology

Design and Tooling Requirement: Ability to persist 100% of the V3 terminology content

CDA Implementation guides are largely assembled manually by the workgroups; there is no specific tooling requirement for generation of a specific format for the CDA terminology published by HL7. However we have seen errors rife in the content, likely due to the fact that these are manually assembled. The access to CDA terminology is through Trifolia, which is widely used in the CDA community. At the current time, the content is manually extracted from the published implementation guides and entered into Trifolia for accessibility.