Automatingtestcase Generation for Computer Interpretable Clinical Guidelines

AutomatingTestcase Generation for Computer Interpretable Clinical Guidelines

JaeHoonLEEa, HyunYoung KIMb, JeongAhKIMc, InSookCHOd[1],Yoon KIMe

aDepartment of Industrial Engineering, Ajou University, Suwon, Korea

bDepartment of Nursing, Eulji University,DaeJeon, Korea

cDivision of Computer Education,Kwandong University,Gangneung, Korea

dNursing Department,Inha University,Incheon, Korea

eMedical Department, Seoul National University, Seoul, Korea

Abstract.In this paper, we developed an automated testcase generation tool to verify computer interpretable clinical guidelines in a clinical decision support system. The concept of test coverage and the function of testcase generation in S/W testingdomain were adopted. The proposed methods enable to 1) automatethe testing functions such as testcase generation, test execution, and analysis of execution outcomes, and 2) to measure the effectiveness of the test quantitativelyby test coverage. A Severe Sepsis guideline was implemented to verifycomputational efficiency and plausibility of the proposed method.

Keywords.CDSS, clinical guideline, software testing, validation and verification

1.Introduction

One of the success factorsof Clinical Decision Support Systems (CDSSs) is to assure reliable and correct outcomes of them. Among the factors to determine the quality of CDSS outcomes, acquiring high quality of a computer interpretable guideline (CIG) isnot only important but a complicated issue. Although manual inspection by clinicians has been used as a traditional way to test a CIG, the increasing complexity of the recent CIGs requires too much time and cost. Thus, dynamic testingto test guidelines in a simulation environmenthas recognized as an effective way with its automated manner.

In dynamic testing, making proper testcasesis important because the volume of execution is directly related to the cost of testing. In the recent studies by Wang et al. (2004) and Martins et al.(2006), the testcases were created empirically by physicians. Thesehave two limitations; 1) it requires significant time and effort of knowledge authors,2) and the empirically created testcases have deviations depend on the clinicians [1][2].To solve the problems,this studyadopted 1) the concept of test coverage, and 2) the function of automating testcase generation.

2.Method

The SAGE guidelinewas selected to derive the elements of clinical guidelines because its key approach is to integrate guideline-based decision support with the workflow of care processcomprehensively[3]. Three coverage types; paths to ensure the cases of selecting independent paths, conditions to determinea criterion for whether it is true or false, and boundary values to determine a decision satisfies to a specified domain value were derived.

The testing tool was developed as a plug-in module of the uBrain CDSS, which can translateand execute the SAGEbased guidelines as depicted in Figure 1 [4].After a CIG is authored, atest performermay setup test level, generateand select worthwhile testcases, and make expected answers. Then a test executor will make preconditions for each testcase and triggers the CDSS.The execution outcomes will be returned to uBrain tester so that they can be analyzed to find unmatched conditions.

Figure 1.Framework of uBrain tester

We implemented a Severe Sepsis guideline whichconsists of 12 sub guidelines with 65 rule sets. 840 testcaseswere generated under full condition coverage and 59 testcases were selected. Physicians added 4 more testcases. In the first iteration, totally 63 cases were executed and 11 mismatched were found. By revising the original guidelines based on the errors, tests in second iteration result 100% matches.

3.Conclusion

The proposed methodenables to functionallyautomate testcase generation, test execution, and analysis. Knowledge authors and test performers can conduct the entire testing process in an integrated environment. In addition, testcase generation based on test coverage allows measuring the performance of testing quantitatively by amounts of execution and rate of detection(detected flaws / executedtestcases).

References

[1]Wang D, Peleg M, Tu SW, etal. Design and implementation of the GLIF3 guideline execution engine, J Biomed Inform.2004 Oct;37(5):305-18.

[2]Martins SB, Lai S, Tu S, et al. Offline testing of the ATHENA Hypertension decision support system knowledge base to improve the accuracy of recommendations, AMIA AnnuSymp Proc. 2006:539-43.

[3]Tu SW, Glasgow J, SAGE Guideline Model Specification 1.65, 2006.

[4]Lee J, Kim J, Cho I, Kim Y, Integration of workflow and rule engines for clinical decision support services, Stud Health Technol Inform. 2010;160(Pt 2):811-5.

[1] Corresponding Author:Inha University, College of Nursing, 253 Yonghyun-dong, Nam-gu, Incheon, Korea,