1. Top k-gene markers for distinguishing liver cancer

Tables 1-3 give the top five one-gene, two-gene and three-gene special biomarkers of liver cancer. These genes can be used as biomarkers for liver cancer, and simultaneously they are found to be effective discriminators for liver cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 1. We also can see that the good 3-gene discriminators for liver cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.

As noted, ECM1 is involved in endochondral bone formation, angiogenesis, and tumor biology, which is also associated with metastatic potential of hepatocellular carcinoma [1]. PPP2R5A is implicated in the negative control of cell growth and division [2]. Moreover, several of the top genes as biomarkers have been reported to be cancer relevant. For example, MTHFD1 is reported to play a key role among folate metabolism and colon cancer initiation and progression [3]. Low expression level of LY6E is observed in several human hepatocellular carcinogenesis (HCC) and correlated with the malignant potential of HCC [4]. GLS has been suggested to play a role in regulation of cancer cell metabolism [5]. CD34 and ANGPT1 are reported to play key roles in angiogenesis which are also overexpression in hepatocellular carcinoma [6, 7]. BGN is suggested to have a role in controlling cell growth in cancer [8]. NDST1 has been demonstrated to play a role in tumor angiogenesis and tumor growth [9].

Table 1: One-gene Marker for liver cancer

Markers / Accuracy1 / Accuracy2 / Mean
ECM1 / 0.959869 / 0.807437 / 0.883653
MTHFD1 / 0.862829 / 0.754013 / 0.808421
PPP2R5A / 0.795614 / 0.799626 / 0.79762
LY6E / 0.78147 / 0.785651 / 0.78356
PTS / 0.733004 / 0.812349 / 0.772677

Table 2: Two-genes Marker for liver cancer

Markers / Accuracy1 / Accuracy2 / Mean
ECM1+MTHFD1 / 0.991119 / 0.892049 / 0.941584
ECM1+SLC25A13 / 0.997369 / 0.867643 / 0.932506
ECM1+GPD1L / 0.984869 / 0.879152 / 0.93201
ECM1+SORD / 0.997369 / 0.866252 / 0.93181
ECM1+GLS / 0.984869 / 0.859951 / 0.92241

Table 3: Three-genes Marker for liver cancer

Markers / Accuracy1 / Accuracy2 / Mean
CD34+ECM1+MTHFD1 / 0.989803 / 0.926572 / 0.958187
ANGPT1+ECM1+MTHFD1 / 0.991119 / 0.924481 / 0.9578
BGN+CD34+MTHFD1 / 0.969737 / 0.94292 / 0.956328
ECM1+MTHFD1+NDST1 / 0.991119 / 0.917246 / 0.954182
ECM1+MTHFD1+OLFML2A / 0.989803 / 0.917074 / 0.953438

Figure 1: Classification accuracies by the top 100 k-gene markers

2. Top k-gene markers for distinguishing lung cancer

Tables 4-6 give the top five one-gene, two-gene and three-gene special biomarkers of lung cancer. These genes can be used as biomarkers for lung cancer, and simultaneously they are found to be effective discriminators for lung cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 2. We also can see that the good 3-gene discriminators for lung cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.

As noted, LAMP3 is reported as a hypoxia regulated gene of potential interest in hypoxia-induced therapy resistance and metastasis [10]. OLR1 has suggested to play a role in cell proliferation, migration and inhibition of apoptosis [11]. TRIM2 has been suggested to play a role in mediating the p42/p44 MAPK-dependent ubiquitination [12] The protein of DOCK9 belongs to DOCK proteins which have been related in cell migration, morphogenesis, and phagocytosis and as important components of tumor cell movement and invasion [13]. GYG2 is implicated in determining glycogen accumulation and controlling glycogen synthesis [14]. Moreover, several of the top genes as biomarkers have been reported to be cancer relevant. For example, HYAL1 has been found down-regulation in non-small cell lung cancer and involved in tumour suppressor [15]. TNNC1 is reported to be down regulated in lung cancer and related with calcium signaling pathway [16]. CDH13 has been suggested to be inactivated in lung cancer by combination of deletion and hypermethylation [17].

Table 4: One-gene Marker for lung cancer

Markers / Accuracy1 / Accuracy2 / Mean
LAMP3 / 0.858616 / 0.769559 / 0.814088
HYAL2 / 0.81745 / 0.798345 / 0.807898
C11orf9 / 0.829092 / 0.707824 / 0.768458
OLR1 / 0.760883 / 0.774356 / 0.76762
TNNC1 / 0.901562 / 0.63299 / 0.767276

Table 5: Two-genes Marker for lung cancer

Markers / Accuracy1 / Accuracy2 / Mean
CDH13+LAMP3 / 0.879152 / 0.850031 / 0.864591
DOCK9+TRIM2 / 0.865516 / 0.849466 / 0.857491
GYG2+HYAL2 / 0.882778 / 0.822166 / 0.852472
DOCK9+TNNC1 / 0.91369 / 0.789613 / 0.851652
LAMP3+TNNC1 / 0.926359 / 0.774037 / 0.850198

Table 6: Three-genes Marker for lung cancer

Markers / Accuracy1 / Accuracy2 / Mean
DOCK9+LAMP3+TRIM2 / 0.921285 / 0.861351 / 0.891318
DOCK9+IFI30+TRIM2 / 0.898373 / 0.883979 / 0.891176
DOCK9+TNNC1+TRIM2 / 0.916329 / 0.856951 / 0.88664
HYAL2+LAMP3+TRIM2 / 0.918646 / 0.850576 / 0.884611
DOCK9+SLC39A14+TRIM2 / 0.899142 / 0.866726 / 0.882934

Figure 2: Classification accuracies by the top 100 k-gene markers

3. Top k-gene markers for distinguishing colon cancer

Tables 7-9 give the top five one-gene, two-gene and three-gene special biomarkers of colon cancer. These genes can be used as biomarkers for colon cancer, and simultaneously they are found to be effective discriminators for colon cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 3. We also can see that the good 3-gene discriminators for colon cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.

As noted, the protein of SCGN is thought to be involved in KCL-stimulated calcium flux and cell proliferation [18]. UGDH has been reported to play a role in signal transduction, cell migration, and cancer growth and metastasis [19]. DYRK2 has a role in the regulation of cellular growth or development [20]. The protein of PTPRF is a member of the protein tyrosine phosphatase (PTP) family which is involed in regulating a variety of cellular processes including cell growth, differentiation, mitotic cycle, and oncogenic transformation [21]. The substrates of CANT1 are suggested to be involved in several signaling functions, such as Ca2+ release [22]. ALDH4A1 is suggested to play a protective role in cellular stresses as a p53-inducible gene [23]. The protein of KIF5C is a member of kinesin family which is involved in several cellular functions including mitosis, meiosis and transport of cellular cargo [24]. RUSC1 is a novel signaling adapter and is dependent on the prolonged activation of MAPK [25]. Moreover, several of the top genes as biomarkers have been reported to be cancer relevant. For example, STAP2 is reported to play a role with endogenous positive regulator of cell growth in several cancers, such as breast cancer [26]. BCAS1 has been suggested to be downregulated in colorectal tumors and located in a very unstable genomic region [27]. HOXD1 is down regulated in colon cancer by hypermethylation and silencing and plays an important role in tumorigenesis [28]. EDIL3, which plays an important role in mediating angiogenesis, has reported to be used as a biomarker of colon cancer [29]. SLC6A8 has suggested to be used as potential markers for the detection of circulating tumor cells in the peripheral blood [30]. CXCL2 is one of inflammatory cytokines which is reported to associate with early events in colon carcinogenesis [31].

Table 7: One-gene Marker for colon cancer

Markers / Accuracy1 / Accuracy2 / Mean
STAP2 / 0.851786 / 0.843429 / 0.847607
CNNM4 / 0.858572 / 0.818782 / 0.838677
BCAS1 / 0.855833 / 0.820062 / 0.837948
HOXD1 / 0.84381 / 0.818044 / 0.830927
CNNM2 / 0.85881 / 0.802088 / 0.830449

Table 8: Two-genes Marker for colon cancer

Markers / Accuracy1 / Accuracy2 / Mean
SCGN+UGDH / 0.952222 / 0.86204 / 0.907131
DYRK2+NEBL / 0.917619 / 0.882825 / 0.900222
PTPRF+SCGN / 0.937183 / 0.860706 / 0.898944
EDIL3+STAP2 / 0.910556 / 0.886143 / 0.898349
CANT1+SCGN / 0.924087 / 0.869585 / 0.896836

Table 9: Three-genes Marker for colon cancer

Markers / Accuracy1 / Accuracy2 / Mean
ALDH4A1+KIF5C+STAP2 / 0.940556 / 0.928 / 0.928
HOXD1+SLC6A8+UGDH / 0.911786 / 0.918698 / 0.918698
CXCL2+HOXD1+SLC6A8 / 0.906667 / 0.918698 / 0.918698
DYRK2+NEBL+RUSC1 / 0.917341 / 0.918547 / 0.918547
CANT1+DYRK2+NEBL / 0.925119 / 0.917286 / 0.917286

Figure 3: Classification accuracies by the top 100 k-gene markers

4. Top k-gene markers for distinguishing gastric cancer

Tables 10-12 give the top five one-gene, two-gene and three-gene special biomarkers of gastric cancer. These genes can be used as biomarkers for gastric cancer, and simultaneously they are found to be effective discriminators for gastric cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 4. We also can see that the good 3-gene discriminators for gastric cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.

As noted, XYLT2 is reported to play a role in initiating the biosynthesis of glycosaminoglycan chains in proteoglycans including chondroitin sulfate, heparan sulfate, heparin and dermatan sulfate [32]. SERPING1 has suggestted to play a role in regulating several important physiological pathways, such as complement activation, blood coagulation, fibrinolysis and the generation of kinins [33]. The protein of ABCC5 is involved in degradation of phosphodiesterases and possibly an elimination pathway for cyclic nucleotides [34]. GEM has reported to play a role as a regulatory protein in receptor-mediated signal transduction [35]. TYROBP acts as an activating signal transduction element [36]. Moreover, several of the top genes as biomarkers have been reported to be cancer relevant. For example, AQP1 has reported have differential expression in gastric cancer and to be involved in cell migration [37]. BPIL1 is suggested to be involved in oral squamous cell carcinomas [38]. ZNF300 is reported to play a role in promoting tumor development and stimulating cancer cell proliferation [39]. EMP1 which is an epithelial membrane protein is found to be overexpressed in early gastric tumorigenesis [40]. ANXA1 has been reported to be related with gastric cancer development and progression and can be used as biomarker for gastric cancer [41]. NR5A2, which is overexpression in gastric cancer, has been suggested to promote the proliferation of gastric adenocarcinoma [42]. PKIB is reported to enhance the growth and mobility of prostate cancer [43]. SPAG4 has been suggested to be used as a potential clinically relevant cancer marker [44]. SERPINI1, which is involved in G1-S transition checkpoint, is downregulated by miR-21 in gastric cancer [45].

Table 10: One-gene Marker for gastric cancer

Markers / Accuracy1 / Accuracy2 / Mean
ANKRD22 / 0.670857 / 0.770649 / 0.720753
AQP1 / 0.667905 / 0.741188 / 0.704546
TMEM161B / 0.692286 / 0.707554 / 0.69992
XYLT2 / 0.634762 / 0.750203 / 0.692482
SERPING1 / 0.610952 / 0.764477 / 0.687715

Table 11: Two-genes Marker for gastric cancer

Markers / Accuracy1 / Accuracy2 / Mean
ABCC5+BPIL1 / 0.781333 / 0.768621 / 0.774977
ANKRD22+ZNF300 / 0.732952 / 0.804943 / 0.768948
ANKRD22+EMP1 / 0.708381 / 0.821649 / 0.765015
ANXA1+SERPING1 / 0.718095 / 0.808721 / 0.763408
ABCC5+GEM / 0.744 / 0.77885 / 0.761425

Table 12: Three-genes Marker for gastric cancer

Markers / Accuracy1 / Accuracy2 / Mean
ABCC5+ANKRD22+TYROBP / 0.786667 / 0.824548 / 0.805607
EMP1+NR5A2+PKIB / 0.790952 / 0.817405 / 0.804179
EMP1+SERPING1+SPAG4 / 0.752952 / 0.849238 / 0.801095
ABCC5+ANKRD22+EMP1 / 0.751333 / 0.850608 / 0.800971
C7orf54+SERPING1+SERPINI1 / 0.772476 / 0.827438 / 0.799957

Figure 4: Classification accuracies by the top 100 k-gene markers

5. Top k-gene markers for distinguishing breast cancer

Tables 13-15 give the top five one-gene, two-gene and three-gene special biomarkers of breast cancer. These genes can be used as biomarkers for breast cancer, and simultaneously they are found to be effective discriminators for breast cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 5. We also can see that the good 3-gene discriminators for breast cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.

As noted, ITGB1BP1 is suggested to have a possible role in cell adhesion [46]. HS2ST1 is reported to play a key role in generating a myriad of distinct heparan sulfate fine structures [47]. TINAGL1 is reported as a metastasis suppressive protein [48]. Moreover, several of the top genes as biomarkers have been reported to be cancer relevant. For example, MET is reported to be downregulated in fibrocystic disease of the breast and invasive ductal carcinoma of the breast [49]. OXTR has been suggested to contribute to prostate cancer invasion and metastasis [50]. HOXA9 is reported to be downregulated in human breast cancers and associated with tumor aggression, metastasis, and patient mortality [51]. The gene INTS6 is a candidate tumor suppressor and located in the critical region of loss of heterozygosity [52]. LAMC1 has been found over expressing in breast cancer [53].

Table 13: One-gene Marker for breast cancer

Markers / Accuracy1 / Accuracy2 / Mean
MET / 0.65912 / 0.771069 / 0.715095
C3orf64 / 0.743565 / 0.65559 / 0.699577
OXTR / 0.721667 / 0.674907 / 0.698287
HOXA9 / 0.697685 / 0.692685 / 0.695185
DNAJC1 / 0.688935 / 0.692359 / 0.690647

Table 14: Two-genes Marker for breast cancer

Markers / Accuracy1 / Accuracy2 / Mean
ITGB1BP1+OXTR / 0.807824 / 0.807102 / 0.807463
HS2ST1+OXTR / 0.787269 / 0.808693 / 0.797981
ASPH+OXTR / 0.834537 / 0.757913 / 0.796225
HOXA9+TBC1D16 / 0.75 / 0.835229 / 0.792615
C3orf64+TBC1D16 / 0.789352 / 0.793669 / 0.79151

Table 15: Three-genes Marker for breast cancer

Markers / Accuracy1 / Accuracy2 / Mean
DNAJC1+HS2ST1+OXTR / 0.868889 / 0.833863 / 0.851376
ITGB1BP1+OXTR+TINAGL1 / 0.855556 / 0.838002 / 0.846779
INTS6+ITGB1BP1+OXTR / 0.846991 / 0.838636 / 0.842813
DNAJC1+LAMC1+ZNF507 / 0.854398 / 0.831128 / 0.842763
ATP9A+EIF2AK3+OXTR / 0.848148 / 0.83631 / 0.842229

Figure 5: Classification accuracies by the top 100 k-gene markers

6. Top k-gene markers for distinguishing thyroid cancer

Tables 16-18 give the top five one-gene, two-gene and three-gene special biomarkers of thyroid cancer. These genes can be used as biomarkers for thyroid cancer, and simultaneously they are found to be effective discriminators for thyroid cancer and other cancer types. The average classification accuracies of top 100 k-gene special markers are showed in figure 6. We also can see that the good 3-gene discriminators for thyroid cancer have higher classification accuracies than the top 2-gene discriminators, and similarly the 2-gene discriminators are better than 1-gene.