Supplementary Document for

Extending Gene Ontology with Gene Association Networks

Jiajie Peng, Tao Wang, Jixuan Wang, Yadong Wang∗and Jin Chen∗

Fig S1……………………………………………………………………………………………………………………………………….2

Fig S2……………………………………………………………………………………………………………………………………….2

Fig S3……………………………………………………………………………………………………………………………………….3

Fig S4……………………………………………………………………………………………………………………………………….3

Fig S5……………………………………………………………………………………………………………………………………….4

Fig S6……………………………………………………………………………………………………………………………………….4

Fig S7……………………………………………………………………………………………………………………………………….5

Fig S8……………………………………………………………………………………………………………………………………….5

Table S1…….…………………………………………………………………………………………………………………………….6

Table S2…….…………………………………………………………………………………………………………………………….6

Table S3…….…………………………………………………………………………………………………………………………….7

Table S4…….…………………………………………………………………………………………………………………………….11

Table S5…….…………………………………………………………………………………………………………………………….11

Table S6…….…………………………………………………………………………………………………………………………….11

1

SUPPLEMENTARY FIGURES

Fig S1. An example to visualize the predicted new term using OBO-Edit.

Fig S2. Performance comparison of predicted gold-standard term rate (CGR) on five GO term prediction methods using biological process.

Fig S3. Performance comparison on the correct prediction rate (A) and F1 (B) on five GO term prediction methods. All the methods were applied on the 2009 version of GO in biological process category and were evaluated on the 2011 and 2013 version of GO.

Fig S4. Performance comparison of precision and recall on five GO term prediction methods using the FDR based criteria. All the methods were applied on the 2009 version of GO in biological process category and were evaluated on the 2011 and 2013 version of GO. Colors represent different methods, and shapes represent versions of GO.

Fig S5. Performance comparison of predicted gold-standard term rate (CGR) on five GO term prediction methods using cellular component.

Fig S6. Comparison of the running time between GOExtender and NEXO using different input networks.

Fig S7. Comparison of the running time between GOExtender and NEXO using different GO data.

Fig S8. The F1 scores of GOExtender using different biological networks on the 2013 version of GO in biological process and cellular component categories.

SUPPLEMENTARY TABLES

Table S1. The correct prediction rate (CPR), correctly predicted gold-standard term rate (CGR) and F1 on the GO biological process category of January 2007, January 2009, January 2011 and January 2013.

Table S2. The correct prediction rate (CPR), correctly predicted gold-standard term rate (CGR) and F1 on the GO molecular function category of January 2007, January 2009, January 2011 and January 2013.

Table S3. The predicted new terms with multiple parents on the most recent GO biological process data (2015 January). The “parent term set” and “gene set” columns represent the parent set and gene set correponding to the predicted terms.

ID / parentTerm / parentTermName / geneset_name
1 / [0051301, 0051321, 0007067, 0007059] / [ cell division, meiotic cell cycle, mitotic nuclear division, chromosome segregation] / CTF3 SLK19 OKP1 MCM16 NKP2 MCM22 CTF19
2 / [0000122, 0006366, 0045944] / [ negative regulation of transcription from RNA polymerase II promoter, transcription from RNA polymerase II promoter, positive regulation of transcription from RNA polymerase II promoter] / RGT1 RGR1 SSN2 SRB8 MED8 MED1 MED7 CSE2 NUT2
3 / [0051301, 0051321, 0007067] / [ cell division, meiotic cell cycle, mitotic nuclear division] / IML3 CTF3 MCM21 CHL4
4 / [0006364, 0000398, 0008033] / [ rRNA processing, mRNA splicing, via spliceosome, tRNA processing] / LSM4 LSM5 LSM6 LSM7 LSM3 LSM8
5 / [0051301, 0007067, 0007059] / [ cell division, mitotic nuclear division, chromosome segregation] / NKP2 CNN1 SHE1 SLK19 CHL4 MCM16 CTF3 MCM22 OKP1 CTF19
6 / [0006310, 0006260, 0006281] / [ DNA recombination, DNA replication, DNA repair] / RFA3 RFA2 RFA1 MEC1
7 / [0051301, 0051321, 0030435] / [ cell division, meiotic cell cycle, sporulation resulting in formation of a cellular spore] / ADY3 SSP1 SPO21 CNM67 MPC54
8 / [0000122, 0016568, 0045944] / [ negative regulation of transcription from RNA polymerase II promoter, chromatin modification, positive regulation of transcription from RNA polymerase II promoter] / RPD3 HDA1 DEP1 SIN3
9 / [0006468, 0051301, 0007067] / [ protein phosphorylation, cell division, mitotic nuclear division] / CDC5 CDC15 CDC7 CDC28 SWE1
10 / [0016192, 0006886] / [ vesicle-mediated transport, intracellular protein transport] / APL1 CHC1 APM4 CLC1 APS2 APL3 GGA1 GGA2
11 / [0055085, 0006811] / [ transmembrane transport, ion transport] / NHA1 ENA2 ENA5 ENA1
12 / [0051301, 0007067] / [ cell division, mitotic nuclear division] / CTF19 MCM16 MCM22 CHL4
13 / [0006281, 0016568] / [ DNA repair, chromatin modification] / ARP4 YAF9 SWC4 EAF1 YNG2 EAF6 RVB1 RVB2 INO80 IES2 EAF7 ESA1 EAF5 EAF3 TRA1
14 / [0006281, 0006366] / [ DNA repair, transcription from RNA polymerase II promoter] / TFB5 SSL1 RAD6 SOH1 RPB4 NHP6B RPB9 RAD2 RAD3 TFB4 TFB2 SSL2 TFB1
15 / [0051321, 0030435] / [ meiotic cell cycle, sporulation resulting in formation of a cellular spore] / DMC1 IME1 RIM4 SPO11 MEI5 IME2 IME4 SPO1 SPO13 XRS2 SPO14 SPO12 RED1
16 / [0006260, 0055114] / [ DNA replication, oxidation-reduction process] / RNR4 RNR3 RNR1 RNR2
17 / [0006260, 0006281] / [ DNA replication, DNA repair] / MRC1 CTF4 POL2 TOF1
18 / [0016192, 0006886] / [ vesicle-mediated transport, intracellular protein transport] / SFB3 SEC24 SAR1 SFB2
19 / [0006364, 0008033] / [ rRNA processing, tRNA processing] / POP7 POP6 POP3 POP8
20 / [0008652, 0055114] / [ cellular amino acid biosynthetic process, oxidation-reduction process] / LYS1 LYS12 LYS2 LYS9
21 / [0006629, 0055114] / [ lipid metabolic process, oxidation-reduction process] / ERG24 ERG26 ERG11 ERG4 ERG27 ERG25 ERG3 ERG9
22 / [0034599, 0055114] / [ cellular response to oxidative stress, oxidation-reduction process] / TSA2 TRR1 TRR2 GLR1
23 / [0006468, 0051301] / [ protein phosphorylation, cell division] / SWE1 KCC4 CDC28 CDC5
24 / [0006281, 0051321] / [ DNA repair, meiotic cell cycle] / MRE11 SAE2 MMS4 RAD57 MUS81 RDH54 CST9 RAD50 XRS2
25 / [0008033, 0045944] / [ tRNA processing, positive regulation of transcription from RNA polymerase II promoter] / KAE1 PCC1 CGI121 BUD32 GON7
26 / [0055114, 0006811] / [ oxidation-reduction process, ion transport] / FRE2 FRE4 FET3 FRE3
27 / [0006468, 0007165] / [ protein phosphorylation, signal transduction] / SLT2 MKK2 MKK1 PKC1
28 / [0006310, 0006281] / [ DNA recombination, DNA repair] / NSE5 RAD52 RAD51 SMC5 KRE29 RFA1 RAD59 SGS1 MMS4 MUS81 MSH3 HED1 RDH54 SLX1 SLX4 SMC6 NSE1 MMS21 MSH2 NSE3 NSE4 MEC1 RFA3 RFA2
29 / [0006281, 0051301] / [ DNA repair, cell division] / RTT101 HRT1 MMS22 MMS1
30 / [0006260, 0006281] / [ DNA replication, DNA repair] / POL30 RAD27 CDC9 RRM3
31 / [0006366, 0016568] / [ transcription from RNA polymerase II promoter, chromatin modification] / TAF6 TAF1 TAF12 SPT3 HFI1 TAF5 TAF10 TAF9
32 / [0016192, 0006914] / [ vesicle-mediated transport, autophagy] / SEC17 BET3 VAM3 SEC16 TRS31 YPT32 TRS20 YPT1 YPT31 BET5 TRS33 TRS23
33 / [0016568, 0045944] / [ chromatin modification, positive regulation of transcription from RNA polymerase II promoter] / RPD3 HDA1 SIN3 ISW1 SAP30 UME1 SDS3 PHO23 DEP1
34 / [0008652, 0055114] / [ cellular amino acid biosynthetic process, oxidation-reduction process] / MET1 MET10 MET8 MET16
35 / [0006310, 0006281] / [ DNA recombination, DNA repair] / YKU70 YKU80 CDC9 LIF1
36 / [0016568, 0045944] / [ chromatin modification, positive regulation of transcription from RNA polymerase II promoter] / SGF29 TRA1 SPT8 SUS1
37 / [0000122, 0016568] / [ negative regulation of transcription from RNA polymerase II promoter, chromatin modification] / RPD3 RXT2 DEP1 SIN3
38 / [0006260, 0006281] / [ DNA replication, DNA repair] / DNA2 RFA1 RFA2 SPT16 POB3 MEC1 RFA3
39 / [0055085, 0006811] / [ transmembrane transport, ion transport] / VMA2 VMA9 VMA8 VMA10
40 / [0000122, 0045944] / [ negative regulation of transcription from RNA polymerase II promoter, positive regulation of transcription from RNA polymerase II promoter] / SIN3 DEP1 ESS1 HDA1
41 / [0000122, 0045944] / [ negative regulation of transcription from RNA polymerase II promoter, positive regulation of transcription from RNA polymerase II promoter] / RIM101 YTA7 RGT1 MED1 SPT6 MED2 SSN2 SRB8 RME1 SSN8 MED7 ROX3 SIN4 RGR1 SRB7 PGD1 MED8 GAL11 CSE2 NUT2 MED11

Table S4. Sensitivity analysis of parameter p1 fixing p2 as 10 and p3 as 4.

P1 / CPR / CGR / F1
25 / 0.17 / 0.27 / 0.21
50 / 0.16 / 0.26 / 0.20
75 / 0.13 / 0.22 / 0.17
100 / 0.14 / 0.23 / 0.17

Table S5. Sensitivity analysis of parameter p2 fixing p1 as 50 and p3 as 4.

P2 / CPR / CGR / F1
10 / 0.16 / 0.26 / 0.20
20 / 0.16 / 0.26 / 0.20
30 / 0.13 / 0.23 / 0.16
40 / 0.12 / 0.22 / 0.16

Table S6. Sensitivity analysis of parameter p3 fixing p1 as 50 and p2 as 10.

P3 / CPR / CGR / F1
4 / 0.16 / 0.26 / 0.20
10 / 0.12 / 0.18 / 0.14
20 / 0.12 / 0.26 / 0.17
30 / 0.03 / 0.10 / 0.05

1