Table 3 Statistics of gene information

Gene number / 4711
Gene total length / 7382899 bp
Gene average length / 1567 bp
Coding sequence GC / 41%
Coding sequence length / 1547.03
Exon number / 1.09
Exon length / 1425.12
Intron length / 237.36
Intron GC / 37%
Single exon gene / 4341

Table 4 Statistics of KOG function classify

Type / Number / Function
CELLULAR PROCESSES AND SIGNALING
M / 35 / Cell wall/membrane/envelope biogenesis
N / 1 / Cell motility
O / 361 / Post-translational modification, protein turnover, and chaperones
T / 236 / Signal transduction mechanisms
U / 267 / Intracellular trafficking, secretion, and vesicular transport
V / 25 / Defense mechanisms
W / 5 / Extracellular structures
Y / 29 / Nuclear structure
Z / 85 / Cytoskeleton
Total / 1044
INFROMATION STORAGE AND PROCESSING
A / 184 / RNA processing and modification
B / 73 / Chromatin structure and dynamics
J / 276 / Translation, ribosomal structure and biogenesis
K / 213 / Transcription
L / 161 / Replication, recombination and repair
Total / 907
METABOLISM
C / 178 / Energy production and conversion
D / 147 / Cell cycle control, cell division, chromosome partitioning
E / 196 / Amino acid transport and metabolism
F / 70 / Nucleotide transport and metabolism
G / 124 / Carbohydrate transport and metabolism
H / 77 / Coenzyme transport and metabolism
I / 115 / Lipid transport and metabolism
P / 84 / Inorganic ion transport and metabolism
Q / 59 / Secondary metabolites biosynthesis, transport, and catabolism
Total / 1050
POORLY CHARACTERIZED
R / 453 / General function prediction only
S / 226 / Function unknown
Total / 679

Table 5 KOGSecond level functional classification of C.versatilis

Cellular processes and signaling / Infromation storage and processing / Metabolism / Poorly characterized
M / 35 / 0.95% / A / 184 / 5% / C / 178 / 4.80% / R / 453 / 12.30%
N / 1 / 0.03% / B / 73 / 1.90% / D / 147 / 4.00% / S / 226 / 6.10%
O / 361 / 9.80% / J / 276 / 7.50% / E / 196 / 5.30%
T / 236 / 6.40% / K / 213 / 5.80% / F / 70 / 1.90%
U / 267 / 7.30% / L / 161 / 4.40% / G / 124 / 3.40%
V / 25 / 0.68% / H / 77 / 2.10%
W / 5 / 0.14% / I / 115 / 3.10%
Y / 29 / 0.79% / P / 84 / 2.30%
Z / 85 / 2.30% / Q / 59 / 1.60%
Total / 1044 / 28% / Total / 907 / 25% / Total / 1050 / 29% / Total / 679 / 18%

Table 6 Predicted rRNA of C.versatilis

Amino acid / Counts / Amino acid / Counts / Amino acid / Counts
Ala / 25 / Gly / 22 / His / 4
Ser / 21 / Pro / 11 / Glu / 14
Arg / 14 / Thr / 14 / Leu / 21
Lys / 24 / Val / 19 / Phe / 7
Asp / 11 / Ile / 12 / Asn / 10
Gln / 12 / Met / 10 / Cys / 5
Tyr / 7 / Trp / 6 / Total / 269

Table 7 Predicted transposon of C.versatilis

Name / Sequence Begin / End / Transposon style / E-value
scaffold00001 / 575101 / 575985 / helitronORF / E=9e-28
scaffold00001 / 576328 / 576534 / helitronORF / E=4e-15
scaffold00001 / 1118129 / 1118452 / DDE_1 / E=7e-07
scaffold00001 / 1471793 / 1472416 / gypsy / E=3e-07
scaffold00002 / 37455 / 38480 / TY1_Copia / E=8e-07
scaffold00002 / 908918 / 909967 / DDE_1 / E=4e-17
scaffold00002 / 997574 / 998833 / TY1_Copia / E=4e-09
scaffold00002 / 1172986 / 1173903 / helitronORF / E=1e-29
scaffold00002 / 1174228 / 1174503 / helitronORF / E=1e-15
scaffold00003 / 826021 / 826839 / gypsy / E=5e-06
scaffold00008 / 543425 / 543547 / TY1_Copia / E=3e-07
scaffold00008 / 773974 / 774213 / TY1_Copia / E=7e-08
scaffold00009 / 345411 / 345785 / gypsy / E=1e-15
scaffold00009 / 351262 / 351729 / gypsy / E=2e-06
scaffold00009 / 847137 / 847679 / LINE / E=4e-12
scaffold00010 / 195650 / 196807 / TY1_Copia / E=2e-08
scaffold00010 / 796124 / 796816 / TY1_Copia / E=8e-11
scaffold00010 / 828017 / 828496 / gypsy / E=7e-09
scaffold00010 / 870350 / 870754 / gypsy / E=4e-06
scaffold00025 / 28856 / 29329 / TY1_Copia / E=7e-06
scaffold00029 / 3631 / 3825 / gypsy / E=4e-08
scaffold00039 / 3028 / 3108 / gypsy / E=4e-07
scaffold00039 / 3141 / 3989 / gypsy / E=1e-13
scaffold00039 / 3985 / 5010 / gypsy / E=2e-50
scaffold00050 / 2771 / 2953 / gypsy / E=3e-06
scaffold00050 / 2952 / 3365 / gypsy / E=2e-29
scaffold00059 / 2247 / 2411 / helitronORF / E=6e-06

Table 8 Genome information comparison between C.versatilis, Z.rouxii CBS732 andS.cerevisiae S288c

Strains / C. versatilis / Z. rouxii CBS 732 / S.cerevisiae S288c
Genome size / 9.7M / 9.76M / 12.15M
Sequencing coverage / 11.8× / 11.1×
Number of protein / 4711 / 4994 / 5907
Number ofscaffold / 52 / 9 / 17
Length of N50 / 1,229,640 / 1,496,342
G+Ccontent / 39.74% / 39.15% / 38.16%