p. 1
Supplementary data Koduri et al.
Fig. S1 Amino acid sequence alignments of P. patens chalcone synthase (CHS) superfamily.
Highlighted are the catalytic triad of CHN (Cys170-His309-Asn342, numbering of PpCHS) (Ferrer et al. 1999), the active site Phe residues (Phe221 and Phe271) (Jez et al. 2000; Jez et al. 2002), the GFGPG loop (Suh et al. 2000), and the conserved Arg residues (Fukuma et al. 2007). Intron split sites are highlighted green. Other residues discussed in the text (N-ends of PpCHS3 and PpCHS5, etc) are highlighted cyan.
PpCHS ------MASAGDVTRAALPRAQPRAEG-PACVLGIGTAVPPAE 36
PpCHS1a/1c ------MASAGDVTRTALPRGQPRAEG-PACVLGIGTAVPPAE 36
PpCHS01 ------MASAGDVTRVALPRGQPRAEG-PACVLGIGTAVPPAE 36
PpCHS7 ------MASAGEVTRAALPRGQPCAEG-PACVLGIGTAVPLAE 36
PpCHS2 ------MAP-SGEVDVQGAA------TRSALPRGQPRAEG-PACVLGVGTAVPPAE 42
PpCHS3 ------MAPRAGELDVAASDEQVAAAPLVRMHAPIPRGQPRAEG-PACVLGIGTAVPPTE 53
PpCHS5 ------MAPRAGELDIAASDEQVAAAPLVRMHAPIPRGQPRAEG-PACVLGIGTAVPPTE 53
PpCHS4 ------MAP-AGEVEAEVRA------TRAVLPRGQPRAEG-PACVLGIGTAVPPTE 42
PpCHS6 ------MAPPSGESISASAEEPIALSV------LPRGQPRAEG-PASVLGIGTAVPPTE 46
PpCHS9 ------MASYVERGAVTNGHGHKVLQQPHPRLVPLPDG-PTCVFAIGTACPPTT 47
PpCHS10 MASRRVEAAFDGQAVELGATIPAANGNGTHQSIKVPGHRQVTPG-KTTIMAIGRAVPANT 59
PpCHS11 TKAGIEIKIMSDLGTESNGVAAHTNTNDIRCEGYVPYAVKLVEQRPPGILGMGTANPPHT 106
. ::.:* * *
PpCHS FLQSEYPDFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95
PpCHS1a/1c FLQSEYPEFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95
PpCHS01 FLQSEYPDFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95
PpCHS7 FLQSEYPDFFFNITNCGEKEALKAKFKCICDKSGIKKRHMFLT-EEVLKANPGICTYMEP 95
PpCHS2 FLQSEYPDFFFNITNCGEKDALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 101
PpCHS3 FLQSEYPDFFFNITNTSEKEALHAKFKRICDKSGIRKRHMFLT-EEVLKANPSMCTYMEP 112
PpCHS5 FLQSEYPDFFFNITNTSEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 112
PpCHS4 FLQSDYPDFFFNITNTSDQEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 101
PpCHS6 FLQSEYPDFFFEVTKCSEKEALKAKFKRICDKSGIRKRYLFLT-KEVLEANPGIATYMEP 105
PpCHS9 IEQKTYPEKLFEMCGVGDNKPLLQKLKYMCDTSCIEKRHAFVT-EEVVKEYPEFASYSDK 106
PpCHS10 TFNDGLADHYIQEFNLQDP-VLQAKLRRLCETTTVKTRYLVVN-KEILDEHPEFLVDGAA 117
PpCHS11 YKMDEFAKILAKP-EFNGPPGAEVFVDRICKASGIKKKHTAVTADEVYAGYPNIYNFGEP 165
.. . . :*. : .:: : .*: * :
PpCHS SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155
PpCHS1a/1c SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155
PpCHS01 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155
PpCHS7 SLNVRHDVVVVLVPKLAAEAALKAIKEWCSCKSNITHIVFATTSGVNMPGADHALAKLLG 155
PpCHS2 SLNVRHDIVVVQVPKLAAEAAQRAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 161
PpCHS3 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 172
PpCHS5 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 172
PpCHS4 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 161
PpCHS6 SLNVRHDIVVVQVPKLAAEAAVKAIKEWGGRKSEITHIVFATTSGVNMPGADHAMAKLLG 165
PpCHS9 SLTTRLNMANKVVPEIAVEAAMNAVQEWGRPLSDITHMVVATTSTLSIPGTDFVIARKLG 166
PpCHS10 TVSQRLAITGEAVTQLGHEAATAAIKEWGRPASEITHLVYVSSSEIRLPGGDLYLAQLLG 177
PpCHS11 SLDDRFKLFEKQGMNISIECSERALKDWGGDRSAITHLIVFSSTGMLTPAIDYRLLEALN 225
:: * : ::. *.: *:::* * ***:: ::: : *. * : . *.
PpCHS LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211
PpCHS1a/1c LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211
PpCHS01 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211
PpCHS7 LKPTVKRIMMCQTGYFGGASVLRVAKDLAENIKGARVRAVASEVTAVIYR----ASSKHH 211
PpCHS2 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 217
PpCHS3 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 228
PpCHS5 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 228
PpCHS4 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 217
PpCHS6 LKPTVKRVMLYQTGCFGGATVLRVAKDLAENNKNARVLAVCSEVTAVTYR----APNENH 221
PpCHS9 LKPSVQRIFMNQVGCWGGGAVMRVGRILAESAKDARVLVIAAEANTIMNFRKPTEETFYK 226
PpCHS10 LRSDVNRVMLYMLGCYGGASGIRVAKDLAENNPGSRVLLITSECTLIGYK----SLSPDR 233
PpCHS11 LSPNVKHYFVSFLGCHGGVIGLRTACEIAEADPKHRVLIVCTELSSVQAQN--IDPAFTR 283
* . :.: :: * ** ::.. :** ** : :* .
PpCHS LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271
PpCHS1a/1c LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271
PpCHS01 LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271
PpCHS7 LDGLVGSALFGDGVCVYVVGSDPKPEVEKLLFKKHWAGVTILPESDGAIDGHLPEAGFIF 271
PpCHS2 LDGLVGSALFGDGAGVYVVGSDPKPEVEKALFEVHWAGETILPESDGAIDGHLTEAGLIF 277
PpCHS3 LDGLVGSALFGDGAGVYVVGSDPKPEAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 288
PpCHS5 LDGLVGSALFGDGAGVYVVGSDPKPEAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 288
PpCHS4 LDGLVGSALFGDGAGVYVVGSDPKPQAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 277
PpCHS6 LDGLVGSALFGDGAAVFVVGADPKP-EEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 280
PpCHS9 VDYFLAHVTLGDGAAALILGADPKLNHERPLYEMYWSSQTAIEGSAEAIVGTFSDAGLVQ 286
PpCHS10 PYDLVGAALFGDGAAAMIMGKDPIPVLERAFFELDWAGQSFIPGTNKTIDGRLSEEGISF 293
PpCHS11 LNNIVTLTIFGDGAGAVVVGQ--PSKTEVPFFEMIRCKSTIIPNTSKSISVMITQHGLDA 341
:: . :***. . ::* * ::: . : : :* : : *:
PpCHS HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330
PpCHS1a/1c HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330
PpCHS01 HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330
PpCHS7 HLMK-DVPGLIFKNIKKFLNEARKPVGSPAWNEMFWAVHPEGPAILNQVEAKLKLTKDKM 330
PpCHS2 HLMK-DVPGLISKNIEKFLNEARKCVGSPDWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 336
PpCHS3 HLMK-DVPGLISKNIEKFLNEARKCVGSPEWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 347
PpCHS5 HLMK-DVPGLISKNIEKFLNEARKCVGSPEWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 347
PpCHS4 HLMK-DVPGLISKNIEKFLNEARKCVGSPQWNDMFWAVHPGGPAILDQVEAKLKLSKDKM 336
PpCHS6 HLMK-DVPGLISKNIEKFLSEARKCVGSPDWNDMFWAVHPGGPAILDQVEAKLKLSKDKM 339
PpCHS9 SLQKNVVPDILGKHLKGLVSEGMELIGSPSPTDMFWVVHPGAYRILEVVSETMDIKKEKL 346
PpCHS10 KLGR-ELPKLIESNIQGFCDPILKRAGGLKYNDIFWAVHPGGPAILNAVQKQLDLAPEKL 352
PpCHS11 NLEK-DVPKNVSSSTGVFMKSLLDEFG-LDFASVGWAAHPGGKPILDAIEKVCGLLPDQL 399
* :* : : . * .: *..** . **: :. : :::
PpCHS QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388
PpCHS1a/1c QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388
PpCHS01 QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388
PpCHS7 QGSKDILSEFGNRSSASVLFVLDQIRHKFVKMGAS--TLGEDSEFGFFIGFGPGLTLEVL 388
PpCHS2 QGSRDILSEYGNMSSASVLFVLDQIRQRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 394
PpCHS3 QGSRDILSEYGNMSSSSVLFVLDQIRQRSVKMGAS--TLGEGSDFGFFIGFGPGLTLEVL 405
PpCHS5 QGSRDILSEYGNMSSSSVLFVLDQIRQRSVKMGAS--TLGEGSDFGFFIGFGPGLTLEVL 405
PpCHS4 QGSRDVLSEFGNMSSSSVLFVLDQIRHRSLEMRSS--TLGEGSEFGFFIGFGPGLTLEVL 394
PpCHS6 QGSRDVLSEFGNMSSSSVLFVLDQIRQRSMKMGAS--TTGEGNDFGFFIGFGPGLTLEVL 397
PpCHS9 QPSWDILRDFGNISSPTCLFVLDEMRKYSKRTGAA--TTGEGCEWGFLVGLGPGFNVELT 404
PpCHS10 QTARQVLRDYGNISSSTCIYVLDYMRHQSLKLKEANDNVNTEPEWGLLLAFGPGVTIEGA 412
PpCHS11 ENSRSVLENKGNMSSASVFFVLDEFRKK------GRVAGRDWTVALGFGPGISIEGV 450
: .:* : ** :*. ::*** .*: :: :.:***..:*
PpCHS VLRAAPNSA 397
PpCHS1a/1c VLRAAPNSA 397
PpCHS01 VLRAAPNSA 397
PpCHS7 VLRAAPNSA 397
PpCHS2 VLRAAATV- 402
PpCHS3 VLRAAANV- 413
PpCHS5 VLRAAAANV 414
PpCHS4 VLRAAVNV- 402
PpCHS6 VLRSMPIV- 405
PpCHS9 LLLSVPF-- 411
PpCHS10 LLRNLC--- 418
PpCHS11 LLRNIYH-- 457
:*
Fig. S2 Putative translation initiation sites of PpCHS11.
(1) Nucleotide sequence of the 5′-end region. The four ATG codons are highlighted in color, and the surrounding sequences are also highlighted in gray.
ATGTCCACACGTAGCGAGCTAGTTCGTTCATGACCTTTGGGAACTCTCCTGGGAGTGCCA 60
CTTAAATTGTGGTTGTAAAAAAGTGATATGTTTTCAGAAGCTGAGGAGGAATCGAAGCTT 120
TTGGATTAACAATGAGCTTATATTAGTCGCAAAAGGATGACCCTCTTGAGCTATAAAGAC 180
GAGCTAGCCCCTGGCGGAGGGGTCAGATGCAACTAGACTGCAGAGCCTGTGAGACCAGTA 240
CAAGCATTGAAGTCTACATGAATCCTCCGCTTCGCAGTCACATGTATATCTGCTAGTCTT 300
CATGCCATGTACACTAAGCTTCCAGTCAACTTAAATGCTGCCCATTGAAACCTTCTTTCT 360
TGTCTGCTCAAGCTCAGACTCGGTGTCTTCATTGTGACCAACCTTCGGCGAAGACAACCG 420
ACGGTCTTGAGCAGTTCCTGTTTCTTCTTCCTCTTTTTTATCTACTTTTGTTTTTCTCCA 480
ACCATTCAGTTTCTTGTATCATGGCACCAGCATCCGACTCCGCGGTGGAAGAGCCGAGTC 540
TCGCCAATACAGGTGCGGTGATGAAGTCCTTGAGTGATCTGGTGGTGCAGAACGGCAATG 600
GAGTGCATGTCCGCTGCCGAGACGATGGACTCAAGGACACCAAGGCTGGAATAGAAATCA 660
AAATCATGTCAGACTTGGGCACTGAAAGCAACGGCGTTGCGGCGCATACTAACACCAATG 720
ACATAAGGTGCGAGGGTTATGTGCCGTATGCAGTAAAACTGGTGGAGCAAAGGCCGCCTG 780
GTATACTGGGCATGGGGACTGCCAATCCTCCGCACA
(2) Deduced amino acid sequence of the 5′-end region. The four Met residues are highlighted in matching color with ATG codons above.
VHT-RASSFMTFGNSPGSAT-IVVVKK-YVFRS-GGIEAFGLTMSLY-SQKDDPLEL-RR
ASPWRRGQMQLDCRACETSTSIEVYMNPPLRSHMYIC-SSCHVH-ASSQLKCCPLKPSFL
SAQAQTRCLHCDQPSAKTTDGLEQFLFLLPLFYLLLFFSNHSVSCIMAPASDSAVEEPSL
ANTGAVMKSLSDLVVQNGNGVHVRCRDDGLKDTKAGIEIKIMSDLGTESNGVAAHTNTND
IRCEGYVPYAVKLVEQRPPGILGMGTANPPH
(3) Comparison of the putative translation initiation sites with the plant initiation consensus sequence.
Consensus sequence: 5′- caA(A/C)aA+1TGGCg
Site 1 (position at 501): 5′- GTA T CA+1TGGCA
Site 2 (position at 561): 5′- CGG T GA+1TGAAG
Site 3 (position at 666): 5′- AAA T CA+1TGTCA
Site 4 (position at 792): 5′- TGG G CA+1TGGGG
Fig. S3 Sequence analysis of PpCHS8.
(1) Nucleotide sequence of PpCHS8. The proposed exons are in black, putative intron (625 nt) in green and untranslated regions in gray. The start and stop codons are highlighted and the ATG codon that may lead to a truncated protein is underlined.
ACAACTAATGTCAGCTCCTTTCAAAGGGTAATAATGTTAAGAGGCAGACAGTTTGCTACAGGCTCATGTAGCCACTGGTGATTCACTCTCAATTGCTGGGCAATATGTTCTGAGAAGATGTATGCACAGTTCAAGGCCACATCCTTTTCATGTAAACTTGTGCATACTGAAACTCAAATCCCCTTTCAATTTATGCAAGGATGAACGTACAAAGCATGATTGATTTATGAGCAGTATTGCAGCTCATGCAGGCAACATTGCAATCAATCTCACCCTACCCTTAAAATTCAGCCAAACGTACAGAAATTCTTGGTGTAGCAAATGAATTCCTTGTCCACTCTTAACTATTCCTACTTCCACCAATCCAGTTTTGAAAATTTATTTGAAGAATTTATTTCCAAATAAAAAGTTGTCTGATGTCGAAGCGAATCGCGCCGTTGGAACTTGAAAGAAATGAAACATTCCACTTCTGTAGGAACTAGTCAGTACACATCAGTTGGCGAAACGAGCTCTGAGCTGATTTCAAATATGAGAGCCAGAGGAGCTGTATTACTCGAAATGAAAGAGAGCTTCTTATATCCGGACTCACAGCATGAGAACCTTGGGAGGAGGAAGGCTCATTGATCACTTCGGTGGGTGCCAACTGAGACCCATGGTCTTCGAGAAAATGAATAGACTTGACCGGCGGTATGTTTTACTGTTCCACAGTGGCTGCTTTGGAGGCTCGACAAGTCAACTTTGACTGGTCTTGCATCAAGTCCATTCATCTG
ATGGAGCCCCGCTCTTTGTCACTGGGGAAGATCTCACTGAGAATCTACGAACTTTTTGCAGAAGGGCCGGTCTGCAGAAGGGCCGGCGGTGATTCTTAGCATCGGAACTGCGGTGCCTCCCTATGTGCACGAGATGGGGTCGTACGCCGATTACTACTTTGACGAGACCAACTGCAACCACAAGCCAGAGCTGAAGGCGAAGTTCAAGCGCATCTGCGATAAGATGCACATTAGCAAACGTCACATGGTAGTGCGGAAGGAGCTCCTGGCGCAATACCCCTCTCTGGGAACATACCTCAACAACAGCCTGGAGGACCGCCACAAGGTGTGCATGGAATGGGTACCGAAGTTGGCAGTTGAGGCAGCTGAGAACGCAATCAAGGAATGGGGCGGGTCACTCTCACAGATTACCCATATTGTCATGGCGACTACAAGCGTCGTTAACATGCCCGGAGTAGATCTTCTCGTTGCCAAAGCGCTCGGCCTCTCACCCAAGCTGCGCCGCGTGATGATGTATCAAACAGGATGCTGGGGAGGAGCCGCGATCATTCGGGTGGCAAAAGACATCGCCGAGAATAACAAGGGGGCACGTGTTCTTGTTGTCGCAAGCGAGTGTACTGCAACCTTCTTCCGCGCCCCTAGTGAGGAATATCTCGATGGACTGGTGGGCCAGGCATTGTTCGGCGACGGCGCTGGAGCACTTGTCATCGGCGCTGATCCCAACCCCGACACTGAAAGGACTCTATACGAGATTCAGTGGAGCGGGGAGA
TGGTGGTACCTGATAGCGAGGGCGCCATCGATGGCCACATGATGGAGGCCGGCATGTACTACCACCTCAAGCCAGACATCCCGAAACTAGTCTCCCGCAGCATCGAGGAGTTCGTTTCCGACGCCACCGCTCAAGCAGGAAATGCCGACGTCAACGACTTGTTCTGGGCTGTACACCCCGGAGGCGTCGCCATTCTCAATCAGATCGAGAACCAGCTGATGCTCTCTCCTGAGAAGCTGCTCGCCAGTCGGGAGATTCTGGCCGACTATGGCAACATGGCCAGTGCGTGCGTGCTTTTCGTCTTGGATCAAGTCCGAAACTGCTCGATCAAAGCCAAAGCATCCACTACCGGGGAAGGTAGGGACTTCGGATCGCTCATCGGCATCGGCCCAGGGCTCACGATGGAATGCTGTGTGCTGAAGTCTGTTCCACTTGATAATTAGCCTTGACGTATTGCAGGAATTTTTTGGACTTTTTTTTAAAACTTAGAAAGTGATTCAGTCTCGGGGATAACTGGTGAAAGCTTCCTATAATTTGTCATCTTAAACGGCTTCAGTATAGTAAGTTTGACAGCTGCGAGGACTGGGCAGGCCATTGCCGAACTCGAGTCACCGTTGTGATGACGAACTGCCAATGCAGCTTG
(2) Deduced amino acid sequence of PpCHS8. The 26 N-end residues that show sequence similarity to other P. patens CHS enzymes are highlighted as well as the CHN triad and GIGPG loop. The proposed exon-intron split site is marked by an arrow.
MNKGRSAEGPAVILSIGTAVPPYVHEMGSYADYYFDETNCNHKPELKAKFKRICDKMHISKRHMVVRKELLAQYPSLGTYLNNSLEDRHKVCMEWVPKLAVEAAENAIKEWGGSLSQITHIVMATTSVVNMPGVDLLVAKALGLSPKLRRVMMYQTGCWGGAAIIRVAKDIAENNKGARVLVVASECTATFFRAPSEEYLDGLVGQALFGDGAGALVIGADPNPDTERTLYEIQWSGEMVVPDSEGAIDGHMMEAGMYYHLKPDIPKLVSRSIEEFVSDATAQAGNADVNDLFWAVHPGGVAILNQIENQLMLSPEKLLASREILADYGNMASACVLFVLDQVRNCSIKAKASTTGEGRDFGSLIGIGPGLTMECCVLKSVPLDN-
Fig. S4 Sequence analysis of PpCHSpg1.
(1) Genome sequence of PpCHSpg1. The exon sequences are in black, putative intron (258 nt) in green and untranslated regions in gray. The proposed exon-intron split site is highlighted.
77988 TCGAGACAACCCCGAGTGAGCAGCTCGGACTCAACATCCATTCCGAACCCAAACGTGCCCATGGCACCAA
77918 GCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGAGGCCAGCCTCGCGCTGAGGG
77848 ACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCCGCGGAGTTCCTGCAGAGCGAGTACCCCGAC
77778 TTCTTCTTCAACATCACCAACTGCGGAAGGCCAAATTCAAGCGCATCTGTAAGTCCACCTCCACCGTGTG
77708 ACTCTCTTCGCGGCTCGTGATGTGCTCACTGTTCCCTGTGGGCCGGCCCTCACCCCCAGCCGCGCTCGAC
77638 GACGTGCTCATTGCACGGGCTGCGCGGCGCGAGCGCATTAGAATTGCGACTTGTGATTTGTTGCGGGCGT
77568 CGAGATGGGAATGCGGCGGTCGCGATGTTCCTGAAATTGGGTCGATTCCGGCCCCTTACGACGGCAGCTG
77498 ACGCTTGGCCATGAATGTCGATGCAGGTGACAAGTCGGGGATCCGCAAGCGCCACATGTTCCTCACGGAG
77428 GAGGTGCTCAAGGCTAACCCCGGCATCTGCACGTACATGGAGCCCTCCCTGAACGTCCGCCACGACATCG
77358 TCGTCGTCCAGGTCCCCAAGCTCGCCGCGGAGGCAGCCCAGAGGGCCATCAAGGAGTGGGGCGGCCGCAA
77288 GTCTGACATCACCCACATCGTGTTCGCCACCACCAGCGGCGTGAACATGCCCGGAGCCGACCACGCCCTG
77218 GCCAAGCTGCTGGGCCTGAAGCCCACGGTGAAGCGGGTCATGATGTACCAGACCGGGTGCTTTGGCGGTG
77148 CTTCCGTGCTCAGGGTGGCCAAGGATCTGGCGGAGAACAACAAGGGCGCCAGGGTGTTGGCGGTGGCCAG
77078 CGAGGTCACGGCCGTCACATACCGCGCACCCAGCGAGAACCACTTGGACGGCTTGGTGGGCTCGGCCCTG
77008 TTCGGCGATGGCGCCGGAGTGTACGTGGTGGGATCCGATCCCAAGCCGGAGGTGGAGAAAGCACTGTTCG
76938 AGGTGCACTGGGCGGGCGAGACGATCTTGCCAGAGAGTGATGGAGCCATTGATGGGCATCTGACGGAGGC
76868 GGGGCTCATCTTCCACCTCATGAAGGACGTGCCAGGGCTGATCTCCAAGAACATCGAGAAGTTCTTGAAC
76798 GAGGCCAGGAAGTGCGTCGGTTCGCCCGATTGGAACGAGATGTTCTGGGCGGTGCACCCGGGAGGCCCGG
76728 CCATTCTGGACCAGGTGGAGGCGAAGCTGAAGCTGACCAAGGACAAGATGCAGGGGAGCAGGGACATACT
76658 GTCGGAGTACGGCAACATGTCGTCGGCGTCGGTGTTGTTCGTGCTGGATCAGATTCGCCAGAGGTCGGTC
76588 AAGATGGGGGCGTCGACGCTGGGAGAGGGCAGCGAGTTTGGCTTCTTCATTGGATTCGGTCCGGGGCTCA
76518 CCCTGGAAGTGCTGGTCCTCCGGGCCGCGGCCACCGTTTGAGTTGTAGCAACGGGCACGGGCAAATCAAT
76448 ATCGTCTCCCTGGTTTTCCTTTTGCGCGCTGGAGTGGAGATCGTATGAGTGAGCGAATTATGATGATTCA
76378 TGGGTCAGAGAGCATCCAGGTTGTGGCACCAGAGGGTTTTACATTCAGGTCTGCAGCAAATGGACTTTCG
76308 ATGAAATTTGGTCAGTTGGATCAGCGCGCTA 76278
(2) Partial sequence alignment of PpCHS2a and PpCHSpg1 transcripts.
Upstream and downstream from the region shown, the sequences are identical. The proposed exon-intron split sites are highlighted.
PpCHS2a ATGGCACCAAGCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGA
PpCHSpg1 ATGGCACCAAGCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGA
************************************************************
PpCHS2a GGCCAGCCTCGCGCTGAGGGACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCC
PpCHSpg1 GGCCAGCCTCGCGCTGAGGGACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCC
************************************************************
PpCHS2a GCGGAGTTCCTGCAGAGCGAGTACCCCGACTTCTTCTTCAACATCACCAACTGCGGCGAG
PpCHSpg1 GCGGAGTTCCTGCAGAGCGAGTACCCCGACTTCTTCTTCAACATCACCAACTGCGG----
********************************************************
PpCHS2a AAGGACGCCCTGAAGGCCAAATTCAAGCGCATCTGTGACAAGTCGGGGATCCGCAAGCGC
PpCHSpg1 ------AAGGCCAAATTCAAGCGCATCTGTGACAAGTCGGGGATCCGCAAGCGC
************************************************
PpCHS2a CACATGTTCCTCACGGAGGAGGTGCTCAAGGCCAACCCCGGCATCTGCACGTACATGGAG
PpCHSpg1 CACATGTTCCTCACGGAGGAGGTGCTCAAGGCTAACCCCGGCATCTGCACGTACATGGAG
******************************** ***************************
Fig. S5 Alignment of deduced amino acid sequences of PpCHS and PpCHSpg2. Two nonsense mutations in PpCHSpg2 are highlighted.
PpCHS MASAGDVTRAALPRAQPRAEGPACVLGIGTAVPPAEFLQSEYPDFFFNITNCGEKEALKA
PpCHSpg2 MASAGNITRAALPRGQPRAEGPTFVLGISTAVPPAKFLQSKYPDFFFNITNCGEKEVLKA
***** ******* ******* **** ****** **** *************** ***
PpCHS KFKRICDKSGIRKRHMFLTEEVLKANPGICTYMEPSLNVRHDIVVVQVPKLAAEAAQKAI
PpCHSpg2 KFRCICDKLGIPKRHMFLMEGMPKANPGICTYMEPTLNVRHDIVVVQVPKLAAEAVQKAI
** **** ** ****** * ************ ******************* ****
PpCHS KEWGGRKSDITHIVFATTSGVNMPGADHALAKLLGLKPTVKRVMMYQTGCFGGASVLRVA
PpCHSpg2 KEYGGRKSDITHIVFATTSGVNMTGADHALAKLLGLKPTVKLVLMYQTGCPGSASVLRVA
** ******************** ***************** * ****** * *******
PpCHS KDLAENNKGARVLAVASEVTAVTYRAPSENHLDGLVGSALFGDGAGVYVVGSDPKPEVEK
PpCHSpg2 KDLAENNKSSRVLAVASEVTAVTYRAPSENHLDGLVGSALFGDDADVHVVGSDPKPEVEK
******** ********************************* * * ************
PpCHS PLFEVHWAGETILPESDGAIDGHLTEAGLIFHLMKDVPGLISKNIEKFLNEARKPVGSPA
PpCHSpg2 PLFEVHWAGETILPESGGAIDGHLTEAGLIFHLMKDEPVLIFKNIERF-NEARKPVGSPA
**************** ******************* * ** **** * ***********
PpCHS WNEMFWAVHPGGPAILDQVEAKLKLTKDKMQGSRDILSEFGNMSSASVLFVLDQIRHRSV
PpCHSpg2 WNEMF-AVHLGGSAILDQVEAKLQLTKDKMQGNRDILFEFGNTSSALMLFVLDQIRRRSV
***** *** ** ********** ******** **** **** *** ******** ***
PpCHS KMGASTLGEGSEFGFFIGFGPGLTLEVLVLRAAPNSA-
PpCHSpg2 EMRVSTMGEGSKFGFLIGFGPGVVLDVLVLRVAANSA-
* ** **** *** ****** * ***** * ***