p. 1

Supplementary data Koduri et al.

Fig. S1 Amino acid sequence alignments of P. patens chalcone synthase (CHS) superfamily.

Highlighted are the catalytic triad of CHN (Cys170-His309-Asn342, numbering of PpCHS) (Ferrer et al. 1999), the active site Phe residues (Phe221 and Phe271) (Jez et al. 2000; Jez et al. 2002), the GFGPG loop (Suh et al. 2000), and the conserved Arg residues (Fukuma et al. 2007). Intron split sites are highlighted green. Other residues discussed in the text (N-ends of PpCHS3 and PpCHS5, etc) are highlighted cyan.

PpCHS ------MASAGDVTRAALPRAQPRAEG-PACVLGIGTAVPPAE 36

PpCHS1a/1c ------MASAGDVTRTALPRGQPRAEG-PACVLGIGTAVPPAE 36

PpCHS01 ------MASAGDVTRVALPRGQPRAEG-PACVLGIGTAVPPAE 36

PpCHS7 ------MASAGEVTRAALPRGQPCAEG-PACVLGIGTAVPLAE 36

PpCHS2 ------MAP-SGEVDVQGAA------TRSALPRGQPRAEG-PACVLGVGTAVPPAE 42

PpCHS3 ------MAPRAGELDVAASDEQVAAAPLVRMHAPIPRGQPRAEG-PACVLGIGTAVPPTE 53

PpCHS5 ------MAPRAGELDIAASDEQVAAAPLVRMHAPIPRGQPRAEG-PACVLGIGTAVPPTE 53

PpCHS4 ------MAP-AGEVEAEVRA------TRAVLPRGQPRAEG-PACVLGIGTAVPPTE 42

PpCHS6 ------MAPPSGESISASAEEPIALSV------LPRGQPRAEG-PASVLGIGTAVPPTE 46

PpCHS9 ------MASYVERGAVTNGHGHKVLQQPHPRLVPLPDG-PTCVFAIGTACPPTT 47

PpCHS10 MASRRVEAAFDGQAVELGATIPAANGNGTHQSIKVPGHRQVTPG-KTTIMAIGRAVPANT 59

PpCHS11 TKAGIEIKIMSDLGTESNGVAAHTNTNDIRCEGYVPYAVKLVEQRPPGILGMGTANPPHT 106

. ::.:* * *

PpCHS FLQSEYPDFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95

PpCHS1a/1c FLQSEYPEFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95

PpCHS01 FLQSEYPDFFFNITNCGEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 95

PpCHS7 FLQSEYPDFFFNITNCGEKEALKAKFKCICDKSGIKKRHMFLT-EEVLKANPGICTYMEP 95

PpCHS2 FLQSEYPDFFFNITNCGEKDALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 101

PpCHS3 FLQSEYPDFFFNITNTSEKEALHAKFKRICDKSGIRKRHMFLT-EEVLKANPSMCTYMEP 112

PpCHS5 FLQSEYPDFFFNITNTSEKEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 112

PpCHS4 FLQSDYPDFFFNITNTSDQEALKAKFKRICDKSGIRKRHMFLT-EEVLKANPGICTYMEP 101

PpCHS6 FLQSEYPDFFFEVTKCSEKEALKAKFKRICDKSGIRKRYLFLT-KEVLEANPGIATYMEP 105

PpCHS9 IEQKTYPEKLFEMCGVGDNKPLLQKLKYMCDTSCIEKRHAFVT-EEVVKEYPEFASYSDK 106

PpCHS10 TFNDGLADHYIQEFNLQDP-VLQAKLRRLCETTTVKTRYLVVN-KEILDEHPEFLVDGAA 117

PpCHS11 YKMDEFAKILAKP-EFNGPPGAEVFVDRICKASGIKKKHTAVTADEVYAGYPNIYNFGEP 165

.. . . :*. : .:: : .*: * :

PpCHS SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155

PpCHS1a/1c SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155

PpCHS01 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 155

PpCHS7 SLNVRHDVVVVLVPKLAAEAALKAIKEWCSCKSNITHIVFATTSGVNMPGADHALAKLLG 155

PpCHS2 SLNVRHDIVVVQVPKLAAEAAQRAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 161

PpCHS3 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 172

PpCHS5 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 172

PpCHS4 SLNVRHDIVVVQVPKLAAEAAQKAIKEWGGRKSDITHIVFATTSGVNMPGADHALAKLLG 161

PpCHS6 SLNVRHDIVVVQVPKLAAEAAVKAIKEWGGRKSEITHIVFATTSGVNMPGADHAMAKLLG 165

PpCHS9 SLTTRLNMANKVVPEIAVEAAMNAVQEWGRPLSDITHMVVATTSTLSIPGTDFVIARKLG 166

PpCHS10 TVSQRLAITGEAVTQLGHEAATAAIKEWGRPASEITHLVYVSSSEIRLPGGDLYLAQLLG 177

PpCHS11 SLDDRFKLFEKQGMNISIECSERALKDWGGDRSAITHLIVFSSTGMLTPAIDYRLLEALN 225

:: * : ::. *.: *:::* * ***:: ::: : *. * : . *.

PpCHS LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211

PpCHS1a/1c LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211

PpCHS01 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 211

PpCHS7 LKPTVKRIMMCQTGYFGGASVLRVAKDLAENIKGARVRAVASEVTAVIYR----ASSKHH 211

PpCHS2 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVASEVTAVTYR----APSENH 217

PpCHS3 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 228

PpCHS5 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 228

PpCHS4 LKPTVKRVMMYQTGCFGGASVLRVAKDLAENNKGARVLAVCSEVTAVTYR----APSENH 217

PpCHS6 LKPTVKRVMLYQTGCFGGATVLRVAKDLAENNKNARVLAVCSEVTAVTYR----APNENH 221

PpCHS9 LKPSVQRIFMNQVGCWGGGAVMRVGRILAESAKDARVLVIAAEANTIMNFRKPTEETFYK 226

PpCHS10 LRSDVNRVMLYMLGCYGGASGIRVAKDLAENNPGSRVLLITSECTLIGYK----SLSPDR 233

PpCHS11 LSPNVKHYFVSFLGCHGGVIGLRTACEIAEADPKHRVLIVCTELSSVQAQN--IDPAFTR 283

* . :.: :: * ** ::.. :** ** : :* .

PpCHS LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271

PpCHS1a/1c LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271

PpCHS01 LDGLVGSALFGDGAGVYVVGSDPKPEVEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 271

PpCHS7 LDGLVGSALFGDGVCVYVVGSDPKPEVEKLLFKKHWAGVTILPESDGAIDGHLPEAGFIF 271

PpCHS2 LDGLVGSALFGDGAGVYVVGSDPKPEVEKALFEVHWAGETILPESDGAIDGHLTEAGLIF 277

PpCHS3 LDGLVGSALFGDGAGVYVVGSDPKPEAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 288

PpCHS5 LDGLVGSALFGDGAGVYVVGSDPKPEAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 288

PpCHS4 LDGLVGSALFGDGAGVYVVGSDPKPQAEKALFEVHWAGESILPESDGAIDGHLTEAGLIF 277

PpCHS6 LDGLVGSALFGDGAAVFVVGADPKP-EEKPLFEVHWAGETILPESDGAIDGHLTEAGLIF 280

PpCHS9 VDYFLAHVTLGDGAAALILGADPKLNHERPLYEMYWSSQTAIEGSAEAIVGTFSDAGLVQ 286

PpCHS10 PYDLVGAALFGDGAAAMIMGKDPIPVLERAFFELDWAGQSFIPGTNKTIDGRLSEEGISF 293

PpCHS11 LNNIVTLTIFGDGAGAVVVGQ--PSKTEVPFFEMIRCKSTIIPNTSKSISVMITQHGLDA 341

:: . :***. . ::* * ::: . : : :* : : *:

PpCHS HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330

PpCHS1a/1c HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330

PpCHS01 HLMK-DVPGLISKNIEKFLNEARKPVGSPAWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 330

PpCHS7 HLMK-DVPGLIFKNIKKFLNEARKPVGSPAWNEMFWAVHPEGPAILNQVEAKLKLTKDKM 330

PpCHS2 HLMK-DVPGLISKNIEKFLNEARKCVGSPDWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 336

PpCHS3 HLMK-DVPGLISKNIEKFLNEARKCVGSPEWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 347

PpCHS5 HLMK-DVPGLISKNIEKFLNEARKCVGSPEWNEMFWAVHPGGPAILDQVEAKLKLTKDKM 347

PpCHS4 HLMK-DVPGLISKNIEKFLNEARKCVGSPQWNDMFWAVHPGGPAILDQVEAKLKLSKDKM 336

PpCHS6 HLMK-DVPGLISKNIEKFLSEARKCVGSPDWNDMFWAVHPGGPAILDQVEAKLKLSKDKM 339

PpCHS9 SLQKNVVPDILGKHLKGLVSEGMELIGSPSPTDMFWVVHPGAYRILEVVSETMDIKKEKL 346

PpCHS10 KLGR-ELPKLIESNIQGFCDPILKRAGGLKYNDIFWAVHPGGPAILNAVQKQLDLAPEKL 352

PpCHS11 NLEK-DVPKNVSSSTGVFMKSLLDEFG-LDFASVGWAAHPGGKPILDAIEKVCGLLPDQL 399

* :* : : . * .: *..** . **: :. : :::

PpCHS QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388

PpCHS1a/1c QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388

PpCHS01 QGSRDILSEFGNMSSASVLFVLDQIRHRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 388

PpCHS7 QGSKDILSEFGNRSSASVLFVLDQIRHKFVKMGAS--TLGEDSEFGFFIGFGPGLTLEVL 388

PpCHS2 QGSRDILSEYGNMSSASVLFVLDQIRQRSVKMGAS--TLGEGSEFGFFIGFGPGLTLEVL 394

PpCHS3 QGSRDILSEYGNMSSSSVLFVLDQIRQRSVKMGAS--TLGEGSDFGFFIGFGPGLTLEVL 405

PpCHS5 QGSRDILSEYGNMSSSSVLFVLDQIRQRSVKMGAS--TLGEGSDFGFFIGFGPGLTLEVL 405

PpCHS4 QGSRDVLSEFGNMSSSSVLFVLDQIRHRSLEMRSS--TLGEGSEFGFFIGFGPGLTLEVL 394

PpCHS6 QGSRDVLSEFGNMSSSSVLFVLDQIRQRSMKMGAS--TTGEGNDFGFFIGFGPGLTLEVL 397

PpCHS9 QPSWDILRDFGNISSPTCLFVLDEMRKYSKRTGAA--TTGEGCEWGFLVGLGPGFNVELT 404

PpCHS10 QTARQVLRDYGNISSSTCIYVLDYMRHQSLKLKEANDNVNTEPEWGLLLAFGPGVTIEGA 412

PpCHS11 ENSRSVLENKGNMSSASVFFVLDEFRKK------GRVAGRDWTVALGFGPGISIEGV 450

: .:* : ** :*. ::*** .*: :: :.:***..:*

PpCHS VLRAAPNSA 397

PpCHS1a/1c VLRAAPNSA 397

PpCHS01 VLRAAPNSA 397

PpCHS7 VLRAAPNSA 397

PpCHS2 VLRAAATV- 402

PpCHS3 VLRAAANV- 413

PpCHS5 VLRAAAANV 414

PpCHS4 VLRAAVNV- 402

PpCHS6 VLRSMPIV- 405

PpCHS9 LLLSVPF-- 411

PpCHS10 LLRNLC--- 418

PpCHS11 LLRNIYH-- 457

:*

Fig. S2 Putative translation initiation sites of PpCHS11.

(1) Nucleotide sequence of the 5′-end region. The four ATG codons are highlighted in color, and the surrounding sequences are also highlighted in gray.

ATGTCCACACGTAGCGAGCTAGTTCGTTCATGACCTTTGGGAACTCTCCTGGGAGTGCCA 60

CTTAAATTGTGGTTGTAAAAAAGTGATATGTTTTCAGAAGCTGAGGAGGAATCGAAGCTT 120

TTGGATTAACAATGAGCTTATATTAGTCGCAAAAGGATGACCCTCTTGAGCTATAAAGAC 180

GAGCTAGCCCCTGGCGGAGGGGTCAGATGCAACTAGACTGCAGAGCCTGTGAGACCAGTA 240

CAAGCATTGAAGTCTACATGAATCCTCCGCTTCGCAGTCACATGTATATCTGCTAGTCTT 300

CATGCCATGTACACTAAGCTTCCAGTCAACTTAAATGCTGCCCATTGAAACCTTCTTTCT 360

TGTCTGCTCAAGCTCAGACTCGGTGTCTTCATTGTGACCAACCTTCGGCGAAGACAACCG 420

ACGGTCTTGAGCAGTTCCTGTTTCTTCTTCCTCTTTTTTATCTACTTTTGTTTTTCTCCA 480

ACCATTCAGTTTCTTGTATCATGGCACCAGCATCCGACTCCGCGGTGGAAGAGCCGAGTC 540

TCGCCAATACAGGTGCGGTGATGAAGTCCTTGAGTGATCTGGTGGTGCAGAACGGCAATG 600

GAGTGCATGTCCGCTGCCGAGACGATGGACTCAAGGACACCAAGGCTGGAATAGAAATCA 660

AAATCATGTCAGACTTGGGCACTGAAAGCAACGGCGTTGCGGCGCATACTAACACCAATG 720

ACATAAGGTGCGAGGGTTATGTGCCGTATGCAGTAAAACTGGTGGAGCAAAGGCCGCCTG 780

GTATACTGGGCATGGGGACTGCCAATCCTCCGCACA

(2) Deduced amino acid sequence of the 5′-end region. The four Met residues are highlighted in matching color with ATG codons above.

VHT-RASSFMTFGNSPGSAT-IVVVKK-YVFRS-GGIEAFGLTMSLY-SQKDDPLEL-RR

ASPWRRGQMQLDCRACETSTSIEVYMNPPLRSHMYIC-SSCHVH-ASSQLKCCPLKPSFL

SAQAQTRCLHCDQPSAKTTDGLEQFLFLLPLFYLLLFFSNHSVSCIMAPASDSAVEEPSL

ANTGAVMKSLSDLVVQNGNGVHVRCRDDGLKDTKAGIEIKIMSDLGTESNGVAAHTNTND

IRCEGYVPYAVKLVEQRPPGILGMGTANPPH

(3) Comparison of the putative translation initiation sites with the plant initiation consensus sequence.

Consensus sequence: 5′- caA(A/C)aA+1TGGCg

Site 1 (position at 501): 5′- GTA T CA+1TGGCA

Site 2 (position at 561): 5′- CGG T GA+1TGAAG

Site 3 (position at 666): 5′- AAA T CA+1TGTCA

Site 4 (position at 792): 5′- TGG G CA+1TGGGG

Fig. S3 Sequence analysis of PpCHS8.

(1) Nucleotide sequence of PpCHS8. The proposed exons are in black, putative intron (625 nt) in green and untranslated regions in gray. The start and stop codons are highlighted and the ATG codon that may lead to a truncated protein is underlined.

ACAACTAATGTCAGCTCCTTTCAAAGGGTAATAATGTTAAGAGGCAGACAGTTTGCTACAGGCTCATGTAGCCACTGGTGATTCACTCTCAATTGCTGGGCAATATGTTCTGAGAAGATGTATGCACAGTTCAAGGCCACATCCTTTTCATGTAAACTTGTGCATACTGAAACTCAAATCCCCTTTCAATTTATGCAAGGATGAACGTACAAAGCATGATTGATTTATGAGCAGTATTGCAGCTCATGCAGGCAACATTGCAATCAATCTCACCCTACCCTTAAAATTCAGCCAAACGTACAGAAATTCTTGGTGTAGCAAATGAATTCCTTGTCCACTCTTAACTATTCCTACTTCCACCAATCCAGTTTTGAAAATTTATTTGAAGAATTTATTTCCAAATAAAAAGTTGTCTGATGTCGAAGCGAATCGCGCCGTTGGAACTTGAAAGAAATGAAACATTCCACTTCTGTAGGAACTAGTCAGTACACATCAGTTGGCGAAACGAGCTCTGAGCTGATTTCAAATATGAGAGCCAGAGGAGCTGTATTACTCGAAATGAAAGAGAGCTTCTTATATCCGGACTCACAGCATGAGAACCTTGGGAGGAGGAAGGCTCATTGATCACTTCGGTGGGTGCCAACTGAGACCCATGGTCTTCGAGAAAATGAATAGACTTGACCGGCGGTATGTTTTACTGTTCCACAGTGGCTGCTTTGGAGGCTCGACAAGTCAACTTTGACTGGTCTTGCATCAAGTCCATTCATCTG

ATGGAGCCCCGCTCTTTGTCACTGGGGAAGATCTCACTGAGAATCTACGAACTTTTTGCAGAAGGGCCGGTCTGCAGAAGGGCCGGCGGTGATTCTTAGCATCGGAACTGCGGTGCCTCCCTATGTGCACGAGATGGGGTCGTACGCCGATTACTACTTTGACGAGACCAACTGCAACCACAAGCCAGAGCTGAAGGCGAAGTTCAAGCGCATCTGCGATAAGATGCACATTAGCAAACGTCACATGGTAGTGCGGAAGGAGCTCCTGGCGCAATACCCCTCTCTGGGAACATACCTCAACAACAGCCTGGAGGACCGCCACAAGGTGTGCATGGAATGGGTACCGAAGTTGGCAGTTGAGGCAGCTGAGAACGCAATCAAGGAATGGGGCGGGTCACTCTCACAGATTACCCATATTGTCATGGCGACTACAAGCGTCGTTAACATGCCCGGAGTAGATCTTCTCGTTGCCAAAGCGCTCGGCCTCTCACCCAAGCTGCGCCGCGTGATGATGTATCAAACAGGATGCTGGGGAGGAGCCGCGATCATTCGGGTGGCAAAAGACATCGCCGAGAATAACAAGGGGGCACGTGTTCTTGTTGTCGCAAGCGAGTGTACTGCAACCTTCTTCCGCGCCCCTAGTGAGGAATATCTCGATGGACTGGTGGGCCAGGCATTGTTCGGCGACGGCGCTGGAGCACTTGTCATCGGCGCTGATCCCAACCCCGACACTGAAAGGACTCTATACGAGATTCAGTGGAGCGGGGAGA

TGGTGGTACCTGATAGCGAGGGCGCCATCGATGGCCACATGATGGAGGCCGGCATGTACTACCACCTCAAGCCAGACATCCCGAAACTAGTCTCCCGCAGCATCGAGGAGTTCGTTTCCGACGCCACCGCTCAAGCAGGAAATGCCGACGTCAACGACTTGTTCTGGGCTGTACACCCCGGAGGCGTCGCCATTCTCAATCAGATCGAGAACCAGCTGATGCTCTCTCCTGAGAAGCTGCTCGCCAGTCGGGAGATTCTGGCCGACTATGGCAACATGGCCAGTGCGTGCGTGCTTTTCGTCTTGGATCAAGTCCGAAACTGCTCGATCAAAGCCAAAGCATCCACTACCGGGGAAGGTAGGGACTTCGGATCGCTCATCGGCATCGGCCCAGGGCTCACGATGGAATGCTGTGTGCTGAAGTCTGTTCCACTTGATAATTAGCCTTGACGTATTGCAGGAATTTTTTGGACTTTTTTTTAAAACTTAGAAAGTGATTCAGTCTCGGGGATAACTGGTGAAAGCTTCCTATAATTTGTCATCTTAAACGGCTTCAGTATAGTAAGTTTGACAGCTGCGAGGACTGGGCAGGCCATTGCCGAACTCGAGTCACCGTTGTGATGACGAACTGCCAATGCAGCTTG

(2) Deduced amino acid sequence of PpCHS8. The 26 N-end residues that show sequence similarity to other P. patens CHS enzymes are highlighted as well as the CHN triad and GIGPG loop. The proposed exon-intron split site is marked by an arrow.

MNKGRSAEGPAVILSIGTAVPPYVHEMGSYADYYFDETNCNHKPELKAKFKRICDKMHISKRHMVVRKELLAQYPSLGTYLNNSLEDRHKVCMEWVPKLAVEAAENAIKEWGGSLSQITHIVMATTSVVNMPGVDLLVAKALGLSPKLRRVMMYQTGCWGGAAIIRVAKDIAENNKGARVLVVASECTATFFRAPSEEYLDGLVGQALFGDGAGALVIGADPNPDTERTLYEIQWSGEMVVPDSEGAIDGHMMEAGMYYHLKPDIPKLVSRSIEEFVSDATAQAGNADVNDLFWAVHPGGVAILNQIENQLMLSPEKLLASREILADYGNMASACVLFVLDQVRNCSIKAKASTTGEGRDFGSLIGIGPGLTMECCVLKSVPLDN-

Fig. S4 Sequence analysis of PpCHSpg1.

(1) Genome sequence of PpCHSpg1. The exon sequences are in black, putative intron (258 nt) in green and untranslated regions in gray. The proposed exon-intron split site is highlighted.

77988 TCGAGACAACCCCGAGTGAGCAGCTCGGACTCAACATCCATTCCGAACCCAAACGTGCCCATGGCACCAA

77918 GCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGAGGCCAGCCTCGCGCTGAGGG

77848 ACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCCGCGGAGTTCCTGCAGAGCGAGTACCCCGAC

77778 TTCTTCTTCAACATCACCAACTGCGGAAGGCCAAATTCAAGCGCATCTGTAAGTCCACCTCCACCGTGTG

77708 ACTCTCTTCGCGGCTCGTGATGTGCTCACTGTTCCCTGTGGGCCGGCCCTCACCCCCAGCCGCGCTCGAC

77638 GACGTGCTCATTGCACGGGCTGCGCGGCGCGAGCGCATTAGAATTGCGACTTGTGATTTGTTGCGGGCGT

77568 CGAGATGGGAATGCGGCGGTCGCGATGTTCCTGAAATTGGGTCGATTCCGGCCCCTTACGACGGCAGCTG

77498 ACGCTTGGCCATGAATGTCGATGCAGGTGACAAGTCGGGGATCCGCAAGCGCCACATGTTCCTCACGGAG

77428 GAGGTGCTCAAGGCTAACCCCGGCATCTGCACGTACATGGAGCCCTCCCTGAACGTCCGCCACGACATCG

77358 TCGTCGTCCAGGTCCCCAAGCTCGCCGCGGAGGCAGCCCAGAGGGCCATCAAGGAGTGGGGCGGCCGCAA

77288 GTCTGACATCACCCACATCGTGTTCGCCACCACCAGCGGCGTGAACATGCCCGGAGCCGACCACGCCCTG

77218 GCCAAGCTGCTGGGCCTGAAGCCCACGGTGAAGCGGGTCATGATGTACCAGACCGGGTGCTTTGGCGGTG

77148 CTTCCGTGCTCAGGGTGGCCAAGGATCTGGCGGAGAACAACAAGGGCGCCAGGGTGTTGGCGGTGGCCAG

77078 CGAGGTCACGGCCGTCACATACCGCGCACCCAGCGAGAACCACTTGGACGGCTTGGTGGGCTCGGCCCTG

77008 TTCGGCGATGGCGCCGGAGTGTACGTGGTGGGATCCGATCCCAAGCCGGAGGTGGAGAAAGCACTGTTCG

76938 AGGTGCACTGGGCGGGCGAGACGATCTTGCCAGAGAGTGATGGAGCCATTGATGGGCATCTGACGGAGGC

76868 GGGGCTCATCTTCCACCTCATGAAGGACGTGCCAGGGCTGATCTCCAAGAACATCGAGAAGTTCTTGAAC

76798 GAGGCCAGGAAGTGCGTCGGTTCGCCCGATTGGAACGAGATGTTCTGGGCGGTGCACCCGGGAGGCCCGG

76728 CCATTCTGGACCAGGTGGAGGCGAAGCTGAAGCTGACCAAGGACAAGATGCAGGGGAGCAGGGACATACT

76658 GTCGGAGTACGGCAACATGTCGTCGGCGTCGGTGTTGTTCGTGCTGGATCAGATTCGCCAGAGGTCGGTC

76588 AAGATGGGGGCGTCGACGCTGGGAGAGGGCAGCGAGTTTGGCTTCTTCATTGGATTCGGTCCGGGGCTCA

76518 CCCTGGAAGTGCTGGTCCTCCGGGCCGCGGCCACCGTTTGAGTTGTAGCAACGGGCACGGGCAAATCAAT

76448 ATCGTCTCCCTGGTTTTCCTTTTGCGCGCTGGAGTGGAGATCGTATGAGTGAGCGAATTATGATGATTCA

76378 TGGGTCAGAGAGCATCCAGGTTGTGGCACCAGAGGGTTTTACATTCAGGTCTGCAGCAAATGGACTTTCG

76308 ATGAAATTTGGTCAGTTGGATCAGCGCGCTA 76278

(2) Partial sequence alignment of PpCHS2a and PpCHSpg1 transcripts.

Upstream and downstream from the region shown, the sequences are identical. The proposed exon-intron split sites are highlighted.

PpCHS2a ATGGCACCAAGCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGA

PpCHSpg1 ATGGCACCAAGCGGAGAAGTCGACGTTCAGGGTGCTGCCACAAGGAGCGCGCTTCCCAGA

************************************************************

PpCHS2a GGCCAGCCTCGCGCTGAGGGACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCC

PpCHSpg1 GGCCAGCCTCGCGCTGAGGGACCAGCATGTGTGTTGGGCGTCGGCACTGCGGTGCCTCCC

************************************************************

PpCHS2a GCGGAGTTCCTGCAGAGCGAGTACCCCGACTTCTTCTTCAACATCACCAACTGCGGCGAG

PpCHSpg1 GCGGAGTTCCTGCAGAGCGAGTACCCCGACTTCTTCTTCAACATCACCAACTGCGG----

********************************************************

PpCHS2a AAGGACGCCCTGAAGGCCAAATTCAAGCGCATCTGTGACAAGTCGGGGATCCGCAAGCGC

PpCHSpg1 ------AAGGCCAAATTCAAGCGCATCTGTGACAAGTCGGGGATCCGCAAGCGC

************************************************

PpCHS2a CACATGTTCCTCACGGAGGAGGTGCTCAAGGCCAACCCCGGCATCTGCACGTACATGGAG

PpCHSpg1 CACATGTTCCTCACGGAGGAGGTGCTCAAGGCTAACCCCGGCATCTGCACGTACATGGAG

******************************** ***************************

Fig. S5 Alignment of deduced amino acid sequences of PpCHS and PpCHSpg2. Two nonsense mutations in PpCHSpg2 are highlighted.

PpCHS MASAGDVTRAALPRAQPRAEGPACVLGIGTAVPPAEFLQSEYPDFFFNITNCGEKEALKA

PpCHSpg2 MASAGNITRAALPRGQPRAEGPTFVLGISTAVPPAKFLQSKYPDFFFNITNCGEKEVLKA

***** ******* ******* **** ****** **** *************** ***

PpCHS KFKRICDKSGIRKRHMFLTEEVLKANPGICTYMEPSLNVRHDIVVVQVPKLAAEAAQKAI

PpCHSpg2 KFRCICDKLGIPKRHMFLMEGMPKANPGICTYMEPTLNVRHDIVVVQVPKLAAEAVQKAI

** **** ** ****** * ************ ******************* ****

PpCHS KEWGGRKSDITHIVFATTSGVNMPGADHALAKLLGLKPTVKRVMMYQTGCFGGASVLRVA

PpCHSpg2 KEYGGRKSDITHIVFATTSGVNMTGADHALAKLLGLKPTVKLVLMYQTGCPGSASVLRVA

** ******************** ***************** * ****** * *******

PpCHS KDLAENNKGARVLAVASEVTAVTYRAPSENHLDGLVGSALFGDGAGVYVVGSDPKPEVEK

PpCHSpg2 KDLAENNKSSRVLAVASEVTAVTYRAPSENHLDGLVGSALFGDDADVHVVGSDPKPEVEK

******** ********************************* * * ************

PpCHS PLFEVHWAGETILPESDGAIDGHLTEAGLIFHLMKDVPGLISKNIEKFLNEARKPVGSPA

PpCHSpg2 PLFEVHWAGETILPESGGAIDGHLTEAGLIFHLMKDEPVLIFKNIERF-NEARKPVGSPA

**************** ******************* * ** **** * ***********

PpCHS WNEMFWAVHPGGPAILDQVEAKLKLTKDKMQGSRDILSEFGNMSSASVLFVLDQIRHRSV

PpCHSpg2 WNEMF-AVHLGGSAILDQVEAKLQLTKDKMQGNRDILFEFGNTSSALMLFVLDQIRRRSV

***** *** ** ********** ******** **** **** *** ******** ***

PpCHS KMGASTLGEGSEFGFFIGFGPGLTLEVLVLRAAPNSA-

PpCHSpg2 EMRVSTMGEGSKFGFLIGFGPGVVLDVLVLRVAANSA-

* ** **** *** ****** * ***** * ***