Supplementary Material

Characterization of cleavage intermediate and star sites of RM.Tth111II

Zhenyu Zhu*, Shengxi Guan, Derek Robinson, Hanna El Fezzazi, AineQuimby, and

Shuang-yong Xu*

New England Biolabs, Inc., 240 County Road, Ipswich, MA 01938, USA

Supplementary Figure 1.

Color-coded functional domains and secondary structure prediction and PROMALS3D alignment (computer server: Tth111II and TthHB27I amino acid sequences were used in the alignment. The exact boundaries of the functional domains remain to be determined by structure analysis and experimentation.

Blue=endonuclease catalytic domain (rough boundary: aa 1-146)

Green=alpha helical domain (rough boundary: aa 147-374)

Orange=N6-adenine methyltransferase group (rough boundary 375-726, MTasemotifs X (FYTP), I (GSGT), VIII (VFEGAS), underlined aa blocks)

Dark blue=specificity domain (rough boundary 727-1106)

Secondary structure (ss) prediction: h,  helix; e,  sheet. 9=high probability of the secondary structure prediction (PROMALS3D)

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 1 MLSLLTGGVFRRVKLMNWIDLYTHLKQEVPWFFNSVRLAASQAHNEAEFESRINNAIERLAQKLGVQLLF 70

Tth111II_CAARCA 1 ------MNWIDLYTHLKQEVPWFFNSVRLAASQAHNEAEFESRINNAIERLAQKLGVQLLF 55

Consensus_aa: ...... MNWIDLYTHLKQEVPWFFNSVRLAASQAHNEAEFESRINNAIERLAQKLGVQLLF

Consensus_ss: hhh hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh hhhhhhhhhhhhhhhh eeee

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 71 REQYTLATGRADAVYNRLVIEYEPPGSLRPNLKHSHTQHAVRQVMNYIEELSRAERHDRDRLLGVVFDGH 140

Tth111II_CAARCA 56 REQYTLATGRADAVYNRLVIEYEPPGSLRPNLKHSHTQHAVRQVMNYIEELSRAERHDRDRLLGVVFDGH 125

Consensus_aa: REQYTLATGRADAVYNRLVIEYEPPGSLRPNLKHSHTQHAVRQVMNYIEELSRAERHDRDRLLGVVFDGH

Consensus_ss: eeeeee hhhhhhhhhhhhhhhhh hhhhhhhh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 141 YFIFVRYHEGHWIVEEPLEVNPASCERFLRSLFSLSSGRALIPENLVEDFGSQNDLSRQATRALYHALQG 210

Tth111II_CAARCA 126 YFIFVRYHEGHWIVEEPLEVNPASCERFLRSLFSLSSGRALIPENLVEDFGSQNDLSRQATRALYHALQG 195

Consensus_aa: YFIFVRYHEGHWIVEEPLEVNPASCERFLRSLFSLSSGRALIPENLVEDFGSQNDLSRQATRALYHALQG

Consensus_ss: eeeeeee eee hhhhhhhhhhhhh hhhhhhhhhhhh hhhhhhhhhhhhhhh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 211 HTSDLTARLFVQWQIFFGETAGADAAGGELKHKSELLAFARGMGLRGSRIDMPRFLFALHTYFSFLVKNI 280

Tth111II_CAARCA 196 HTSDLTARLFVQWQIFFGETAGADAAGGELKHKSELLAFARGMGLRGSRIDMPRFLFALHTYFSFLVKNI 265

Consensus_aa: HTSDLTARLFVQWQIFFGETAGADAAGGELKHKSELLAFARGMGLRGSRIDMPRFLFALHTYFSFLVKNI

Consensus_ss: hhhhhhhhhhhhh hhhhhhhhhhhhhhh hhhhhhhhhhhhhhhhhhh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 281 ARLVLQAYAGGGLGTTPLTTIANLEGEALRRELQNLESGGLFRTLGLKNLLEGDFFAWYLDAWNPEVEEA 350

Tth111II_CAARCA 266 ARLVLQAYAGGGLGTTPLTTIANLEGEALRRELQNLESGGLFRTLGLKNLLEGDFFAWYLDAWNPEVEEA 335

Consensus_aa: ARLVLQAYAGGGLGTTPLTTIANLEGEALRRELQNLESGGLFRTLGLKNLLEGDFFAWYLDAWNPEVEEA

Consensus_ss: hhhhhh hhh hhhhhhhhhhhh hhhh hhhhhhhhhhh hhhhhh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 351 LRQVLARLAEYNPATVQDDPHSARDLLKKLYHYLLPRDIRHDLGEFYTPDWLAERLLNQLGEPWFIMPPG 420

Tth111II_CAARCA 336 LRQVLARLAEYNPATVQDDPHSARDLLKKLYHYLLPRDIRHDLGEFYTPDWLAERLLNQLGEPWFIMPPG 405

Consensus_aa: LRQVLARLAEYNPATVQDDPHSARDLLKKLYHYLLPRDIRHDLGEFYTPDWLAERLLNQLGEPWFIMPPG

Consensus_ss: hhhhhhhhhhhhh hhhhhhhhhhhhhhhh hhhhhhhhhhhhhhhhh

Conservation: 9999999999999999999999 99999999999999999999999999999999999999999999999

TthHB27I_CAARCA 421 NHPPRGLPDKRLLDPACGSGTFLVLAIRALKVNCFLAGFSEADTLEVILNSVVGIDLNPLAVTAARVNYL 490

Tth111II_CAARCA 406 NHPPRGLPDKRLLDPACGSGTFPVLAIRALKVNCFLAGFSEADTLEVILNSVVGIDLNPLAVTAARVNYL 475

Consensus_aa: NHPPRGLPDKRLLDPACGSGTF.VLAIRALKVNCFLAGFSEADTLEVILNSVVGIDLNPLAVTAARVNYL

Consensus_ss: eee hhhhhhhhhhhhhhhh hhhhhhhhh eeeee hhhhhhhhhhhh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 491 LAIADLLPYRRREVEIPVYLADSILTPARGEGLFAQNRRILETAVGPLPVPEVINSRAKMERLTDLLEEY 560

Tth111II_CAARCA 476 LAIADLLPYRRREVEIPVYLADSILTPARGEGLFAQNRRILETAVGPLPVPEVINSRAKMERLTDLLEEY 545

Consensus_aa: LAIADLLPYRRREVEIPVYLADSILTPARGEGLFAQNRRILETAVGPLPVPEVINSRAKMERLTDLLEEY

Consensus_ss: hhh hhhhhhhhhhhhhh

Conservation: 999999999999999999999999999999 9 9999999999999999999999999999999999999

TthHB27I_CAARCA 561 VRGDFSTEAFLARAKKEIPDLADALHADEVLTELYERLRDLHRQGLDGIWARVLKNAFMPLFLEPFDYVV 630

Tth111II_CAARCA 546 VRGDFSTEAFLARAKKEIPDLADALHADEVITGLYERLRDLHRQGLDGIWARVLKNAFMPLFLEPFDYVV 615

Consensus_aa: VRGDFSTEAFLARAKKEIPDLADALHADEVlT.LYERLRDLHRQGLDGIWARVLKNAFMPLFLEPFDYVV

Consensus_ss: h hhhhhhhhhhhhhhhh hhhhhhhhhhhhhhhhhhhhhhhhhhh eeeee

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 631 GNPPWINWESLPQAYREQTAELWTCYGLFVHSGMDTILGKGKKDASTLMTYAVADRFLKEGGKLGFLITQ 700

Tth111II_CAARCA 616 GNPPWINWESLPQAYREQTAELWTCYGLFVHSGMDTILGKGKKDASTLMTYAVADRFLKEGGKLGFLITQ 685

Consensus_aa: GNPPWINWESLPQAYREQTAELWTCYGLFVHSGMDTILGKGKKDASTLMTYAVADRFLKEGGKLGFLITQ

Consensus_ss: hhhhhhhhhhhhhhhhh hhhhhhhhhhhhhhh eeeeee

Conservation: 999999999999999999999999999999999999999999999999999999 999999999999999

TthHB27I_CAARCA 701 SVWKTGAGQGFRRFRIGENGPHLRVLHVDDLSSLQVFEGASTRTSAFVLQKGRPTRYPVPYTYWKKTTKG 770

Tth111II_CAARCA 686 SVWKTGAGQGFRRFRIGENGPHLRVLHVDDLSSLQVFEGASTRTSAFVLQKGRPPRYPVPYTYWKKTTKG 755

Consensus_aa: SVWKTGAGQGFRRFRIGENGPHLRVLHVDDLSSLQVFEGASTRTSAFVLQKGRPsRYPVPYTYWKKTTKG

Consensus_ss: hhh hhhhhhhhhh eeeeeee eeeeeeeee eeeee

Conservation: 99999999999999999999999999999999999999999999 9999999999999999999999999

TthHB27I_CAARCA 771 EGLDYDSTLGEVMEQTKRLRFHAVPVDPDDLTSPWLTARRRALYAVRKVLGTSEYRAYEGANSGGANGIY 840

Tth111II_CAARCA 756 EGLDYDSTLGEVMEQTKRLRFHAVPVDPDDLTSPWLTARRRALYSVRKVLGTSEYRAYEGANSGGANGIY 825

Consensus_aa: EGLDYDSTLGEVMEQTKRLRFHAVPVDPDDLTSPWLTARRRALYtVRKVLGTSEYRAYEGANSGGANGIY

Consensus_ss: hhhhhhh hhhhhhh hhhhhh ee ee

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 841 WLEILAERPDGLVVVRNVTEGAKREVEGITTELEPDLLYPLLRGRDVRRWYAQPSLHILMVQDPKTRRGI 910

Tth111II_CAARCA 826 WLEILAERPDGLVVVRNVTEGAKREVEGITTELEPDLLYPLLRGRDVRRWYAQPSLHILMVQDPKTRRGI 895

Consensus_aa: WLEILAERPDGLVVVRNVTEGAKREVEGITTELEPDLLYPLLRGRDVRRWYAQPSLHILMVQDPKTRRGI

Consensus_ss: ee hhhhhhhhhh hhhhhhhh eee

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 911 DEQVLQKRYPKTWAYLKRFEAVLRERSGFRRYFTRKDRNGRMVETGPFYSMFNVGDYTFAPWKVVWRYVA 980

Tth111II_CAARCA 896 DEQVLQKRYPKTWAYLKRFEAVLRERSGFRRYFTRKDRNGRMVETGPFYSMFNVGDYTFAPWKVVWRYVA 965

Consensus_aa: DEQVLQKRYPKTWAYLKRFEAVLRERSGFRRYFTRKDRNGRMVETGPFYSMFNVGDYTFAPWKVVWRYVA

Consensus_ss: hhhhhhhhhhhhhhhhhhhhhhhh eeeee

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 981 SDFIVAVVGPASDEKPVVPNEKLMLVPVEDDNEAFYLCGVLNSSPIRFAVQSFFVQTQIAPHVLQKLCIP 1050

Tth111II_CAARCA 966 SDFIVAVVGPASDEKPVVPNEKLMLVPVEDDNEAFYLCGVLNSSPIRFAVQSFFVQTQIAPHVLQKLCIP 1035

Consensus_aa: SDFIVAVVGPASDEKPVVPNEKLMLVPVEDDNEAFYLCGVLNSSPIRFAVQSFFVQTQIAPHVLQKLCIP

Consensus_ss: eeeee eeee eeeee hhhhhhhhhhh hhhhhhhhhh ee hh

Conservation: 9999999999999999999999999999999999999999999999999999999999999999999999

TthHB27I_CAARCA 1051 RYEPNTDHQNRIAHLSRRAHELAPAAYNGDKAARAELRRVEEEIDRAAAQLWGLTEEELAEIRRSLEELR 1120

Tth111II_CAARCA 1036 RYEPNTDHQNRIAHLSRRAHELAPAAYNGDKAARAELRRVEEEIDRAAAQLWGLTEEELAEIRRSLEELR 1105

Consensus_aa: RYEPNTDHQNRIAHLSRRAHELAPAAYNGDKAARAELRRVEEEIDRAAAQLWGLTEEELAEIRRSLEELR

Consensus_ss: hhhhhhhhhhhhhhhhhhhh hhhhhhhhhhhhhhhhhhhhhh hhhhhhhhhhhhhh

Conservation: 9

TthHB27I_CAARCA 1121 G 1121

Tth111II_CAARCA 1106 G 1106

Consensus_aa: G

Consensus_ss: