Article title: The prediction of lipid binding regions in cytoplasmic and extracellular loops of membrane proteins as exemplified by protein translocation membrane proteins.

Journal: Journal of Membrane Biology

Author: R. C. A. Keller

As indicated in the text, the subsequent tables are included in this supplement. This supplement contains:

Table S1: Details lipid binding region search of the SecYEG complex

Table S2: Comparison of TM helix positions

Table S3: Details lipid binding region search of the SecDFyajC complex, YidC and members of the Tat pathway

Table S4: Secondary structure prediction comparison

Figure S1: 3D structures of M. jannaschii SecG and E. coli SecG

Figure S2: Sequence alignment of various SecD proteins

Table S1: Details lipid binding region search of the SecYEG complex.Data belonging to the lipid binding region search of the protein translocation membrane proteins of the SecYEG complex.The lipid binding region (LBR) is predicted by the Heliquest lipid binding discrimination factor and the possible TM helices were determined by the Heliquest generated Eisenberg plot approach. In case of a surface seeking helices which was found by the Eisenberg plot methodology this is indicated (S).

Sequence / H / μH / z / TM / LBR
E.coli SecY:
9FQSAKGGLGELKRRLLFV26 / 0.389 / 0.336 / 3 / - / Y
24LFVIGALIVFRIGSFIPI41 / 1.122 / 0.168 / 1 / Y / -
85YISASIIIQLLTVVHPTL102 / 0.949 / 0.186 / 0 / Y / -
101TLAEIKKEGESGRRKISQ118 / -0.074 / 0.383 / 2 / - / Ya
122YGTLVLAIFQSIGIATGL139 / 0.835 / 0.209 / 0 / Y / -
161VVSLVTGTMFLMWLGEQI178 / 0.927 / 0.055 / -1 / Y / -
187ISIIIFAGIVAGLPPAIA204 / 1.008 / 0.190 / 0 / Y / -
217FLVLLLVAVLVFAVTFFV234 / 1.326 / 0.100 / 0 / Y / -
273GVIPAIFASSIILFPATI290 / 1.003 / 0.252 / 0 / Y / -
300WNWLTTISLYLQPGQPLY317 / 0.883 / 0.216 / 0 / MLb / -
315PLYVLLYASAIIFFCFFY332 / 1.267 / 0.058 / 0 / Y / -
375LVGALYITFICLIPEFMR392 / 1.037 / 0.350 / 0 / Y / -
397VPFYFGGTSLLIVVVVIM414 / 1.101 / 0.046 / 0 / Y / -
M.jannaschii SecY:
17PVKEITFKEKLKWTGIVL34 / 0.527 / 0.112 / 2 / - / Ya
27LKWTGIVLVLYFIMGCID44 / 1.067 / 0.187 / 0 / Y / -
54AIFEFWQTITASRIGTLI71 / 0.790 / 0.301 / 0 / MLb / -
71ITLGIGPIVTAGIIMQLL88 / 0.993 / 0.071 / 0 / Y / -
100IPENRALFQGCQKLLSII117 / 0.619 / 0.396 / 1 / - / Y
113LLSIIMCFVEAVLFVGAG130 / 1.036 / 0.239 / 1 / Y / -
138LLAFLVIIQIAFGSIILI155 / 1.264 / 0.200 / 0 / Y / -
170IGLFIAAGVSQTIFVGAL187 / 0.875 / 0.089 / 0 / Y / -
209YIAPIIGTIIVFLMVVYA226 / 1.161 / 0.302 / 0 / Y / -
257IPVILAAALFANIQLWGL274 / 1.033 / 0.147 / 0 / Y / -
313PIHAIVYMIAMIITCVMF330 / 1.175 / 0.243 / 0 / Y / -
357KGFRKSEKAIEHRLKRYI374 / 0.010 / 0.543 / 5 / - / Y
374IPPLTVMSSAFVGFLATI391 / 0.931 / 0.348 / 0 / Y / -
394FIGALGGGTGVLLTVSIV411 / 0.830 / 0.193 / 0 / Y / -
419LREKVSELHPAIAKLLNK436 / 0.299 / 0.554 / 2 / - / Y
E.coli SecE:
19WVVVVALLLVAIVGNYLY36 / 1.117 / 0.162 / 0 / Y / -
45ALAVVILIAAAGGVALLT62 / 0.899 / 0.085 / 0 / Y / -
71FAREARTEVRKVIWPTRQ88 / 0.201 / 0.276 / 3 / - / Y
94TLIVAAVTAVMSLILWGL111 / 1.053 / 0.187 / 0 / Y / -
M.jannaschii SecE:
11QLKEFIEECRRVWLVLKK28 / 0.433 / 0.614 / 2 / - / Y (S)
45ISLLGIIGYIIHVPATYI62 / 1.040 / 0.335 / 0 / Y / -
E.coli SecG:
5LLVVFLIVAIGLVGLIML22 / 1.323 / 0.111 / 0 / Y / -
46SGSGNFMTRMTALLATLF63 / 0.603 / 0.407 / 1 / - / Y (S)
56TALLATLFFIISLVLGNI73 / 1.067 / 0.332 / 0 / Y / -
M.jannaschii SecG:
10ATSAGLIRYMDETFSKIR27 / 0.329 / 0.425 / 1 / - / Y
32HVIGVTVAFVIIEAILTY49 / 0.953 / 0.093 / -1 / Y / -

a Less than 50% helical according SOPMA. For the sake of comparison, because of the high degree of sequence homology between both SecY proteins the regions are included in this overview.

b According to TOPDB these regions correspond to membrane loops (MLs) instead of regular TM helices.

Table S2: Comparison of TM helices positions. Data comparison between positions of the TM helices as obtained by the Heliquest generated Eisenberg plot approach (Keller, 2011b) and by the topology prediction program TOPCONS (Bernsel et al. 2009).

Protein / Heliquest generated Eisenberg (18 AA window) / TOPCONS (20 AA window)
E.coli SecY / 24-41 / 22-42
85-102 / 75-95
122-139 / 119-139
161-178 / 154-174
187-204 / 183-203
217-234 / 218-238
273-290 / 274-294
315-332 / 316-336
375-392 / 369-389
397-414 / 397-417
M.jannaschii SecY / 27-44 / 27-47
71-88 / 69-89
113-130 / 113-133
138-155 / 135-155
170-187 / 165-185
209-226 / 206-226
257-274 / 251-271
313-330 / 316-336
374-391 / 372-392
394-411 / 395-415
E.coli SecG / 5-22 / 3-23
56-73 / 54-74
M.jannaschii SecE / 32-49 / 32-52
Differences / none / none

Table S3: Details lipid binding region search of the SecDFyajC complex, YidC and members of the Tat pathway. Data belonging to the lipid binding region search of the protein translocation membrane proteins of the SecDFyajC complex, YidC and members of the Tat pathway.

Sequence / H / μH / z / TM / LBR
E.coli SecD
5YPLWKYVMLIVVIVIGLL22 / 1.234 / 0.276 / 1 / Y / -
225YAVQQNINILRNRVNQLG242 / 0.313 / 0.347 / 2 / - / Y
454LEACLAGLLVSILFMIIF471 / 1.189 / 0.192 / -1 / Y / -
477LIATSALIANLILIVGIM494 / 1.044 / 0.145 / 0 / Y / -
501TLSMPGIAGIVLTLAVAV519 / 0.873 / 0.131 / 0 / Y / -
553IFDANITTLIKVIILYAV570 / 0.909 / 0.245 / 0 / Y / -
578FAITTGIGVATSMFTAIV595 / 0.810 / 0.247 / 0 / Y / -
596GTRAIVNLLYGGKRVKKL613 / 0.293 / 0.282 / 5 / - / Y
T.thermophilus SecD
6LTSLFLLGVFLLALLFVW23 / 1.344 / 0.155 / 0 / Y / -
278ALIGTLAIFLLIFAYYGP295 / 1.089 / 0.171 / 0 / Y / -
296HLGLVASLGLLYTSALIL313 / 0.934 / 0.084 / 0 / Y / -
311LILGLLSGLGATLTLPGI328 / 0.945 / 0.222 / 0 / Y / -
354RAGKKLRQAIPEGFRHST371 / 0.062 / 0.430 / 4 / - / Y
372LTIMDVNIAHLLAAAALY389 / 0.799 / 0.139 / -1 / Y / -
400AVILAIGVVASVFSNLVF417 / 0.941 / 0.230 / 0 / Y / -
E.coli SecF
25WAFGISGLLLIAAIVIMG42 / 1.093 / 0.108 / 0 / Y / -
146GAMALMAALLSILVYVGF163 / 0.969 / 0.167 / 0 / Y / -
167WRLAAGVVIALAHDVIIT184 / 0.809 / 0.117 / 0 / Y / -
191FHIEIDLTIVASLMSVIG208 / 0.848 / 0.223 / -2 / Y / -
217VSDRIRENFRKIRRGTPY234 / 0.026 / 0.468 / 4 / - / Y
248TLITSGTTLMVILMLYLF265 / 1.085 / 0.144 / 0 / Y / -
273FSLTMLIGVSIGTASSIY290 / 0.815 / 0.166 / 0 / Ya / -
T.thermophilus SecF
16YVTAATLLLAALAAGVVF33 / 0.866 / 0.114 / 0 / Y / -
136AVMAVLVGLGLILLYVAF153 / 1.116 / 0.083 / 0 / Y / -
171VAIVAGMYSLLGLEFSIP188 / 0.847 / 0.223 / -1 / Y / -
185FSIPTIAALLTIVGYSIN202 / 0.875 / 0.208 / 0 / Y / -
207VSDRIRENQKLLRHLPYA224 / 0.219 / 0.407 / 2 / - / Y
227VNRSINQTLSRTVMTSLT244 / 0.353 / 0.235 / 2 / - / Y
243LTTLLPILALLFLGGSVL260 / 1.107 / 0.167 / 0 / Y / -
266AIFVGIFVGTYSSIYVVS283 / 0.902 / 0.222 / 0 / Ya / -
278SIYVVSALVVAWKNRRKA295 / 0.436 / 0.196 / 4 / - / Y
E.coli yajC
22SLILMLVVFGLIFYFMIL39 / 1.394 / 0.033 / 0 / Y / -
42QQKRTKEHKKLMDSIAKG59 / 0.134 / 0.414 / 4 / - / Y
E.coli YidC
6NLLVIALLFVSFMIWQAW23 / 1.217 / 0.117 / 0 / Y / -
338QPLFKLLKWIHSFVGNWG355 / 0.789 / 0.589 / 2 / - / Y (S)
355WGFSIIIITFIVRGIMYP372 / 1.109 / 0.213 / 1 / Yb / -
373TKAQYTSMAKMRMLQPKI390 / 0.308 / 0.181 / 4 / - / Y
431PIFLALYYMLMGSVELRQ448 / 0.839 / 0.114 / 0 / Y / -
463YYILPILMGVTMFFIQKM481 / 1.054 / 0.214 / 1 / Y / -
491QQKIMTFMPVIFTVFFLW508 / 1.079 / 0.303 / 1 / Y / -
515VLYYIVSNLVTIIQQQLI532 / 0.936 / 0.303 / 0 / Y / -
529QLIYRGLEKRGLHSREKK546 / 0.025 / 0.109 / 4 / - / Y
E.coli TatA
3GISIWQLLIIAVIVVLLF20 / 1.308 / 0.107 / 0 / Y / -
23KKLGSIGSDLGASIKGFK40 / 0.236 / 0.502 / 3 / - / Y
E.coli TatB
1MFDIGFSELLLVFIIGLV18 / 1.099 / 0.126 / -2 / Y / -
30KTVAGWIRALRSLATTVQ47 / 0.463 / 0.545 / 3 / - / Y (S)
62SLKKVEKASLTNLTPELK79 / 0.203 / 0.320 / 2 / - / Y
78LKASMDELRQAAESMKRS95 / 0.022 / 0.468 / 1 / - / Y
E.coli TatC
7QPLITHLIELRKRLLNCI24 / 0.671 / 0.515 / 2 / - / Y (S)
20LLNCIIAVIVIFLCLVYF37 / 1.388 / 0.218 / 0 / Y / -
75TFMVSLILSAPVILYQVW92 / 1.088 / 0.261 / 0 / Y / -
110LLVSSSLLFYIGMAFAYF127 / 1.047 / 0.126 / 0 / Y / -
147VSTDIASYLSFVMALFMA164 / 0.830 / 0.316 / 1 / Y / -
216TLLAIPMYCLFEIGVFFS233 / 1.091 / 0.302 / 0 / Y / -

a Less than 50% helical according SOPMA, however both TOPCON and TOPDB predict these regions as regular TM helices.

b According to SOPMA not helical, however predicted as regular TM helix by TOPCONS.

Table S4: Secondary structure prediction comparison. Comparison of secondary structure predictions of the N-terminal part of SecG in M. jannaschii and E. coli as obtained by SOPMA (Combet et al. 2000) and I-Tasser (Zhang 2008).

M. jannaschii SecG:
Sopma:
10 20 30 40 50
| | | | |
MSKREETGLATSAGLIRYMDETFSKIRVKPEHVIGVTVAFVIIEAILTYGRFL
hcccccccccchhhhhhhhhhhhhheeccccceeeeeeeeeehhhhhecccee
I-Tasser:
2040
||
MSKREETGLATSAGLIRYMDETFSKIRVKPEHVIGVTVAFVIIEAILTYGRFL
CCCCCCCCCCCCHHHHHHHHCCHHHCSCCCCCSSSSSSHHHHHHHHHHHCCCC
E.coli SecG:
Sopma:
10 20 30 40 50 60 70
| | | | | | |
MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATLFFIISLVLGNINSNKTNK
hhhhhhhhhhhhhhhhheeeeecttccccceeeecccccceeecccccchhhhhhhhhhhhhhhhhhhhhhhecccccc
I-Tasser:
20406080
||||
MYEALLVVFLIVAIGLVGLIMLQQGKGADMGASFGAGASATLFGSSGSGNFMTRMTALLATLFFIISLVLGNINSNKTNK
CHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCC

Figure S1. 3D structures of M. jannaschii SecG and E. coli SecG. Indicative 3D structures of M. jannaschii SecG and E. coli SecG as obtained I-Tasser (Zhang 2008). The TM-helices are depicted in red and the predicted lipid binding region as found by the Heliquest approach is depicted in blue.

A. M. jannaschii SecG

B. E. coli SecG

Figure S2: Sequence alignment of various SecD proteins

SECD E.COL MLNRYPLWKYVMLIVVIVIGLLYALPNLFGEDPAVQITGARGVAASEQTL 50

SECD H.INF MLNRYPLWKNLMVIFIVAIGILYSLPNIYGEDPAVQISGTRGQEANTSVL 50

SECD T.THE -MNRKNLTS-LFLLGVFLLALLFVWKPWAPEEPKVRLG------36

SECD M.JAN -MDISKLLKDRKILILIIFVTLSVFLIVFKG------30

:: * . :: :. : *

SECD E.COL IQVQKTLQEEKITAKSVALEEGAILARFDSTDTQLRAREALMGVMGDKYV 100

SECD H.INF GQVQDVLKTNNLPTKSIVLENGSILARFTNTDDQLLAKDKIAERLGNNYT 100

SECD T.THE ------

SECD M.JAN ------

SECD E.COL VALNLAPATPRWLAAIHAEPMKLGLDLRGGVHFLMEVDMDTALGKLQEQN 150

SECD H.INF TALNLAPATPAWLSMFGANPMKWGLDLRGGVRFLMEVDMNATLVKRQEQL 150

SECD T.THE ------LDLKGGLRIVLEADVEN------53

SECD M.JAN ------LDFGIDLSGGTIIVLKAEKPMS------52

:** ** ::::.:

SECD E.COL IDSLRSDLREKGIPYTTVRKENNYGLSITFRDAKARDEAIAYLSKRHPDL 200

SECD H.INF QDSLRGELRKEKIQYTAIKNTEHFGTLITLANVSQRAKAERIIRQLHPTL 200

SECD T.THE ------PTL 56

SECD M.JAN ------

SECD E.COL VISSQGSNQLRAVMSDARLSEAREYAVQQNINILRNRVNQLGVAEPVVQR 250

SECD H.INF DITEPDADSINLGLSTAALNEARDLAIEQNLTILRKRVAELGVAEAVIQR 250

SECD T.THE DD------LEKARTVLENRINALGVAEPLIQI 82

SECD M.JAN ------DKEIEATIKIITERLNYNGLNDVVIYP 79

:: .:: :*: *: : ::

SECD E.COL QGADRIVVELPGI--QDTARAKEILGATATLEFRLVNTNVDQAAAASGRV 298

SECD H.INF QGAERIVIELPGV--QDTARAKEILGATATLEFRIVNQNVTADAISRNML 298

SECD T.THE QGQKRIVVELPGLSQADQDRALKLIGQRAVLEFRIVKEGATGTTVAQINQ 132

SECD M.JAN RGNDEIIVEIPKS--CDTDRIIKILKQQGVFVAKIDNITAYTGSDVQNVE 127

:* ..*::*:* * * ::: ..: :: : . :

SECD E.COL PGDSEVKQTREGQPVVLYKR-----VILTGDHITDSTSSQDEYN-QPQVN 342

SECD H.INF PADSEVKYDRQGHPVALFKR-----AVLGGEHIINSSSGLDQHSSTPQVS 343

SECD T.THE ALRENPRLNREELEKDLIKPEDLGPPLLTGADLADARAVFDQFG-RPQVS 181

SECD M.JAN LPTKIPQGETWAYGVPFELT------LEGAKKFAEVAKGKAYH-----K 165

. : : * * . : . . .

SECD E.COL ISLDSAGGNIMSNFTKDNIGKPMATLFVEYKDSGKKDANGRAVLVKQEEV 392

SECD H.INF VTLDSEGGEIMSQTTKKYYKKPMATLYVEYKDNGKKDENGKTILEKHEEV 393

SECD T.THE LTFTPEGAKKFEEVTRQNIGKRLAIVLDGR------VYTAPV 217

SECD M.JAN VELYMDGKLISAPVLSPDLADGKP------HPQQV 194

: : * . . *

SECD E.COL INIANIQSRLGNSFRITGINNPNEARQLSLLLRAGALIAPIQIVEERTIG 442

SECD H.INF INVATIQGRFGSNFQITGVDSIAEAHNLSTLLKSGALIAPIQIVEERTIG 443

SECD T.THE IRQAITGG----QAVIEGLSSVEEASEIALVLRSGSLPVPLKVAEIRAIG 263

SECD M.JAN ITVGAYPP------TKEEIDEAMAIYSALKSGALPVKLDIEYISTIS 235

* . .. ** : *::*:* . :.: :*.

SECD E.COL PTLGMQNIEQGLEACLAGLLVSILFMIIFY-KKFGLIATSALIANLILIV 491

SECD H.INF PSLGAQNVEQGINASLWGLVAVIAFMLFYY-KMFGVIASFALVINIVLLV 492

SECD T.THE PTLGQDAIQAGIRSALIGTLAIFLLIFAYYGPHLGLVASLGLLYTSALIL 313

SECD M.JAN PEFGKEFLKGTAIALLLAFIAVGIIVSIRYKQPKIAIPILITCISEVIII 285

* :* : :: : * . :. :: * :. . :::

SECD E.COL GIMSLLPGATLSMPGIAGIVLTLAVAVDANVLINERIKEELSNGRTVQQA 541

SECD H.INF GLMSILPGATLSMPGIAGIVLTLGMSVDANVLIFERIKEEIRNGRSIQQA 542

SECD T.THE GLLSGL-GATLTLPGIAGLVLTLGAAVDGNVLSFERIKEELRAGKKLRQA 362

SECD M.JAN LGFASLIDWKLDLPSIAGIIAAVGTGVDNQIVITD---EALKRG---AGK 329

:: * . .* :*.***:: ::. .** ::: : * : *

SECD E.COL IDEGYRGAFSSIFDANITTLIKVIILYAVGTGAIKGFAITTGIGVATSMF 591

SECD H.INF INEGYNGAFTSIFDANLTTILTAIILYAVGTGPIQGFAITLSLGVAISMF 592

SECD T.THE IPEGFRHSTLTIMDVNIAHLLAAAALYQYATGPVRGFAVILAIGVVASVF 412

SECD M.JAN IRASIKRAFFIIFASAATSIAAMLPLFVLGVGMLKGFAITTIAGVLIGIF 379

* . . : *: : : *: ..* ::***: ** .:*

SECD E.COL TAIVGTRAIVNLLYGGKRVKKLSI 615

SECD H.INF TAITGTRALVNALYGGKQLKKLLI 616

SECD T.THE SNLVFSRHLLERLADRGEIRPP-- 434

SECD M.JAN ITRPAFARIIEEMFKKF------396

::: :