Supplementary Material

Table S1. Sequences of 58 ACE inhibitory dipeptides with observed and calculated activitiesusing GP modeling.

No. / Dipeptide / pIC50
Obsd / ISA-ECI / z-scale / DPPS
Training set (40 samples)
1 / VW / 5.80 / 5.49(0.53a) / 5.36(0.51) / 5.25(0.62)
2 / IY / 5.43 / 4.97(0.50) / 4.98(0.50) / 4.86(0.63)
3 / RW / 4.80 / 4.86(0.53) / 4.67(0.51) / 4.70(0.64)
4 / VY / 4.66 / 4.78(0.48) / 4.48(0.50) / 4.50(0.61)
5 / AY / 4.06 / 4.10(0.47) / 4.04(0.50) / 3.89(0.62)
6 / IP / 3.89 / 3.84(0.49) / 3.91(0.50) / 3.65(0.61)
7 / RP / 3.74 / 3.30(0.45) / 3.74(0.51) / 3.54(0.59)
8 / AF / 3.72 / 3.32(0.49) / 3.64(0.50) / 3.81(0.63)
9 / GY / 3.68 / 3.46(0.49) / 3.56(0.50) / 3.52(0.62)
10 / VP / 3.38 / 3.77(0.47) / 3.55(0.50) / 3.29(0.62)
11 / GP / 3.35 / 2.98(0.46) / 3.31(0.50) / 2.97(0.58)
12 / GF / 3.20 / 3.14(0.50) / 3.31(0.51) / 3.21(0.58)
13 / IF / 3.03 / 3.39(0.54) / 3.44(0.50) / 3.67(0.63)
14 / IG / 2.92 / 2.49(0.45) / 2.74(0.49) / 2.83(0.62)
15 / GI / 2.92 / 3.04(0.47) / 2.85(0.49) / 2.99(0.57)
16 / GM / 2.85 / 3.09(0.46) / 2.87(0.49) / 2.98(0.59)
17 / GA / 2.70 / 2.57(0.46) / 2.62(0.48) / 2.73(0.57)
18 / YG / 2.70 / 2.45(0.45) / 2.59(0.50) / 2.70(0.60)
19 / GL / 2.60 / 3.05(0.46) / 2.77(0.48) / 2.90(0.56)
20 / AG / 2.60 / 2.24(0.44) / 2.33(0.46) / 2.45(0.60)
21 / KG / 2.49 / 2.38(0.44) / 2.48(0.48) / 2.56(0.60)
22 / FG / 2.43 / 2.53(0.50) / 2.43(0.50) / 2.53(0.63)
23 / GK / 2.27 / 2.86(0.47) / 2.32(0.50) / 2.44(0.60)
24 / GT / 2.24 / 2.23(0.52) / 2.33(0.48) / 2.35(0.62)
25 / HG / 2.20 / 2.33(0.44) / 2.15(0.48) / 2.28(0.62)
26 / GQ / 2.15 / 2.20(0.57) / 2.29(0.49) / 2.22(0.59)
27 / GG / 2.14 / 2.08(0.45) / 2.06(0.51) / 2.09(0.60)
28 / QG / 2.13 / 2.08(0.45) / 2.13(0.46) / 2.19(0.62)
29 / SG / 2.07 / 2.08(0.45) / 2.08(0.47) / 2.05(0.62)
30 / LG / 2.06 / 2.49(0.46) / 2.31(0.49) / 2.28(0.63)
31 / TG / 2.00 / 2.23(0.44) / 2.23(0.48) / 2.08(0.62)
32 / EG / 2.00 / 2.11(0.45) / 1.99(0.47) / 1.89(0.60)
33 / DG / 1.85 / 2.07(0.45) / 1.82(0.48) / 1.88(0.63)
34 / PG / 1.77 / 2.43(0.45) / 1.88(0.49) / 1.97(0.62)
35 / KA / 3.42 / 3.13(0.46) / 3.28(0.50) / 3.18(0.60)
36 / RA / 3.34 / 2.83(0.45) / 3.31(0.50) / 3.28(0.59)
37 / YA / 3.34 / 3.24(0.47) / 3.27(0.50) / 3.25(0.60)
38 / AA / 3.21 / 2.89(0.45) / 3.09(0.50) / 2.99(0.60)
39 / FR / 3.04 / 3.10(0.57) / 3.13(0.51) / 3.26(0.65)
40 / EA / 2.00 / 2.65(0.45) / 2.25(0.50) / 2.31(0.60)
Test set(18 samples)
41 / IW / 5.70 / 5.59(0.57) / 4.85(0.70) / 4.75(0.74)
42 / AW / 5.00 / 4.98(0.52) / 4.40(0.69) / 4.23(0.73)
43 / GW / 4.52 / 4.42(0.58) / 3.88(0.73) / 3.68(0.69)
44 / VF / 4.28 / 3.42(0.51) / 3.89(0.68) / 4.44(0.76)
45 / AP / 3.64 / 3.39(0.46) / 3.46(0.65) / 3.17(0.63)
46 / RF / 3.64 / 3.29(0.49) / 4.06(0.75) / 4.35(0.79)
47 / VG / 2.96 / 2.43(0.45) / 2.49(0.66) / 2.46(0.72)
48 / GH / 2.51 / 2.65(0.48) / 2.75(0.61) / 2.72(0.64)
49 / GR / 2.49 / 2.63(0.79) / 2.65(0.71) / 2.67(0.87)
50 / GS / 2.42 / 1.78(0.61) / 2.41(0.62) / 2.04(0.68)
51 / GV / 2.34 / 2.98(0.47) / 2.68(0.69) / 2.82(0.62)
52 / MG / 2.32 / 2.45(0.45) / 2.19(0.66) / 2.53(0.77)
53 / GE / 2.27 / 2.23(0.56) / 2.29(0.57) / 2.06(0.74)
54 / WG / 2.23 / 2.52(0.48) / 2.51(0.75) / 2.83(0.80)
55 / GD / 2.04 / 2.11(0.57) / 2.56(0.68) / 1.83(0.78)
56 / LA / 3.51 / 3.27(0.48) / 3.30(0.70) / 2.87(0.68)
57 / HL / 2.49 / 3.46(0.48) / 3.12(0.76) / 3.96(0.78)
58 / DA / 2.42 / 2.56(0.46) / 2.47(0.58) / 2.42(0.69)

aThe data in bracket arestandard deviationof calculation values (includingnoise).

Table S2. Sequences of 31bradykinin-potentiating pentapeptideswith observed and calculated activities using GP modeling.

No / Pentapeptide / pRAI
Obsd / ISA-ECI / z-scale / DPPS
Training set (25 samples)
1 / VESSK / 0.00 / -0.02(0.47a) / 0.10(0.36) / 0.19(0.28)
2 / VEASK / 0.20 / 0.25(0.48) / 0.20(0.36) / 0.26(0.28)
3 / VEAAK / 0.51 / 0.37(0.42) / 0.41(0.35) / 0.40(0.27)
4 / VKAAK / 0.11 / 0.05(0.41) / 0.22(0.40) / 0.17(0.28)
5 / VEWAK / 2.73 / 2.33(0.45) / 2.24(0.38) / 2.14(0.28)
6 / VEAAP / 0.18 / 0.31(0.44) / 0.30(0.40) / 0.22(0.28)
7 / VEHAK / 1.53 / 1.29(0.47) / 1.36(0.38) / 1.23(0.28)
8 / VAAAK / -0.10 / -0.04(0.45) / 0.06(0.40) / -0.04(0.28)
9 / LEAAK / 0.40 / 0.36(0.41) / 0.39(0.35) / 0.36(0.27)
10 / FEAAK / 0.30 / 0.35(0.45) / 0.33(0.39) / 0.33(0.28)
11 / VEGGK / -1.00 / -0.62(0.46) / -0.72(0.41) / -0.58(0.28)
12 / VEFAK / 1.57 / 1.56(0.47) / 1.62(0.37) / 1.42(0.28)
13 / VELAK / 0.59 / 0.70(0.47) / 0.73(0.37) / 0.88(0.28)
14 / AAYAA / 0.46 / 0.54(0.47) / 0.61(0.37) / 0.78(0.28)
15 / AAWAA / 0.75 / 1.01(0.46) / 1.0(0.37) / 1.08(0.28)
16 / VAWAA / 1.43 / 1.37(0.43) / 1.30(0.39) / 1.40(0.29)
17 / VAWAK / 1.45 / 1.47(0.45) / 1.42(0.40) / 1.42(0.28)
18 / VKWAA / 1.71 / 1.66(0.43) / 1.63(0.40) / 1.64(0.29)
19 / VWAAK / 0.04 / 0.12(0.46) / 0.11(0.41) / 0.02(0.30)
20 / VAAWK / 0.23 / 0.22(0.48) / 0.31(0.41) / 0.28(0.29)
21 / EKWAP / 1.30 / 1.40(0.47) / 1.42(0.40) / 1.46(0.29)
22 / RKWAP / 1.98 / 1.82(0.48) / 1.86(0.41) / 1.88(0.30)
23 / VEWVK / 1.71 / 1.69(0.48) / 1.59(0.40) / 1.78(0.28)
24 / FSPFR / 0.64 / 0.69(0.49) / 0.72(0.41) / 0.66(0.29)
25 / GGGGG / 0.00 / -0.11(0.49) / -0.08(0.42) / -0.35(0.30)
Test set (6 samples)
26 / VESAK / 0.28 / 0.49(0.71) / 0.34(0.37) / 0.34(0.33)
27 / GEAAK / -0.52 / 0.27(0.70) / 0.19(0.56) / 0.21(0.42)
28 / AAAAA / -0.10 / -0.06(0.61) / 0.37(0.53) / -0.05(0.33)
29 / VKWAP / 2.35 / 1.70(0.45) / 1.42(0.54) / 1.75(0.35)
30 / PGFSK / 0.90 / 0.69(0.79) / 1.01(0.67) / 1.00(0.40)
31 / RYLPT / 0.40 / 0.98(0.81) / 0.94(0.68) / 1.17(0.42)

aThe data in bracket arestandard deviationof calculation values (includingnoise).

Table S3. Sequences of 101cationic antimicrobialpentadecapeptideswith observed and calculated activities using GP modeling.

No. / Pentadecapeptide / Log Potency
Obsd / ISA-ECI / z-scale / DPPS
Training set(70 samples)
1 / KWKLFLGILAVLKVL / -0.799 / -0.316(0.333a) / -0.349(0.256) / -0.457(0.208)
2 / KWKGELEIEAELKVL / -0.425 / -0.107(0.343) / -0.247(0.267) / -0.323(0.220)
3 / GWKLGLKILNVLKVL / -0.305 / 0.043(0.292) / 0.007(0.256) / -0.039(0.206)
4 / KWKLFKKNNNNNKHN / -0.303 / -0.141(0.328) / -0.173(0.270) / -0.193(0.214)
5 / KWHLFLLILAVLKVL / -0.289 / -0.192(0.333) / -0.084(0.258) / -0.074(0.206)
6 / KRGLFKKGGAVLKGL / -0.277 / -0.008(0.339) / -0.064(0.273) / -0.151(0.223)
7 / HWHLHKHRGARHKVL / -0.169 / 0.017(0.310) / -0.156(0.269) / -0.152(0.212)
8 / KWHLFLKILAVLKVL / -0.130 / 0.044(0.292) / -0.014(0.253) / -0.014(0.203)
9 / KWKLFKKHGNVRKVL / -0.113 / 0.093(0.303) / 0.065(0.258) / 0.050(0.207)
10 / KNKRNKKIGAVLKVL / -0.072 / 0.077(0.339) / 0.156(0.264) / 0.105(0.215)
11 / KHNLFKGIGAVLLVL / -0.035 / 0.088(0.343) / 0.082(0.267) / 0.041(0.214)
12 / KWKLFKKIGNRNKVL / -0.024 / 0.080(0.311) / 0.107(0.259) / 0.063(0.210)
13 / LWKLFLHILAVLKVL / -0.017 / 0.071(0.294) / 0.072(0.257) / 0.044(0.206)
14 / GWRLFRGIRAVLNVL / 0.031 / 0.117(0.343) / 0.253(0.245) / 0.225(0.208)
15 / KWGLFKNIGAVLHVN / 0.063 / 0.137(0.335) / 0.253(0.265) / 0.186(0.212)
16 / RWKLNNNIGARLKVL / 0.081 / 0.142(0.343) / 0.183(0.261) / 0.125(0.211)
17 / HWKLFKKIGHVNKRL / 0.127 / 0.163(0.306) / 0.267(0.263) / 0.241(0.211)
18 / KWKLFKKNGAVLKVL / 0.149 / 0.200(0.318) / 0.252(0.252) / 0.236(0.204)
19 / HWKRFLRIGHNLNVN / 0.175 / 0.189(0.343) / 0.070(0.273) / 0.114(0.218)
20 / GWKLFKGIRAVLNVL / 0.175 / 0.327(0.330) / 0.234(0.245) / 0.232(0.207)
21 / KWKLFKKIGGVGGVL / 0.202 / 0.238(0.321) / 0.289(0.267) / 0.231(0.217)
22 / GWKLFLKILAVLKVL / 0.204 / 0.044(0.292) / 0.159(0.253) / 0.159(0.205)
23 / GWKLFKNRGAVLKHL / 0.205 / 0.179(0.335) / 0.259(0.263) / 0.200(0.210)
24 / KWKLFNKRGAVLKVL / 0.205 / 0.207(0.343) / 0.290(0.256) / 0.269(0.208)
25 / KWKLFKKIGANLKVL / 0.316 / 0.469(0.304) / 0.377(0.253) / 0.363(0.203)
26 / KWKLFLHILAVLKVL / 0.351 / 0.071(0.294) / 0.168(0.252) / 0.202(0.201)
27 / KWKLFRKIGAVHRVL / 0.363 / 0.399(0.326) / 0.372(0.253) / 0.418(0.203)
28 / LWKLFKKHGAVLKVL / 0.384 / 0.476(0.286) / 0.379(0.257) / 0.399(0.205)
29 / KWHLNKRIHAVLKRL / 0.413 / 0.305(0.343) / 0.408(0.264) / 0.428(0.211)
30 / KWKLFRRIGAVLKHR / 0.432 / 0.314(0.343) / 0.496(0.260) / 0.525(0.210)
31 / KRKRFRKIGAVLKVL / 0.439 / 0.317(0.343) / 0.397(0.262) / 0.418(0.213)
32 / KWKLFKLRGRVRKVL / 0.459 / 0.400(0.335) / 0.317(0.266) / 0.331(0.215)
33 / KWKLFKKIGLGLGVL / 0.498 / 0.479(0.311) / 0.502(0.264) / 0.511(0.218)
34 / KWLLFKKIGAVLLNH / 0.535 / 0.555(0.296) / 0.487(0.270) / 0.514(0.215)
35 / LRKLFKKIRAVLLVR / 0.558 / 0.304(0.338) / 0.478(0.274) / 0.463(0.221)
36 / LWRLLKKILRVLKVL / 0.582 / 0.690(0.267) / 0.649(0.252) / 0.626(0.205)
37 / GWKLFKLIGAVLKVL / 0.587 / 0.546(0.300) / 0.567(0.256) / 0.591(0.206)
38 / KWKLGKKIGAVLGVL / 0.596 / 0.627(0.289) / 0.580(0.258) / 0.595(0.208)
39 / KWKLFHKILAVLKVL / 0.613 / 0.685(0.271) / 0.504(0.251) / 0.476(0.201)
40 / GWRLLKKILEVLKVL / 0.617 / 0.690(0.267) / 0.600(0.249) / 0.595(0.204)
41 / KWKLFHLIGAVLKVL / 0.620 / 0.547(0.301) / 0.527(0.254) / 0.534(0.203)
42 / KWKNFKKIGAVLKVL / 0.628 / 0.687(0.269) / 0.561(0.255) / 0.560(0.205)
43 / KWKLRKKIGAVLKVL / 0.630 / 0.690(0.267) / 0.531(0.256) / 0.548(0.209)
44 / GWKLGKKIGRVLKVL / 0.637 / 0.690(0.267) / 0.680(0.253) / 0.682(0.205)
45 / KWKLFKLIRAVLKVL / 0.656 / 0.546(0.300) / 0.575(0.256) / 0.620(0.208)
46 / GWKGFKKIGRVLKVL / 0.678 / 0.690(0.267) / 0.650(0.260) / 0.668(0.213)
47 / KWKLFKKIGAVLNRL / 0.687 / 0.622(0.291) / 0.556(0.256) / 0.587(0.207)
48 / KWGLFKKIGAVLKVL / 0.698 / 0.690(0.267) / 0.573(0.246) / 0.587(0.197)
49 / KWKLFKKVLKVLTTG / 0.724 / 0.629(0.279) / 0.602(0.265) / 0.616(0.215)
50 / GWKLFKKIGRVLKVL / 0.726 / 0.690(0.267) / 0.708(0.234) / 0.728(0.185)
51 / GWKLFKKIGRVLRVL / 0.738 / 0.694(0.275) / 0.708(0.234) / 0.723(0.186)
52 / LWKLFKKIGRVLKVL / 0.740 / 0.690(0.267) / 0.711(0.251) / 0.727(0.203)
53 / LWKLFRKIRRLLRVL / 0.741 / 0.451(0.326) / 0.757(0.238) / 0.785(0.206)
54 / GWKLGKKILRVLRVL / 0.745 / 0.694(0.275) / 0.708(0.227) / 0.708(0.178)
55 / KWKLGKKILNVLKVL / 0.746 / 0.690(0.267) / 0.604(0.254) / 0.630(0.204)
56 / GWRLGKKILRVLKVL / 0.746 / 0.690(0.267) / 0.723(0.226) / 0.736(0.177)
57 / LWKLFKKIRRVLRVL / 0.749 / 0.694(0.275) / 0.736(0.230) / 0.748(0.187)
58 / KWKLFKKIGAVLKVL / 0.757 / 0.690(0.267) / 0.686(0.230) / 0.721(0.180)
59 / GWKLGKKILRVLKVL / 0.758 / 0.690(0.267) / 0.708(0.226) / 0.714(0.176)
60 / NWKLFKKIGAVLKVL / 0.764 / 0.690(0.267) / 0.633(0.248) / 0.655(0.199)
61 / KWHLFKKIGAVLKVL / 0.764 / 0.690(0.267) / 0.640(0.233) / 0.698(0.178)
62 / GWKLFKKIGAVLKVL / 0.774 / 0.690(0.267) / 0.647(0.248) / 0.663(0.201)
63 / LWKLFKKINRVLKVL / 0.781 / 0.690(0.267) / 0.725(0.253) / 0.746(0.202)
64 / KWKLFHKIGAVLKVL / 0.783 / 0.685(0.271) / 0.599(0.248) / 0.599(0.199)
65 / LWKLFKKIRRVLKVL / 0.788 / 0.690(0.267) / 0.736(0.230) / 0.754(0.186)
66 / KWKLFKHIGAVLKVL / 0.790 / 0.679(0.279) / 0.609(0.248) / 0.648(0.198)
67 / GWKLGKHILNVLKVL / 0.791 / 0.678(0.279) / 0.623(0.255) / 0.641(0.204)
68 / KWKLGKKIGAVLKVL / 0.801 / 0.690(0.267) / 0.648(0.250) / 0.668(0.201)
69 / KWKLFKGIRAVLKVL / 0.810 / 0.522(0.320) / 0.510(0.255) / 0.517(0.206)
70 / GWRLIKKILRVFKGL / 0.824 / 0.738(0.276) / 0.716(0.258) / 0.750(0.211)
Test set (31 samples)
71 / KWNLNGNINAVLKVL / -0.680 / -0.106(0.381) / -0.215(0.336) / -0.230(0.307)
72 / KWHLRNKIGAVRNNL / -0.270 / 0.009(0.406) / -0.063(0.360) / -0.130(0.327)
73 / KHKLFKKIGAHRKRN / -0.257 / 0.202(0.413) / 0.008(0.363) / -0.018(0.320)
74 / GWELGEEILNVLKVL / -0.150 / 0.149(0.360) / 0.228(0.302) / 0.379(0.292)
75 / KNKLEKKIGAVLKVL / 0.012 / 0.079(0.340) / 0.315(0.307) / 0.286(0.290)
76 / KWKLGKGIGAVGKVL / 0.014 / 0.390(0.378) / 0.200(0.341) / 0.017(0.321)
77 / KWKLFNRIGHNRKVN / 0.021 / 0.204(0.414) / 0.226(0.333) / 0.212(0.302)
78 / KGKGGKKGGRGGKVL / 0.032 / 0.196(0.414) / 0.057(0.438) / -0.204(0.440)
79 / GWLLHRNIGNVLHRL / 0.142 / 0.197(0.413) / 0.419(0.355) / 0.455(0.310)
80 / LWHLFLKILAVLKVL / 0.180 / 0.044(0.292) / 0.098(0.303) / 0.090(0.264)
81 / RWKNFKNIRANLRVL / 0.241 / 0.157(0.378) / 0.329(0.334) / 0.221(0.291)
82 / KWKLFGKNGRNLLVL / 0.259 / 0.154(0.404) / 0.108(0.375) / 0.224(0.349)
83 / GWRLFKGIRAVLNVL / 0.262 / 0.327(0.330) / 0.248(0.245) / 0.255(0.210)
84 / KWKLFKKGAVLKVLT / 0.277 / 0.104(0.333) / 0.211(0.364) / 0.216(0.344)
85 / KWKLFKKRNAVLKVL / 0.293 / 0.318(0.303) / 0.342(0.307) / 0.334(0.269)
86 / KWKLFKRIGAVHKRL / 0.351 / 0.290(0.361) / 0.352(0.305) / 0.374(0.268)
87 / HWKLFKKIHAVRKHL / 0.351 / 0.276(0.299) / 0.325(0.331) / 0.245(0.281)
88 / KWKLFKKGIGAVLKV / 0.423 / 0.209(0.343) / 0.167(0.367) / 0.339(0.346)
89 / KWKLFKKLKVLTTGL / 0.511 / 0.389(0.305) / 0.413(0.331) / 0.383(0.312)
90 / LWRLLKHILRVLKVL / 0.513 / 0.678(0.279) / 0.655(0.298) / 0.635(0.260)
91 / KWKLFKKAVLKVLTT / 0.613 / 0.496(0.325) / 0.205(0.349) / 0.545(0.330)
92 / NWKLFHKIGAVLKVL / 0.622 / 0.685(0.271) / 0.520(0.288) / 0.554(0.243)
93 / KWKGFKKIGAVLKVL / 0.635 / 0.690(0.267) / 0.496(0.300) / 0.502(0.275)
94 / KWKLFKKIGAVLHNL / 0.656 / 0.706(0.268) / 0.471(0.298) / 0.537(0.254)
95 / LWKLFKKIRRLLKVL / 0.740 / 0.609(0.274) / 0.743(0.234) / 0.748(0.258)
96 / HWKLFKKIGAVLKVL / 0.757 / 0.520(0.267) / 0.614(0.264) / 0.665(0.239)
97 / KWKLGKKILRVLKVL / 0.767 / 0.600(0.267) / 0.612(0.295) / 0.627(0.263)
98 / GWKLGLKILRVLKVL / 0.769 / 0.444(0.292) / 0.660(0.300) / 0.568(0.266)
99 / GWKLGKKILNVLKVL / 0.781 / 0.690(0.267) / 0.563(0.290) / 0.648(0.254)
100 / KWRLFKNIGAVLKVL / 0.799 / 0.546(0.344) / 0.580(0.264) / 0.657(0.247)
101 / VWRLIKKILRVFKGL / 0.820 / 0.638(0.276) / 0.766(0.306) / 0.688(0.280)

aThe data in bracket arestandard deviationof calculation values (includingnoise).

Table S4. Comparison of results obtained from this work and previous studies for the ACE inhibitory dipeptide panel.

No. / Descriptor / Method / r2 / q2 / RMSE
1 / z-scale [1] / PLS / 0.770 / 0.723 / -
2 / t-score [2] / PLS / 0.744 / - / 0.50
3 / ISA-ECI [3] / PLS / 0.700 / - / -
4 / MSW-score [4] / PLS / 0.708 / 0.637 / -
5 / HESH [5] / PLS / 0.877 / 0.838 / 0.361
6 / MEDV [6] / GA-MLR / 0.883 / 0.861 / 0.339
7 / MEDV-13 [7] / SMR-PCR / 0.895 / 0.783 / 0.32
8 / VHSE [8] / SMR-PLS / 0.770 / 0.745 / 0.48
9 / T-scale [9] / SMR-PLS / 0.845 / 0.786 / 0.39
10 / 3D-HoVAIF[10] / GA-PLS / 0.857 / 0.811 / 0.38
11 / z-scale (this work) / GP / 0.969 / 0.918 / 0.17

1.Hellberg S, Eriksson L, Jonsson J, et al.Int. J. Pept. Protein. Res., 1991, 37, 414–424.

2.Cocchi M, Johansson E.Quant. Struct. -Act. Relat., 1993, 12, 1–8.

3.Collantes ER, Dunn WJ.J. Med. Chem., 1995, 38, 2705–2713.

4.Zaliani A, Gancia E.J. Chem. Inf. Comput. Sci., 1999, 39, 525–533.

5.Shu M,Mei H,Yang S, et al. QSAR Comb. Sci., 2008, DOI: 10.1002/qsar.200710169.

6.Liu S, Yin C, Wang L.J. Chem. Inf. Comput. Sci., 2002, 42, 749–756.

7.Liu S, Yin C, Cai S, et al.J. Chem. Inf. Comput. Sci., 2001,41, 321–329.

8.Mei H, Liao Z, Zhou Y, et al.Biopolymers (Pept. Sci.) 2005, 80, 775–786.

9.Tian F, Zhou P, Li Z.J. Mol. Struct.,2007, 830, 106–115.

10.Tian F, Zhou P, Lv L, et al. J. Pept. Sci., 2007, 13, 549–566.

Table S5. Comparison of results obtained from this work and previous studies for the bradykinin-potentiating pentapeptides panel.

No. / Descriptor / Method / r2 / q2 / RMSE
1 / z-scale [1] / PLS / 0.970 / - / -
2 / ISA-ECI[2] / PLS / 0.920 / - / -
3 / Topological Indices [3] / SMR-MLR / 0.899 / 0.845 / 0.23
4 / SFED [4] / MLR / 0.750 / 0.617 / -
5 / SSIA-AM1[5] / SMR-PLS / 0.874 / 0.786 / 0.27
6 / VHSE [6] / SMR-PLS / 0.934 / 0.861 / -
7 / ISA-ECI (this work) / GP / 0.967 / 0.846 / 0.15
  1. HellbergS, SjostromM, SkagerbergB, et al. J. Med. Chem., 1987, 30, 1126–1135.
  2. CollantesER, DunnWJ. J. Med. Chem., 1995, 38,2705–2713.
  3. SungJ, WeiF, TropshaA.J. Chem. Inf. Comput. Sci., 1998, 38, 259–268.
  4. KimJ, NamKY, ChoKH. et al.Bull. Korean Chem. Soc., 2003, 24, 1742–1750.
  5. Zhou P, Zhou Y, Wu S, et al. Chin. Sci. Bul., 2006, 51, 524–529.
  6. Mei H, Liao Z, Zhou Y, et al.Biopolymers (Pept. Sci.) 2005, 80, 775–786.

Table S6. Comparison of results obtained from this work and previous studies for the cationic antimicrobial pentadecapeptides panel.

No. / Descriptor / Method / r2 / q2 / RMSE
1 / z-scale [1] / PLS / 0.76 / 0.55
2 / z-scale [1] / GA-PLS / 0.71 / 0.61 / -
3 / FASGAI [1] / SMR-PLS / 0.78 / 0.64 / -
4 / FASGAI [1] / SMR-MLR / 0.87 / 0.74 / -
5 / ‘Inductive’ descriptor [2] / ANN / 0.860 / - / -
6 / DPPS (this work) / GP / 0.918 / 0.823 / 0.11
  1. Liang G, Li Z. QSAR Comb. Sci., 26, 2007,754–763.
  2. Cherkasov A, Jankovic B. Molecules, 2004, 9, 1034–1052.

Table S7. Modeling statistics of the ACE inhibitory dipeptide panelbased on the original 124 amino acid descriptors (119original DPPSs+2 ISA-ECI +3 z-scales) without PCA-processing.

Method / Num. of variables / Training set (40 samples) / Test set (18 samples)
r2 / q2a / RMSE / / RMSP / Tropsha’s statistics
/ / / k / k′
PLS / 248b / 0.813 / 0.678 / 0.379 / 0.647 / 0.612 / 0.652 / 0.617 / 0.601 / 0.948 / 0.987
ANNc / 248 / - / - / - / - / - / - / - / - / - / -
SVM / 248 / 0.927 / 0.794 / 0.259 / 0.754 / 0.509 / 0.771 / 0.722 / 0.715 / 0.967 / 0.987
GP / 248 / 0.964 / 0.816 / 0.170 / 0.775 / 0.498 / 0.786 / 0.745 / 0.720 / 0.972 / 1.013

aLeave-1/3-out cross-validation q2.

b 248 variables = 124 descriptors (119original DPPSs+2 ISA-ECI +3 z-scales) ×2 AAs (per dipeptide).

cANN is incapable of implementation due to the variable size is too large.

Table S8. Modeling statistics of the bradykinin-potentiating pentapeptide panelbased on the original 124 amino acid descriptors (119 original DPPSs + 2 ISA-ECI +3 z-scales) without PCA-processing.

Method / Num. of variables / Training set (25 samples) / Test set (6 samples)
r2 / q2a / RMSE / / RMSP / Tropsha’s statistics
/ / / k / k′
PLS / 620 b / 0.898 / 0.544 / 0.301 / 0.470 / 0.815 / 0.478 / 0.356 / 0.414 / 0.687 / 1.012
ANN c / 620 / - / - / - / - / - / - / - / - / - / -
SVM c / 620 / - / - / - / - / - / - / - / - / - / -
GP / 620 / 0.985 / 0.731 / 0.133 / 0.637 / 0.635 / 0.660 / 0.596 / 0.567 / 0.935 / 1.004

aLeave-1/3-out cross-validation q2.

b620 variables = 124 descriptors (119original DPPSs+2 ISA-ECI +3 z-scales) ×5 AAs (per pentapeptide).

cANN and SVM are incapable of implementation due to the variable size is too large.

Table S9. Modeling statistics ofcationic antimicrobial pentadecapeptide panel based on the original 124 amino acid descriptors (119original DPPSs + 2 ISA-ECI +3 z-scales) without PCA-processing.

Method / Num. of variables / Training set (70 samples) / Test set (31 samples)
r2 / q2a / RMSE / / RMSP / Tropsha’s statistics
/ / / k / k′
PLS / 1860 b / 0.910 / 0.513 / 0.110 / 0.452 / 0.321 / 0.489 / 0.374 / 0.395 / 0.678 / 1.102
ANN c / 1860 / - / - / - / - / - / - / - / - / - / -
SVM c / 1860 / - / - / - / - / - / - / - / - / - / -
GP c / 1860 / - / - / - / - / - / - / - / - / - / -

aLeave-1/3-out cross-validation q2.

b 1860 variables = 124 descriptors (119original DPPSs+2 ISA-ECI +3 z-scales) ×15 AAs (per pentadecapeptide).

cANN, SVM and GP are incapable of implementation due to the variable size is too large.