Supplementary material to
In silico proteomic characterization of human epidermal growth factor receptor 2 (HER-2) for the mapping of high affinity antigenic determinants against breast cancer
------
Detailed methodology for homology modeling of HER-2:
Homology modeling is a process for building model of the proteins on the basis of template structures. The three dimensional structure of the HER-2 was modeled by comparative protein modeling protocol. Homology modeling of HER-2 sequence was performed by two automated homology modeling programs (SWISS-MODEL and CPH models 3.0 server) and MODELLER 9v7 software. SWISS-MODEL and CPH models server need only sequence data as an input so HER-2 sequence was submitted directly to the above tools and the 3D model of HER-2 was obtained. For model building with MODELLER the template was selected manually after running BLASTp program for best template search. Sequence with maximum identity score and less e-value with the query sequence was subsequently used as a reference structure to build a 3D model. We noticed that there was no single template satisfying the input criteria. Hence to tackle this problem multi-template modeling was performed using the best seven templates i.e., 1S78|A, 1S78|B, 3BE1|A, 1RPA|A, 2JWA|A, 3FUY|A, 3H3B|A. MODELLER was used for alignment of target protein sequence (HER-2) with its corresponding 7 best templates with the help of align2d_mult.py command. The alignment file generated was found incorrect as the residues of the template sequences where not aligned properly with the query sequence. Hence, an alternative approach to perform multiple alignment was achieved by using ClustalW. For this server we need multiple sequences in Fasta format and not PDB so with the aid of ATM2SEQ tool at the NCBS-IWS portal, the sequences were retrieved from all the selected templates. The alignment generated by ClustalW was also seen incorrect so we tried for Geneious Pro software, a commercial one, but still the results were not accurate which prompted us to make manual correction in sequence alignment and the optimal alignment is depicted in Fig. 2. Finally, model_mult.py command of MODELLER was run by providing the manually prepared alignment file as an input. It was observed that the sequence alignment was accurate and the models were generated using 100% identity.
It is critically important that the predicted models, produced using any of the methods discussed above, should be further examined to increase confidence in the structural model. The validation for structure models obtained from the different software tools (SWISS-MODEL, CPH models server and MODELLER) was performed by inspection of the Psi/Phi Ramachandran plot obtained from PROCHECK analysis and determination of the compatibility of an atomic model (3D) with its own amino acid sequence (1D) was checked by VERIFY_3D. The energy of the HER-2 protein model was minimized to release the internal constraints and to enhance the percentage residues in core region. Energy is a function of the degree of freedom in a molecule (i.e., bonds, angles, and dihedrals). Minimum energy arrangements of the atoms correspond to stable states of the system and energy minimization can repair distorted geometries by moving atoms to release internal constraints. Here the constructed model was solvated and subjected to constraint energy minimization with a harmonic constraint of 100 kJ/mol/Å2 applied for all amino acids whose Phi/Psi was out of core region, using the steepest descent and conjugate gradient technique to eliminate bad contacts between protein atoms and structural water molecules and to correct the stereochemistry of the model. Computations were carried out in vacuo with the GROMOS96 parameters set, implementation of Swiss-PdbViewer. The refined model was once again subjected to a series of tests for validation of consistency and reliability of the model using PROCHECK.
Supplementary Tables
Table 1: Final selection of epitopes (B-cell and T-cell) from various immunoinformatic tools
Source / B cell/T cell / HLA allele/Antigenic property / Epitope / Score
IEDB/BcePred / B-cell (Continuous) / Accessibility / PETHLD / 2.1
Continuous / Accessibility / NRPEDE / 7.2
Continuous / Accessibility / HYKDPP / 5.0
Continuous / B-Turn / PLNNTTP / 1.2
Continuous / B-Turn / HKNNQLA / 1.0
Continuous / B-Turn / HHNTHLC / 1.0
Continuous / B-Turn / NCTHSCV / 1.1
Continuous / Hydrophilicity / TGASPGG / 4.7
Continuous / Hydrophilicity / TGPKHSD / 5.3
Continuous / Hydrophilicity / ECQPQNG / 5.1
Continuous / Hydrophilicity / DEEGACQ / 5.8
Continuous / Flexibility / RSLRELG / 1.0
Continuous / Flexibility / GESSEDC / 1.1
IEDB / T-cell (MHC-I) / HLA-B*1503 / TQVCTGTDM / 14.3
IEDB / MHC-I / HLA-A*3201
HLA-A*0250
HLA-A*0219
HLA-A*0216
HLA-A*0212
HLA-A*0211
HLA-A*0206
HLA-A*0201
HLA-A*0202 / QLFEDNYAL
QLFEDNYALA / 11.5
7.5
10.6
33.7
3.8
3.7
24.2
10.6
27.1
IEDB / MHC-I / HLA-A*0202
HLA-A*0250 / FLQDIQEVQ / 33.6
7.9
IEDB / MHC-I / HLA-A*3101 / RQVPLQRLR / 26.5
IEDB / MHC-I / HLA-A*2403 / TYLPTNASL / 8.9
IEDB / MHC-I / HLA-A*0211
HLA-A*0216
HLA-A*0250 / VLIQRNPQL / 24.7
36.9
5.0
IEDB / MHC-I / HLA-A*3101 HLA-A*6801 / RTVCAGGCAR / 28.9
31.6
IEDB / MHC-I / HLA-A*0203
HLA-A*6802 / YTFGASCVTA / 46.2
16.3
IEDB / MHC-I / HLA-A*0203
HLA-A*0250 / YLSTDVGSC / 41.1
21.5
IEDB / MHC-I / HLA-A*0201
HLA-A*0202
HLA-A*0203
HLA-A*0206
HLA-A*0211
HLA-A*0216
HLA-A*0250
HLA-A*3001
HLA-B*1517 / KIFGSLAFL / 18.9
5.2
10.0
6.4
7.7
21.5
8.9
24.3
37.8
IEDB / MHC-I / HLA-A*1101
HLA-A*3101
HLA-A*6801 / CVNCSQFLR / 45.7
14.5
9.6
IEDB / MHC-I / HLA-A*0201
HLA-A*0202
HLA-A*0203
HLA-A*0211
HLA-A*0212
HLA-A*0216
HLA-A*0219
HLA-A*0250 / ILHNGAYSL / 47.1
40.9
30.5
5.9
9.3
39.3
24.5
3.7
IEDB / MHC-I / HLA-A*0203
HLA-A*0211
HLA-A*0212
HLA-A*0216
HLA-A*0219
HLA-A*0250 / HLYQGCQVV / 10.4
5.1
19.2
7.4
29.9
2.6
IEDB / MHC-I / HLA-A*0203
HLA-A*6802 / CVARCPSGV / 35.5
13.4
IEDB / MHC-I / HLA-B*1503
HLA-B*1517
HLA-B*5801 / LSYMPIWKF / 11.9
3.2
26.9
IEDB
PROPRED / MHC-I
MHC-II / HLA-A*0206
HLA-A*0211
HLA-A*0216
HLA-A*0219
HLA-A*0250
HLA-A*6901
HLA-DRB1_0305
HLA-DRB1_0309 / YVLIAHNQV / 20.1
9.6
19.3
35.9
5.0
6.6
3.3
3.71
PROPRED / MHC-II / DRB1_1101
DRB1_1120
DRB1_1128
DRB1_1302
DRB1_1305
DRB1_1321 / FQNLQVIRG / 3
3.2
5.2
3.2
5.2
5.1
PROPRED / MHC-II / HLA-DRB1_0405
HLA-DRB1_0410
HLA- DRB1_1304 / LQVFETLEE / 5.2
6.2
5.0
EpiToolKit / MHC-I / HLA-A*1101 / CSPMCKGSR / 18.0
EpiToolKit / MHC-I / HLA-A*0301
HLA-A*2601 / ELHCPALVTY / 24.0
25.0
EpiToolKit / MHC-I / HLA-A*0201 / TLQGLGISWL / 26.0
EpiToolKit / MHC-I / HLA-B*1801
HLA-B*3701
HLA-B*4001
HLA-B*4402
HLA-B*4901 / GEGLACHQL / 19.0
22.0
23.0
23.0
18.0
EpiToolKit / MHC-I / HLA-A*0301 / RVLQGLPREY / 23.0
EpiToolKit / MHC-I / HLA-B*0702 / GPEADQCVA / 18.0
HLAPred / T cell [MHC-I] / HLA-A*3302 / ESMPNPEGR / 3.8
HLAPred / MHC-I / HLA-B*5102 / RNPHQALLH / 15.7
HLAPred/ProPred / MHC-II / HLA-DRB1*0813 / LQLRSLTEI / 3.8
IEDB / MHC-II / HLA-H2IEd / QDTILWKDI / 75.78
RANKPEP / MHC-I / HLA-A24 /
AYIPWHIHF
/53.76
RANKPEP / MHC-I /HLA-B3909
/SRDPVHWMW
/55.424
RANKPEP / MHC-I /HLA-Cw0304
/IAMDPFMAL
/57.304
RANKPEP / MHC-I /HLA-B35
/WPHPWLHGY
/45.866
RANKPEP / MHC-I /HLA-A6601
/YCFGGGHKR
/49.723
Table 2: The docking scores of all selected T-cell epitopes for the screening of conformational epitopes
S. No. / Molecule / Dock Score1 / YULIAHNQV / 105.95
2 / AYIPWHIHF / 105.90
3 / CSPMCKGSR / 96.25
4 / CVARCPSGV / 101.92
5 / CVNCSQFLR / 109.66
6 / ELHCPALVTY / 104.34
7 / ESMPNPEGR / 108.97
8 / FLQDIQEVQ / 113.44
9 / FQNLQVIRG / 109.79
10 / GEGLACHQL / 102.17
11 / GPEADQCVA / 101.84
12 / HLYQGCQVV / 102.06
13 / IAMDPFMAL / 99.77
14 / ILHNGAYSL / 107.68
15 / KIFGSLAFL / 98.18
16 / LQLRSLTEI / 101.95
17 / LQVFETLEE / 106.80
18 / LSYMPIWKF / 103.81
19 / QDTILWKDI / 106.95
20 / QLFEDNYAL / 109.24
21 / QLFEDNYALA / 111.60
22 / RNPHQAAAL / 101.31
23 / RQVPLQRLR / 95.84
24 / RTVCAGGCAR / 101.86
25 / RVLQGLPREY / 100.07
26 / SRDPVHWMW / 105.02
27 / TLQGLGISWL / 100.76
28 / TQVCTGTDM / 112.34
29 / TYLPTNASL / 111.49
30 / VLIQRNPQL / 99.27
31 / WPHPHWLHGY / 107.56
32 / YCFGGGHKR / 119.56
33 / YLSTDVGSC / 106.66
34 / YTFGASCVTA / 103.55
Supplementary Figures
Figure 1:
Fig. 1: Best multiple sequence alignment used for 3D model building. Missing fragments of (1S78A) are highlighted as enclosed in rectangles (red color) and their presence in HER2 sequence is shown in Green color.
Figure 2:
(a) (b)
Fig. 2: Comparative Ramchandran plot analysis of (a) modeled protein and (b) 1S78│A.
Figure 3A:
Fig. 3 (A): Superposition of 1S78A (cyan color) and modeled protein (Pink colored).
Figure 3B:
Fig. 3 (B): Superposition (structural alignment) of 1S78A (Model 1) and modeled protein (Model 2) shown in sequence alignment (1D) form.
Figure 4:
Fig. 4: DNA Vaccine (pSecTag2-HER-2624-CTLA-4124)
Extracellular domain of human CTLA-4 use to augment Immune response
atgcacgtggcccagcccgccgtggtgctggccagcagcagaggcatcgccagcttcgtgtgcgagtacgccagccccggcaaggccaccgaggtgagagtgaccgtgctgagacaggccgacagccaggtgaccgaggtgtgcgccgccacctacatgatgggcaacgagctgaccttcctggacgacagcatctgcaccggcaccagcagcggcaaccaggtgaacctgaccatccagggcctgagagccatggacaccggcctgtacatctgcaaggtggagctgatgtacccccccccctactacctgggcatcggcaacggcacccagatctacgtgatcgaccccgagccctgccccgacagcgac