Li Xue

Iowa State University 1-515-450-7183(C)

PROFILE

Ph.D. studentwith major in Bioinformatics and minor in Statistics. Major research interests are data mining/machine learning applications in QSAR (Quantitative Structure–Activity Relationship) models, macro-molecular binding sites predictions, T cell epitope predictions, andusing partner-specific interface predictions to rank conformations generated by docking programs.

SKILLS

  • Machine learning algorithms
/
  • Familiar with Linux

  • Proficient in Perl, MATLAB, R
/
  • Experience with SAS

EDUCATION

Iowa State UniversityPh.D.2012

Bioinformatics with Statistics minorGPA: 3.76/4

Major Prof. Vasant Honavar and Drena Dobbs

Dissertation: Sequence homology based protein-protein interacting residue predictions and the

applications in ranking docked conformations.

Shanghai Jiaotong UniversityM.S.2003

Image Processing and Pattern RecognitionGPA: 3.93/4

Major Prof. Lixiu Yao and Jie Yang

Thesis: Data mining-based gene expression data analysis and prediction of signal peptides and its cleavage site.

Yanshan UniversityB.E.1999

Electrical and Electronics EngineeringGPA: 3.55/4(top 5%)

Major Prof. Xiuling Zhang

Thesis: The optimization of dynamic RBF neural network and its application.

HONORS & AWARDS

ISU Research Excellence Award2012

Best Poster Award at ACM-BCB conference2011

Women in Bioinformatics Award at ACM-BCB conference2011

ISMB conference Travel Fellowship2010

Exceptional Graduate 2003, 2006

Scholastic Excellent Graduation Thesis for Bachelor Degree2003

Special Prize of Academic Excellence Scholarship (top 1%)2002

The First and Second Prize of Academic Excellence Scholarship 1999-2001

Exceptional Student 1999-2000

JOURNAL PAPERS

Xue, L. C., Jordan, R., El-Manzalawy, Y., Dobbs, D., & Honavar, V. (2012). DockRank: Ranking docked models using partner-specific sequence homology based protein interface prediction. (To be submitted.)

Walia, R., Xue, L. C., Wilkins, K., El-Manzalawy, Y., Dobbs, D., and Honavar, V. (2012) Robust prediction of RNA-binding sites in proteins using a combination of sequence homology and machine learning methods. (To be submitted.)

Xue, L. C., Dobbs, D., & Honavar, V. (2011). HomPPI: A Class of sequence homology based protein-protein interface prediction methods. BMC Bioinformatics, 12, 244.

Zhang, G. L., Ansari, H. R., Bradley, P., Cawley, G. C., Hertz, T., Hu, X., Jojic, N., Kim, Y., Kohlbacher, O., Lund, O., Lundegaard, C., Magaret, C. A., Nielsen, M., Papadopoulos, H., Raghava, G. P., Tal, V. S., Xue, L. C., Yanover, C., Zhu, S., Rock, M. T., Crowe, J. E., Panayiotou, C., Polycarpou, M. M., Duch, W., & Brusic, V. (2011). Machine learning competition in immunology - Prediction of HLA class I binding peptides. J Immunol Methods, 374 (1-2), 1-4.

Xue, L. C., Petersen, L. K., Broderick, S., Narasimhan, B., & Rajan, K. (2010). Identifying factors controlling protein release from combinatorial biomaterial libraries via hybrid data mining methods. ACS Combinatorial Science, 13, 50-58.

Petersen, L. K., Xue, L. C., Wannemuehler, M.J., Rajan, K., & Narasimhan, B. (2009). The simultaneous effect of polymer chemistry and device geometry on the in vitro activation of murine dendritic cells. Biomaterials, 30, 5131-5142.

Lee, J. H., Hamilton, M., Gleeson, C., Caragea, C., Zaback, P., Sander, J. D., Xue, L. C., Wu, F., Terribilini, M., Honavar, V., & Dobbs, D. (2008). Striking similarities in diverse telomerase proteins revealed by combining structure prediction and machine learning approaches. Pac Symp Biocomput, 501-12.

Xue, L. C., Yang, J., & Liu, H. (2006). Multi-feature Image Segmentation using FCM algorithm. Image Technology, 1, 34-35.

Li, G. Z., Yang, J., Liu, G.P., & Xue, L. C. (2004). Feature selection for multi-class problems using support vector machines. Lecture Notes In Artificial Intelligence, 3157, 292-300.

Liu, H., Yang, J., Wang, M., Xue, L. C., & Chou, K. C. (2005). Using Fourier Spectrum Analysis and Pseudo Amino Acid Composition for Prediction of Membrane Protein Types. The Protein Journal, 24(6), 385-389.

CONFERENCE PAPERS

Xue, L. C., Jordan, R., El-Manzalawy, Y., Dobbs, D., & Honavar, V. (2011). Ranking docked models of protein-protein complexes using predicted partner-specific protein-protein interfaces: A preliminary study. In Proceedings of the International Conference On Bioinformatics and Computational Biology (ACM-BCB); Chicago, Illinois, August 1-3, 2011. (Best Poster Award and an extended version was invited to a special issue of BMC Bioinformatics journal.)

Xue, L. C., Walia, R., EL-Manzalawy, Y., Dobbs, D., & Honavar, V. (2011). Improved prediction of protein-RNA interfaces using combined sequence homology and machine learning methods: A preliminary study. In Proceedings of the International Conference On Bioinformatics and Computational Biology (ACM-BCB); Chicago, Illinois, August 1-3, 2011.

Yao, L., Xue, L. C., Liu, H. (2007). A novel approach predicting the signal peptides and their cleavage sites, International Conference on Bioinformatics & Biomedical Engineering, 8, 391-393.

PROJECTS

Protein-Protein Docking

  • DockRank: Rank Docked ConformationsFall 2010 - Fall 2011

Designed and developed DockRank, a novel approach to rank docked conformations based on the degree to which the interface residues inferred from the docked conformation match the interface residues predicted by ourpartner-specific sequence homology based interface predictor, PS-HomPPI.Our results show that DockRank significantly outperforms several state-of-the-art energy based scoring functions and the variants of DockRank supplied with predicted interface from several state-of-the-art non-specific interface predictors.

Protein Interface Prediction

  • PS-HomPPI: Partner-Specific Protein-Protein Interface PredictionSpring 2010- Fall 2010

Proposed a novel partner-specificmeasure of conservation of residues at the interfacebetween a pair of interacting proteins among their homo-interologs. Developed PS-HomPPI, the first sequence based partner-specific interface predictor, which in our preliminary studies has been shown to provideamong the most reliable predictors of interface residues of a hypothetical transient complexformed by a protein A with its putative interaction partner B whenever the homo-interologs of A-B can be reliably identified.

  • NPS-HomPPI: Non-Partner Specific Homologous Sequence-Based Protein-Protein Interface Prediction Spring 2007- Summer 2010

Applied PCA (Principal Component Analysis), a dimension reduction technique, to study the multivariate relationship between protein interface conservation and multiple sequence similarity metrics; Developed a sequence homologybased interface residue predictor, NPS-HomPPI, which does not require the knowledge of binding partners, with performance rivaling more complicated methods that require structural information as input.

Computational Immunology

  • MHC Class II epitope prediction (Intern atMerck & Co., Inc.)Summer 2009

Compared several published algorithms MHC Class II epitope prediction. Designed, developed, and tested a new epitope prediction algorithm, which is a modification of Hammer's matrix method that showed an improved performance compared to other methods.

QSAR

  • In Silico Analysis of Biodegradable Drug Delivery System2008

Developed a GA-SVR hybrid system for selecting relevant copolymer molecular descriptors polymer film stimulation data. Genetic Algorithms (GA) was used to select the optimal subset of copolymer molecular descriptors that optimize the regression performance of SVR on polymer film stimulation data.

Optimization, ClassificationRegression

  • GA-LLE based Regression Analysis of Spinel DataFall 2008

Applied Genetic Algorithms (GA) to find the optimal parameters for LLE (Locally Linear Embedding), which was used to reduce the dimension of feature space for SVR (Support Vector Regression). Significantly improved the regression performance of SVR from 0.7386 (original 52-dimension space) to 0.9105 (14-dimension LLE space).

  • Netflix Competition – Movie Recommendation Systems2008

Led a group of three graduate students. Instead of using users’ profile, we downloaded, extracted and utilized many properties of the movies, such as actors, director, genres and awards information. Designed and developed a set of similarity based approach, PCA based SVM classification, and regression solutions to predict a user’s ranking of movies.

  • Classification of Signal Peptide and Prediction of Cleavage Site2005

Designed the classifiers using SVMs/HMM (Support Vector Machines/Hidden Markov Model); Dealt with unbalanced dataset.

  • Multi-Feature Image Segmentation using FCM AlgorithmSummer 2005

Used fuzzy c-means clustering algorithm to segment a picture into several meaningful areas.

  • Image Processing: Character Recognition (course project)Spring 2005

Trained BP and Hopfield Neural Network (NN) using labeled character sample set, and used the trained NN classifier to identify noisy characters.

  • Speech Enhancement (course project)Fall 2004

Studied and implemented four basic adaptive speech enhancement algorithms based on LMS and Wavelet Decomposition.

  • Clustering Analysis of Gene Expression ProfileFall 2004-Spring 2005

Spectral estimation of optimal cluster numbers; Dealt with incomplete datasets.

  • Optimal Design and Application of RBF Neural Network (Bachelor thesis Project)Spring 2003

Designed fuzzy Neural Network temperature controller (FNNC); Used GA to optimize the parameters of FNNC; Used an RBF NN to simulate the temperature system to be controlled.

REFERENCESAvailable upon request