Supplemental Information

Analytical and decision support tools for genomics-assisted breeding

Rajeev K. Varshney1, 2, Vikas K. Singh1, John M.Hickey3, Xu Xun4, David F. Marshall5, Jun Wang4, David Edwards2, Jean-Marcel Ribaut6

1International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, India

2School of Plant Biology, University of Western Australia, 35 Stirling Highway, Crawley, WA, Australia

3The Roslin Institute and Royal (Dick) School of Veterinary Studies, The University of Edinburgh, Easter Bush, Midlothian, Scotland, UK

4Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China

5Information and Computational Sciences, The James Hutton Institute, Invergowrie, Dundee, United Kingdom

6Generation Challenge Program/ Integrated Breeding Platform, c/o CIMMYT, Apdo. Mexico, DF, Mexico

Corresponding author: Varshney, R.K. ()

Twitter:@rajvarshney

Table S1. Analytical and Decision Support Tools for Genomics-Assisted Breeding

Software / Description / Citation† / URL / Refs
Genetic diversity analysis
DAMBE (5) / Primarily used for phylogenic analysis and data analysis in molecular biology and evolution. / 1822
(Freely available) / / [S1]
DarWin (5.0.158) / Dissimilarity Analysis and Representation for Windows (DarWin) developed for diversity and phylogenetic analysis on the basis of evolutionary dissimilarities. / 279
(Freely available) /
MEGA (5) / An integrated tool for automatic and manual sequence alignment, inferring phylogenetic trees, mining web-based databases, estimating rates of molecular evolution, and testing evolutionary hypotheses. / 258964
221965
35046
(Freely available) / / [S2]
NTSYS-pc (2.2) / Numerical taxonomy and multivariate analysis system, probably most widely used software for diversity analysis. / 11055
(Commercially available) / / [S3]
PAUP (4) / Phylogenetic analysis using parsimony, freely available tool for inferring and interpreting evolutionary trees. / 20664
(Freely available) /
SNPhylo / A powerful SNP-based pipeline to develop a phylogenetic tree from re-sequencing datasets. / 8
(Freely available) / / [S4]
Population genetic analysis
Arlequin (3.5) / An integrated software package for population genetic data analysis. / 109463
40103.5
(Freely available) / / [S5]
DnaSP (5) / A software package for the analysis of aligned DNA sequence data. / 53815
(Freely available) / / [S6]
GenAlEx (6.5) / Excel-based population genetic analysis software. / 64576
12156.5
(Freely available) / / [S7]
GENEPOP (4.2.1) / A population genetics software package for a range of analysis. / 12623
(Freely available) / / [S8]
PowerMarker (3.25) / Designed for use with SSR/SNP data in population genetics analyses. / 1873
(Freely available) / / [S9]
SMOGD (1.2.5) / Aweb-based application for the calculation of genetic diversity. / 447
(Freely available) / / [S10]
Linkage map construction
JoinMap (4.1) / Useful software for the calculation development of high-density genetic linkage maps. Bested suited for construction of consensus map and can accommodate up to100, 000 markers. / 8364.1
16073
(Commercially available) /
LPmerge / R-based package for construction of consensus linkage map. / 8
(Freely available) / / [S11]
Madmapper / Useful in constructing of the high density linkage map using SNPs. / 115
(Freely available) /
MapDisto / Constructs linkage maps it can create new maps or add molecular markers to pre-existing maps. It also calculates distortion of segregation, / 53
(Freely available) /
MapDraw / Microsoft Excel macro for drawing genetic linkage maps based on given genetic linkage data / 283
(Freely available) / / [S12]
Mapmaker (3) / The first and widely used program for construction of the genetic linkage map. / 6563
(Freely available) / / [S13]
MSTMap / Based on Minimum Spanning Tree (MST). The algorithm implemented can handle ultra-dense maps of up to 10,000~100,000 markers. / 45
(Freely available) / / [S14]
Record / REcombination Counting and ORDering, can be used for the ordering of loci on genetic linkage maps. / 147
(Freely available) / / [S15]
SimpleMap / This pipeline can develop linkage maps of ~1000 loci in less than <10 minutes / 0
(Freely available) /
SEG-Map / A novel software for genotype calling and genetic map construction from next-generation sequencing / 8
(Freely available) / / [S16]
QTL- mapping using bi- and multi-parent mapping population and visualization
Biomercator (V3) / Package used for genetic map compilation and QTL meta-analysis algorithms. / 203
19V3
(Freely available) / / [S17]
Mapchart (2.2) / A software for the graphical presentation of linkage maps and QTLs. / 2038
(Freely available) / / [S18]
MAPMAKER/QTL (3.0) / Allows mapping genes controlling quantitative traits to a genetic linkage map. / 420
(Freely available) /
MapManager QTX / Software for mapping both Mendelian and quantitative trait loci and useful for SMA, SIM and CIM analysis. / 1111
(Freely available) / / [S19]
MapQTL (6) / Software for mapping of QTLs and can be used in re-sequencing based datasets. / 6525
5464
(Commercially available) /
MetaQTL / Package of new computational methods for the meta-analysis of QTL mapping experiments. / 77
(Freely available) / / [S20]
MQ2 / MQ2 is useful tool for visualization of QTL mapping results . / 0
(Freely available) / / [S21]
MultiQTL (2.5) / Integrates a broad spectrum of NGS data analysis for performing QTL analysis / 14
(Commercially available) / / [S22]
PLABQTL / Plant Breeding And Biology-QTL.provides an overview of the input linkage map with segregation ratios, genotype frequencies of marker pairs and useful for CIM analysis. / 338
(Freely available) / / [S23]
PROC-QTL / IAllows users to perform QTL mapping for continuous and discrete traits within the SAS platform. / 53
(Commercially available) / / [S24]
QGene (4.3.10) / A plug-in platform that allows execution and comparison of a variety of modern QTL-mapping and useful for SMA and SIM analysis. / 146
(Commercially available) / / [S25]
QTLMapper (1.6) / Software for mapping quantitative trait loci with main effects, epistatic effects and QTL× Environment interactions. / 695
(Freely available) / / [S26]
QTLnetwork (2.0) / Software package for mapping QTL with epistatis and Q × E interatcion. / 189
(Freely available) / / [S27]
R/Qtl / R-based QTL mapping software with a wide range of experimental cross types / 1293
(Freely available) / / [S28]
R/qtlcharts / Interactive graphics R based software for quantitative trait locus mapping / 0(Freely available) / / [S29]
Win QTL-Cartographer (2.5) / A user-friendly version to map quantitative traits and best suited for SIM and CIM analysis. / 1565
(Freely available) /
Genome-wide association studies for marker-trait association
BAPS (6) / A useful tool for understanding Bayesian interference of the genetic structure of the population. / 216
3322
(Freely available) / / [S30]
EIGENSOFT / Detects and corrects for population stratification in genome-wide association studies using principal components analysis. / 3929
(Freely available) / / [S31]
GAPIT / Genome Association and Prediction Integrated Tool, an R package that performs GWAS and genome prediction or genomic selection. / 95
(Freely available) / / [S32]
GenABEL / R-based genome-wide SNP analysis package. / 701
(Freely available) / / [S33]
PLINK (1.9) / A useful tool set for whole-genome association and population-based linkage analyses. / 8863
(Freely available) / / [S34]
fastSTRUCTURE / Investigate population structure, using multi-locus genotype data . / 14798
(Freely available) / / [S35]
TASSEL (5.0) / Trait Analysis by aSSociation, Evolution, and Linkage (TASSEL) is useful for association mapping of complex traits in diverse samples. / 975
(Freely available) / / [S36]
Molecular Breeding for development of superior genotypes
CSSL Finder / Software for the development of chromosomal segment substitution lines, which can be used for mapping and also as improved line. / 45
(Freely available) /
Flapjack / This software can be used for graphical genotyping and haplotype visualization of SNP data. / 44
(Freely available) / / [S37]
GGT / Graphical GenoTypes is versatile software for visualization and analysis of genetic data. / 135
(Freely available) / / [S38]
OptiMAS / A useful program for selection of plants with high number of desirbale alleles in segregating population, for intermating. / 3
(Freely available) / / [S39]
solGS / A web-based tool for genomic selection based on RR-BLUP model. / 0
(Freely available) / / [S40]
Sampling of lines/genotypes for phenotyping
maxRec / Useful for the selection of lines on the basis of recombinant events happens during recombination. / 44
(Freely available) / / [S41]
MMA / Minimum moment aberration based method to minimize the average of all pairwise similarities between individuals in the subsample. / 71
(Freely available) / / [S42]
Power Core / This program applys the advanced M strategy with a heuristic search for establishing core sets. / 77
(Freely available) / / [S43]
SPCLUST / The most appropriate software for selection of lines for the subsets from bi- multi-parental mapping populations. / 4
(Freely available) / / [S44]
Integrated pipelines for genomics-assisted breeding
IBP-BMS (3.0.8) / Breeding Management System, a suite of interconnected purpose-built software tools and crop breeding databases designed to support planning, data management, statistical analysis and decision-making for plant breeding. / 0
(Freely available) / / [S45]
IciMapping / ICIM (QTL IciMapping) constructs genetic linkage maps and maps QTL by simple interval mapping, and inclusive composite interval mapping can handle segregation distortion loci. / 8
(Freely available) / / [S46]
iMAS / An integrated genomic-assisted breeding platform, based on freely available and powerful software. / 0
(Freely available) / / -
ISMU (1.0) / A pipeline useful for analyzing NGS based data, from its preprocessing to SNP calling and finally development of SNP-based markers likes KASasPar and Golden Gate assays. / 1
(Freely available) / / [S47]
ISMU (2.0) / Improved version of ISMU 1.0, with additional different modules for performing genomic selections / 0
(Freely available) / Under development / [S48]
MBDT / Molecular breeding design tool, uses graphical genotyping and integration of genotyping, phenotyping and QTL information for marker-assisted backcrossing. / 0
(Freely available) / / [S49]
Genomic tools for sequencing-based-mapping/ re-sequencing
CloudMap / Web-based open sources package for direct analysis of EMS-induced mutants for identification of candidate genes. / 31
(Freely available) / / [S50]
fastPHASE / software for haplotype reconstruction, and estimating missing genotypes from population data / 114
(Freely available) / / [S51]
HaploBlockFinder / A set of computer programs for analyses of haplotype block structure. / 135
(Freely available) / / [S52]
HaploView / This software can do analysis and visualization of LD and haplotype maps. / 9343
(Freely available) / / [S53]
MutMap / NGS-based rapid detection of candidate genes in EMS-nduced mutants from higher plants. / 139
(Freely available) / / [S54]
NGM / Next generation mapping is another NGS-based approach for identification of candidate gene in EMS mutants. / 122
(Freely available) / / [S55]
QTL-seq / A BSA-based NGS approach for rapid detection of QTL from bi-parental mapping population. / 53
(Freely available) / / [S56]
SHAPEIT / Is a program for haplotype estimation of SNP genotypes in large cohorts across whole chromosome. / 153
(Freely available) / / [S57]
ShoreMap / Simultaneous mapping and mutant identification by deep sequencing is a rapid approach for identification of candidate gene in EMS mutants / 219
(Freely available) / / [S58]
WHAP / A program can perform haplotype based association analysis and useful for multi-allelic markers with dominant and recessive genetic models / 156
(Freely available) / / [S59]

†Citation index was calculated using Google Scholar program as on August 15, 2015

Software version shown in parenthesis

Supplemental References

S1. Xia, X. (2013) DAMBE5: A comprehensive software package for data analysis in molecular biology and evolution. Mol. Biol. Evol. 30, 1720-1728.

S2. Tamura, K. et al. (2011) MEGA5: Molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731-2739.

S3.Rohlf, F.J. (1992) NTSYS-pc (Numerical Taxonomy and Multivariate Analysis System). Version 1.70. Exeter, Setauket, NY.

S4.Lee, T.H. et al. (2014) SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data. BMC Genomics 15, 162.

S5.Excoffier, L. et al. (2005) Arlequin ver. 3.0: An integrated software package for population genetics data analysis. Evol. Bioinformatics Online 1, 47-50.

S6.Librado, P. and Rozas J. (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452.

S7.Peakall, R. and Smouse, P.E. (2012) GenAlex 6.5: genetic analysis in Excel. Population genetic software for teaching and research-an update. Bioinformatics 28, 2537-2539.

S8.Raymond, M. and Rousset, F. (1995) GENEPOP (version 1.2): population genetics software for exact tests and ecumenicism. J. Heredity 86, 248-249

S9.Liu, K. and Muse, S.V. (2005) PowerMarker: an integrated analysis environment for genetic marker analysis. Bioinformatics 21, 2128–2129.

S10. Crawford, N. (2009) SMOGD: software for the measurement of genetic diversity. Mol Ecol Notes 10, 556-557.

S11. Endelman, J. B. and Plomion, C. (2014) LPmerge: an R package for merging genetic maps by linear programming. Bioinformatics 30, 1623-1624.

S12. Liu, R., and Meng, J. (2003) MapDraw: a Microsoft Excel macro for drawing genetic linkage maps based on given genetic linkage data. Heraditas25, 317–321.

S13. Lander, E. S. et al. (1987) MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1, 174-181.

S14. Wu, Y. et al. (2008) Efficient and accurate construction of genetic linkage maps from the minimum spanning tree of a graph. PLoS Genet. 4: e1000212.

S15. van, Os. H. et al. (2005) RECORD: a novel method for ordering loci on a genetic linkage map. Theor. Appl. Genet. 112, 30–40

S16. Zhao, et al. (2010) SEG-Map: A novel software for genotype calling and genetic map construction from next-generation sequencing. Rice 3, 98–102.

S17. Sosnowski, O. et al. (2012) BioMercator V3: an upgrade of genetic map compilation and quantitative trait loci meta-analysis algorithms. Bioinformatics 28, 2082-2083.

S18. Voorrips, R.E. (2002) MapChart: software for the graphical presentation of linkage maps and QTLs. J. Hered 93, 77–78.

S19. Manly, K.F. et al. (2001) Map Manager QTX, cross-platform software for genetic mapping. Mamm. Genome 12, 930-932.

S20. Veyrieras, J.B. et al. (2007) MetaQTL: a package of new computational methods for the meta-analysis of QTL mapping experiments. BMC Bioinformatics 8, 49.

S21. Chibon, P.Y. et al. (2013) MQ2: visualizing multi-trait mapped QTL results. Mol. Breed. 32, 981-985.

S22. Broman, K.W. (2005) The genomes of recombinant inbred lines. Genetics 169, 1133-1146.

S23. Utz, H.F. and Melchinger, A.E. (1996) PLABQTL: A program for composite interval mapping of QTL. J. Agric. Genomics 2(1).

S24. Hu, Z. and Xu, S. (2009) PROC QTL-a SAS procedure for mapping quantitative trait loci. Int. J. Plant Genomics 2009, 3.

S25. Joehanes, R. and Nelson, J. C. (2008) QGene 4.0, an extensible Java QTL-analysis platform. Bioinformatics 24, 2788-2789.

S26. Wang, G.L. et al. (1999) Mapping QTLs with epistatic effects and QTL × environment interactions by mixed linear model approaches. Theor. Appl. Genet. 99, 1255-1264.

S27. Yang J, et al. (2008) QTLNetwork: Mapping and visualizing genetic architecture of complex traits in experimental populations. Bioinformatics 24, 721–723.

S28. Broman, K.W. et al. (2003) R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889-890.

S29. Broman, K. W. (2014) R/qtlcharts: interactive graphics for quantitative trait locus mapping. Genetics, 199, 359-361.

S30. Cheng, L., et al. (2013) Hierarchical and spatially explicit clustering of DNA sequences with BAPS software. Mol. Biol. Evol. 30, 1224-1228.

S31. Price, A.L. et al. (2006) Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 38, 904-909.

S32. Lipika, A.E. et al. (2012) GAPIT: genome association and prediction integrated tool. Bioinformatics 28, 2397-2399.

S33. Aulchenko, Y.S. et al. (2007) GenABEL: an R library for genome-wide association analysis. Bioinformatics 23, 1294-1296.

S34. Purcell, S. et al. (2007) PLINK: a tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet 81, 559-575.

S35. Pritchard, J. K. et al. (2000) Inference of population structure using multilocus genotype data. Genetics, 155, 945–959.

S36. Bradbury, P.J. et al. (2007) TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics 23, 2633-2635.

S37. Milne, I. et al. (2010) Flapjack-graphical genotype visualization. Bioinformatics 26, 3133-3134.

S38. Van Berloo, R. (2008) Computer note: GGT 2.0: versatile software for visualization and analysis of genetic data. J. Hered. 99, 232-236.

S39. Valente, F. et al. (2013) OptiMAS: A decision support tool for marker-assisted assembly of diverse alleles. J Hered. 104, 586-590.

S40. Tecle, I.Y. et al. (2014) solGS: a web-based tool for genomic selection. BMC Bioinformatics. 15, 398.

S41. Jannink, J.L. (2005) Selective phenotyping to accurately map quantitative trait loci. Crop Sci. 45, 901-908.

S42. Jin, C. et al. (2004) Selective phenotyping for increased efficiency in genetic mapping studies. Genetics 168, 2285-2293.

S43. Kim, K.W. et al. (2007) PowerCore: A program applying the advanced M strategy with a heuristic search for establishing core sets. Bioinformatics 23, 515-526.

S44. Huang, B.E. et al. (2012) Selecting subsets of genotyped experimental populations for phenotyping to maximize genetic diversity. Theor. Appl. Genet. 126, 379-388.

S45.Delannay, X. G. et al. (2012) Fostering molecular breeding in developing countries. Mol. Breed. 29, 857-873.

S46. Wang, J. et al. (2011) Users’ manual of QTL IciMapping v3.1. Institute of Crop Science, CAAS, Beijing, and Crop Research Informatics Lab, Mexico

S47. Azam, S. et al. (2014) An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data. PLoS One 9, e101754.

S48. Rathore A., et al. (2015) ISMU 2.0: A multi-algorithm pipeline for genomic selection. Presented at the Plant and Animal Genome XXII, January 11-15, San Diego, CA.

S49. Jayashree, B. et al. (2009) Bioinformatics: A platform for ICRISAT's global research needs. Documentation. International Crops Research Institute for the Semi-Arid Tropics, Patancheru, India.

S50. Minevich, G. et al. (2012) CloudMap: a cloud-based pipeline for analysis of mutant genome sequences. Genetics 192, 1249-1269.

S51. Scheet, P and Stephens, M. (2006) A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78, 629-644

S52. Zhang, K. and Jin, L (2003) HaploBlockFinder: haplotype block analyses. Bioinformatics. 19, 1300–1301.

S53. Barrett, J.C. et al. (2005) Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 21, 263–265.

S54. Abe, A. et al. (2012) Genome sequencing reveals agronomically important loci in rice using MutMap. Nature Biotech.30, 174–178.

S55. Austin, R.S. et al. (2011). Next-generation mapping of Arabidopsis genes. Plant J. 67, 715-725.

S56. Takagi, H. et al. (2013) QTL-seq: rapid mapping of quantitative trait loci in rice by whole genome resequencing of DNA from two bulked populations. Plant J. 74, 174-183

S57. Delaneau, O. et al. (2013) Improved whole chromosome phasing for disease and population genetic studies. Nat Methods 10, 5–6.

S58. Schneeberger, K. et al. (2009) SHOREmap: simultaneous mapping and mutation identification by deep sequencing. Nat. Methods 6, 550-551.

S59. Purcell, S et al. (2007) WHAP: haplotype-based association analysis. Bioinformatics 23, 255–256.

1