FinalScientific Report (years 1-3)

Project acronym: EUROSPAN

Project full title: European Special Populations Research Network: Quantifying and Harnessing Genetic Variation for Gene Discovery

Contract no.: LSHG-CT-2006-018947

Period covered: from 1st March 2006 to 31stAugust 2009

Date of preparation: 26th October 2009

Project coordinator: Professor Harry Campbell

Project coordinator organization: University of Edinburgh

Project partners:

Participant no. / Participant organisation name / Participant organisation short name / Scientific team leader / Town / Country
1 (Coordinator) / University of Edinburgh / Edinburgh / Professor H Campbell / Edinburgh / UK
2 / UK Medical Research Council Human Genetics Unit / MRC / Professor A Wright / Edinburgh / UK
3 / ErasmusUniversity / Erasmus / Professor C van Duijn / Rotterdam / Netherlands
4 / UppsalaUniversity / Uppsala / Professor U Gyllensten / Uppsala / Sweden
5 / NationalResearchCenter for Environment and Health / GSF / Professor T Meitinger / Munich / Germany
6 / University of ZagrebMedicalSchool / Zagreb / Professor I Rudan / Zagreb / Croatia
7 / IntegraGen / IntegraGen / Dr P Brooks / Paris / France
8 / EuropeanResearchAcademy / EURAC / Dr P Prams taller / Bolzano / Italy

Executive Summary

Progress Year 1

All partners met twice in Programme Coordination Committee meetings in order to define an agreed scientific plan to achieve the aims and deliverables of the project. Appropriate administrative arrangements were established to ensure proper project management and communications. EUROSPAN partners planned to measure 50 new phenotypes (principally in lipidomics) in a coordinated manner with identical study procedures which will facilitate joint analysis. A major new SNP genotyping facility was established by one of the partners (GSF) to serve EUROSPAN genotyping requirements. Purchase of genotyping arrays was timed to coincide with falls in commercial prices for this product and the EUROSPAN order was combined with other orders so as to achieve the maximum value for money (by attracting the minimum available unit price which is only offered to very high volume orders). Thus EUROSPAN aimed to deliver 1268 million SNP genotypes compared to approximately 22 million outlined in the EUROSPAN contract. This was to be achieved within budget but with some virement of funds from phenotyping and staffing (but with no threat to achievement of other project aims and deliverables). Joint analysis plans were agreed and were implemented which covered data sharing, development of new software by two partners (GENABEL – Erasmus and University of Edinburgh) and sharing of a range of data handling and quality control software. Plans were agreed for division of responsibilities and leadership arrangements on specified topics so that joint analysis can proceed by multiple groups according to an agreed overall plan. A review of the relevant literature on ethical and social aspects of genetic research in isolated communities was conducted, a note made on completed and ongoing research on these topics and plans made for a subgroup meeting on this topic at the next programme coordination meeting in year 2. A number of scientific papers were published and poster and oral presentation accepted at international meetings to disseminate early findings.

Progress – year 2

All partners met several times in Programme Coordination Committee meetings and in many smaller and more focused meetings of specific working groups. In addition, phone conferences were held regularly, at least once a month. Administrative personnel successfully dealt with the two main administrative tasks jointly agreed for the 2nd year, which used up the majority of EUROSPAN budget: (i) purchase of arrays for genome-wide scan from Illumina, Inc; and (ii) sub-contracting the laboratory in Regensburg, Germany, led by Professor Schmitz, to perform several hundred sophisticated lipidomics analyses in 4200 EUROSPAN recruits. In the second part of Year 2, genotyping of most EUROSPAN subjects (all but Erasmus samples) took place at one of the partners (GSF) and it was successfully conducted. During this period, lipidomics analyses were performed and finished in Regensburg, thus delivering several hundred highly specific phenotypes shared and standardized between all partners (instead of about 40 initially proposed). Joint analysis of all data to search genes underlying lipidomics traits using genome-wide association studies and dense SNP-based genome-wide scans were begun in Rotterdam, with all partners contributing their standardized analyses to a joint server. A new software for these analyses was developed by two partners (GENABEL – Erasmus and University of Edinburgh). In parallel, all groups were working on linkage analyses of designated traits that were divided between the partners. These analyses used STR markers, followed up by association studies under the peak using SNP markers. A number of scientific papers were published and poster and oral presentation accepted at international meetings to disseminate early findings. EUROSPAN initiated collaborations on a number of publications with European consortia conducting meta-analyses of several studies (eg lipids, uric acid etc) to maximise study power across Europe.

Progress– year 3

EUROSPAN has made good progress in year 3. All partners have met several times in Programme Coordination Committee meetings and in many smaller and more focused meetings of specific working groups. In addition, phone conferences were being held regularly, typically every week or every 2 weeks depending on the volume of work at that time. Administrative personnel have successfully dealt with the main administrative tasks for the year 3: (i) coordination of analysis activity across partners with submission of manuscripts for publication; and (ii) formation of collaborative arrangements with a wide variety (>25) of international consortia to lead or contribute to meta-analyses of EUROSPAN data within a larger international grouping in order to increase overall study power (and thus to maximise the potential scientific achievements of the EUROSPAN project).

In Year 3, EUROSPAN data were subject to an agreed quality control procedure and held together for joint analysis. New software tools developed by EUROSPAN were used to analyse the very large and complex family-based GWAS dataset. These data comprise a wide variety of >300 phenotypes or quantitative traits (see Annex 2 to this report) which was very substantially more than described in the EUROSPAN contract (38 phenotypes are measured in 3 or more populations) and a total of 1,268 million SNP genotypes, once again very substantially more than the approximately 22 million outlined in the original EUROSPAN contract. This was achieved within budget but with some virement of funds from phenotyping and staffing to genotyping (with no compromise to the achievement of project aims and deliverables).

Joint analysis of all data to search genes underlying lipidomics traits using genome-wide association studies and dense SNP-based genome-wide scans were conducted led by the Rotterdam partner, with all partners contributing their standardized analyses to a joint server. A new software for these analyses was developed by two partners (GENABEL – Erasmus and University of Edinburgh). A substantial number of scientific papers have been published (see list of publications) and poster and oral presentation accepted at international meetings to disseminate early findings. At the time of writing this report there are 43 articles published or in press in international journals with a further 9 submitted and awaiting final decisions by journals in year 3. This includes publication of the following major publications jointly by EUROSPAN partners

  • Aulchenko YS et al. Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nat Genet. 2009;41:47-55.[22 EUROSPAN authors]
  • Dupuis J et al. New genetic loci implicated in fasting glucose homeostasis and their impact on type 2 diabetes risk. Nat Genet 2009: in press.[16 EUROSPAN authors]
  • Repapi E et al. Genome-wide association study identifies five new loci associated with lung function. Nat Genet 2009 in press [15 EUROSPAN authors]
  • Pfeufer A et al. Common variants at ten loci modulate the QT interval duration in the QTSCD Study. Nat Genet. 2009;41:407-14. [9 EUROSPAN authors]
  • Saxena R et al. Genetic variation in gastric inhibitory polypeptide receptor (GIPR) impacts the glucose and insulin responses to an oral glucose challenge. Nat Genet2009, in press[5 EUROSPAN authors]
  • Benjamin EJ et al. Variants in ZFHX3 are associated with atrial fibrillation in individuals of European ancestry. Nat Genet. 2009 Aug;41(8):879-81. [4 EUROSPAN authors]
  • Willer CJ et al. Genetic Investigation of ANthropometric Traits Consortium. Six new loci associated with body mass index highlight a neuronal influence on body weight regulation. Nat Genet. 2009; 41: 25-34. [3 EUROSPAN authors]
  • Kong A et al. (EUROSPAN authors within the DIAGRAM Consortium). Parental origin of sequence variants associated with complex diseases. Nature 2009: in press[9 EUROSPAN authors]
  • Kolz M et al. Meta-analysis of 28,141 individuals identifies common variants within five new loci that influence uric acid concentrations. PLoS Genet. 2009;5:e1000504.[10 EUROSPAN authors]
  • Hicks AA et al. Genetic determinants of circulating sphingolipid concentrations. PLOS Genetics. 2009in press[31 EUROSPAN authors]
  • Heard-Costa NL et al. NRXN3 is a novel locus for waist circumference: a genome-wide association study from the CHARGE Consortium. PLoS Genet. 2009;5:e1000539.[10 EUROSPAN authors]
  • Lindgren CM et al. Genome-wide association scan meta-analysis identifies three Loci influencing adiposity and fat distribution. PLoS Genet. 2009;5: e1000508. [4 EUROSPAN authors]
  • Van Duijn C et al. A genomic study of lipidomic profiles: a genome wide association analysis of circulating phospholipids. Nat Genet 20009: submitted for publication [33 EUROSPAN authors]

Topic areas include descriptive analyses of the EUROSPAN dataset, discovery of common genetic variants underlying disease-related traits, ethical aspects of the EUROSPAN study and population genetics studies (such as selection / drift effects).

EUROSPAN studies have made major contributions to a number of international GWAS consortia and has achieved its aims relating to discovery of common variants influencing disease traits through the large sample sizes achievable through co-ordination with other international research groups. The major role played by EUROSPAN in these meta-analysis consortium papers listed in list of publications is evidenced by the number of EUROSPAN authors on the publications (median 15; range 9 – 26). EUROSPAN members are currently working actively or leading 25 ongoing GWAS meta-analyses within international consortia and participating in sub-group studies on GG / GE interactions, rarer variants or pathways analyses within these consortia. This work is likely to continue throughout 2010 thus continuing to extend the final scientific achievements of EUROSPAN.

Virement of activity to support some initial sequencing activity was agreed at the Programme Coordinating Committee meeting in year 2 and presented in the Year 2 Report. This work is ongoing (due to the time consuming nature of the sequencing analysis) and will lead to future publications in 2009 / 2010. A progress report is given in Annex 3 of this report.

1. Project Objectives and Major Achievements

WP1: Harmonization of phenotypes – year 1

  • Phenotypes in all study populations were entered into study spreadsheet
  • Details of phenotyping methods were published on EUROSPAN website and guided groups in adopting same procedures for new phenotypes (such as digital ECG measures)
  • Phenotyping continued to take place in all 5 study populations leading to a larger number of phenotypes that can be analysed jointly across EUROSPAN and leading to about 1000 new recruits in year 1
  • Arrangements were made to collaborate with the Prof Schmitz, University of Regensberg (EU lipidomics initiative) to measure about 40 novel lipidomics traits in all EUROSPAN populations by the same methods and in the same laboratory

WP1: Harmonization of phenotypes – year 2

  • Prof Schmitz, from the University of Regensberg (EU lipidomics initiative), measured several hundred novel lipidomics traits in all EUROSPAN populations by the same methods and in the same laboratory;
  • University of Regensburg also analysed basic biohemistry in sera of EUROSPAN recruits to make them comparable and prepared for joint analysis using genome-wide association studies;
  • Some phenotyping continued to take place in all 5 study populations, leading larger number of phenotypes that can be analysed jointly across EUROSPAN; these traits include cognitive traits, eye phenotypes and bone mineral density;

WP1: Harmonization of phenotypes – year 3

  • Additional phenotyping was carried out to achieve the maximum number of common phenotypes for joint analysis. Annex 2 lists details of presence in each EUROSPAN cohort (and in selected other cohorts available to EUROSPAN partners) for 321 quantitative traits. The number of common phenotypes across EUROSPAN populations greatly exceeds the target in the EUROSPAN contract (38 phenotypes are measured in 3 or more populations)
  • Quality control procedures and description of QT distribution analyses are detailed in the publications on each individual trait (see list of publications)
  • University of Regensburgdelivered 165 lipidomic traits (GWAS data analysis of which resulted in a PLoS Genetic publication and a manuscript currently submitted to Nature Genetics)
  • A resulting large number of phenotypes were analysed or are currently still being analysed jointly across EUROSPAN and in collaboration with other consortia (see Annex 1 for list of all completed and ongoing consortia analyses)

WP2: Genomics – molecular technology: microsatellite genome wide screening - year 1

  • A scientific review identified the clear case to pursue high throughput SNP genotyping and this was agreed at the Programme Coordination Committee meeting
  • A tendering exercise resulted in a contract to purchase genotyping Hap300 genotyping arrays from Illumina
  • GSF Germany commissioned an Illumina genotyping platform and established related standard operating procedures for specimen transport and genotyping
  • Genotyping of EUROSPAN samples started

WP2: Genomics – molecular technology: microsatellite genome wide screening – year 2

  • Genome-wide microsatellite screens are available in 4 populations, while the fifth (Rotterdam) performed their genome-wide screen with SNP markers of comparable density and power; linkage analysis for all 11 traits shared between all partners was performed using the same method which was jointly developed by Rotterdam and Edinburgh partners;
  • Meta-analysis of all 5 linkage analyses was performed centrally (Rotterdam), and the significant peaks are followed up by association approach using dense SNP markers which cover the genome region below the significant peaks;
  • Papers on joint linkage / association analysis are due to be submitted for publication in July 2008 on glucose / insulin; lipids; weight/ height/ BMI; ECG parameters; and creatinine.

WP2: Genomics – molecular technology: genome wide screening – year 3

  • Virement of genotyping funding in year 1 enabled EUROSPAN resources to be deployed to carry out a total of 1,268 million SNP genotypes, once again very substantially more than the approximately 22 million outlined in the original EUROSPAN contract. The emphasis was thus on GWAS analyses rather than genome-wide microsatellite screens for linkage analysis
  • Quality control reports for the individual analyses are contained within the many individual publications, including several in high impact journals (see list of publications). Limited linkage analyses were conducted and published (see list of publications).

WP3: Genomics–molecular technology: SNP selection / high density genotyping – year 1

  • A scientific review identified the clear case to pursue high throughput SNP genotyping and this was agreed at the Programme Coordination Committee meeting
  • A tendering exercise resulted in a contract to purchase genotyping Hap300 genotyping arrays from Illumina
  • GSF Germany commissioned an Illumina genotyping platform and established related standard operating procedures for specimen transport and genotyping
  • Genotyping of EUROSPAN samples started

WP3: Genomics–molecular technology: SNP selection / high density genotyping – year 2

  • Hap300 genotyping arrays were purchased from Illumina Inc
  • GSF Germany commissioned an Illumina genotyping platform and performed genome-wide scans in 4 populations; Rotterdam partner performed genome-wide scan in their population; genotyping for 4 of the 5 population samples is completed

WP3: Genomics–molecular technology: SNP selection / high density genotyping – year 3

  • As noted above for WP2 the original plan to genotype SNPs in a set of candidate genes was replaced by GWAS studies which delivered approximately 60 times the volume of genotyping within the fixed EUROSPAN budget
  • The GWAS genotyping was all completed successfully by the GSF Germany partner (4 populations) and by the Rotterdam partner (1 population)

WP4: Statistical / quantitative genetics: data analysis of genome wide screen data and of data from genetic variants known to cause disease – year 1

  • A successful analysis workshop attended by all partners was held in Rotterdam
  • Software for data handling (re-formatting, splitting pedigrees, and description of complex pedigrees); quality control and data analysis (pedigree based association based and genome wide association) were identified or developed and made available to EUROSPAN partners
  • Plans for joint analyses were made and lead partners for specific analysis topics were agreed

WP4: Statistical / quantitative genetics: data analysis of genome wide screen data and of data from genetic variants known to cause disease – year 2

  • Joint analysis of all data to identify genes underlying biochemical and lipidomics traits using genome-wide association studies and dense SNP-based genome-wide scans have been conducted led by Erasmus, with all partners contributing their standardized analyses to a joint server.
  • We have discovered numerous novel genetic variants influencing these traits (reaching genome wide significance p< 10-8 across 5 populations) and are currently preparing the first publication for submission to a high impact journal in May 2008.

WP4: Statistical / quantitative genetics: data analysis of genome wide screen data and of data from genetic variants known to cause disease – year 3

  • Joint analysis of all data to identify genes underlying biochemical and lipidomics traits using genome-wide association studies and dense SNP-based genome-wide scans have been conducted led by Erasmus, with all partners contributing their standardized analyses to a joint server.
  • We have discovered numerous novel genetic variants influencing these traits (reaching genome wide significance p< 10-8 across 5 populations) and these are detailed in the list of publications. Ongoing analyses are continuing for further traits as noted in Annex 1.
  • A major additional deliverable, not described in the EUROSPAN contract, is the successful and significant contribution which EUROSPAN studies have made to a large number of GWAS meta-analysis consortium papers list in Annex 1. This contribution has been substantial as evidenced by the number of EUROSPAN authors on the publications (median 15; range 9 – 26). EUROSPAN members are currently working actively or leading several ongoing GWAS meta-analyses within international consortia and participating in sub-group studies on GG / GE interactions, rarer variants or pathways analyses within these consortia. This work is likely to continue throughout 2010 thus continuing to extend the final scientific achievements of EUROSPAN. Although not included in the original EUROSPAN contract and budget this additional work has been achieved within the fixed budget.

WP5: Statistical / quantitative genetics: development of new analytic approaches – year 1

  • A successful analysis workshop attended by all partners was held in Rotterdam
  • Software for data handling (re-formatting, splitting pedigrees, and description of complex pedigrees); quality control and data analysis (pedigree based association based and genome wide association) were identified or developed and made available to EUROSPAN partners
  • Plans for joint analyses were made and lead partners for specific analysis topics were agreed

WP5: Statistical / quantitative genetics: development of new analytic approaches – year 2

  • Five successful meetings attended by all partners were held in Rotterdam, London, Dubrovnik, Bolzano and Munich;
  • Software for data handling (re-formatting, splitting pedigrees, and description of complex pedigrees); quality control and data analysis (pedigree based association based and genome wide association) were identified or developed and made available to EUROSPAN partners;
  • Algorithm for the developed software has been published

WP5: Statistical / quantitative genetics: development of new analytic approaches – year 3

  • Successful meetings attended by all partners were held in Uppsala, Edinburghand Munich at which data analysis approaches were discussed;
  • The software for data handling (re-formatting, splitting pedigrees, and description of complex pedigrees); quality control and data analysis (pedigree based association based and genome wide association)which were identified or developed and made available to EUROSPAN partners in year 2 were employed in data analyses leading to the publications listed in publications list
  • The parallel computing facility established by the Rotterdam partner was utilised in the data handling and analysis of the large and complex EUROSPAN datasets

WP6: Social and ethical aspects – year 1