a consortium for Type 2 Diabetes Genetic Exploration by Next-generation sequencing in multi-Ethnic Samples

T2D-GENES is a large collaborative effort to find genetic variants that influence risk of type 2 diabetes. With funding from NIDDK, the group pursued three projects as listed below. The T2D-GENES Consortium also funded and guided the construction of the Portal.

Project 1: Deep whole-exome sequencing in 10,000 people from five ethnicities

The goal of Project 1 was to discover how variation in the protein-coding portion of the genome contributes to type 2 diabetes risk. The project’s dataset is unusually large and diverse, with exomes from 10,000 people across five ethnicities, including 1,000 type 2 diabetes cases and 1,000 controls from each:

  • African-American (samples from Wake Forest University and the Jackson Heart Study)
  • South Asian (UK LOLIPOP; Singapore)
  • East Asian (Korea; Singapore)
  • Hispanic (Starr County; San Antonio)
  • European (Finns (METSIM); Ashkenazim)

This diversity of ancestries allows scientists to find new genetic variants in populations that have otherwise been under-studied. The project also examined exomes to identify the transcripts most likely to be involved in type 2 diabetes pathogenesis. In addition, T2D-GENES researchers closely examined genomic locations that have been implicated in single-gene and syndromic forms of type 2 diabetes (such as MODY), evaluating them for association with traits that are related to the disease (such as fasting glucose levels). Ultimately, Project 1 is intended to answer major questions about the genetic architecture of type 2 diabetes and how natural selection has shaped it, and to spur the development of new statistical and analytical methods that can be used in genomic studies of other diseases.

Project 2: Deep whole-genome sequencing of 600 individuals selected from extended Mexican American pedigrees

Project 2 aimed to identify low-frequency and rare variants (those seen in less than five and less than .05 percent of the population, respectively) influencing type 2 diabetes risk. The project's dataset includes whole-genome sequence information on 1,043 people from 20 Mexican-American extended families in which type 2 diabetes is unusually common. The research participants were selected from two studies: the San Antonio Family Heart Study (SAFHS) and the San Antonio Family Diabetes/Gallbladder Study (SAFDGS), collectively referred to as the San Antonio Mexican American Family Studies (SAMAFS). About 600 partipants underwent high-quality whole-genome sequencing, with an average of 50x coverage. The remaining 440 participants had genotypes imputed genome-wide based on their family members' information and data from the 1000 Genomes Project.

Studies of large, complex pedigrees, such as Project 2, are especially well-suited for the study of rare variants. Finding rare variants in the population at large requires extremely large sample sizes; often, the variants may be seen only once, making it difficult to reliably determine their effects on phenotype. However, by studying large pedigrees, scientists can increase their chances of finding multiple individuals who carry the same rare variants (because those variants run in their families). Therefore, Project 2's approach provides a way to identify low-frequency and rare variants -- both at known GWAS signals and novel genomic loci -- that may contribute to type 2 diabetes risk.

Project 3: Trans-ethnic fine-mapping "mega-meta-analysis"

Genome-wide association studies (GWAS) have implicated dozens of genomic regions in type 2 diabetes risk, but in many cases it is unclear which variants in those regions actually influence the underlying biology of disease, and which variants are merely near the disease-causing variants but do not themselves contribute to pathophysiology. The goal of Project 3 was to precisely identify the causal variants by (1) focusing on genomic regions previously implicated in type 2 diabetes risk; (2) inferring the existence of surrounding low-frequency variants by imputing relevant data from the 1000 Genomes Project; and (3) using a range of statistical approaches to determine which of these variants are most likely to cause disease.

A "mega-meta-analysis," Project 3 involved data from 26,488 type 2 diabetes patients and 83,964 non-diabetic controls, with follow-up in 21,491 patients and 55,647 controls. The project was a collaboration between five consortia with research participants from different continental ancestry groups: AGEN-T2D (East Asians), DIAGRAM (Europeans), SAT2D (South Asians), MAT2D (Mexican Americans), and MEDIA (African Americans).

This diversity of ancestries is especially important to the study design. Variants that are tightly correlated by location in some ancestry groups (e.g., Europeans) may travel more independently in other groups (e.g., African Americans). Therefore, examining data from many different groups can help distinguish between true causal variants and those that are merely along for the ride.