AlzGene - Methods
Back Search Methods Disclaimer Credits

Updated 5 May 2010

1. Overview

The goal of the AlzGene database is to serve as a comprehensive, unbiased, publicly available and regularly updated field-synopsis of published genetic association studies performed on AD phenotypes. Eligible publications are identified following systematic searches of scientific literature databases, as well as the table of contents of journals in genetics, neurology, and psychiatry. Data selected for display summarize key characteristics of the investigated study cohorts (e.g., gene overview), as well as genotype distributions in cases and controls (e.g., polymorphism details). For eligible polymorphisms with genotype data in at least four case-control samples, continuously updated random-effects meta-analyses are presented (see meta-analysis methods). Note that data obtained from family-based studies are not included in the meta-analyses, as crude odds ratios cannot be readily calculated from overall genotype distributions. However, these studies and their qualitative results are still listed on the gene-summary pages of the AlzGene website (see Table 2 for example).

To ensure the highest degree of scientific objectivity, only studies published in peer-reviewed journals available in English are considered for inclusion into the database. In particular, this precludes the inclusion of data presented only in abstracted form, e.g. at scientific meetings. We encourage authors of original reports fulfilling the above criteria to submit their data as soon as their work is accepted for publication.

For more details on inclusion criteria, literature searches, data-management procedures, statistical analyses, and online database structure, please see Bertram et al. (2007), and Allen et al. (2008).

2. The "Top Results" List

In an effort to facilitate the identification of the most promising meta-analysis results available in AlzGene, a continuously updated list displaying the most strongly associated genes ("Top Results") has been added to the AlzGene homepage. The list includes genes/loci which contain at least one variant showing a nominally significant summary OR in the analysis of all studies (“All”), or those limited to samples of a specific ethnicity (e.g. “Caucasian”). The nominally significant meta-analyses are then graded based on interim guidelines "Venice critera" for the grading of the epidemiological credibility of genetic association studies recently developed by the Human Genome Epidemiology Network (HuGENet; Ioannidis et al, 2008).

In the "Top Results" list, genes are ranked based the genetic variant with the best overall HuGENet/Venice grade; for genes with identical grades, ranking is based on P-value; for genes with identical grade & P-value, ranking is based on effect size (OR).

Note that an earlier version of the “Top Results” list (before July 13, 2009), determined the ranking of nominally significant results by summary OR alone.

HuGENet "Venice criteria"

We rate overall epidemiological credibility as ‘strong’ if associations received three A grades, ‘moderate’ if they received at least one B grade but no C grades, and ‘weak’ if they received a C grade in any of the three assessment criteria. While we believe that this list represents an up-to-date summary of particularly promising AD candidate genes that warrant follow-up with high priority, we note that many of these may still represent false-positive findings.

Briefly, each meta-analyzed association in AlzGene is graded on the basis of the amount of evidence, consistency of replication, and protection from bias. For amount of evidence, we assign the grade ‘A’ when the total number of minor alleles of cases and controls combined in the meta-analyses exceeds 1,000, ‘B’ when it is between 100 and 1,000, and ‘C’ when it is less than 100. For consistency of replication, we assign the grade ‘A’ for I2 point estimates <25%, ‘B’ for I2 values of 25–50%, and ‘C’ for I2 values >50%. Note that this criterion does not apply to meta-analyses with a P-value <1x10-7 after exclusion of the initial studie(s), as described in Khoury et al, 2009. For protection from bias, the guidelines propose consideration of various potential sources of bias, including errors in phenotypes, genotypes, confounding (population stratification) and errors or biases at the meta-analysis level (publication and other selection biases). A grade A implies that there is probably no bias that can affect the presence of the association, grade B that there is no demonstrable bias but important information is missing for its appraisal, and grade C that there is evidence for potential or clear bias that can invalidate the association. Errors and biases are also considered in the framework of the observed summary OR. Whenever the summary OR deviates less than 1.15-fold from the null in meta-analyses based on published data, we acknowledge that occult publication and selective reporting biases alone may invalidate the association, regardless of the presence or absence of other biases, and therefore assign a grade of C. When the summary OR deviates more than 1.15-fold from the null, we assign a grade of C when the modified regression test (Hardbord et al, 2006) or excess test suggest the possibility of publication-bias or significance-chasing bias or when the association is no longer nominally statistically significant upon exclusion of the initial study or studies violating HWE.

APOE and related effects in the "Top Results List"

Genetic variants for which a significant meta-analysis result is likely due to linkage-disequilibrium with the APOE-ε4 allele (e.g. in APOC1), are not listed separately as “Top Results”. However, meta-analyses of these genes and polymorphisms are still available via the specific gene-summary pages. View also information below on how APOE-ε4 itself is handled in AlzGene.

3. Database Organization and Methods

Meta-Analysis Methods

For all polymorphisms with minor allele frequencies in healthy controls >1%, and for which case-control genotype data are available in four or more independent samples, crude odds ratios (ORs) and 95 percent confidence intervals (CIs) are calculated from the reported allele distributions for each study. Summary ORs and 95 percent CIs are calculated using the DerSimonian and Laird (1986) random-effects model (using the 'rmeta' package in R). This procedure is done including all studies irrespective of ethnicity (denoted by "All Studies" on the meta-analysis figures), and for all ethnic groups with independent genotype data in at least three populations. Whenever applicable, the results of a number of sensitivity analyses are also displayed, e.g. after exclusion of the initial study, after exclusion of studies in which a violation of Hardy-Weinberg Equilibrium (HWE; calculated using the 'HardyWeinberg' package in R) was detected. Overlapping samples (of which usually only the largest is included), studies with missing data, or control samples deviating from HWE are indicated on the meta-analysis graphs. Please note, that when only few studies are included in the meta-analyses (i.e. less than ~10), the random effects model may yield summary ORs and confidence bounds that are slightly anti-conservative.

To allow a visual assessment of the change in summary OR over time, cumulative meta-analyses are displayed for each of the polymorphisms eligible for meta-analysis. Cumulative meta-analyses are only displayed for the ethnic subgroup (e.g. 'All' or 'Caucasian' ) with the overall best ranking summary OR by random-effects meta-analyses.

Inclusion of Genome-wide Association Studies (GWAS)

For the systematic inclusion of data from GWAS and other large-scale studies, we have devised the following step-wise protocol, which we believe allows to capture the most relevant genetic information without the need to include every data-point from these studies. Please visit this page to see a summary of all published large-scale studies currently included in AlzGene.

Stage I: Represents the inclusion of genes and polymorphisms “featured” or highlighted by the authors of the GWAS or other large-scale study, usually because they show some degree of genetic association after completion of all analyses, e.g. testing multiple independent samples. These genes and polymorphisms probably represent the most important findings of each GWAS and are therefore included in AlzGene with highest priority. Genomic loci that do not map within any known gene are represented by a surrogate name specifying the cytogenetic location (e.g. “GWA_1q25.12”). Markers in linkage-disequilibrium with APOE ε2/3/4, (i.e.. variants located in APOC1) are summarized as "APOE" as featured gene.

For GWAS that have made their genotype data publicly available, we will also make use of “non-featured” genotype distributions, i.e. of polymorphisms not believed to be strongly associated with AD in the original publications:

Stage II: Will add GWAS and large-scale study genotype data for polymorphisms already available in AlzGene, i.e. usually derived from candidate gene studies. GWAS data for such overlapping polymorphisms will be added to the gene-specific entries and, if applicable, included and displayed in the meta-analyses. This stage adds valuable information to the existing AlzGene meta-analyses as it is derived from assessments that are largely unbiased with respect to gene function, in contrast to most conventional candidate gene studies. Note that genotype data from large-scale association studies using a "pooled" genotyping approach will not be considered for these analyses due to the sometimes substantial variability of genotype and allele frequencies when compared to subject-level genotyping.

Stage III: Focuses on published meta-analyses of the existing GWAS datasets. Genes and loci resulting from these analyses are treated equivalently to the "featured genes" of Stage I (above). Genotype data on the gene-specific pages will be extracted from the primary GWAS studies (if their data is publicly available) and displayed alongside a "GWAS meta-analysis" entry. This stage also entails the inclusion of more complex genetic analyses, e.g. those jointly analyzing large numbers of polymorphisms at different loci based on assumptions regarding the functional interconnection of these loci, e.g. in forms of "pathways". To the degree that it can be achieved in this context, these pathway-based results are labeled as such and stored in separate "unmapped" section of the database.

Association studies on mitochondrial genes

Studies assessing a potential association between AD and genetic variants in the mitochondrial (mt) genome are subject to the same inclusion criteria as studies investigating markers from the nuclear genome, and are displayed on a separate "chromosome graph" (which is adapted from imagery on the "Mito Map" website [http://www.mitomap.org/]). Owing to the specific characteristics of human mt-inheritance (e.g. its multicopy nature and the high frequency of somatic mutation events) and the innate heterogeneity of mt-association studies, however, genotype data from these studies are not included on AlzGene and therefore not subject to meta-analysis.

Association of APOE Polymorphisms with AD

In contrast to essentially all other association findings in AD, the risk effect of APOE-ε4 has been consistently replicated in a large number of studies across many ethnic groups (Saunders, 1993; Farrer, 1997). Many studies have also observed a more modest protective effect for the minor allele, ε2. Because the established role of the ε4- allele, we did not seek to catalog every APOE-ε2/3/4- result in the published literature. Instead, as a proof of concept, we only considered the 43 samples included in the previous meta-analysis by Farrer et al. (1997).

Note that there are several differences in the approaches taken by Farrer (1997) and AlzGene to derive summary risk estimates for the ε2/3/4 polymorphism in APOE: 1) Farrer et al. included data from every available study (including meeting reports, studies not available in English, family-based data, and personally communicated, unpublished genotype data, no published control genotypes, etc.). Following the same procedure as for all other genes represented in AlzGene, five such studies were excluded here (i.e., refs. 43, 51, 55, 57, 62, [in Farrer paper]). 2) Farrer et al. included case-control and family-based samples in their meta-analyses. For each of the family studies, only one individual (generally the proband) was included in the pooled analyses, but none of the unaffected individuals. Three such studies are excluded from the AlzGene analyses because crude ORs cannot be calculated in these cases (i.e., refs. 26, 42, and 49 [in Farrer paper]). 3) Farrer et al. had access to the raw genotype data from all studies, which allowed OR estimates from pooled genotypes, as well as several additional analyses incorporating co-variables such as onset age, gender, and years of education. AlzGene uses a traditional meta-analytic approach based on crude ORs calculated separately for each study from the published genotype tables (see meta-analysis methods). Despite these analytic and conceptual differences, the results of both approaches generate very similar effect size estimates and confidence intervals (see APOE meta-analysis results, and Bertram et al. (2007) for details).

Please note that the restrictions in inclusion criteria for the ε2/ε3/ε4 variants in APOE do not apply to any of the other published APOE polymorphisms tested for association with AD (e.g., those in the promoter region). For those, we have attempted to sample and analyze every study fulfilling general AlzGene inclusion and exclusion criteria (see above).

4. Early-onset Familial AD Genes

AlzGene provides summaries of studies that use genetic association methods on common polymorphisms (minor allele frequency in controls >1%) by either case-control or family-based designs. For a summary of rare mutations in early-onset familial AD genes (APP, PSEN1, and PSEN2) please visit the Alzheimer Disease & Frontotemporal Dementia Mutation Database of the Department of Molecular Genetics, University of Antwerp, Belgium. See also the Alzforum Mutations Directory.

References

Allen NC, Bagade S, McQueen MB, Ioannidis JP, Kavvoura FK, Khoury MJ, Tanzi RE, Bertram L. (2008) "Systematic meta-analyses and field synopsis of genetic association studies in schizophrenia: the SzGene database." Nat Genet 40(7):827-34. Abstract

Bertram L, McQueen, Mullin K, Blacker D, Tanzi RE. (2007) "Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database." Nat Genet 39(1): 17-23. Abstract

Bertram L, Tanzi RE (2008) "Thirty years of Alzheimer's disease genetics: the implications of systematic meta-analyses." Nature Reviews Neuroscience 9(10):768-78. Abstract

Bertram L, Tanzi RE (2004) "Alzheimer's disease: one disorder, too many genes?" Hum Mol Genet 1;13 (Spec No 1): R135-41. Abstract

DerSimonian R, Laird N. Meta-analysis in clinical trials. Control Clin Trials. 1986 Sep;7(3):177-88. Abstract

Farrer, L.A., et al. (1997) "Effects of age, sex, and ethnicity on the association between apolipoprotein E genotype and Alzheimer disease. A meta-analysis. APOE and Alzheimer Disease Meta Analysis Consortium." JAMA 278:1349-1356. Abstract

Harbord RM, Egger M, Sterne JA (2006) "A modified test for small-study effects in meta-analyses of controlled trials with binary endpoints." Stat Med 25;3443–3457. Abstract

Ioannidis JP, Boffetta P, Little J, O'Brien TR, Uitterlinden AG, Vineis P, Balding DJ, Chokkalingam A, Dolan SM, Flanders WD, Higgins JP, McCarthy MI, McDermott DH, Page GP, Rebbeck TR, Seminara D, Khoury MJ (2008) "Assessment of cumulative evidence on genetic associations: interim guidelines." Int J Epidemiol 37(1):120-32. Abstract

Khoury MJ, Bertram L, Boffetta P, Butterworth AS, Chanock SJ, Dolan SM, Fortier I, Garcia-Closas M, Gwinn M, Higgins JP, Janssens AC, Ostell J, Owen RP, Pagon RA, Rebbeck TR, Rothman N, Bernstein JL, Burton PR, Campbell H, Chockalingam A, Furberg H, Little J, O'Brien TR, Seminara D, Vineis P, Winn DM, Yu W, Ioannidis JP (2009) "Genome-wide association studies, field synopses, and the development of the knowledge base on genetic variation and human diseases." Am J Epidemiol 170(3):269-79. Abstract

Martin ER, et al. (2001) "SNPing away at complex diseases: analysis of single-nucleotide polymorphisms around APOE in Alzheimer disease." Am J Hum Genet. 2000 Aug;67(2):383-94. Abstract

Saunders, A. M., et al. (1993). "Association of apolipoprotein E allele epsilon 4 with late-onset familial and sporadic Alzheimer's disease." Neurology 43: 1467-72. Abstract

Takei et al. (2009) "Genetic association study on in and around the APOE in late-onset Alzheimer disease in Japanese." Genomics. 2009 May;93(5):441-8. Abstract

Yu CE, Seltman H, Peskind ER, Galloway N, Zhou PX, Rosenthal E, Wijsman EM, Tsuang DW, Devlin B, Schellenberg GD (2007) "Comprehensive analysis of APOE and selected proximate markers for late-onset Alzheimer's disease: Patterns of linkage disequilibrium and disease/marker association." Genomics. Apr 12 (Epub ahead of print). Abstract

AlzGene Recent Updates
AlzGene Top Results
AlzGene Stats
Studies: 1395
Genes: 695
Polymorphisms: 2973
Meta-analyses: 320
Cure Alzheimer's Fund
Proud supporter of the AlzGene database.
Michael J. Fox Foundation
The PDGene database is supported by a grant from The Michael J. Fox Foundation in partnership with the Alzheimer Research Forum.

NCRAD

The National Cell Repository for Alzheimer Disease seeks to recruit 1,000 families with two or more living brothers or sisters who have been diagnosed with late onset Alzheimer’s disease.
ALSGene
AlzGene
MSGene
PDGene
SZGene
An up-to-date collection of all published genetic association studies.