|A service of the U.S. National Library of Medicine®|
Next steps in studying the human genome
Please choose from the following list of questions for information about current and future initiatives in genomic research.
On this page:
What are the next steps in genomic research?
Discovering the sequence of the human genome was only the first step in understanding how the instructions coded in DNA lead to a functioning human being. The next stage of genomic research will begin to derive meaningful knowledge from the DNA sequence. Research studies that build on the work of the Human Genome Project are under way worldwide.
The objectives of continued genomic research include the following:
For more information about the genomic research following the Human Genome Project:
The National Human Genome Research Institute supports research in many of the areas described above. The Institute provides detailed information about its research initiatives at
What are single nucleotide polymorphisms (SNPs)?
Single nucleotide polymorphisms, frequently called SNPs (pronounced “snips”), are the most common type of genetic variation among people. Each SNP represents a difference in a single DNA building block, called a nucleotide. For example, a SNP may replace the nucleotide cytosine (C) with the nucleotide thymine (T) in a certain stretch of DNA.
SNPs occur normally throughout a person’s DNA. They occur once in every 300 nucleotides on average, which means there are roughly 10 million SNPs in the human genome. Most commonly, these variations are found in the DNA between genes. They can act as biological markers, helping scientists locate genes that are associated with disease. When SNPs occur within a gene or in a regulatory region near a gene, they may play a more direct role in disease by affecting the gene’s function.
Most SNPs have no effect on health or development. Some of these genetic differences, however, have proven to be very important in the study of human health. Researchers have found SNPs that may help predict an individual’s response to certain drugs, susceptibility to environmental factors such as toxins, and risk of developing particular diseases. SNPs can also be used to track the inheritance of disease genes within families. Future studies will work to identify SNPs associated with complex diseases such as heart disease, diabetes, and cancer.
For more information about SNPs:
An audio definition of
A detailed overview of SNPs and their association with cancer risk can be found in the National Cancer Institute’s Understanding Cancer Series: Genetic Variation
For people interested in more technical data, several databases of known SNPs are available:
What are genome-wide association studies?
Genome-wide association studies are a relatively new way for scientists to identify genes involved in human disease. This method searches the genome for small variations, called single nucleotide polymorphisms or SNPs (pronounced “snips”), that occur more frequently in people with a particular disease than in people without the disease. Each study can look at hundreds or thousands of SNPs at the same time. Researchers use data from this type of study to pinpoint genes that may contribute to a person’s risk of developing a certain disease.
Because genome-wide association studies examine SNPs across the genome, they represent a promising way to study complex, common diseases in which many genetic variations contribute to a person’s risk. This approach has already identified SNPs related to several complex conditions including diabetes, heart abnormalities, Parkinson disease, and Crohn disease. Researchers hope that future genome-wide association studies will identify more SNPs associated with chronic diseases, as well as variations that affect a person’s response to certain drugs and influence interactions between a person’s genes and the environment.
For more information about genome-wide association studies:
The National Human Genome Research Institute provides a detailed explanation of genome-wide association
You can also search for clinical trials of genome-wide association studies online.
For people interested in more technical information, the NCBI’s Database of Genotype and Phenotype
What is the International HapMap Project?
The International HapMap Project is an international scientific effort to identify common genetic variations among people. This project represents a collaboration of scientists from public and private organizations in six countries. Data from the project is freely available to researchers worldwide. Researchers can use the data to learn more about the relationship between genetic differences and human disease.
The HapMap (short for “haplotype map”) is a catalog of common genetic variants called single nucleotide polymorphisms or SNPs (pronounced “snips”). Each SNP represents a difference in a single DNA building block, called a nucleotide. These variations occur normally throughout a person’s DNA. When several SNPs cluster together on a chromosome, they are inherited as a block known as a haplotype. The HapMap describes haplotypes, including their locations in the genome and how common they are in different populations throughout the world.
The human genome contains roughly 10 million SNPs. It would be difficult, time-consuming, and expensive to look at each of these changes and determine whether it plays a role in human disease. Using haplotypes, researchers can sample a selection of these variants instead of studying each one. The HapMap will make carrying out large-scale studies of SNPs and human disease (called genome-wide association studies) cheaper, faster, and less complicated.
The main goal of the International HapMap Project is to describe common patterns of human genetic variation that are involved in human health and disease. Additionally, data from the project will help researchers find genetic differences that can help predict an individual’s response to particular medicines or environmental factors (such as toxins.)
For more information about the International HapMap Project:
The National Human Genome Research Institute provides an overview of the project in their International HapMap Project fact
Detailed information about the project, as well as project data, are available from the International HapMap Project web
You can also search for clinical trials involving haplotypes or associated with the International HapMap Project.
What is the Encyclopedia of DNA Elements (ENCODE) Project?
The ENCODE Project was planned as a follow-up to the Human Genome Project. The Human Genome Project sequenced the DNA that makes up the human genome; the ENCODE Project seeks to interpret this sequence. Coinciding with the completion of the Human Genome Project in 2003, the ENCODE Project began as a worldwide effort involving more than 30 research groups and more than 400 scientists.
The approximately 20,000 genes that provide instructions for making proteins account for only about 1 percent of the human genome. Researchers embarked on the ENCODE Project to figure out the purpose of the remaining 99 percent of the genome. Scientists discovered that more than 80 percent of this non-gene component of the genome, which was once considered “junk DNA,” actually has a role in regulating the activity of particular genes (gene expression).
Researchers think that changes in the regulation of gene activity may disrupt protein production and cell processes and result in disease. A goal of the ENCODE Project is to link variations in the expression of certain genes to the development of disease.
The ENCODE Project has given researchers insight into how the human genome functions. As researchers learn more about the regulation of gene activity and how genes are expressed, the scientific community will be able to better understand how the entire genome can affect human health.
For more information about the ENCODE Project:
The University of California at Santa Cruz provides detailed information about the findings of the ENCODE
Published research findings are available through Nature Magazine’s Nature Encode
What is pharmacogenomics?
Pharmacogenomics is the study of how genes affect a person’s response to drugs. This relatively new field combines pharmacology (the science of drugs) and genomics (the study of genes and their functions) to develop effective, safe medications and doses that will be tailored to a person’s genetic makeup.
Many drugs that are currently available are “one size fits all,” but they don’t work the same way for everyone. It can be difficult to predict who will benefit from a medication, who will not respond at all, and who will experience negative side effects (called adverse drug reactions). Adverse drug reactions are a significant cause of hospitalizations and deaths in the United States. With the knowledge gained from the Human Genome Project, researchers are learning how inherited differences in genes affect the body’s response to medications. These genetic differences will be used to predict whether a medication will be effective for a particular person and to help prevent adverse drug reactions.
The field of pharmacogenomics is still in its infancy. Its use is currently quite limited, but new approaches are under study in clinical trials. In the future, pharmacogenomics will allow the development of tailored drugs to treat a wide range of health problems, including cardiovascular disease, Alzheimer disease, cancer, HIV/AIDS, and asthma.
For more information about pharmacogenomics
The National Institute of General Medical Sciences offers a list of Frequently Asked Questions about
Additional information about
The National Genetics and Genomics Education Centre of the National Health Service (UK) provides information about predicting the effects of
A list of clinical trials involving
What advances are being made in DNA sequencing?
Determining the order of DNA building blocks (nucleotides) in an individual’s genetic code, called DNA sequencing, has advanced the study of genetics and is one method used to test for genetic disorders.
New technologies that allow rapid sequencing of large amounts of DNA are being developed. The original sequencing technology, called Sanger sequencing (named after the scientist who developed it, Frederick Sanger), was a breakthrough that helped scientists determine the human genetic code, but it is time-consuming and expensive. The Sanger method has been automated to make it faster and is still used in laboratories today to sequence short pieces of DNA, but it would take years to sequence all of a person’s DNA (known as the person’s genome). Several technologies have been developed more recently, called next-generation sequencing (or next-gen sequencing), that have sped up the process (taking only days to weeks to sequence a human genome) while reducing the cost.
With next-generation sequencing, it is now feasible to sequence large amounts of DNA, for instance all the pieces of an individual’s DNA that provide instructions for making proteins. These pieces, called exons, are thought to make up 1 percent of a person’s genome. Together, all the exons in a genome are known as the exome, and the method of sequencing them is known as whole exome sequencing. This method allows variations in the protein-coding region of any gene to be identified, rather than a select few genes. Because most known mutations that cause disease occur in exons, whole exome sequencing is thought to be an efficient method to identify possible disease-causing mutations.
However, researchers have found that DNA variations outside the exons can affect gene activity and protein production and lead to genetic disorders–variations that whole exome sequencing would miss. Another method, called whole genome sequencing, determines the order of all the nucleotides in an individual’s DNA and can determine variations in any part of the genome.
While many more genetic changes can be identified with whole exome and whole genome sequencing than with select gene sequencing, the significance of much of this information is unknown. Because not all genetic changes affect health, it is difficult to know whether identified variants are involved in the condition of interest. Sometimes, an identified variant is associated with a different genetic disorder that has not yet been diagnosed (these are called incidental or secondary findings).
In addition to being used in the clinic, whole exome and whole genome sequencing are valuable methods for researchers. Continued study of exome and genome sequences can help determine whether new genetic variations are associated with health conditions, which will aid disease diagnosis in the future.
For more information about DNA sequencing technologies and their use:
Genetics Home Reference discusses whether all genetic changes affect health and development.
A scientist at the Genome Institute at the University of Washington describes the different sequencing
An illustration of the decline in the cost of DNA
The American College of Medical Genetics and Genomics (ACMG) has laid out their policies regarding whole exome and whole genome
The PHG Foundation (UK) provides an overview of whole genome
The Mount Sinai School of Medicine Genomics Core Facility describes the techniques used in whole exome