Welcome to kuyez | Welcome My Forum

Hoca · 2025-06-14T14:57:41+0100

June 26, 2025, is the 25th anniversary of the White House announcement of the first sequencing of the human genome, and July 28, 2025, marks the 30th anniversary of the publication of the first sequenced genome of a living species, Haemophilus influenzae.1 These anniversaries are more closely linked than might be imagined.

In 1994, Ham Smith and I submitted a grant application to the NIH genome center to use our new idea of whole genome shotgun sequencing to rapidly sequence a bacterial genome. The reviewers and NIH genome leadership were certain that the approach would never work, and the grant was not funded.2 At the time NIH and DOE were funding a 7-year project to sequence the E. coli genome from hundreds of mapped small clones.

The Human Genome Project (HGP) was proceeding on the same distributed clone sequencing approach due to the view that genomes were too complex and had to be broken down into a large number of smaller projects distributed around the world. We surprised and shocked both the genome and broader scientific community with our publication of the H. influenzae genome in Science on July 28, 1995, based on a single whole genome shotgun sequencing effort.

I was certain that our approach would work to sequence the human genome, but I had to argue with the editors of Science to include the final sentence in the paper, “Finally, this strategy has the potential to facilitate the sequencing of the human genome.” 1

However, at the time, there were only a small handful of scientists, at best, who might have agreed with me. Fortunately, one of them was Mike Hunkapiller, PhD, of Applied Biosystems, now part of Thermo Fisher Scientific, who was developing a new capillary DNA sequencer and offered me $300 million to start a company (Celera Genomics) to sequence the human genome using my method and his sequencer.

History demonstrates that it was a smart bet; my team sequenced the first human genome in less than one year, and Applied Biosystems made a fortune. I knew that Celera was substantially ahead of the HGP and we had plans to announce our success when President Clinton asked me to consider making the announcement from the White House along with the HGP and declaring that the genome race finished in a tie to end all the public acrimony between Celera and the HGP.

I made the controversial decision to agree to this plan so that the Celera success would not do harm to the public funding of science. On June 26, 2000, almost exactly 5 years after the publication of the first genome, Celera and the HGP announced with President Clinton and British Prime Minister Tony Blair the first versions of the human genome sequence, that were published the following year.3,4

The good news

In the quarter-century since Celera and the HGP delivered the first human genome sequences, the world has witnessed a profound and pervasive genomic revolution. What began as a bold scientific quest—sometimes hyped as the key to “life’s blueprint”—has matured into concrete outcomes that touch many aspects of society. The global biotechnology industry has been transformed, growing exponentially and spawning technologies that were unimaginable in 2000.

Drug development has been made smarter and more efficient, yielding therapies informed by our genes and even cures for diseases long deemed incurable. Medicine has become more personalized, with genomics empowering doctors and patients to predict and prevent illness in ways that improve outcomes and quality of life. Innovative companies and business models have risen (and some fallen), all learning how to create value from the code of life.

Meanwhile, governments and international bodies have crafted new policies to support innovation and protect individuals, ensuring that genomic advances proceed ethically and equitably.

Genomics caused what I call a silent revolution that changed both basic sciences and pharmaceutical development. You would have had to be working in science before the 1990’s to even remember how slow progress was before you could just search a database for a gene or protein of interest. Projects were usually a decade or longer to isolate a protein and eventually clone the corresponding gene. Nobel Prizes were given out a gene at a time.

Expressed Sequence Tags (EST), rapid gene discovery5 and genomics changed the decades into a few seconds of computer time. The pharmaceutical industry was almost instantly awash with new potential therapeutic targets and the world changed from searching for a drug target to validating them.

Money for research and investment capital went from a trickle before 2000 to a flood after the White House announcement. The economic impact of genomics has been enormous. In the first 20 years (1990–2010) an estimated $800 billion in economic activity was generated in the U.S. alone. By 2019, human genomics was contributing around $250 billion per year to the U.S. economy and supporting close to one million jobs.

This prosperity wasn’t confined to one country—it reflects a global industry transformation, evidenced by the proliferation of large-scale genome initiatives across continents.

Many countries built national biobanks and sequencing programs (e.g., U.K. Biobank’s 500,000 genomes; France’s “Médecine Génomique 2025”; China’s Precision Medicine Initiative). These programs drive local biotech growth and ensure that genomics is truly a global enterprise, not just a U.S. effort.

The genomics sector’s value is now measured in the trillions: the global biotechnology market (much of it genomics-driven) was valued around $1.3–1.7 trillion in the mid-2020s. In short, the first human genome sequences triggered a seismic expansion of the biotech industry worldwide—launching new companies, creating jobs, and training a generation of genomics experts.

Genomics became integral to the fabric of biomedical research, medical practice, and society, moving beyond the lab to commercial and clinical sectors.

The not-so-good news

Although several countries launched big-budget programs, the 2016 announcement of China’s multimillion dollar investment in its Precision Medicine Initiative is leveraging the country’s vast sequencing capacity and population and dwarfing U.S. funding.

Perhaps due in part to budget limitations, the planning at NIH and other agencies was shortsighted in its policies and, as a result, after 25 years the understanding of the human genome has progressed far less than it could have. We still have a limited understanding of how our genetic code has produced over 9 billion unique individuals.

In my view, this slow progress can be attributed to three factors.

1. Short read sequencing technology

Ironically the cost of sequencing genomes and the development of new faster, cheaper technologies while democratizing DNA sequencing has had critical unintended consequences. The first two versions of the human genome published in 2001 were sequenced using Sanger sequencing, which was slow and very costly. Celera’s genome cost about $100 million and the HGP around $6 billion. Sequencing large numbers of humans was not going to be feasible at these costs.

Major reductions in genome cost came from new technologies largely driven by Illumina. Sequencing costs were driven dramatically down to less than $500/genome by 2024. However, the new sequence technology resulted in short reads of only 1-200 bp, making genome assembly impossible.

As a result, the definition of a genome sequence changed from an independently produced “sequence”, to a sufficient number of the short reads that could be layered onto a reference genome to discover SNP variations between the short reads and the reference genome.

Independent assembly of actual genome sequences only restarted recently with the tremendous advances in long reads from single molecule sequencing developed by PacBio and an independent approach by Oxford Nanopore.

In 2007 the first diploid human genome sequence was completed6 by the Venter Institute. This project, called the Homo sapiens Reference Genome Project, produced a high-quality genome sequence that included both sets of phased chromosomes inherited from each parent. The phasing was accomplished by sequencing a number of individual sperm (haploid) cells. For the record this was my genome.

This was a major milestone following the first two human genome efforts, which produced a composite (haploid) reference genome assembled from multiple individuals rather than representing a single, complete diploid genome.

The first diploid genome was significant because it revealed the genetic variations between the two chromosome sets, highlighting the importance of sequencing diploid genomes to fully capture individual genetic diversity. The diploid genome was sequenced using Sanger sequencing and cost an estimated $40 million. Even though substantial genetic variation was not in SNPs but in larger insertions and deletions, the cost led it to be largely ignored other than as the reference to align the new short read sequences.

With short read sequences layered on a reference genome, allele-specific effects were lost by collapsing maternal and paternal alleles, generating a non-existing in nature sequence that obscured and complicated variant interpretation.

For example, compound heterozygotes, where a different gene sequence was inherited from each parent, created an artificial construct showing both variants on a single protein sequence that in reality did not exist. In addition, knowing which parent a trait was inherited from is critical in risk assessments.

2. Missing heritability

This issue finally came to a head after examination of short-read genomes when it was discovered that genetic variants of up to 50% of known heritable traits were missing from SNP data.7 Heritability estimates from twin or family studies suggest that traits like height, BMI, or schizophrenia, are 40–80% heritable.

Common genome-wide association study SNPs typically explain only 10–50% of total heritability depending on the trait. This should not have been a surprise based on the first diploid genome that showed around one quarter of genome variation was in insertions and deletions of greater than a single nucleotide and that there were more total base pairs in the structural variations than in all the SNPs.

Although the NIH-led telomere- to-telomere (T2T) Consortium proclaimed in April 2003 that the human genome sequence was now completed, the first publication of a complete, phased, T2T diploid human genome actually appeared in July 2023 in Cell Research by a team led by researchers from the Chinese Academy of Sciences.8 This genome assembly represents the first publicly available instance where both parental haplotypes of a human genome were fully resolved from telomere to telomere. The researchers utilized advanced sequencing technologies, including PacBio HiFi and Oxford Nanopore ultra-long reads, combined with Hi-C data, to achieve this comprehensive assembly.

3. Lack of phenotype data

Many seemed to think that just sequencing large numbers of genomes would make deep understanding and new knowledge fall into place. While that has been true for ancestry tracing and population genetics, without detailed comprehensive phenotype data to accompany genome sequencing little true progress will be made. Much of the genetic data derived from the genome is misleading or just wrong.

For example, APOE4 mutations have been claimed to be diagnostic for predicting Alzheimer’s disease. I am a heterozygote for APOE but a brain MRI and Amyloid PET were both completely negative. As a result, I started Human Longevity to do comprehensive imaging and phenotyping along with genome sequencing.

After close to 10,000 individuals screened, not one single APOE heterozygote had any Alzheimer’s indications and 20% of homozygotes, including some in their mid 90’s also had no Alzheimer’s indications. Similar findings occurred with breast and ovarian cancers, where family history is a much stronger predictor than existing genetic markers.

To me this means the causal mutations are part of the missing heritability. We did however find that about 50% of “healthy” individuals had a major tumor or disease that they were unaware of.9

Doctor Using Pharmacogenomics In Drug Discovery

Credit: Leo Wolfort / iStock / Getty Images Plus

Looking ahead

After 25 years, the field of human genomics is now starting over with the right technology to do full diploid phased genomes. So, we are at the point where we can actually have, and test, genome changes in place of just a collection of short snippets if DNA.

The phasing will enable knowing which parent the traits were inherited from, enabling true genealogy of disease and traits. From starting Human Longevity, hundreds of similar centers have opened to help with pre-symptomatic screening, while at the same time creating a set of comprehensive phenotyping datasets to relate back to the genome sequence data.

This must be the future of genome research if we are going to make progress in truly understanding the role our genetic code plays in helping to determine our phenotypes and diseases.

References

1. Fleischmann RD, Adams, MD, White, O, et al. Whole-Genome Random Sequencing and

Assembly of Haemophilus influenzae Rd. Science. 1995;269(5223):496-512.

2. Venter JC. A life decoded: my genome, my life. Viking, New York; 2007

3. Venter JC, et. al. The Sequence of the Human Genome. Science. 2001 Feb 15;291(5507): 1304-1351.

4. Lander, ES, Linton, LM, Birren, B, et al. Initial sequencing and analysis of the human genome. Nature. 2001;409: 860–921.

5. Adams MD, Kelley, JM, Gocayne, JD, et al. Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project. Science. 1991;252(5013):1651-1656.

6. Levy, S, Sutton, G, Ng, PC, et al. The diploid genome sequence of an individual human. PLoS Biol. 2007;5(10):e254.

7. Eichler, EE, Flint, J, Gibson, G, et al. Missing heritability and strategies for finding the underlying causes of complex disease. Nat Rev Genet. 2010;11(6):446–450.

8. Yang, C, Zhou, Y, Song, Y, et al. The Complete and Fully-Phased Diploid Genome of a Male Han Chinese. Cell Research. 2023;33(10):1-17

9. Hou YCC, Yu, HC, Martin, R, et.al., Precision Medicine Integrating Whole-Genome Sequencing, Comprehensive Metabolomics, and Advanced Imaging. PNAS. 2020;117(6): 3053-3062.

The post J. Craig Venter Describes a Human Genomics Revolution Still In Progress appeared first on GEN - Genetic Engineering and Biotechnology News.

Background color picker

Search

Welcome to kuyez | Welcome My Forum

J. Craig Venter Describes a Human Genomics Revolution Still In Progress

Hoca

Administrator

The good news

The not-so-good news

Looking ahead

About Us

Newest members

What's new

Welcome to kuyez | Welcome My Forum

J. Craig Venter Describes a Human Genomics Revolution Still In Progress

Hoca

Administrator

The good news​

The not-so-good news​

Looking ahead​

About Us

Newest members

What's new

The good news

The not-so-good news

Looking ahead