Genome sequencing has taken off in recent years, and large-scale projects are leading the way. This review looks at efforts around the world.

Gene sequencing has proved its usefulness as a diagnostic and prognostic tool. Its use in the identification of BRCA1 mutations is already a gold standard in cancer research. Thanks to personalized medicine trends and collaborations between the industry and regulatory authorities, we could see whole genome sequencing (WGS) turning into a common practice faster than one could have originally expected.

The next generation sequencing (NGS) market, including but not limited to WGS, was valued at €4.6Bn in 2015 and is expected to reach €19Bn by 2020. Many private companies, such as Illumina, Roche, Life Technologies and Pacific Biosciences, are rushing into NGS to answer the rising sequencing needs.

Thanks to this competition, the technology has quickly improved in the past years. Nowadays, you can easily order a gene screening test from a few hundred to a few thousand dollars, depending on the provider and what you are looking for.

The genomic sequence of a cytogenetically aberrant human cancer cell line

Back in 2003, the International Human Genome Sequencing Consortium kicked-off the genome analysis race by sequencing a complete human genome after years of worldwide collaboration and billions of investment. A few years later, the price of WGS reached $1,000.

According to Illumina, it is “expected one day” that whole genome sequencing will cost less than $100. On one hand, pocket NGS is not yet practical or economical. On the other hand, most available equipment is still quite expensive. For example, Illumina’s NovaSeq 5000 costs around €800,000 and a NovaSeq 6000 reaches almost €1M.

New partnerships, like the one between 23andMe and Roche’s Genentech in 2015, are trying to capitalize on the wealth of data: this partnership aims to obtain whole genome sequencing data from 3,000 people with Parkinson’s disease. (Fun Fact: Google invested in 23andMe and its co-founder married 23andMe’s CEO.)

However, sequencing genomes to generate data is only part of the job. Quality check, preprocessing of sequenced reads and mapping to a reference genome still require powerful computing facilities, efficient algorithms and obviously experienced staff. It is a time-consuming process.

Everybody talks about the $1,000 genome, but they don’t talk about the $2,000 mapping problem behind the $1,000 genome,” says Peter Tonellato, Professor of Biostatistics at the University of Wisconsin.

Moreover, WGS generates huge amounts of data, which poses a challenge for data storage.

Tremendous Data Storage

The Broad Institute in Cambridge, Massachusetts, said that “during the month of October, it decoded the equivalent of one human genome every 32 minutes. That translated to about 200 terabytes of raw data.” Even if that quantity is smaller than what is handled daily by internet companies, it exceeds anything biologists and hospitals have ever dealt with.

Amazon and Google understand this need and already offer to keep a copy of any genome for €24 ($25) a year, which translates to roughly €0.02/GB per month, since a file is commonly between 100 and 400GB. In 2014, The National Cancer Institute said that it would pay €18M ($19M) to move copies of the 2.6 petabyte Cancer Genome Atlas into the cloud.

Our bird’s eye view is that if I were to get lung cancer in the future, doctors are going to sequence my genome and my tumor’s genome, and then query them against a database of 50 million other genomes,” said Deniz Kural, whose company, Seven Bridges, stores genome data using Amazon’s cloud system.

Genome Sequencing Around the World

The UK was the first to launch a dedicated program to whole genome sequencing in Europe. Genomics England aims to sequence up to 100,000 whole genomes from patients with rare diseases, their families, and cancer patients from 11 Genomic Medicine Centres. Ten companies including GSK, AstraZeneca and Roche have signed up to be part of the GENE Consortium, giving them access to 5,000 sequenced genomes.

Genomics England

These collaborations can raise concern regarding access to private health data, but there is no doubt that such a massive project could not be possible without private funding. Genomics England’s community management is impressive, with frequent updates and campaigns to raise public awareness. According to a monthly updated counter, almost 20,000 genomes have been sequenced so far!

On a similar framework, Australia is currently working on the €290M (AU$400M), 4-year 100,000 Genomes Project (100KGP), sequencing patients with rare diseases and cancer to create a massive database for R&D.

Estonia proposed an ambitious personalized medicine program in June 2000 and thus became an unexpected pioneer. The ‘‘Estonian Genome Project Foundation’’ aimed at collecting 100,000 randomly selected samples before 2007. As of February 2014, the project had collected data from 52,000 adult donors including only a few hundred WGS.

In the USA, the Precision Medicine Initiative (PMI), with its 1-million-volunteer health study, will gather a large database of health data including genetics and lifestyle factors. To make a long story short, the Mayo Clinic will analyze and store one million blood and DNA samples.

As in the UK, some of the anonymized data will be probably made available to researchers and industries in order to stimulate the project, which started in 2016 with €52M ($55M) from the NIH to build the foundational partnerships and infrastructure needed to launch the program.

Afficher l'image d'origine

In 2016, France announced the “France Medecine Genomique 2025” program, aiming to open 12 sequencing centers and ensure 235,000 WGS a year. The French government is planning to inject €670M in this program, whose main aim is to use WGS as a diagnostics tool.

Many other western countries such as Ireland and Iceland have launched their own programs. However, when, but when it comes to personalized medicine, take into account genetic variability between populations is a prerequisite. Western medicine has historically targeted western populations but, nowadays, western medicine is a worldwide practice.

There is a massive bias in medical research; Europeans have been developing drugs for Europeans without asking how compatible these pharmaceuticals are for the rest of the world.” Stephan Schuster, Chair of the Genome Asia consortium.

Based on this observation, the non-profit consortium GenomeAsia 100K decided to generate genomic data for Asian populations. Supporters of the initiative include genomics companies Macrogen in Korea and MedGenome in India, as well as Illumina. According to the PHGFoundation, at least 50,000 DNA samples have already been collected, and initial work will focus on creating suitable reference genome sequences for key populations in Malaysia, India, Japan or Thailand.

With the same purpose, the Qatar Genome Program aims to establish the Qatari Reference Genome Map by sequencing 3,000 whole genomes, which accounts for around 1% of the Qatari population.

Last but not least, China has been an unbeatable leader in genome sequencing for years now. In 2010, the BGI genomics institute in Shenzhen was probably hosting a higher sequencing capacity than that of the entire United States. China’s sequencing program is not just aiming for thousands but rather one million human genomes and will include subgroups of 50,000 people, each with specific conditions such as cancer or metabolic disease. There will also be cohorts from different regions of China “to look at the different genetic backgrounds of subpopulations.

How to Handle the Ethical Implications?

It is difficult to anticipate the impact of WGS in modern medicine, but ethical issues regarding privacy of health data have already emerged. It is obvious that no one would like to see GAFA (Google, Apple, Facebook, Amazon) selling genome data as they are probably already doing with personal data from their users.

A key challenge is that ethical, legal, and social concerns raised by the most innovative technologies, including cell and gene therapy as well as sequencing, significantly differ between regions. This definitely gives a certain advantage to countries with less restrictive laws, which are usually not western countries. For example, in Europe, transparency about the purpose of sample collection and protocols are mandatory before any research is conducted.

Although it is easier said than done, regulators should be proactive and set up an appropriate framework for these promising but challenging approaches while ensuring it does not hinder R&D.


Images via whiteMocca /Shutterstock; Clark MJ et al. (2010), PLoS Genet 6(1): e1000832; GenomicsEngland; National Institutes of Health

Previous post

German Spin-off Scores Funding to Treat Atrial Fibrillation

Next post

There's a New Record in Biotech Fundraising: Should We Believe the Hype?