A new era for biodiversity genomics begins

The Vertebrate Genome Project (VGP) today announces its flagship study, as well as associated publications focused on the quality of genome assembly and standardization for the field of genomics.

The international consortium, led by a research team from the Rockefeller university (USA), with the participation of the Institute of Evolutionary Biology (IBE), a mixed center of the CSIC and the Pompeu Fabra University (UPF), publishes in a first study in Nature a proof of concept with 16 sets of genomes reference of complete vertebrates (such as the zebra finch or the Ana hummingbird) of high quality. These results will allow access to the study and conservation of species on an unprecedented scale.

This massive comparative genomics project represents a new era of innovation in genome science, developing and using state-of-the-art sequencing, assembly and annotation techniques in new ways.

Tomàs Marquès-Bonet, from the IBE

Thanks to the work of 10 years of the scientific community of the project Genome 10K (G10K) to sequence the genomes of 10,000 species of vertebrates and from other comparative genomics efforts around the world, the VGP has been able to take advantage of improvements in sequencing in recent years to begin production of high-quality reference genome assemblies for the more than 70,000 living vertebrates.

“This massive comparative genomics project represents a new era of innovation in genome science, developing and using next-generation sequencing, assembly and annotation techniques in new ways, with implications for addressing fundamental questions in comparative biology, genetics, and biodiversity conservation ”, comments Tomàs Marquès-Bonet, principal investigator in the Comparative Genomics group of the IBE, and member of the Steering Committee of the VGP project.

The consortium will also serve as a model for other coordinated genomics projects, such as the Catalan Initiative for the Earth Biogenome Project (CBP), that can take advantage of the extensive infrastructure and knowledge of the VGP, which has included the collaboration of hundreds of international scientists from more than 50 institutions from 12 different countries since the project began.


Ana’s hummingbird (Calypte anna). / Adobe Stock

Towards the improvement of genome assembly

In a special issue of Nature, with complementary articles published simultaneously in other scientific magazines, the VGP details numerous technological improvements in genome assembly.

The international team has managed to combine long-range automated reading of genomes with the use of new algorithms to put the pieces of the genomic puzzle together in each case with almost no errors.

In the lead article, the VGP demonstrates the viability to establish and achieve high-quality metrics for the reference genome of almost all species. With their new approach, the international team has succeeded in combining long-range automated reading of genomes with the use of new algorithms to put the pieces back together. genomic puzzle in each case almost without errors.

“When I was asked to take over the leadership of the G10K in 2015, I emphasized the need to bring together more partners and work on approaches that would produce the highest quality data possible, as it took months for students and postdocs in my own group to correct the structure of each gene in genome sequences for their experiments, ”says Erich Jarvis, head of the VGP sequencing center at Rockefeller University, coordinator of the G10K and researcher at the Howard Hughes Medical Institute. “For me, this was not only a practical mission, but a moral one,” he adds.

The first genomes analyzed have already led to new discoveries with implications for characterizing biodiversity and contributing to conservation and human health. In particular, the first high-quality reference genomes from six species of bats, generated with the Bat 1K consortium, revealed the selection and loss of immunity-related genes that are directly relevant to the investigation of emerging infectious diseases, such as the current covid-19.

As the first large-scale project of high-quality reference eukaryotic genomes, the VGP has also become the working model for other large consortia, including the Earth BioGenome Project, the Tree project. de la Vida de Darwin (Darwin Tree of Life), the Catalan Initiative for the Earth Biogenome Project (CBP) and the European Reference Genome Atlas (ERGA), among others.

So far, the VGP consortium has led to the generation of more than a hundred genomes that represent the most complete versions of these. species. The genomic data developed have been generated mainly in three sequencing centers that have committed to the mission of the VGP, including the vertebrate genome laboratory of Rockefeller University (New York, USA), in part supported by Howard Hughes Medical Institute, the Wellcome Sanger Institute (UK) and the Max Planck Institute (Germany).

The next step of the VGP will be to complete phase 1 of the project, which will consist of the analysis of 260 species, with a representative species for each order of vertebrates.

The next step of the VGP will be to continue networking around the world and with others consortia until completing phase 1 of the project, which will consist of the analysis of approximately 260 species, with a representative species for each order of vertebrates separated by a minimum of 50 million years from a common ancestor with other species.

The VGP intends to create genomic resources that, in addition, allow to relate these 260 species, including complete genomes that provide a means to understand their evolutionary history in great detail. Phase 2 will focus on analyzing representative species of each vertebrate family and is currently in the process of identifying samples and raising funds.

Proposal for a new nomenclature

In another study within the framework of Vertebrate Genome Project, also published today in the journal Nature, the Rockefeller University, together with the University of Barcelona, ​​has analyzed and compared the genome of 35 species of the main lineages of vertebrates.

Their results show that oxytocin and the arginine vasotocin (or arginine vasopressin), two hormones of the endocrine system that also act as neurotransmitters and that regulate a wide range of biological functions in vertebrates, such as bonding or blood pressure, are encoded by the same family of genes that come from a common ancestral gene.

As biochemists of the pregenomic era named the genes that contain the information necessary to synthesize these hormones differently in different species of animals, researchers now propose a new universal nomenclature based on genetic evolutionary history.

According to the authors of the study, the proposal makes it possible to unify the different names used in vertebrates, both for the genes that code for hormones and for those that code for their receptors, thus facilitating comparative research between species.


Rhie A. et. to the. “Towards complete and error-free genome assemblies of all vertebrate species” (2021). Nature

Theofanopoulou, C. et al. “Universal nomenclature for oxytocin-vasotocin ligand and receptor families”, Nature

Fountain: IBE (CSIC-UPF), UB

Rights: Creative Commons.

Each family has its own language. Trailer and poster of ‘CODA’

They rescue the body of a “woman” floating in the Sea of ​​Japan; it was a sex doll