The PDB Archive Reaches a Significant Milestone
With this week's update, the PDB archive has passed the milestone of 150,000 entries, and now contains a total of 150,145.
Established in 1971, this central, public archive has reached this milestone thanks to the efforts of structural biologists throughout the world who collectively contribute a wealth of experimentally-determined protein and nucleic acid structure data, which is made available to researchers all around the world, across many different disciplines.
Four wwPDB data centers support online access to three-dimensional structures of biological macromolecules that help researchers understand many facets of biomedicine, agriculture, and ecology, from protein synthesis to health and disease to biological energy. The archive is large, containing more than 1.9 million files related to these PDB entries and requiring more than 512 gigabytes of storage.
The archive reached the landmark of 100,000 entries in 2014, the International Year of Crystallography. Since that record was set, the PDB continued to grow rapidly, both in number of deposited structures and in the complexity of the data. This growth has been supported by the launch of OneDep, a common global system for deposition, validation, and biocuration of PDB data for supported experimental methods. The OneDep system and the underlying PDBx/mmCIF archive format enable the PDB archive to adapt over time to meet the challenges posed by developments in structural biology. More than 41,000 structures that have been deposited, annotated, and validated using OneDep have now been released into the PDB archive, with many more entries updated to ensure consistency of the archive.
With this week's regular update, the PDB welcomes 262 new structures into the archive. These structures join others vital to research and education in fundamental biology, biomedicine, and bioenergy. Since its inception, the size of the archive has increased tenfold roughly every 10-15 years: the PDB reached 100 released entries in 1982, 1000 entries in 1993, and 10,000 in the year 2000. Now that the 150,000th is made available, more than half of the archive has been released in the past ten years.
The scientific community eagerly awaits the next 150,000 structures and the invaluable knowledge these new data will bring. However, the increasing number, size and complexity of biological data being deposited in the PDB and the emergence of hybrid structure determination methods constitute major challenges for the management and representation of structural data. wwPDB will continue to work with the community to meet these challenges and ensure that the archive maintains the highest possible standards of quality, integrity, and consistency.
Development and future of the PDB archive and wwPDB organization is described in the new reference publication for the PDB archive: Protein Data Bank: the single global archive for 3D macromolecular structure data (Nucleic Acids Res., 2019) and many other papers, including Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive (Methods in Molecular Biology, 2017), How community has shaped the Protein Data Bank (Structure, 2013), and Creating a Community Resource for Protein Science (Protein Science, 2012). A full list is available.