News

Implementation of PDB Entry Versioning and Better Revision History to Improve PDB Archive Management

10/03 wwPDB News

A new FTP repository, ftp://ftp-versioned.wwpdb.org/ now hosts versioned structural model files in PDBx/mmCIF and PDBML formats. As announced on May 17, 2017, wwPDB has introduced a versioning system to enable depositor-initiated or wwPDB-initiated updates to previously released PDB entries while retaining the same PDB accession code. Updates to atomic coordinates, polymer sequence or chemical description in a PDB coordinate file will trigger a major version increment. Other changes will be classified as minor. All major versions of each PDB structure are retained in the new FTP archive. In the 2018 phase of the project, wwPDB will enable depositor-initiated updates of coordinates.

File names in the versioned FTP archive conform to a new naming scheme, which allows users to easily see the major and minor version numbers:

<PDB_ID>_<content_type>_v<major_version>-<minor_version>.<file_format_type>.<file_compression_type>

The familiar 4-character PDB accession code is extended to 8 characters prefixed with “pdb”. Thus PDB accession code for entry 1abc becomes pdb_00001abc. This new format of PDB accession code will be included in the model files at a later date. For example, the first initial release of PDB entry 1abc would have the following form under the new file-naming scheme:

pdb_00001abc_xyz_v1-0.cif.gz

where xyz stands for coordinate content; cif indicates the file format; and gz indicates a compressed UNIX archive file.

The first minor revision of PDB entry 1abc would then have the following name:

pdb_00001abc_xyz_v1-1.cif.gz

If PDB entry 1abc then had a major update, it would have the following name:

pdb_00001abc_xyz_v2-0.cif.gz (N.B.: The minor update number will be reset to zero every time a new major update is made.)

The versioned data files for a particular entry are stored in single directory following a 2-character hash from the two penultimate characters of the PDB code:

../pdb_versioned/data/entries/<two-letter-hash>/<pdb_accession_code>/<entry_data_File_names>

For example, major version 1 with minor version 2 file for entry 1ABC would have the following path:

../pdb_versioned/data/entries/ab/pdb_00001abc/pdb_00001abc_xyz_v1-2.cif.gz

Different views of the repository are provided for content type and format as a convenience for repository users. The wwPDB provides a link to the absolutely latest version files as well as latest version of each major version in the entries directories.

For example, users can access the absolute latest version of each coordinate mmCIF file (e.g. 1ABC) via below file path.

../pdb_versioned/views/latest/coordinates/mmcif/ab/pdb_00001abc/pdb_00001abc_xyz.cif.gz
→../pdb_versioned/data/entries/ab/pdb_00001abc/pdb_00001abc_xyz_v2-0.cif.gz

Or users can access all major versions of coordinate mmCIF files for entry 1ABC via below file path.

../pdb_versioned/views/all/coordinates/mmcif/ab/pdb_00001abc/pdb_00001abc_xyz_v1.cif.gz
→../pdb_versioned/data/entries/ab/pdb_00001abc/pdb_00001abc/pdb_00001abc_xyz_v1-2.cif.gz

../pdb_versioned/views/all/coordinates/mmcif/ab/pdb_00001abc/pdb_00001abc/pdb_00001abc_xyz_v2.cif.gz
→../pub/pdb_versioned/data/entries/ab/pdb_00001abc/pdb_00001abc_xyz_v2-0.cif.gz

Data files in the current archive location ftp://ftp.wwpdb.org/pub/pdb/data/structures/ will continue to use the familiar naming style and will continue to contain only the latest version for every entry.

News Index