PDB Statistics: Growth in Number of Unique Protein Sequences in Released PDB Structures (Cumulative) at Identity 30%

This chart shows the annual and cumulative numbers of protein sequences in released PDB structures. The chart can be viewed for a few different levels of sequence identity since the beginning of the PDB archive. The cumulative bars represent the growth in unique protein sequences (number of polymeric entities) across history. The yearly bars (dark blue) tell how many new protein sequences were added in a certain year.

Note: The total number of sequence clusters in the statistics table is linked to the sequence cluster group search result page. There is a default precision threshold in calculating the numbers for performance balance. So the statistics count may have a slight discrepancy compared to the actual non-redundant group search result when the result count approaches or goes above 10,000. The group search result page provides an accurate count. The statistics page provides the trend.

Chart is currently loading

Sequence cluster level:

YearNumber of New Protein SequencesTotal Number of Protein Sequences
19761111
19771122
1978325
1979126
1980228
1981735
19821752
1983557
1984966
1985773
1986780
1987787
198816103
198927130
199027157
199135192
199246238
1993131369
1994279648
1995227875
19962611,136
19973891,525
19984541,979
19996102,589
20007253,314
20017254,039
20027724,811
200310455,856
200414857,341
200515658,906
2006180110,707
2007195512,662
2008187614,538
2009187116,409
2010186918,278
2011166619,944
2012176121,705
2013184223,547
2014218425,731
2015198227,713
2016212129,834
2017226732,101
2018228534,386
2019239836,784
2020288139,665
2021232741,992
2022291444,906
202323345,139