RCSB PDB Help

CATH

What is CATH?

CATH is a free, publicly available, hierarchical classification of protein domain structures, which clusters proteins at four major levels, Class(C), Architecture(A), Topology(T) and Homologous superfamily (H). It was created in 1990s and provides information on the evolutionary relationships of protein domains.

In this classification scheme experimentally determined protein structures from the PDB are classified into CATH groups by a mixture of manual and automated methods. For example, Class is derived from secondary structure content, and assigned automatically for more than 90% of protein structures. Architecture, that describes the gross orientation of secondary structures, independent of connectivity, is currently assigned manually. The Topology level clusters structures according to their topological connections and numbers of secondary structures. The Homologous superfamilies cluster proteins with highly similar structures and functions. The assignments of structures to topology families and homologous superfamilies are made by sequence and structure comparisons.

Why Browse by CATH classification?

The CATH classification groups protein structures by structurally similar domains, that may have functional and evolutionary relationships. You can use the CATH browser to explore proteins that have similar shapes and functions. It can be used to identify and explore highly conserved regions that may have amino acids that are functionally significant. Such structures may also be useful as starting models for phasing (in X-ray experiments), for modeling in EM volumes (in EM experiments), for simulations and/or for hypothesis generation and experimental design.

How to use the CATH Browser?

The CATH browser allows users to type in a protein name in the search box, and select from the options in the autocomplete list. Alternatively, you can enter a CATH ID to find structures of interest. CATH IDs are 4 numbers separated by dots to represent the CATH classification group.

After locating the individual or protein class of interest in the browser, users can view the number of PDB structures in this group. Clicking on the numbers listed next to the process name will launch a search for the PDB structures that have the CATH domain of interest.

Example

Browse the PDB for structures that have a globin fold as follows:

Navigate through the tree and its branches for “mainly alpha” >> “orthogonal bundles” >> “globin-like” and "Globins" OR
Type Globin in the search box on the top of the page and select from the options "globin-like", OR
Type the 4 number CATH ID: 1.10.490.10 in the search box on the top of the page

Please report any encountered broken links to info@rcsb.org

Last updated: 6/25/2024