Membrane Protein Resources
What are membrane proteins?
Membranes define cellular and organellar boundaries and are composed of phospholipid bilayers. Membrane proteins are either embedded in or associated with the phospholipid bilayer. Membrane proteins are crucial for cell survival and communication across membranes, serving as molecular transporters, signal receptors, ion channels, and enzymes. Although membrane proteins are encoded by roughly one fifth of human genes, they account for half of all drug targets.
Unlike soluble globular proteins, membrane proteins have hydrophobic amino acid side chains exposed on their surfaces so that they can associate with and embed in hydrophobic phospholipid bilayers. However, when these proteins are separated from lipids they often tend to aggregate and/or precipitate. As a result, experimentally determined structures of membrane proteins are underrepresented in the PDB archive. However, recent improvements in experimental design (e.g., use of cryo-electron microscopy and inclusion of detergents, lipid molecules, vesicles, and nanodiscs) provide a wealth of new possibilities for membrane protein structure determination.
Why is it important to learn about membrane proteins?
Since the membrane-associated and membrane-traversing regions of membrane proteins have distinct properties, recognizing these regions in the protein can help us understand the stability and functions of membrane proteins. For example, learning which parts of the protein face the cytosol and which parts are extracellular can help us recognize the ligand-binding and signaling domains of a membrane receptor. In general, membrane proteins can be classified based on their different spatiotemporal characteristics:
- Classifications based on duration of membrane association:
- Integral membrane proteins are permanently attached to a lipid bilayer, either embedded in or anchored to the membrane.
- Peripheral membrane proteins form transient complexes with the membrane or with integral membrane proteins.
- Classifications based on mode of membrane association:
- Transmembrane (or bitopic) proteins traverse the membrane layer at least once.
- Monotopic membrane proteins are attached to a single side of the lipid bilayer. They can be covalently bound to lipid molecules or interact with the membrane via amphipathic alpha helices, hydrophobic loops, or electrostatic interactions.
How are membrane protein structures identified in the PDB?
Although membrane proteins are composed of helices and sheets, they have some unique features and properties that distinguish them from any other soluble protein. PDB structures of these proteins often do not include a lipid bilayer, so the membrane-binding regions have to be manually and/or programmatically annotated by experts. Following membrane protein annotation, these proteins can be organized in various ways based on their membrane-associated regions.
Four external resources are used to annotate entries in the PDB archive as membrane proteins:
- OPM: Orientations of Proteins in Membranes database (Lomize, 2012): This classification is derived from SCOP and the Transporter Classification Database (TCDB). A custom transfer energy function is used to determine the position of the lipid bilayer.
- PDBTM: Protein Data Bank of Transmembrane Proteins (Kozma, 2012): This resource is based on the TMDET algorithm (Tusnády, 2004) that detects membrane proteins by their 3D structure.
- MemProtMD: A database of membrane proteins embedded in lipid bilayers (Newport, 2019): This automatic annotation pipeline identifies integral alpha-helical domains as well as beta barrels based on sequence features and then determines protein-lipid interactions using molecular dynamics simulations.
- mpstruc: Membrane Proteins of Known Structures (White, 2009): The structures in this browser are manually curated based on literature surveys.
Each of these resources uses a variety of measures and strategies for annotating membrane proteins (reviewed in Shimuzu, 2018). Some membrane protein structures are unanimously annotated by all four external resources. Depending on the annotation strategy used, some membrane protein structures are only annotated by a single resource. For example, OPM covers a substantially higher fraction of peripheral membrane proteins than are represented in other resources. By contrast, MemProtMD specifically considers integral membrane proteins.
Different types of membrane protein information are derived from each of the four resources (see Table 1). All four external membrane classification resources provide PDB entry-level information. These data are mapped to individual entities in these entries through cross-referencing with relevant UniProt features or keywords (‘transmembrane region’, ‘intramembrane region’, ‘signal peptide’, ‘transit peptide’, ‘membrane’, and ‘cell membrane’). The OPM and mpstruc resources provide detailed classifications of membrane proteins, and OPM and PDBTM provide information about the specific sequence segments that are membrane-associated. Both the classifications and membrane protein segment annotations are integrated into rcsb.org and can be accessed there. In addition, OPM, PDBTM, and MemProtMD provide their own visualization of the membrane layer relative to the protein.
Table 1: Types of information provided by OPM, PDBTM, MemProtMD, and mpstruc are summarized in the table below.
|Detailed Classification of Annotated Membrane Proteins||Integrated into rcsb.org; presented as Browse Trees||Integrated into rcsb.org; presented as Browse Trees|
|Membrane-Associated Sequence Segments||Integrated into rcsb.org; presented as entity-level information||Integrated into rcsb.org; presented as entity-level information|
|3D Visualization of Membrane Layer in the Context of the Protein||Available in external resource||Available in external resource||Available in external resource|
How to access membrane protein information?
Various annotations and classifications from external resources can help one identify membrane proteins in the PDB, learn about the membrane-associated sequence segments, and visualize in 3D the position of the membrane with reference to the protein(s).
Use the Advanced Search system, to search for OPM, PDBTM, MemProtMD, or mpstruc annotations to display all entries annotated as membrane proteins by one of these resources.
Browsing Membrane Protein trees
The ‘Browse Annotations’ feature allows the identification of membrane proteins in the PDB using annotations and classifications from the OPM and mpstruc resources. Hierarchies from both of these resources are also presented on the Annotations pages of classified membrane proteins.
Membrane annotations on the Structure Summary Page
The structure summary page contains a special 'Membrane Protein' remark if an entry is a membrane protein. This link points to the Annotations page of the entry. For each membrane protein entity in the structure, a dedicated ‘Membrane Entity’ tag is featured in the macromolecules section of the page. External links to OPM, PDBTM, and MemProtMD are available via the annotations page and provide details on the corresponding membrane protein entries.
Membrane annotations in the 1D Sequence View
OPM and PDBTM provide information on membrane-associated segments at the sequence level. This information is available in the Protein Feature View, together with annotations from UniProt on transmembrane and intramembrane regions.
Annotations might be missing from the Protein Feature View if an external resource uses non-standard chain identifiers (as it is the case for OPM and PDB ID 6k33). Minor differences between sequence positions reported by UniProt, OPM, and PDBTM are expected.
Predicting and visualizing membrane position in the 3D view (in Mol*)
The structure summary page for a membrane protein includes a link to a special Mol* visualization of its predicted membrane location and orientation as calculated from its 3D structure using the ANVIL (Assignment aNd VIsualization of the Lipid bilayer, Postic, 2016) algorithm. ANVIL is a simplified version of the TMDET algorithm (Tusnády, 2004), the algorithm used by PDBTM. Mol* incorporates an implementation of the ANVIL algorithm that simulates the membrane as the space between two translucent circular plane segments. It is important to note that this is a prediction which is not backed by experimental data.
ANVIL classifies the 20 canonical amino acids based on their hydrophobicity, which determines their propensity to be embedded in a membrane. The algorithm focuses on 'exposed' residues that would interact with solvent or membrane, identified and filtered according to their solvent-accessible surface area. An optimal membrane location is assumed to be one that embeds the maximum number of exposed hydrophobic residues while excluding the maximum number of exposed hydrophilic residues. Based on this assumption, ANVIL iteratively optimizes its prediction of membrane location and thickness.
One shortcoming of the algorithm is that it only considers polymer residues and ignores lipids modeled in the structure, leading to imperfect predictions such as that for PDB ID 2xtv. The prediction by ANVIL also tends to be flawed for proteins determined by NMR (e.g., PDB ID 5x29), in such cases switching to different NMR models may provide better results. Another interesting case occurs for bacterial efflux pumps (e.g., 5v5s) that can traverse more than one membrane (see coloring of its PDBTM entry).
The OPM, PDBTM, and MemProtMD resources also provide their own membrane orientation data and visualizations. Users can visit those external resources for details.
The PDB entry 3sn6 features the human chimeric β2-adrenergic receptor in complex with G-proteins and a nanobody. The receptor itself is a chimeric molecule (meaning there is a bacterial protein fused to the human protein at the N-terminus to increase solubility and stability). However, the focus of the study (and discussion here) is the human receptor protein, which has transmembrane helices (see Figure 1). A variety of information related to membrane associations can be accessed:
- Clicking on the hyperlinked "Yes" next to the Membrane Proteins row in the top of the page links to the Annotations tab of the structure summary page where additional details from the marked membrane related external resources are provided (Figure 3). Clicking on the hyperlinked "Yes" in the Macromolecules section for specific entities also opens the same Annotations tab (see Figure 2).
- Clicking on any of the orange boxes with the names of external resources opens the specific resource page with additional information.
- Clicking on the hyperlinked words "Predict Membrane" below the image of the structure (see Figure 5) opens a dedicated 3D visualization page with the membrane position marked (Figure 6).
|Figure 1: Header of the Structure Summary page of PDB ID 3sn6 showing options for learning more about membrane annotations associations in this structure in red outlined boxes.|
On the Structure Summary page the receptor protein is Entity 4 and shows a dedicated 'Membrane Entity' property (Figure 2). This indicates that this entity is annotated as transmembrane or membrane-associated by OPM, PDBTM, MemProtMD, or mpstruc.
|Figure 2: Information for Entity 4 which indicates the presence of annotations from OPM, PDBTM, MemProtMD, or mpstruc and links to them in the Annotations tab of the Structure Summary page.|
Clicking the hyperlinked 'Yes' next to the "Membrane Protein" row on the top of the Structure Summary page points to the Annotations tab of the structure summary page (Figure 3). This view provides details on the present membrane protein annotations (as well as all other annotations).
Clicking on the Sequence tab on the Structure Summary page and selecting Entity 4 allows viewing of the amino acid sequence of this entity with residue-level annotations derived from a variety of resources. Membrane segment annotations from OPM and PDBTM are included for classified proteins.
Any entry annotated as a Membrane Protein includes a link to a special Mol* viewer that predicts and visualizes membrane location and orientation. The membrane location prediction can be visualized by clicking on the Predict Membrane option below the thumbnail image of the structure (Figure 5). Please note that the link will be specific to the displayed assembly. Use the arrows at the top to switch to another assembly (if applicable) or to the asymmetric unit.
|Figure 5: Option to visualize predicted membrane position in a Membrane protein entity: Click on the highlighted button on the Structure Summary page.|
Clicking on the "Predict Membrane" button opens the Mol* view of the selected asymmetric unit or Biological Assembly and visualizes the membrane orientation predicted by the ANVIL algorithm as two translucent circular plane segments (see Figure 6). The space between both membrane planes represents the location of the membrane. The amino acids are colored by their hydrophobicity value. Hovering over individual residues reports its identity in the bottom right corner and also. highlights it in the sequence panel at the top of the 3D canvas.. Click the ‘Membrane Orientation’ component to focus on or hide the membrane visuals.
It is important to note that the membrane position shown here is a prediction which is not backed by experimental data. No prediction is available for monotopic/peripheral proteins that do not traverse the membrane layer and tiny structures that contain fewer than 15 residues.
|Figure 6: A dedicated 3D visualization page in Mol* with the predicted membrane position marked in the the PDB entry, shown by two transparent circular plane segments. Visual parameters can be changed using the Mol* options.|
- Bittrich, S., Rose, Y., Segura, J., Lowe, R., Westbrook, J. D., Duarte, J. M., & Burley, S. K. (2021). RCSB Protein Data Bank: improved annotation, search and visualization of membrane protein structures archived in the PDB. Bioinformatics, btab813, doi: 10.1093/bioinformatics/btab813
- White, S. H. (2009). Biophysical dissection of membrane proteins. Nature, 459.7245, 344-346, doi: 10.1038/nature08142.
- Newport, T. D., Sansom, M. S. P., & Stansfeld, P. J. (2019). The MemProtMD database: a resource for membrane-embedded protein structures and their lipid interactions. Nucleic acids research, 47(D1), D390-D397, doi: 10.1093/nar/gky1047.
- Lomize, M. A., Pogozheva, I. D., Joo, H., Mosberg, H. I., & Lomize, A. L. (2012). OPM database and PPM web server: resources for positioning of proteins in membranes. Nucleic acids research, 40(D1), D370-D376, doi: 10.1093/nar/gkr703.
- Kozma, D., Simon, I., & Tusnady, G. E. (2012). PDBTM: Protein Data Bank of transmembrane proteins after 8 years. Nucleic acids research, 41(D1), D524-D529, doi: 10.1093/nar/gks1169.
- Shimizu, K., Cao, W., Saad, G., Shoji, M., & Terada, T. (2018). Comparative analysis of membrane protein structure databases. Biochimica et Biophysica Acta (BBA)-Biomembranes, 1860(5), 1077-1091, doi: 10.1016/j.bbamem.2018.01.005.
- Postic, G., Ghouzam, Y., Guiraud, V., & Gelly, J. C. (2016). Membrane positioning for high-and low-resolution protein structures through a binary classification approach. Protein Engineering, Design and Selection, 29(3), 87-92, doi: 10.1093/protein/gzv063.
- Tusnády, G. E., Dosztányi, Z., & Simon, I. (2004). Transmembrane proteins in the Protein Data Bank: identification and classification. Bioinformatics, 20(17), 2964-2972, doi: 10.1093/bioinformatics/bth340