RCSB PDB Help

Basic Search


Video: Simple Searches for Experimental Structures and Computed Structure Models (CSMs)


Overview

What is Basic Search?

Located in the top bar of the website, this search can be used to searches for macromolecular structures (experimental and integrative structures from the PDB archive and Computed Structure Models or CSMs from AlphaFold DB and ModelArchive) by entering keywords or phrases—similar to a standard search engine.

Why use Basic Search?

Use this search option to launch a text-based search for a biomolecular structure or a sequence-based search and find matches in PDB entries. Basic search may be run on both PDB structures and CSMs (when Include CSM is enabled). Searches may be launched with

  • name of the protein, gene, author, ligands, keywords etc. that are present in the structure.
  • specific identifiers related to the structure of interest (e.g., PDB IDs), gene/protein sequences (i.e., GenBank and UniProt IDs) and ligands (e.g., chemical component or BIRD molecule IDs). Note: When the “Include CSM” is enabled, CSM identifiers (e.g., AlphaFold IDs, Model Archive IDs) available from the literature or other public data resources may also be used here.
  • a polymer sequence that has >25 residues (e.g., amino acids, nucleotides)

Documentation

Here are some tips for executing a Basic Search:

  • Full-text search:
    • You can type a word or phrase into the top-bar search box and click the Search icon or press Enter. This runs a full-text search across multiple fields in the database to find entries that match your query.
    • The Basic Search is designed to be broad and inclusive. It searches text-based information across multiple fields in the archival PDBx/mmCIF data, as well as annotations and metadata from external resources mapped to PDB structures.
    • The search looks for your terms across many different fields, so a word may match in more than one place. This can lead to results that are related in different ways. For example, searching for citrate may return enzymes with “citrate” in their names (e.g., citrate synthase) as well as structures that contain the citrate small molecule.
    • Basic Search supports a simple full-text query language that lets you use Boolean operators such as AND, OR, and NOT to refine your results:
      • By default, multiple keywords are combined using AND. For example, searching for Citrate Synthase will return entries that contain both terms. You can also use the plus sign (+) to require that both terms appear. For example, Citrate + Synthase. You can use the pipe symbol (|) to search with OR. For example, Citrate | Synthase will find entries that contain either term in any searchable field. Note: Typing the word OR in the search box is treated as a regular search term, not as a Boolean operator.
      • Putting words in quotation marks searches for the exact phrase. For example, “Citrate Synthase” will return entries where these words appear together in that order in one or more fields.
      • Placing a minus sign () before a word excludes it from the results. For example, ‑ Citrate will return entries that do not contain that word.
      • Using parentheses ( ) lets you control the order in which search terms are applied. For example:
        • (Citrate + Synthase) | Ligase – finds entries that contain both Citrate and Synthase or entries that contain Ligase.
  • Auto-suggestion lists:
    • As you type query word(s) or phrases in the top bar search box, a list of suggestions appear in a box below, grouped by attribute or field name, indicating a specific field in which the search term was found.
    • Click on any term from the auto-suggest list to execute a search where the selected term matches the specified attribute.
    • In the Basic search a long list of auto-suggestions may be possible. The lists in each group of the auto-suggestions are organized alphabetically and only a few top matches are listed. Completing the word(s) in the query can help refine or shorten the lists and show more relevant matches. See also Advanced search options to refine the query results.
  • Advanced Query Builder options:

A tabular summary of the symbols that can be used to combine search terms with Boolean operators

Action Operator Description Example
OR Multiple keywords, | Will find entries containing either Word1 or Word2 Citrate Synthase Citrate | Synthase
AND + or plus sign Will find entries containing both Word1 and Word2 anywhere in the entry. Citrate + Synthase
NOT - or minus sign Will find entries where Word1 is not found anywhere in the entry. -Citrate (Note searching for “-Citrate” with quotes will return entries containing the phrase -Citrate)
Indicate order of search terms ( ) or parenthesis Placing parentheses around search terms will indicate the order of the search. -(Citrate+Synthase) -(Citrate | Synthase)
Search for a phrase " " or quotations Using quotes around a search term will find entries containing that exact phrase. “Citrate Synthase”

Here are some tips for executing a Basic search:

  • This search employs an implied “contains words” or “contains phrase” strategy. This means that if the user enters a word or list of words and clicks the Search Icon, the search is processed as “contains words”, and will return results containing any of the words in the webpage, file, or metadata associated with it.
  • If a phrase is selected from the autosuggest list, or entered within quotes (e.g., "set of words") the search is processed as “contains phrase”.
  • Note that if there are no documents/pages that match the query phrase, the query is automatically changed into a "contains word" search.

Search results

The search results are listed as structures, entities, assemblies, or molecular definitions that match the query. By default the search results are ordered by a relevance score for the query options defined.

Relevancy Scoring

The text-search functionality is powered by Elasticsearch, an open-source software that enables the construction and execution of highly-customizable and complex queries to retrieve specific results relevant to the research question. By default, the results of searches are sorted by "relevancy score," which Elasticsearch calculates. This takes into account the frequency of the given search term(s) in different fields of each result (e.g., does the query word/phrase appear in the title, description, organism) along with how closely the search term(s) match the terms in those fields. The final output from this scoring process is a ranked set of results, in which those with higher calculated relevancy scores are listed first, followed by those with lower relevancy scores. More details about how this search algorithm works may be found in this Elastic blog post (specifically, see the section titled, "How documents are ranked in Elasticsearch").
In addition to relevancy scoring, several other options to reorder the results are available - e.g., based on release date, structure quality, priority showing experimental structures first or last, etc.. Note that depending on the sorting option selected, some search results may be ordered such that the CSMs are listed at the top of the results page. Scroll through all the results and/or adjust your query and sorting criteria in order to identify the structures that meet your needs.

Examples

  1. Basic search for “allosteric regulator”.
  2. Basic search for “allosteric regulator” (PDB structures and CSMs).


Please report any encountered broken links to info@rcsb.org
Last updated: 12/9/2025