Documentation

This page acts as a help page, aiming to explain what each and every page is designed for and how you can navigate through them.

Search
Propensity
Integration
Download
About

Search

This page is where you can search for proteins using a UniProt-based Protein Identifier or a UniProt-based Accession Number. Upon accessing the page, you will be presented with this screen:

You can enter the aforementioned parameter in [1], or click on either of the examples in [2]. As you enter your identifier, the search bar will be filled with suggestions fetched from our database, based on the similarity. Upon entering an ID that is present in our database, the following information will be shown below:

First, general information about the input protein is displayed. [1] and [2] are hyperlinks to their respective UniProt pages. The following information is displayed:

UniProt Protein Identifier (UniProt ID)
UniProt Accession Number (UniProt AC)
Protein Name
Gene Name
Organism
Sequence Length
Subcellular Localizations
Protein Function
Protein Sequence

Scrolling down below, you will see the following information:

[1], [2], and [3] respectively show the number of experimentally verified PTM sites for the protein, the number of unique PTMs experimentally observed on this protein, and the total number of literature recorded across all PTMs. [4] shows the overall summary of all PTMs observed on all residues for the input protein. As for the detailed information about the PTMs of the protein, further down is information about the PTM sequence:

This shows some UniProt-enabled information about the protein sequence with the option to download the PTM data as a JSON file (see [1]). The actual retrieved sequence is displayed just below the general sequence information:

The PTMs shown on the sequence are colour-coded for convenience (as shown in [1]). Additionally, you can use the checkboxes ([2]) to hide or show certain PTMs in the list. By default, all of them will be enabled.

You can hover over a highlighted amino acid/residue (like [1]) to see what PTM is occurring on it. When you click on the highlighted residue, detailed information about the PTM will be shown in [3]:

A localised sequence of 21 amino acids is shown in [1], with the position of the center amino acid mentioned in [2]. Below them is the dbPTM-annotated PTM information about the amino acid. If applicable, a RESID Database ID is also provided alongside the PTM type ([3]). For the PTM, evidence identifiers ([4]) are provided as hyperlinks to their respective PubMed article. The Log Sum and Log-Log Product scores (if applicable) are shown as well (see [6]). If you want to know how these scores are calculated, you can click on the question mark ([5]) and it will lead you to a new page explaining the calculation (see the Propensity page info below).

This is the base information gathered for the proteins. Extra information is also provided by scrolling down:

We also provide secondary structure predictions by JPred to get an idea of how and why a PTM is occurring on the amino acid. Since JPred takes in requests as scheduled jobs, we submit the request upon searching for the protein and put up a disclaimer to wait for the response from JPred's server when it is done. Once the job is completed, we fetch and process the results.

Since JPred also provides hyperlink-embedded alignments, we have shown them as a separate page ([1]). As for the prediction, JPred's own format is followed. Upon clicking on a PTM ([2]), the entire column is highlighted to make it easier for the user to see the secondary structure, any coils, the residue burial percentage, and the confidence scores. Finally, all of the aforementioned information can be downloaded as a JSON file ([3]).

Finally, we also present predicted and experimentally verified structures for the protein observed from AlphaFoldDB and RCSB Database respectively:

[1] and [2] are hyperlinks to the respective database for extensive additional information. We use 3Dmol.js to view the PDB structures. Some buttons are provided to cycle through three different styles, clear all selections made by the user, and re-center the render should the user lose sight of the structure. You can interact with the structure by clicking on a residue, at which point it will display a label that contains the position of the residue in the sequence, the 3-letter code for the residue, any and all PTMs that have been observed on the residue, the Solvent Accessible Surface Area (SASA) in square Angstroms (Å²) (calculated using the Shrake-Rupley algorithm), and the DSSP-calculated secondary structure of the PDB (the letter after DSSP denotes the simplified structure, while the letter inside the parentheses denotes the detailed structure of the residue, if any), all observed in [3]. Additionally, for RCSB database, since all PDBs are queried using the UniProt Accession Number, you can select from one of many possible RCSB PDBs in [4].

Finally, the information can also be viewed in a tabular form:

The sequence, simplified DSSP calculations, detailed DSSP calculations, and the Shrake-Rupley SASA values are given for both the AlphaFoldDB PDB and the RCSB PDB. These can be downloaded individually as a JSON file.

This covers the search page functionality.

Propensity

We offer an interface to calculate the Propensity, or in other words, the tendency of a residue undergoing a certain PTM based on its neighbouring residues. This tool facilitates users with a calculator that gives the chances of a PTM occurring on a residue, backed by thousands of experimentally-verified observations of PTMs.

Here is the Propensity page interface:

In [1], enter the protein subsequence for which you want to calculate the Propensity. This subsequence must have a length between 13 and 21, and must be an odd value to account for equal number of upstream and downstream residues. The counter ([3]) on the right side of the subsequence input lets you know how long your subsequence is.

[2] is where you enter the PTM type for the residue you want to calculate the Propensity for. The list of PTMs will be available to you as suggestions while you type in it.

Once you have entered both of the required information, proceed to click on the "Calculate" button below. You will be presented with something like this:

[1] shows the input subsequence, with the red-color residue denoting the center of the subseuqence - i.e. the residue for which Propensity is being calculated. [2] shows the Log Sum and the Log Log Product Propensity scores. These give an educated guess on the tendency of a residue to undergo a PTM given its neighbouring residues. How these values are calculated are shown in [3]. The vector data constructed in [4] is passed through the equations for each score. The vector in turn is constructed from the table in [5]. The more red a cell is, the higher its probability is, and green cells denote the values which were picked against the residues at their positions relevant to the PTM site.

For Log Sum, every value in [4] is added, with the condition that said value must not be -inf. This is a straightforward equation designed to show empirically how much of a tendency there can be of a residue undergoing PTM.

For Log Log Product, the process is a bit more complex than Log Sum. First, the longest vector is grabbed which contains no -inf values. The newly constructed vector must not be smaller than 13 values in length, where a NIL value is shown instead. Next, all of the values (minus the residue whose Propensity we wish to calculate) are multiplied together and reciprocated with 1. Finally, the resultant value is passed through the natural logarithm equation to retrieve the Log Log Product.

Integration

The Integration page covers the RESTful API side of the website and caters to providing descriptions and examples of each API call. Five API calls are provided for your convenience, each with its own options on what it accepts as input, how it should be called, and what it will return, along with examples for each API call on how to execute the call. However, to use the API functionalities, you must first create an account and pair it with an authorization token. This is done for security purposes. Do not worry, we keep only your username and your password hashed through the SHA256 algorithm, along with the account's token. You must use this token for making API calls.

To access the API integration, you will first create your own account by clicking on the Login tab at the top of the page ([1]):

At the top, you can login with your current user handle and password. At the bottom, you can sign up for your account using a unique username. When a new user is created, it is automatically assigned a new token. This token is valid only for 5 days from the time of the token generation.

After logging in, you will see this at the top of the page:

Upon clicking on your username, you will be presented with multiple options:

You can click on "Copy Token to Clipboard" to copy the token assigned to your account and use this as a Bearer Authorization token. For more details on how to implement it, please refer to the Python starter code on making API requests. In case your authorization account expires or you wish to manually reset the token (it is also done automatically!), you can simply click on "Reset Token". Finally, the third button is for simply logging out of your account.

Download

This page gives you the option to download one of many Post-Translational Modification (PTM) positional matrices. These matrices allow users to find, through a large amount of PTMs on proteins experimentally verified, what the probability of a PTM occurring on an amino acid is. The data is scraped from the experimentally verified database provided by dbPTM and calculated for each PTM on each residue, with relative position of other residues accounted for.

As of the dbPTM 2025 update, 72 PTMs are available which have protein sequences with experimental evidences:

Acetylation	ADP-ribosylation	AMPylation	Amidation
Biotinylation	Blocked amino end	Butyrylation	Carbamidation
Carboxyethylation	Carboxylation	Cholesterol ester	Citrullination
C-linked Glycosylation	Crotonylation	Deamidation	Deamination
Decanoylation	Decarboxylation	Dephosphorylation	D-glucuronoylation
Disulfide bond	Farnesylation	Formation of an isopeptide bond	Formylation
Gamma-carboxyglutamic acid	Geranylgeranylation	Glutarylation	Glutathionylation
GPI-anchor	Hydroxyceramide ester	Hydroxylation	Iodination
Lactoylation	Lactylation	Lipoylation	Malonylation
Methylation	Myristoylation	N-carbamoylation	Neddylation
Nitration	N-linked Glycosylation	N-palmitoylation	Octanoylation
O-linked Glycosylation	O-palmitoleoylation	O-palmitoylation	Oxidation
Phosphatidylethanolamine amidation	Phosphorylation	Propionylation	Pyrrolidone carboxylic acid
Pyrrolylation	Pyruvate	S-archaeol	S-carbamoylation
S-Cyanation	S-cysteinylation	S-diacylglycerol	Serotonylation
S-linked Glycosylation	S-nitrosylation	S-palmitoylation	Stearoylation
Succinylation	Sulfation	Sulfhydration	Sulfoxidation
Sumoylation	Thiocarboxylation	Ubiquitination	UMPylation

Upon selecting a PTM, a residue, and the type of table, you are displayed with the table and the option to download the table as either a CSV, a PDF, a PNG, a SVG, or a JSON.

About

You may view the people working at the prestigious Biomedical Informatics & Engineering Research Laboratory (BIRL) and contact us for queries regarding PTMKB.