Results

Help Contents:
  1. Contacting us
  2. HMMER3 algorithms
  3. Supported target databases
  4. Search Parameters
  5. Results
  6. Application programming interface

There are two ways of viewing your results, the traditional score view and taxonomy view. In the score view the sequences matched are listed from best to worst scoring. In the taxonomy view, the matched sequences are arranged according to the taxonomic lineage of the source organism(s). The user can switch between the result views using the navigation buttons at the top of the page.

Navigation buttons example

Score view (default)

Sequence Matches

Searches can result in many thousands of matches. Returning large numbers of results across the web and rendering them as a table is very time and memory consuming. As such, the first 100 matches are returned by default, allowing immediate analysis of the top matches. The remaining results can be viewed by clicking on the pagination links found above and below the table. You can see the range of matches currently selected in the bottom right corner of the table. Rows in the sequence match table that have a yellow background indicate sequences that score above the reporting thresholds, yet below the inclusion or significance thresholds. Therefore all hits, even if they score above the hit significance threshold will be deemed insignificant. Rows that have a red background indicate sequences that score above the significance/inclusion threshold, but where no single match exceeds the domain significance/inclusion thresholds.

The dark red line in the table provides a visual clue as to where the threshold lies in the results.

Sequence matches

Clicking on the right facing arrows (>) in the very first column of the table will reveal the alignment. The show all link in the table footer allows the display of all hit alignments for the sequences shown in the display ( This is limited to tables of 100 rows or less )

Alignments

At the end of each row in the sequence hit table there is a "show" link. Clicking on this link displays the maximum expected accuracy (MEA) alignment between the query and the target. For each hit between the query and targets there are five rows in the alignment:

Above the alignment the match details are presented:

The are then two E-values for the domain:


sequence alignment

There can be multiple hits per sequence because HMMER performs local-local searches (meaning any subsequence of the query model can align to any subsequence of the target sequence). These are shown sequentially, according to the position on the sequence. An alignment with a yellow background indicates a reported domain/hit that falls below the domain/hit significance threshold.

Note: In the case of hmmscan the query and target lines correspond to different data. The second line (previously query) is the 'Model' and the fourth line (previously target) is the 'query'.

Jackhmmer iterations

Iteration summary

After each iteration for jackhmmer, rather than proceeding to the results page, you are taken to a summary page, which gives an overview of the number of gained, lost or dropped sequences. Sequences gained are those that are new sequences compared to the previous iterations, scoring above the significance threshold. Lost are previously significant sequences, that are no longer reported in the results. Dropped sequences are sequences that were previously significant, but have fallen below the threshold but are still reported.

Jackhmmer Summary

From this table it is possible to view the results of all previous iterations. Thus, if you decide that you want to re-run the latest iteration you can simply go back one and add/remove sequences. Alternatively, if you are happy with the way searches are proceeding, trigger of the next search, with will take all significant hits for the next iteration. If you job converges before 5 iterations (which is the current maximum), the table will be updated to indicate convergence and the run next iteration button will be remove.

Jackhmmer results

The results for jackhmmer are much the same as described above for phmmer. However, there are a few additions. The first is the inclusions of some navigation at the top of the page. The (lost matches( will show a table of the sequences that have been completely lost compared to the previous iteration. There are links to the first new match and to the page of results where the threshold appears. There are also grey buttons in this block that allow you to move between iterations.

Jackhmmer Navigation

Another difference is that each row in the results has a check box, which allows sequences to be either removed or added to the results (a checked box denotes that they will be used in the next iteration). This allows you to modify which sequences are included in successive rounds of jackhmmer. A button at the top and bottom of each page will allow you to start the next iteration.

Jackhmmer Matches

New sequences in the results are denoted with a green background behind the target accession/identifier. Sequences that have dropped below threshold compared to the previous iteration are shown with a red background behind the target accession/identifier.

HMM logos

Below the results table for hmmsearch and jackhmmer (after first iteration if started with a single sequence), you will find an HMM logo. This produces a graphical representation of the profile HMM, with large letters representing more probably/conserved amino acids.

Customization of Results

The default sequence match table contains four information columns: Target (accessions and/or identifiers), Description (functional annotations), Species and E-value. Additional columns can be added by clicking on the "Customize" link at the top right of table. This will reveal a form (shown below) that facilitates a range of custom display options.

Custom Form

Select Visible Columns

The columns that can be selected are:

Row Count
Number the columns.
Secondary Accessions & Ids
Additional identifiers that the sequence may also be known as in the literature and other databases.
Description
The sequence description.
Species
Shows the species to which this sequence belongs and provides a link to the NCBI taxonomy Browser.
Kingdom
Shows the kingdom to which this sequence belongs.
Known Structure (PDB)
Shows whether a structure has been deposited in the PDB for some or all of the sequence, based on SIFTS
Identical Sequences
As most of the target sequence databases contain some redundancy, we collapse identical sequences into a single row of the table. The redundant sequence information (accessions, description and species) is accessible by clicking the number found in the [ Identical Seqs ] column. This produces a pop-up table like the one shown below.
Number of Hits
The number of regions that score above the reporting threshold.
Number of Significant Hits
The number of regions that score above the inclusion threshold.
Bit Score
A bit score in HMMER is the log of the ratio of the sequence's probability according to the profile (the homology hypothesis) to the null model probability (the non-homology hypothesis).
Hit Positions

A graphical representation showing the location of the matches of the query sequence to the target. Below is an example of a query sequence (top) that has 2 regions matching 4 regions in the target sequence (bottom). Note that there are 3 hits colored red. These hits are all the same color as they are found in an overlapping region of the query sequence. The fourth hit is labeled differently because it does not overlap any of the other sequences. The query and target images are scaled according to each other, so the query may scale differently from row to row in the table.

Rows Per Page

In addition to column selection you can also choose the number of rows to be displayed per page. The default value is currently set to 100 rows per page, which shows you a reasonable amount of information, without over loading your browser. While an "All" option is provided, it is recommend that an initial limit be set as some searches can produce a large number of results, which may crash your browser during the rendering of the page.

The ability to show all hit alignments is disabled when more that 100 results are shown in the page.

Identical Sequences

As most of the target sequence databases contain some redundancy, identical sequences are collapsed into a single row of the table. The redundant sequence information (accessions, description and species) is accessible by clicking the number found in the [ Identical Seqs ] column. This will reveal a table like the one below, which shows information about the other identical sequences.

Duplicates

When more than 20 identical sequences are present, the "Next" link allows navigation through the list of redundant sequences.

Profile HMM Matches

This table differs slightly from the Query Match table above. As one sequence is being compared to a profile HMM database, we just report the domain hits.

domain hits - simple

This table is shown automatically for hmmscan searches and can be revealed on phmmer searches by clicking on the "Show hit details" link under the domain graphic. This gives the basic list of matches to Pfam domains, including the Pfam identifier, accession, clan accession and short description. The start/end positions in the basic view relate to the domain envelope. Finally, the domain conditional and independent E-values (described above). As before, rows in the match table that have a yellow background indicate matches that score above the reporting thresholds, yet below the inclusion or significance thresholds.

The alignment start/end positions (that indicate the position of maximum alignment accuracy), HMM start/end positions, as well as the bit score can be obtained by clicking on the advanced option in the top right of the table heading row.

domain hits - advanced

Similar to the sequence hits, the show link reveals the alignment. This produces a similar formated pairwise alignment. Notice, that the query is now in the bottom row as the sequence is compared to a profile, not converted into a profile as with phmmer.

domain hits - alignment

Domain Graphic

By default, a search using hmmscan is run when running a phmmer search. This will indicate the presence of any known Pfam domains on your query sequence. As with Pfam, we present the hits graphically as shown below:

domain graphic

In this example, there are two domains on the sequence. The second domain is label SH2, the first domain is an SH3 domain. You can reveal which domain the first representation is by mousing over the graphic or by viewing the table of domain hits. Note that the number of domains in the table and in the graphic may differ due to Pfam Clans, where multiple HMMs are used to represent large, divergent families. We apply the same post processing to remove overlaps as Pfam to produce the graphic, but unlike Pfam, we show all matches in the table.

Hit Graph

When the target is a sequence database ( phmmer or hmmsearch), we produce a graph to show the distribution of matches. This can be found just above the 'Query Matches' table. The x-axis is hits that have been binned or grouped by E-value, the y-axis is the number of hits in the bin: An example is shown below:

hit distribution graph

The columns of the graph link to the table containing the sequence hits. Thus, to view hits with a higher e-value, click on one of the bins closer to the right side of the graph and the table will be scrolled to that position. Furthermore, each bar in the graph is broken down according to the taxonomic kingdom to which the source organism belongs. It is then simple to assess the taxonomic range of sequence matches to the query sequence.

Under each table, there is a row of two links.

Downloading

The downloads section is accessed by clicking on the download link below the results table. There are a total of 8 different download formats for the different search algorithms:

Format Description Algorithm Gzipped
phmmer hmmsearch hmmscan jackhmmer
FASTA A single file containing all the regions matched in your hits in FASTA format.
Full Length FASTA The same format as the FASTA option, but the full length sequences for significant search hits are returned.
Aligned FASTA Significant search hits returned in the aligned FASTA format.
STOCKHOLM Significant search hits returned in the STOCKHOLM format. Useful if you wish to use your results with the command line version of HMMER.
ClustalW Significant search hits returned in the ClustalW format.
PSI-BLAST Significant search hits returned in PSI-BLAST format.
PHYLIP Significant search hits returned in PHYLIP format.
Plain text Designed to be human readable and contains less information compared to the other formats.
XML Machine readable and contains all the output data from HMMER.
JSON All the same data as the XML format, but in JavaScript Object Notation.
HMM A profile HMM generated from the uploaded multiple sequence alignment. LogoMat-M can be used to generate a graphical representation of the profile HMM.

Search details

The search details provides you with the exact time that the search was performed on our servers, the complete command used to perform the search and the database searched against. If the database has a version associated with it this will be documented, as well as the date that we downloaded the database. An example of the provenance data is shown here:

We also include your query sequence in FASTA format, where applicable. Should you have bookmarked or performed multiple searches and have lost track of which job id corresponds to which job, then this provides a way of tracking the search. You should also double check that this sequence is the same as the one you submitted.

Taxonomy view

Tree Graphic

The first item on the Taxonomy view page is the taxonomic tree graphic. This shows all the sequence hits distributed across a tree derived from the NCBI taxonomy database. The tree starts on the left side with 'All' sequences and each step to the right divides the data further until the species level is reached. Each node in the tree contains the classification name and the count of all hits from that point down. There is also a small hit distribution graphic located below each node, which indicates the proportion of significant hits found within that taxonomic group. Directly above the tree there is a directory like listing, which indicates all the parent nodes of the currently selected node. Clicking on one of the parents allows you to traverse back up to that level of the tree.

taxonomic tree example

Species Distribution

The "Species Distribution" table is linked to the Tree graphic and displays all the species in which a hit occurred. As you descend down the tree, the number of species listed in the table will be reduced to show only those species that are found within the current top-level node. Along with each name we also show the number of hits that were found against sequences from the species. The last column is a link back to the score page that will provide more details on the hits associated with that species.

Downloading

This section is exactly the same as the Downloading section for the Score view

Search details

This is exactly the same as the Search details section for the Score view

Domain Architecture view

The 'Domain Architecture ' view is designed to group all significant sequence matches based on their constituent Pfam domains. The Pfam domains are defined using the Pfam curated gathering thresholds and can not be altered by search parameters. The results of a search are then displayed with the most frequently occurring architectures first.

Domain Graphic (Query)

This section is only available when running phmmer. An hmmscan is run against the pfam database for the query sequence. Domains found on that sequence are represented graphically as shown by the example below. This graphic is exactly the same as the one that can be found on the score view page, if the hmmscan was run as part of the original query. If not, a hmmscan is run using the default Pfam gathering thresholds. This allows the query sequence domain architecture to be compared to those found on the matched target sequences. Below this graphic, there is a link that will will take the users to the same architecutre as the query sequence architecture, if found in the set of target sequences.

Example domain graphic

Domain Architecture list

The domain architecture list is a breakdown of all the sequences found by your search according to the Pfam domains found within each sequence. Sequences with identical domain architectures are grouped together and ordered by the most frequently occurring. Note, sequences with no domains on them is also considered as an architecture. Each architecture group is represented on the page by a row in the tbale and each row can be divided into four subsections. An example is shown below:

Row Subsections

Sequence Count
This is the number of sequences that share the domain architecture. Clicking on this count will reveal the domain architecture graphics for all of the sequences in this group. If there are more than 40 sequences with the same architecture, the results are paginated in sets of 40. The 'Show More' will reveal the next set of matching sequences.
Example
Here you are shown the name and order of each domain found in the architecture.
Graphic
A graphical representation of the example sequence. This shows all the domains that were found for that architecture and can be used like the domain graphics for the query. The black line(s) along the bottom of the image indicate where your query aligned to the target sequence. Hovering over the black line will reveal a pop-up with the alignment coordinates of the hit.
View Scores
Clicking this link will take you back to the score view and restrict the results shown to only those that have the selected architecture.

Downloading

This section is exactly the same as the Downloading section for the Score view

Search details

This is exactly the same as the Search details section for the Score view