UGENE Main Page


Back to list of plugins

Weight Matrix plugin

Weight matrix plugin is a tool for solving the problem of sequence annotating. As well as the SITECONs, the plugins main use case is recognition of potential transcription factor binding sites on basis of the data about conservative conformational and physicochemical properties revealed with the binding sites sets analysis.

UGENE Weight Matrix module contains a lot of position frequency matrices (PFMs) and position weight matrices(PWMs, also known as position specific score matrices - PSSMs). The matrices came from two wide-known open archives: JASPAR, which contains frequency matrices, and UniPROBE containing weight matrices.

Also UGENE Weight Matrix plugin provides a tool for creating specific position frequency and weight matrices from an existing alignment or from a multisequence. The created matrix can be used as a profile for the search as well as the JASPAR and UNIPROBE ones.

To search for transcription factor binding sites in a DNA sequence activate "Analyze->Search TFBS with matrices" context menu item:

Weight matrix menu
Weight matrix menu

In the search dialog you must specify a file with PWM or PFM. You can do so by pressing browse (1) and selecting the file.

Also you can use the special interface to choose a JASPAR matrix by pressing "Search JASPAR database" (2):

Weight matrix menu
Here the matrices are divided into categories and you can read detailed information of every matrix which is represented by its properties. It could help you choose the matrix properly.
Note that the matrices contained in UGENE are located at "$UGENE/data/position_weight_matrix" folder.

Alternative way to specify the position weigh/frequency matrix is to create a specific one from an aligment or a multisequence with the "Build new matrix" tool. Its usage is described at the end of this document.

After the profile (the matrix) is loaded, you can adjust the threshold value. The threshold (3) sets the minimal identity score for a result to pass. The more the result score is, the more it is homologically related to the aligned region. By changing the treshold you can filter low-scoring results.

If the loaded matrix is a position frequency matrix, you must also specify the algorithm to build the corresponding position weight matrix which will represent the sought-for transcription factor. There are four algorithms available.

Weight matrix menu

The rest of the options are standard sequence search options: the strand and the sequence region where to search for the matches.

After specifying the necessary options press "Search". The found results will appear at the dialog table. The corresponding results identity scores are at the "Score" table column.

Weight matrix search dialog

The regions found by Weight Matrix algorithm can be saved as annotations to a DNA sequence in Genbank format by pressing "Save as annotations".

Weight matrix menu

After saving, the file with resulting annotations will be automatically added to the current project, and the annotations will be added to the original sequence.

Note that in case of selecting JASPAR or UNIPROBE matrix, the resulting annotations will contain the given matrix properties.
Weight matrix menu
In general, working with the Weight Matrix plugin is similar to working with SITECON and must be easy for every SITECON user.

To create a position weight or frequency matrix from an alignment or a multisequence, press "Build new matrix" in the "Weight matrix search" dialog, or activate "Tools->Weight Matrix->Build weight matrix" global program menu item.

Weight matrix menu

The "Input file" is an alignment or a multisequence to build the matrix from. "Output file" specifies the file the resulting matrix will be saved into. These two files must be defined.

After defining the input file, its coloured "Alignment Logo" will appear at the bottom of the dialog. It gives a representation of the selected alignment.

Also you can specify the statistic type and the martrix type.
Statistic type defines the way in which the statistics will be collected. Mononucleic option is basically good for small alignments, and the dinucleic option must give more appropriate results for bigger aligments.

Matrix type defines the type of the resulting matrix.
If frequency type is selected then the frequency matrix will be created and saved into the resulting file.
If weight type is selected then the intermediate frequency matrix will be created and then transformed into a weight matrix on basis of the selected "Weight algorithm". Then the weight matrix will be saved into the resulting file.

To start the operation, press "Start". The matrix will be created and saved. If the "Build weight or frequency matrix" dialog was invoked from the "Weight matrix search" dialog, then the matrix also will be chosen as the current profile.

Back to the list of plugins


По-русски.