Weight Matrix pluginWeight matrix plugin is a tool for solving the problem of sequence annotating. As well as the SITECONs, the plugins main use case is recognition of potential transcription factor binding sites on basis of the data about conservative conformational and physicochemical properties revealed with the binding sites sets analysis. UGENE Weight Matrix module contains a lot of position frequency matrices (PFMs) and position weight matrices(PWMs, also known as position specific score matrices - PSSMs). The matrices came from two wide-known open archives: JASPAR, which contains frequency matrices, and UniPROBE containing weight matrices. Also UGENE Weight Matrix plugin provides a tool for creating specific position frequency and weight matrices from an existing alignment or from a multisequence. The created matrix can be used as a profile for the search as well as the JASPAR and UNIPROBE ones. To search for transcription factor binding sites in a DNA sequence activate "Analyze->Search TFBS with matrices" context menu item:
![]() In the search dialog you must specify a file with PWM or PFM. You can do so by pressing browse (1) and selecting the file. Also you can use the special interface to choose a JASPAR matrix by pressing "Search JASPAR database" (2): ![]() Note that the matrices contained in UGENE are located at "$UGENE/data/position_weight_matrix" folder. Alternative way to specify the position weigh/frequency matrix is to create a specific one from an aligment or a multisequence with the "Build new matrix" tool. Its usage is described at the end of this document. After the profile (the matrix) is loaded, you can adjust the threshold value. The threshold (3) sets the minimal identity score for a result to pass. The more the result score is, the more it is homologically related to the aligned region. By changing the treshold you can filter low-scoring results. If the loaded matrix is a position frequency matrix, you must also specify the algorithm to build the corresponding position weight matrix which will represent the sought-for transcription factor. There are four algorithms available. ![]() The rest of the options are standard sequence search options: the strand and the sequence region where to search for the matches. After specifying the necessary options press "Search". The found results will appear at the dialog table. The corresponding results identity scores are at the "Score" table column. ![]() The regions found by Weight Matrix algorithm can be saved as annotations to a DNA sequence in Genbank format by pressing "Save as annotations". ![]() After saving, the file with resulting annotations will be automatically added to the current project, and the annotations will be added to the original sequence. Note that in case of selecting JASPAR or UNIPROBE matrix, the resulting annotations will contain the given matrix properties. ![]() In general, working with the Weight Matrix plugin is similar to working with SITECON and must be easy for every SITECON user. To create a position weight or frequency matrix from an alignment or a multisequence, press "Build new matrix" in the "Weight matrix search" dialog, or activate "Tools->Weight Matrix->Build weight matrix" global program menu item. ![]() The "Input file" is an alignment or a multisequence to build the matrix from. "Output file" specifies the file the resulting matrix will be saved into. These two files must be defined. After defining the input file, its coloured "Alignment Logo" will appear at the bottom of the dialog. It gives a representation of the selected alignment. Also you can specify the statistic type and the martrix type.
Matrix type defines the type of the resulting matrix.
To start the operation, press "Start". The matrix will be created and saved.
If the "Build weight or frequency matrix" dialog was invoked from the "Weight matrix search" dialog,
then the matrix also will be chosen as the current profile.
|