UGENE Main Page

 
Command line interface

Overview

Since 1.7 version UGENE has become a powerful command line tool. In fact, you can run any workflow designer schema from command line.

While working on UGENE's command line interface we were guided by the following principles:

  • Make it as easy as popular shell commands.

  • Include all significant UGENE's features to command line interface.

  • Allow users to add their commands to this interface.

Be sure that path to 'ugene' executable is added to your PATH environment variable. If your PATH is correct then typing 'ugene --help' in console will show the following:

UGENE help message

Using current version of UGENE you can do the following tasks by running a simple command:


Align sequences using MUSCLE3

Command: ugene align (or ugene --task=align)

This command aligns input msa using MUSCLE3 tool and writes output in CLUSTALW format.

Options:
in - path to input msa file
out - path to output file

Example: ugene align --in=14-3-3.sto --out=14-3-3_aligned.aln

On success you will see:

Align from command line

Convert sequences from one format to another

Command: ugene convert-seq

This command converts sequences between formats

Options:
in - path to input sequence file
format - format name of output file. Currently, the following formats are supported:

    fasta
    genbank
    raw
out - path to output file

Example: ugene convert-seq --in=human_T1.fa --format=genbank --out=human_T1.gbk

On success you will see:

Convert sequence from command line

Convert multiple sequence alignments from one format to another

Command: ugene convert-msa

This command converts msa between formats

Options:
in - path to input msa file
format - format name of output file. Currently, the following formats are supported:

    clustal
    stockholm
    msf
    srfasta
out - path to output file

Example: ugene convert-msa --in=CBS.sto --format=clustal --out=CBS.aln

On success you will see:

Convert msa from command line

Find Open Reading Frames

Command: ugene find-orfs

This command finds Open Reading Frames in supplied sequence and writes results as annotations

Options:
in - path to input sequence file
name - name of annotated regions (default: "ORF")
min-length - ignore ORFs shorter than this number (default: 100)
require-stop-codon - Ignore boundary ORFs which last beyond the search region (i.e. have no stop codon within the range)(default: false)

    true
    false
require-init-codon - Allow or not ORFs starting with any codon other than terminator(default: true)
    true
    false
allow-alternative-codons - Allow ORFs starting with alternative initiation codons, accordingly to the current translation table(default: false)
    true
    false
out - path to output file with annotations

Example: ugene find-orfs --in=human_T1.fa --out=human_orfs.gbk --require-init-codon=false

On success you will see:

Find ORFs

Find repeats

Command: ugene find-repeats

This command find repeats in supplied sequence and writes results as annotations

Options:
in - path to input sequence file
name - name of annotated regions (default: "repeat_unit")
min-length - Minimum length of repeats (default:5)
identity - Repeats identity in percents (default:100)
min-distance - Minimum distance between repeats (default:0)
max-distance - Maximum distance between repeats (default: 5000)
inverted - Search for inverted repeats (default: false)

    true
    false
out - path to output file with annotations

Example: ugene find-repeats --in=murine.gb --out=murine_repeats.gbk --identity=99

On success you will see:

Find repeats

Find pattern using Smith-Waterman algorithm

Command: ugene find-sw

This command find given pattern in supplied sequence and writes results as annotations

Options:
in - path to input sequence file
name - name of annotated regions (default: "misc_feature")
ptrn - A subsequence pattern to look for (e.g. AGGCCT)
score - similarity with pattern in percents (default: 90)
matrix - The scoring matrix (default: Auto)

    Auto
    blosum62
    dna
    rna
    dayhoff
    gonnet
    pam250
filter - Result filtering strategy (default: filter-intersections)
    filter-intersections
    none
out - path to output file with annotations

Example: ugene find-sw --in=human_T1.fa --out=human_T1_sw.gbk --ptrn=TGCT --filter=none


Build HMMER2 HMM profile

Command: ugene hmm2-build

This command builds and\or calibrates HMM profile using HMMER2 tools

Options:
in - path to input msa file
name - Descriptive name of the HMM profile (default: <empty>)
calibrate - Enables/disables profile calibration (default: true)

seed - The random seed, a positive integer (default: 0)
out - path to output file with HMM profile

Example: ugene hmm2-build --in=CBS.sto --out=CBS.hmm --calibrate=true

On success you will see:

HMM2 build

Search HMM signals using HMMER2

Command: ugene hmm2-search

This command searches input sequence for significantly similar sequence matches to given profile HMM using HMMER2 tools

Options:
seq - input sequence to search in
hmm - input HMM profile to search for
name - name of the result annotations (default: "hmm_signal")
e-val - filter by high e-value. Positive integer value. Filtering is done by 1e-number (default: 1)
score - filter by low score. Float value. (default: -1000000000)
out - path to output file with annotations

Example: ugene hmm2-search --seq=CBS_seq.fa --hmm=CBS.hmm --out=CBS_hmm.gbk --e-val=2

On success you will see:

HMM2 search

Query remote database

Command: ugene remote-request

This command searches sequence in remote database and writes result as annotations.

Options:
in - path to input file with sequence(s)
out - path to output file in Genbank format
db - database for search:

    "ncbi-blastn" for nucleotide sequence,
    "ncbi-cdd" for amino,
    "ncbi-blastp" for amino
eval - specifies the statistical significance threshold for reporting matches against database sequences (default: 10)
hits - maximum number of hits, that will be shown (default: 10)
name - name of result annotations. If not set, name will be specified with the cdd result or the blast result
short - this parametr determines wether or not to optimize search for short sequences (default: false)
    true
    false
blast-output - path to a file with NCBI-BLAST output (only for ncbi-blastp and ncbi-blastn databases)

Example: ugene remote-request --in=seq.fa --db=ncbi-blastp --hits=100 --blast-output=blast.xml --out=res.gb

This command searches sequence from seq.fa in BLASTP database, writes result annotations in res.gb and BLAST output file in blast.xml


Any other user-defined workflow designer schema

Read Workflow designer documentation

To add your own schema to command line interface you should do the following steps:

  • Create schema in workflow designer
    Create schema
  • Click on 'Configure command line aliases' button on toolbar
    Click on configure aliases button
  • Set aliases for attributes, you want to use in command line, and click 'Ok' button
    Configure aliases
  • Click on 'Save schema' button on toolbar. Fill schema name and filename and click 'Ok' button NOTE: schema filename will be used as a name of a command in command line interface
    Configure aliases
  • That's all! Now you can run your schema. In our example it would be ugene sitecon [parameters...]

Back to user manual