SignalX: Search for a common signal in given set of sequences

The Server implements a MEME like algorithm.

Features of the server:

  1. Type of expected site symmetry may be defined.
  2. Not all sequences may contain the site.
  3. The program tunes site width and number of the sites using Rank statistics evaluation (see Theory.)
  4. The server presents a p-value for every solution.

Algorithm

  1. Scan all sequences
    1. Select word and create a profile from a single word.
    2. Iterate:
      1. Find the best hit of current profile in every sequence.
      2. Sort hits (word) by score.
      3. Using rank statistics define threshold and select significant subset of words.
      4. Using rank statistics select positions that should be included in the profile (tune word length)
      5. Using selected words create new profile.

Presentation of the results

The program gives a number of possible results (training sets and profile). Some word in the sequences (site) may be presented in different results (training sets may overlap).

The results are presented as a table: For every profile user can see the site set, graphical logo and position weight matrix.

Author

Andrey Mironov

Department of Department of Bioengineering and Bioinformatics, MSU, Moscow, Russia

Mail to Andrey Mironov