An algorithm for sorting signed permutations by reversals

This page contains an applet demonstrating the fastest known algorithm for sorting signed permutation by reversals. The algorithm was developed by Haim Kaplan, Ron Shamir and Robert E. Tarjan.The implementation was written by Itsik Mantin, a B. Sc. student in Math and Computer Science in Tel Aviv university, in a workshop in Computational Biology, supervised by Ron Shamir.

The available sections of this page are:

  1. Definition of the problem
  2. Description of the algorithm
  3. What this tool can do
  4. Instructions
  5. Run it now!!
  6. Implementation notes
The genome rearrangement problem
The problem of Genome Rearrangement can be represented as the following pure computational problem:
Let p be a signed permutation of the integers 1, .., n (e.g., 4, -2, 1, 3, -6, 5 when n=6).
A reversal is an operation that reverses a specified section of the permutation in place and flips the signs of the elements in this section flip (e.g., the permutation p' = 4, -1, 2, 3, -6, 5 can be achieved from p by reversing the 2 elements section from -2 to 1).
The reversal distance of a permutation is defined as the leastt number of reversals that can turn the permutation to the identity permutation.

The problem is to find:

  1. The reversal distance of the permutation - d
  2. A sequence of d reversals that turns the permutation to the identity permutation.

The Algorithm
For a full description of the algorithm, previous work and terminology, see the paper by Kaplan, Shamir and Tarjan, "Faster and Simpler Algorithm for Sorting Signed Permutations by Reversals" (to appear in Siam Journal of Computing). A preliminary version is avialable here.

The algorithm has three main stages:

1. Pre-process the permutation. This pre-processing contains 3 sub stages:
    1a. Unsign the permutation, e.g., p will be unsigned to the permutation 0, (7,8), (4,3), (1,2), (5,6), (12,11), (9,10), 13.
    1b. Define the Overlap graph of the permutation
    1c. Find the connected components of the overlap graph
2. Clear the hurdles. A hurdle is a problematic connected component of the overlap graph. In this stage each reversal merges two hurdles in distinct connected components into one non hurdle component.
3. Generate a sequence of safe reversals. A safe reversal is defined as a reversal that reduces  b-c (the number of breakpoints minus the number of cycles) without creating new hurdles.

What this tool can do
There are two working modes of the algorithm:
1. Regular mode - Invoke the algorithm on a defined permutation. There are 4 optional ways for defining the permutation:
    a. Write the permutation manually.
    b. Define a size and get a random signed permutation of that size.
    c. Define a size and a reversals number and get a signed permutation that was created by making this number of random reversals on the unit permutation. This way we have an upper bound on the reversal distance of the new permutation.
    d. The last permutation you have worked on. This option is available for
        reanalysis of something interesting you had recognized.

    The options of this mode are the following:
    a. Run the algorithm all way to the end and get a report of the reversals it had found
    b. Run one step of the algorithm
    c. Define your reversal and perform it on the current permutation
    d. Get a detailed analysis of the current permutation. This analysis contains general information about the permutation and about the overlap graph, including a drawingof the breakpoints graph (for permutations of limited size). The drawing of the grap contains different colors for different kinds of components of the overlap graph (a super hurdle, a simple hurdle, an oriented component, an unoriented component and an adjacency) and for different kinds of edges (oriented or unoriented), which are vertices of the overlap graph).

2. Statistics mode - This mode provides statistics on some parameters like breakpoints
    number, cycles number, hurdles number, etc. In this mode you can characterize a
    randomization of a permutation and get the statistics on user defined number of
    instances of this characterization. It is recommended to try it for the first times
    on low number (100-1000) of instances for evaluation of the time.

There are two main stages in using the program:
1. Initialization of the program. This stage contains:
    a. Choosing the working mode (Regular/Statistics)
    b. Choosing the permutation kind. The list of available options depends on the
        chosen mode.
    c. Inserting the needed parameters for the chosen mode and permutation kind.
    d. Press the "Submit" button.
2. In case the mode is Statistics mode this stage is auto executed. In case the modeis the Regular mode you can use the operations buttons to analyze the progress of the algorithm. The options are:
    a. Run one step of the algorithm - Next button
    b. Run the algorithm to the end (and get a report) - Run button
    c. Choose your own (valid) reversal and act it on the current permutation - User Reversal button
    d. Analyze the current permutation - Analyze button (Note: This mode may be a bit slow because of the way the redrawing applet is implemented)
    e. Clear the permutation and start with a new one - Clear button

Recommended Scenario
Since permutation with hurdles are rare it is recommended to choose the User Permutation and press the Submit button (there is a default permutation of size 10). Then you can press the Analyze button to see a hurdled graph. Then you can use the  Next button in the analysis frame to run the following stages of the algorithm.

Genome rearrangement algorithm applet

Implementation notes