Implementation notes

Future development


Why Applet
I chose to implement this tool as an applet because it enables easy access of anyone from anywhere to this tool. This is the first implementation of this algorithm (as far as I know) and since this algorithm is the fastest for this problem, I consider this easy access as one of the most important properties of this tool. The main disadvantages of applets (related to applications) are the security limitations as file access (no file access) or committing dos command lines. For this reason I added the statistical mode in order to enable good estimation of the reversal distance of a permutation as a function of its size.

Why java
The implementation was in Java and the code is available here. The main reasons for choosing Java is the comfortable GUI implementation and the simplicity of generating an applet. The most problematic aspect of Java, the efficiency, became almost irrelevant by the efficiency of the algorithm and the (relatively) small size of the expected input (not genomic pure data of millions or billions of base pairs but hundreds or thousands of genomic regions represented as a signed permutation). The running time of the algorithm on these input sizes is measured in seconds (for the fast mode).

Graphical User Interface classes

These classes aren't very interesting. They contain the GUI implementation and are large and simple)

Algorithm and its auxiliary classes
  Generic classes Not in any other group classes Future development

Batch mode
As I stressed before, the tool can't run in batch mode because of the security restrictions that applets have (after all, no one wants that something (an applet in our case) that exists in a web page that he accidentally loaded, will run the command line "del C:\*.*" or even read files and send them to its creator). The way to convert this tool to a batch mode tool (that, for example, reads 10,000 signed permutations in a specific format and run the algorithm on each one of them) is not very complicated. It contains the following stages:

  1. Build a reader class with the following methods: Reader(String filename) that opens the file for reading, Readline() that reads the line next to the one that was read in the last call and close() that closes the file. The file can be built by a simple script in any script language.
  2. Understand the interaction of the GR_applet class with GR class. This is the most important stage of this conversion. GR supplies a good interface and GR_applet code uses this interface. Getting a reversals sequence is easy by constructing GR object and invoking GR::run(). Getting properties of a permutation can be done by some simple methods of Permutation class. Most of the available methods are used in the Statistics class (with GR::getPermutation and then Permutation::getBreakpointsNumber, Permutation::getCyclesNumber, ...) and it is recommended to take a look on this section too.
  3. Write a Java application that gets at least one parameter (the file name) and contains a main loop that gets a permutation (through the Reader) and handles it. This handling usually will contain running the algorithm (with GR::run()) and get the Reversals arrays or analysis of the properties of the overlap graph (with GR::getPermutation and then Permutation::getBreakpointsNumber, Permutation::getCyclesNumber, ...).
Future development
The tool supplies all the reasonable functionality I could think of. Most of the possible improvements are in the GUI aspect and some in the efficiency aspect.
As always when building a program without specific targets most of the functionality of the tool was added in the last stage of the programming. Since the program shield was well formed (I spent most of the time building and improving it) it became pretty easy to improve the functionality in short time, especially in the GUI side. Unfortunately, I didn't have an unlimited amount of time and despite improving the project became easy the project entered its submission stage I made only a part of the possible improvement I could think of.

Possible improvements: