A Brief Guide to EEEP ***Invoking EEEP EEEP is invoked with the following command line: EEEP reference-tree-file test-tree-file support-threshold [-np] [-rr | -rR | -rt | -rT] [-uc | -UC] [-lx] [-pt] The reference and test tree files must conform to the Nexus format as described below. A support threshold must be provided to EEEP, but if you wish to treat the test tree as completely resolved, then a threshold of 0 will not collapse any nodes, even if no support values are indicated in the test tree file. The other flags have the following effects: -np: Do not partition into regions of discordance. -rr: Use a permissive reference tree ratchet. -rR: Use a strict reference tree ratchet. -rt: Use a permissive test tree ratchet. -rT: Use a strict test tree ratchet. -uc: Use only weak time constraints. -UC: Use no time constraints. -lx: Force the algorithm to perform at least x edits in searching for a solution. This option is useful if a ratchet is finding time-unconstrained paths of length n that have no legal permutations under a time constraint. Setting x > n forces the ratchet to search for solutions with more edits, that are more likely to yield paths which are legal in time. -pt: Suppress the printing of the input trees in the output generated by EEEP. This can save storage space, but the output trees are necessary for the Perl script that generates edit path visualizations. ***Input EEEP requires two Nexus format tree files as input. A sample Nexus tree is shown below: tree myT (A:0.1,B:0.1,(C:0.5,(D:0.7,(E:0.3,F:0.1)0.5:0.4)1.0:0.9)1.0:0.4); 'tree' identifies the entity as a tree. 's006' is the specific name of the tree. The final entity in the line is a parenthetical representation of the tree, with taxon names (indicated with letters above), branch lengths (which occur after a colon), and bootstrap or posterior probability values (which occur after each closing parenthesis). The above tree is unrooted because it has a trifurcation at the base (between A, B, and the partition containing C,D,E, and F), so it would be suitable as a test tree for EEEP. The reference tree must be rooted, and thus have only a bifurcation at the base. One way to root the tree above would be to put parentheses around 'A:0.1,B:0.1'. When EEEP is invoked, it calls the script 'convertNewick.pl' which converts the reference and test tree files into bipartition representations, which are saved with the suffix '.tempParts'. EEEP then reads these files and performs the edit path analysis. ***Output EEEP writes to standard output, which can easily be redirected to a file by appending '> filename' to the EEEP command line in Win32 or UNIX. A sample command line and output are shown below: //////////////////// ./EEEP MRP_095_rooted.txt s020-g03-t0.51-6651.testParts 0.95 -rR Test tree: s020-g03-t0.51-6651 Discordant regions: 2 Equivalence classes for region 0: (206,222) (223,224) (230,232) (233,235,236) (238,240,241,242,243) Equivalence classes for region 1: (250) (251) (252) (253) (254) (255) (256) (257) (258) Permuting region 0 1 edits: !**!** 0 topologies after 1 edits. 2 successful edits for region 0. Permuting region 1 1 edits: 8 topologies after 1 edits. 2 edits: !!!!!!!!!!!!!!!!!!!!!! 1 topologies after 2 edits. 22 successful edits for region 1. ---------------------------------------------- Unpermuted paths in region 0 * R,223>233 * R,233>223 ---------------------------------------------- Unpermuted paths in region 1 * R,251>253,(251)P(253)>258 * R,251>253,251>258 * R,251>253,253>258 * R,251>253,258>(251)P(253) * R,251>257,251>252M(257) * R,251>257,252M(257)>251 * R,251>257,253>251 * R,251>257,258>251 * R,251>258,(251)P(258)>253 * R,251>258,253>(251)P(258) * R,251>258,258>253 * R,252>257,251>252M(257) * R,252>257,252M(257)>251 * R,253>251,(253)P(251)>258 * R,253>251,253>258 * R,253>251,258>(253)P(251) * R,253>258,(253)P(258)>251 * R,253>258,251>(253)P(258) * R,258>251,(258)P(251)>253 * R,258>251,253>(258)P(251) * R,258>253,(258)P(253)>251 * R,258>253,251>(258)P(253) ---------------------------------------------- Trying to find legal permutations: ********************** ---------------------------------------------- 2 legal edit paths for region 0: 223>233 :: 1; 233>223 :: 1; ---------------------------------------------- ---------------------------------------------- 10 legal edit paths for region 1: 251>253,251>258 :: 1,2; 2,1; 251>253,253>258 :: 1,2; 251>257,253>251 :: 1,2; 251>257,258>251 :: 1,2; 251>258,258>253 :: 1,2; 253>251,253>258 :: 1,2; 2,1; 258>251,258>253 :: 1,2; 2,1; ---------------------------------------------- Minimum edit distance for family s020-g03-t0.51-6651: 3 Total path count: 20 Unique path count: 14 ------------------- END OF RUN -------------------- tree MRP095rooted ((((144,(66,(1,(94,93)))),(97,98)),((68,(67,69)),((47,(50,46)),(5,(37,(48,49)))))),((4,100),(((((61,(6,(7,(116,115)))),(((44,43),(40,(126,(41,((85,((90,(88,(89,91))),(84,83))),(86,87)))))),(82,(79,(81,80))))),(57,(104,(55,(131,(56,54)))))),(35,(96,(26,(25,27))))),(((42,(101,10)),(24,(8,135))),((127,((99,(95,60)),((136,141),(138,137)))),(((30,((9,(((124,(28,29)),(51,(130,(52,53)))),(140,92))),(103,102))),(134,((121,(122,(22,(23,21)))),(20,19)))),((17,(143,(129,(39,38)))),(((18,(11,((78,(2,3)),(45,(13,12))))),(71,72)),(((132,(70,(119,(117,118)))),(123,(58,59))),(((109,110),(111,112)),(125,((63,(65,64)),(76,(((((133,(113,114)),((31,((33,34),(32,(139,77)))),(75,(74,73)))),((108,120),(14,(16,15)))),(128,(62,36))),((106,(142,107)),105))))))))))))))); tree s020-g03-t0.51-6651 (32:0.004,(((114:0.002,113:0.002)1.00:0.112,((11:0.300,(8:0.613,((92:0.081,140:0.055)1.00:0.193,(29:0.171,124:0.187)1.00:0.420)1.00:0.237)1.00:0.192)1.00:0.155,(65:0.279,(109:0.051,110:0.030)1.00:0.249)1.00:0.087)1.00:0.144)1.00:0.146,(75:0.003,(73:0.002,74:0.002)0.56:0.003)1.00:0.025)1.00:0.040,((31:0.004,(77:0.002,139:0.002)1.00:0.007)0.73:0.003,(33:0.002,34:0.002)0.43:0.002)0.96:0.005); Solved: 1 Unsolved: 0 Untouched: 0 ///////////////////// The first line printed out contains the name of the test tree under consideration, and the number of discordant regions to be reconciled independently from one another. For each region of discordance, equivalence classes are shown. Equivalence classes arise when the test tree is a projection of the reference tree, in which case two or more edges in the reference tree may split the exact same set of taxa in the test tree (see the manuscript for more details). Any edges that are contained within the same set of parentheses are completely equivalent to one another, and only one alternative will be considered by EEEP and printed in the solution set. EEEP then tries to perform a series of SPR operations within each region of discordance to try and achieve reconciliation between the reference and test trees. Asterisks indicate the progress of the run, and exclamation marks are printed when a solution is found. The total number of legal paths found for each region are shown when reconciliation has been achieved. Since a ratchet was specified with -rR in this run, the paths shown as 'Unpermuted paths' were found in the absence of a time constraint. The next step is to permute these paths to see if solutions can be found that do not violate time constraints. If legal permutations are found for each region, the count of legal paths will be printed, followed by the actual paths obtained. In this case, Region 1 required two successive edits to reconcile the two trees, and in three cases the order in which edits are applied [(A,B) or (B,A)] is not important. To avoid excessive output, edit paths that are merely permutations of one another are printed on a single line, followed by all of the orderings of the set of edits that were found. The way to interpret an example such as A>B,C>D,E>F :: 2,3,1 is to number the three edits in their presented order (so C>D is #2), then reorder them in the manner specified by the numbered list, in this case yielding C>D,E>F,A>B. Following the legal edit paths, the minimum edit distance for the whole trees is printed. The total number of legal paths, and the total number of unique paths (e.g., counting paths that are permutations of each other only once). Since -pt was not specified, the input trees are printed to the output file. Finally, the count of test trees that were solved, unsolved (no legal paths found), and untouched (edit distance at the specified threshold = 0) is indicated.