3 commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
obache
|
8470a558d6 |
Update phylip to 3.69.
Based on PR#43388 by Wen Heping. version 3.69 (September, 2009) * If there are more than about 50 species in the tree, Treedist can fail to compute distances among the trees. This is due to an overflow problem inadvertently introduced in version 3.68. There is no workaround with the 3.68 executable, but if you can recompile you can fix it by replacing line 1179 of treedist.c, which is currently maxgrp = pow(2,tip_count); by maxgrp = 100000; This is fixed in version 3.69. Versions prior to 3.68 will not have this problem. * In Dnacomp, Pars, and Dollop, if the Shimodaira-Hasegawa test is performed and there are trees perfectly tied with the best tree, the P values were incorrect (being 0 instead of 1). * A team from Iowa State University noticed that time was being wasted in calculations in Dnapenny in the bound calculations. This has now been remedied and it should be noticeably faster. * In the molecular likelihood programs, ancestral state probabilities were being incorrectly calculated for user trees that had internal multifurcations. This has been corrected. version 3.68 (August, 2008) * We received some reports that Dnaml was freezing on some data sets in the Windows executables. This seems to have been because of incorrect handling of small increases in the log-likelihood, causing the algorithm to fall into loops. It was temporarily cured in version 3.67 by changing the compiler optimization level, downwards from -O3 to -O1. Now the underlying problem of small differences of log-likelihood has been addressed too, so you should use the new Windows executables (3.68) to avoid having these problems on Windows systems. * We found that the .DMG (disk image) archive for Mac OS X contained executables for the Intel Mac but not universal binaries that would work on both Intel Mac and PowerPC systems. Oops. We recompiled and reposted the archives (on 23 August 2007). They should work on both kinds of systems now. * We were told that on a Linux computer with a 64-bit Intel Itanium chip the bootstrapping program Seqboot creates blatantly wrong bootstrap samples with characters sampled too many times (or none). On a 64-bit AMD processor the program works fine. The problem is in the random number function "randum" in phylip.c. It seems to be a problem with optimization on the GCC compiler. It is cured by dropping the compiler optimization level from -O3 to -O2. * In Protdist the program would blow up if it computes a distance greater than 100.0. This is owing to a subscript error in the code that writes out the distances, in line 1874 where else if (d[j][k] < 1000.0) should have been else if (d[i][j-1] < 1000.0) If you have this problem and cannot upgrade to version 3.68 or recompile the program with this change, and your data comes from bootstrapping, try omitting just that replicate, or else rerunning the bootstrapping with a different random number seed (which might not happen to drop as many of the sites that caused these two sequences to be so distant). * When Dnadist is used and the lower-triangular output format is chosen, the resulting file has headers at the top of columns and is human- readable but is not machine readable. The (temporary) solution is not to use this option for the time being. * In Mac OS X, Drawgram produces some alarming lines of text at the top of its terminal window when it first runs. These are just scripting commands that were not erased because we do not clear the screen at the right moment. The workaround is simply to ignore these commands. version 3.67 (July, 2007) * We had our first reports on the behavior of PHYLIP Windows executables on Windows Vista. The programs work fine. The only thing that did not work is the self-extraction program that unpacks the archives. For some reason it did not work on Vista. The work-around was that, after you got an archive file like phylipwx.exe onto your system, you had to change the file extension from "exe" to "zip". Then you had to click on the file. You were presented with options including "Extract all files". If you chose that the archive was unpacked. The programs would then work. Although we provided "zip" archive versions of the package, we have now got a new version of WinZip which is supposed to have a self-extractor that works on Windows Vista, and it was used to produce the self-extracting archive since 27 August 2007. * On Mac OS X systems, if our distributed executables are placed in a folder whose path contains a name with an internal blank, such as /Users/ianr/the files/ then the script that causes each of our programs to run when you click on the corresponding icon does not work, and there is an error message. This is a scripting error in our Mac OS X setup, and it was corrected in version 3.67. In the meantime, if you have this problem, the solution is to put PHYLIP in a folder whose path does not have any folder that has a blank in its name. In the above example, all that would be necessary is to rename the folder the files to the_files * We are still getting reports of stickiness of the tree, and occasionally of negative branch lengths, in Dnamlk and Promlk which do not do as good a job of searching for best trees as they should. This has turned out to be an issue of nodes getting stuck when they collide in moving them on the "time" scale. Some major changes were in the code in the 3.67 release to eliminate this stickiness and give a good search. * An error was made in putting together the matrices for the PAM mutation model in Protdist, Proml, and Promlk. These programs will give PAM calculations inconsistent with earlier (v3.65 and before) versions, and with other programs. The matrices were corrected in version 3.67. This does not affect JTT or PMB models. * The W (within-species varation) option of CONTRAST uses somewhat incorrect equations to infer within-species covariances and phylogenetic covariances. These were corrected in version 3.67. Anyone severely impacted by the problem in the meantime should contact me. * Protdist sometimes results in distances greater than or equal to 100.000. When this happens, the distance can run together with the previous number in the output file. For example, a distance of 0.31766 followed by one which is 127.43986 might look like this: "0.31766127.43986". This causes trouble in any program that tries to use this distance matrix. One symptom of this may be the program reporting that two distances which are expected to be equal are unequal -- but then printing them both out, and they appear to be equal! In this case it would print out a message warning you that 0.31766 was not equal to 0.31766. It is doing so because one of them is actually seen by it as 0.31766127 and the other 0.31766. In all future versions, there will be a blank printed between the two numbers. For the present, use an editor to find them and insert the blank by hand. If this is difficult, a Sed script (which can be used on Linux or Unix machines) has been written by Doug Scofield, and is available from him at: this link. Many thanks to him for this. As you can see, this problem is the result of us not thinking of what happens when the distances are big, and the fix in the code is trivial -- just ensuring that there is at least one blank between successive distances. * Contml, with gene frequencies, has a bug in the transformation to variables that have approximate Brownian motion as their evolutionary process. This can lead to wierd trees. It might be preferable to go back to the 3.5c version if you need to use Contml for this. We believe that this will be correctly fixed in the 3.67 version. If people can recompile the source code, they replace the function transformgfs with this one and recompile (you should be able to save it from your browser using the Save As choice in its File menu. version 3.66 (August, 2006) * Program Treedist was found to compute the Branch Score Distance incorrectly. It will, in most cases, get the branch lengths in terminal branches incorrect and then be likely to find a nonzero distance between trees when they are really identical, and incorrect distances when they are not identical. Alas, there is no workaround to avoid this. All distances done with this option before version 3.66 should be regarded as incorrect unless all terminal branches have the same length, or unless the order of species in the tree is the same as in the first tree in the file. The Symmetric Difference option, which does not use branch lengths, works properly. * Program Dnamlk, when run on Linux or Windows systems, sometimes gave negative branch lengths for some branches on the tree. This is bad. Although we at first thought that this was a compiler bug, it seems to be a lack of initialization of some pointers. Program Promlk may have the same problem, as they share code. If you have this problem you can work around it by not using the Global menu option when running Dnamlk (or Promlk). If you need more extensive tree search the J (Jumble) option may be your best bet. * On Windows (at least, on Windows xp), our executables for version 3.65 produce output files (outfile) and output tree files (outtree) that have end-of-line characters that result in their being hard to read on the Notepad editor. They appear as one big line. If you use the Wordpad editor, or Microsoft Word itself, the files will be readable. This is and end-of-line compiler setting we got wrong when compiling the programs. * Programs Dnaml and Proml sometimes failed to iterate branch lengths in trees enough -- this can result in them failing to find as good a tree as the molecular clock versions Dnamlk and Promlk, a phenomenon that is not supposed to occur. The problem results from the iteration code in function makenewv giving up too easily when branch lengths are very short. The resulting branches get "stuck" at length 0 when they should not. If you can recompile the programs, the problem can be solved by the following changes: o In file phylip.h change the value of the constant iterations to 8 instead of 4. o In files dnaml.c and proml.c, change function makenewv to replace done = fabs(y-yold) < epsilon; by done = fabs(y-yold) < 0.1*epsilon; o In dnaml.c, in function makenewv, also replace* if (yold < epsilon) yold = epsilon; by if (y < epsilon) y = epsilon; We think these fix the problem. Some more thorough fixes are implemented in the 3.66 code. * The Mac OS X archives (in .dmg form) appeared at first sight not to have any executables directory in the package. This is owing to strange placement of icons once we package the files. The OS X executables are there -- their folder is just way down the window. Use the scroll bar to look for them. You should be able to use the View/Rearrange menus to make the folder icons appear in a more reasonable place. (Or this can be done once all of the contents of the .dmg archive are copied out to another folder). * Programs Dnaml and Proml (but not Dnamlk or Promlk), from version 3.64 on, crashed if the Categories (C) option is used, even if all categories are given the same rate of change. This unpleasant behavior does not occur if the menu option for "Speedier but rougher analysis" is changed to "No, not rough". That slows down the run but allows it to succeed. The fix turns out to be that all instances in dnaml.c of calls to function copynode (or all instances in proml.c of calls to prot_copynode) that involve an argument lrsaves should have the third argument be rcategs instead of categs. * In Seqboot, when menu item J is set to Permute species within characters it is impossible to change menu item W (character weights). This is a glitch in the menuing code. If you can change the source code and recompile, change at line 215 of seqboot.c: ((permute || ild || lockhart) && (strchr("ACDEFSJPRXNI%1.20",ch) != NULL)) || to be: (permute && (strchr("ACDEFSJPRWXNI%1.20",ch) != NULL)) || ((ild || lockhart) && (strchr("ACDEFSJPRXNI%1.20",ch) != NULL)) || If you are stuck with our executables and need this feature, you can also work around it in the following devious way: 1. Set menu item J to some other setting where menu item W appears in the menu, such as Bootstrap, 2. Change menu item W 3. Then change item J to Permute species within characters 4. Our Makefile for Unix had some problem finding some of the X-windows libraries on Mac OS X systems on Intel Macs. This prevented the compilation of Drawtree and Drawgram. You might have had to use those two programs by using their PowerMac Mac OS X executables. All the other programs did compile and run correctly on Intel Macs. version 3.65 (August, 2005) * Protpars sometimes gave the result "0 trees found" or else simply hung and did not complete its run. This was a bug. The program should always get at least one tree -- if it does not, that is a bug and not a judgement on your data, provided the data file is in our format! * Proml and Restml, and maybe some others, seg-faulted when run on enough multiple data sets, as in bootstrapping. If you have a version that has this problem and can recompile the programs, here is a fix for Proml and Restml. In function "inputdata", replace the lines makeweights(); if ( firstset ) alloclrsaves(); else resetlrsaves(); by if ( !firstset ) freelrsaves(); makeweights(); alloclrsaves(); and you can also eliminate the now-unnecessary function "restlrsaves". (Thanks to Jacques Rougemont for this). version 3.64 (July, 2005) * Treedist had trouble on Windows systems reading trees. This was due to problems with the ftell command on CygWin. It has been fixed by having the files read as binary files. * Trees with branch lengths compared using Treedist may have incorrect distances when evaluated as unrooted trees, owing to miscalculation of branch lengths for the bottommost branches. * Runs of Seqboot on Mac OS X systems with gene frequencies data have showed incorrect results -- wrong numbers of loci sampled, for example. This is due to bad code generated by the Metrowerks Codewarrior compiler when set to higher levels of optimization (our source code is OK). We will recompile the program at a lower level of optimization in the next bug-fixing release. If you can follow our compiling instructions and have this compiler, you can produce a correctly working executable. Alternatively you can use the gcc compiler and use our Unix Makefile to recompile this program (by typing "make seqboot"). This is quite easy to do and all Mac OS X releases have the gcc compiler in them -- it only needs to be installed. * In runs of Proml, Dnaml or Restml with user trees, if one puts in a user tree with an internal multifurcation and asks the program to re- estimate the branch lengths for that tree, the branch lengths in only two of the furcs will be re-estimated if they already have branch lengths. This is due to a bug in the function "initrav" causing it to fail to enter one or more of the subtrees. A workaround until the next release is as follows: Use Retree to remove all branch lengths on the tree. The tree's branch lengths will then all be re-estimated when it is used as a user tree. * The example output in the Treedist documentation gives distances computed by version 3.62 or earlier, in which the tree distance is not square-rooted. version 3.63 (December, 2004) * The DNA and protein likelihood programs could have problems with underflow if very large numbers of sequences were analyzed. Underflow protection code was needed to make this much less likely to happen. * A number of programs had the problem that when M (multiple data set) runs are done, if the data sets differ in the number of characters from data set to data set, they only allocate enough memory for the first data set, and then can crash on subsequent, larger, data sets. For bootstrap and permutation runs this should not be a problem, but for jackknife runs it might be. One work-around until we fixed this was to move the data set with the most characters to the front, so that enough space is allocated. The programs we think had this problem are: Clique, Dnacomp, Proml, Promlk, Protdist, Dollop, Gendist, Pars, Restml, and Restdist. * When the Branch Score distances are computed in program Treedist, the sum of squares of differences between branches was not square-rooted, as the documentation web page says it is. * Fitch and Contml may die when asked to do Jumbling, in some cases. * Dnaml had inconsistencies in results when branch lengths of a user tree were estimated, and when the same numbers were provided in the user tree. * Trees fed into Contrast could cause trouble if they contained unifurcations (forks with only one descendant). The program did not complain about this, as it should have. * End-of-line characters in input files in certain cases caused trouble in Mac OS X (for example when the files came over from Windows). * When printing a rooted tree out in Kitsch, the root was not placed intermediate between its two decsendants. * The variable numtrees was sometimes used when still uninitialized in Pars. * Restdist had a site-aliasing bookkeeping bug that could lead to incorrect results. * Restml would not allow site lengths greater than 8, because an array was of fixed size when it should have been dynamically allocated. * The variable name howmany conflicts with predefined names in some older Sun compilers. It will henceforth be deliberately misspelled to avoid this. * With larger data sets being analyzed, Proml, Promlk, Dnaml, and Dnamlk have had to have underflow protection code installed, as likelihoods were getting too small. * Treedist was giving wrong answers when asked to compute all distances between trees in two files that had unequal numbers of trees. This was a bookkeeping error. * The variable scanned was uninitialized in the Drawtree and Drawgram programs, which could sometimes cause problems. * The lack of initialization of a variable, delta in Dnadist meant that different results could be obtained from interactive runs than were obtained in runs under the control of a command file. * Dnadist was sometimes stopping when encountering sequences that had an infinite or indeterminate distance (i.e. when the sequences were too different or when they had no sites in common), when it should have printed out "-1" and continued. When it was supposed to print "-1" in some recent versions of PHYLIP it printed "1.0000" instead. version 3.62 (September, 2004) * The ftp link used by our "Get Me PHYLIP" page to fetch the version 3.62 Linux gzip'ed sources and documentation archive was incorrect until recently (I hadn't updated it to fetch version 3.62). If you had trouble fetching this archive in version 3.62, please try one more time. It will work now. * A number of people have found, with Fitch and with Contml, that version 3.61 crashes on multiple Jumbling (option J) or on bootstrap runs. This is fairly serious. It does not happen with versions of these programs earlier than 3.6 (such as 3.6a3 or 3.573c). This release fixes these problems. |
||
joerg
|
3d0d19d8ea | Don't depend on hard-coded /usr/X11R6 and honor CFLAGS. Bump revision. | ||
ben
|
ee36ecf16f |
Update phylip to version 3.61. No ChangeLog included. However there is a
very long list of bux fixes since 3.57 at: http://evolution.genetics.washington.edu/phylip/bugs.html |