It’s an all-too-common situation: you would like to infer a phylogeny for a set of organisms, you try a few different methods and you end up with many different trees. Even with the most careful choice of software, settings, tree priors, and the most beautifully converged Bayesian posterior likelihood, you may find that the maximum clade credibility (MCC) tree has low posterior support for certain deep clades.

MCC tree with posterior supports

Anole MCC tree with posterior supports, from Geneva et al. [1]

Tree inference is very complicated, particularly for species trees, and is hampered by factors which include the vast size of tree space, conflicting signals from different genetic loci, confusing signals from convergent evolution, and non-tree-like evolution (recombination, hybridisation, etc.). Geneva et al. experienced just this sort of difficulty when they performed a comprehensive Bayesian phylogenetic analysis of the distichus group of trunk ecomorph anoles [1]. Their MCC tree is reproduced here, and the posterior support values show uncertainty in the branching structure of various deep clades. There are many combinations of ways to resolve these uncertain splits. We wanted to see which alternative trees were supported by the data.

In our recent paper [2] we present a method for handling phylogenetic uncertainty and incongruence. It takes a set of trees and “maps” them into a simple plot where similar trees are grouped together and more different trees are placed further apart. Where many similar trees are clustered together, contour lines indicate the density of points in that region. We began the development of our method theoretically, making sure we had designed a robust mathematical definition for tree distances which would correspond to biological intuition and lend itself to good quality map projections. Then, working closely with biologists, we fine-tuned our method for specific applications with real data and wrote the R package treescape [3] so that anyone can use it – there’s even a handy web app version which requires no knowledge of R.

treescape MDS plot: each point represents a tree, and proximity of points represents similarity of trees. 1000 trees are plotted here, many identical, so contour lines indicate density of points. Colours correspond to clusters of similar trees.

treescape MDS plot: each point represents a tree, and proximity of points represents similarity of trees. 1000 trees are plotted here, many of which are identical, so contour lines indicate the density of points. Colours correspond to clusters of similar trees.

When we applied our method to the trees from the analysis of Geneva et al. [4], we found that there were distinct “clusters” of equally likely tree topologies. It is reassuring that the MCC tree belongs to the largest of these clusters (highlighted on the plot by a yellow triangle), but clearly it cannot represent all of the likely tree shapes on its own. By taking a representative tree from each of the six or so tight clusters, we obtain a more thorough summary of the range of trees supported by the analysis. Such representative trees, taken from the geometric “centre” of each cluster, are credible summary trees with real branch lengths, unlike trees from other summary methods which can suffer from strange behaviour such as negative branch lengths.

We find that there are alternative placements of certain taxa, particularly the ocior, distichus, dominicensis2 clade, and (in our supplement) we explore some of the knock-on effects of using these different tree shapes when analysing the evolution of the anoles, specifically their geographical origins and transitions in their dewlap colour. For instance, we show here a representative tree from each of two different clusters on the map. The trees support ocior, distichus, and dominicensis2 being more closely related to anoles from the East of Hispaniola (the North paleo-island) or the South-West (the South paleo-island) respectively. Both evolutionary histories are supported by the data; in the absence of further research, there is no reason to exclude any of the alternative representative trees identified by our method.

Representative tree from top left cluster

Representative tree from top left cluster

Representative tree from top right cluster

Representative tree from top right cluster

 

 

 

 

 

 

 

 

[1] Geneva, A. J., Hilton, J., Noll, S. and Glor, R. E. (2015). Multilocus phylogenetic analyses of Hispaniolan and Bahamian trunk anoles (distichus species group). Molecular Phylogenetics and Evolution, 87:105-117.

[2] Kendall, M. and Colijn, C. (2016) Mapping phylogenetic trees to reveal distinct patterns of evolution. Molecular Biology and Evolution, first published online June 24, 2016. DOI: 10.1093/molbev/msw124

[3] Jombart T., Kendall M., Almagro-Garcia J., Colijn C. (2015). treescape: statistical exploration of landscapes of phylogenetic trees. R package version 1.9.17.

[4] Geneva A. J., Hilton J., Noll, S. and Glor, R. E. (2015). Data from: Multilocus phylogenetic analyses of Hispaniolan and Bahamian trunk anoles (distichus species group). Dryad Digital Repository.

Michelle Kendall
Latest posts by Michelle Kendall (see all)