1 Theoretical Population Biology 73 (2008) Testing for phylogenetic signal in phenotypic traits: New matrices of phylogenetic proximities Sandrine Pav...

Author:
Carmel Matthews

0 downloads 22 Views 1MB Size

Theoretical Population Biology 73 (2008) 79–91 www.elsevier.com/locate/tpb

Testing for phylogenetic signal in phenotypic traits: New matrices of phylogenetic proximities Sandrine Pavoinea,, Se´bastien Ollierb, Dominique Pontiera, Daniel Chessela a

Laboratoire de Biome´trie et Biologie Evolutive, UMR CNRS 5558, Universite´ de Lyon, Universite´ Lyon 1, France b Universite´ Paris-Sud 11, Faculte´ des Sciences d’Orsay, Unite´ Ecologie, Syste´matique et Evolution, De´partement Biodiversite´ UMR-8079 UPS-CNRS-ENGREF, France Received 2 March 2007 Available online 12 October 2007

Abstract Abouheif adapted a test for serial independence to detect a phylogenetic signal in phenotypic traits. We provide the exact analytic value of this test, revealing that it uses Moran’s I statistic with a new matrix of phylogenetic proximities. We introduce then two new matrices of phylogenetic proximities highlighting their mathematical properties: matrix A which is used in Abouheif test and matrix M which is related to A and biodiversity studies. Matrix A uniﬁes the tests developed by Abouheif, Moran and Geary. We discuss the advantages of matrices A and M over three widely used phylogenetic proximity matrices through simulations evaluating power and type-I error of tests for phylogenetic autocorrelation. We conclude that A enhances the power of Moran’s test and is useful for unresolved trees. Data sets and routines are freely available in an online package and explained in an online supplementary ﬁle. r 2007 Elsevier Inc. All rights reserved. Keywords: Autocorrelation; Computer simulation; Cyclic permutation; Cyclic ordering; Geary’s c statistic; Moran’s I statistic; Permutation test; Phenotypic trait; Phylogenetic diversity; Phylogenetic signal

1. Introduction In the last decades, phylogeny has been more and more often recognized as a potential confounding factor when comparing the states of traits among several species. It is now widely accepted that given the phylogenetic links among species, species values may not be independent data so that the phylogenetic context should be taken into account when assessing the statistical signiﬁcance of crossspecies patterns (Martins and Hansen, 1997). In this context, Abouheif (1999) introduced a diagnostic test for phylogenetic signal in comparative data. It derives from a test for serial independence (TFSI) developed originally by von Neumann et al. (1941) in a non-

Corresponding author. Present address: Sandrine Pavoine, UMR 5173 MNHN-CNRS-P6 ‘Conservation des espe`ces, restauration et suivi des populations’ Muse´um National d’Histoire Naturelle, CRBPO, 55 rue Buffon, 75005 Paris, France. Fax: +33 1 4079 3835. E-mail address: [email protected] (S. Pavoine).

0040-5809/$ - see front matter r 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.tpb.2007.10.001

phylogenetic context. The TFSI detects dependences in a sequence of continuous observations by comparing the average squared differences between two successive observations to the sum of all successive squared differences. Abouheif (1999) proposed an adaptation of this test for phylogenies, by remarking that any single-tree topology can be represented in T different ways. Each representation is obtained by rotating the nodes within a phylogenetic topology, that is to say by permuting the branches connected to nodes in the original phylogeny (Figs. 1A and B). Such a rotation procedure results in changes in the ranking of the tips without changing the topology. Each rotation of the nodes results in a speciﬁc sequence of the species and thus in a speciﬁc sequence of the values taken by these species for a phenotypic trait under study providing therefore a sequence of continuous observations. The TFSI’s statistic can thus be calculated for each rotation. Abouheif’s (1999) test consists of taking Cmean as a statistic, the mean of TFSI’s statistics calculated on all (or merely a random subset of all) the T possible representations of the tree topology.

ARTICLE IN PRESS 80

S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

Fig. 1. Description of Abouheif’s statistic: (A) Hypothetical phylogenetic tree with four species together with the values taken by the four species for a theoretical trait. These values are given by a Cleveland (1994) dot plot. The scale is horizontal; (B) The set of equivalent representations of the tree topology with the corresponding Ci values; (C) The matrix A associated with the hypothetical phylogeny; (D) Values taken by the TFSI’s statistic Cmean and Moran’s statistic applied to A. These two indices are equal.

In this paper, we revisit the test introduced by Abouheif (1999), demonstrating that its corresponding statistic uses a Moran’s (1948) I statistic, initially developed for spatial analyses but introduced in phylogenetic analyses by Gittleman and Kot (1990). After the background on the

analyses of spatial autocorrelation, we demonstrate that Abouheif’s (1999) test uniﬁes two schools of thought developed around Moran’s (1948) and Geary’s (1954) work (Cliff and Ord, 1981). We propose redeﬁning Abouheif’s statistic for positioning it among other measures developed

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

in other contexts especially in conservation biology. In addition to a mathematical formalism, this redeﬁnition allows us to provide a biological interpretation of this statistic, previously based on node rotations, a process leading to an approximate value of a more explicit statistic. We introduce then a new phylogenetic proximity matrix (Matrix A) derived from Abouheif’s statistic. It does not rely on branch lengths of the phylogeny; rather, it focuses on topology. We deﬁne this matrix analytically for all phylogenies (resolved or unresolved), independently of the trait under study. It has excellent statistical features that are presented and discussed compared to three phylogenetic proximity matrices previously used in comparative studies. Once again the analytic deﬁnition of this matrix allows us to place it among other measures developed in both evolutionary biology and conservation biology. This formalisation leads us to introduce a second matrix (M), based on May (1990)’s propositions for measuring the taxonomic distinctiveness of a set of species. The performances of Moran’s test with A and M in terms of power and Type-I error are compared with the performances of Moran’s test used with previously deﬁned matrices of phylogenetic proximities. 2. Material and methods 2.1. Data First a simple, theoretical data set (Fig. 1) contains four hypothetical species and will serve to facilitate the description of Abouheif’s statistics. A total of 22 trees were used for computer simulations. We ﬁrst deﬁned trees according to the following models: the symmetric model which provides trees with equal branch length among nodes and 2n tips, where n is the number of bifurcations; the comb-like model which generates trees in which the tips are spread out like the teeth of a comb; Yule model which assumes that all species are equally likely to speciate. Trees have been generated from the three models with the following numbers of species: 8, 16, 32, 64, 128, 256. Four real trees have also been added to the simulations. The objective was to obtain a variety of tree shapes. All both are available in the packages ade4 and ape from the R Core Development Team 2007. We considered three small sized phylogenies: a 17-taxa maple phylogeny (data ‘maples’ in ade4 obtained from Ackerly and Donoghue (1998)), a 18-taxa lizard phylogeny (data ‘lizards’ in ade4 obtained from Bauwens and Dı´ az-Uriarte (1997)), a 19-taxa bird phylogeny (data ‘procella’ in ade4 obtained from Bried et al. (2003)). Finally we applied simulations to a 137-taxa bird phylogeny to study statistical performance on large sample sizes (data ‘bird.families’ in ape obtained from Sibley and Ahlquist (1990)). Phenotypic trait data were generated using an Ornstein– Uhlenbeck (OU) process with functions available in the package ouch from the R Core Development Team 2007. The phenotype is held near a ﬁxed optimum by a force

81

measured by a parameter named a. When aE0, the OU process approximates the Brownian motion model. When a increases, the data become more and more independent. We scaled branch lengths on all phylogenies so that the maximum length from root to tips is equal to 1, and let a vary from 0, 1, 2 to 10 (Diniz-Filho, 2001). The OU model involves two other parameters. The ﬁrst one (y) is the value of the optima. We ﬁxed y ¼ 0. The second parameter (s) measures the standard deviation expected at each generation due to random evolution along the branches of the phylogeny. We ﬁxed s ¼ 1. For the comb-like model, the simulations were performed on the ultrametric tree (pendant edges regularly increases along the comb leading to the alignment of the tips on one line). 2.2. Background: discrepencies between Moran’s and Geary’s statistics in spatial analyses Several tests for phylogenetic signal use methods created for the analysis of spatial data (Cheverud and Dow, 1985; Cheverud et al., 1985; Gittleman and Kot, 1990; Rohlf, 2001). Several measures of spatial autocorrelation exist in the literature. They quantify the degree to which the value of a quantitative trait in a location is correlated to its value in neighboring locations. Here we focus on two widely used statistics: Moran’s (1948) I and Geary’s (1954) c. We present below these two statistics and show how they can be used to measure phylogenetic autocorrelation. In the measurement of spatial autocorrelation, the ﬁrst step is the description of the neighboring relationships by means of a graph. The next step is the translation of these deﬁned relationships into a matrix of neighboring relationships D, whose general term dij is equal to 1 if individuals i and j are neighbors and 0 if they are not. Let R ¼ diag {r1, r2,y, P rn} be the diagonal matrix containing the row sums ri ¼ nj¼1 dij for D. Denote x ¼ (x1, x2,y, xn)t the values taken by a given trait X for each of the individuals, x¯ the average value and z ¼ (z1, z2,y, zn)t the corresponding standardized values of the trait X: ,sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ n 1X zi ¼ ðxi xÞ ðxk xÞ ¯ ¯ 2. n k¼1 Moran’s I and Geary’s c are deﬁned as zt Dz I ¼ Pn Pn i¼1

j¼1 dij

and

c¼

n 1 zt ðR DÞz Pn Pn . n i¼1 j¼1 dij

In a phylogenetic context, instead of individuals we have taxa, say species. Because the phylogenetic links among species depends on ancestry and evolutionary time, a matrix of phylogenetic proximity is usually not reduced to binary values (1 ¼ neighbor, 0 ¼ not neighbor, Gittleman and Kot, 1990). In fact deﬁning such binary values would mean to deﬁne a ﬁnite number of clades in the trees, and declare that all species from a single clade are neighbor and two species belonging to two different clades are not neighbors, which would considerably reduce the

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

82

possibilities for deﬁning phylogenetic proximities. Therefore, we use the generalized version of Moran’s and Geary’s statistics (Upton and Fingleton, 1985) replacing the binary matrix D with a proximity matrix W ¼ [wij], where wijX0 and wij ¼ wji. We will test the null hypothesis (H0) of no phylogenetic autocorrelation under the assumption that the observations are random independent drawings from one (or separate identical) population(s) with unknown distribution function. A non-parametric test is deﬁned. For a given trait X, the observed values (x1, x2,y, xn) are randomly permuted around the species, while W is kept unchanged. It is assumed that each species is equally likely to receive a value from (x1, x2,y, xn). For each permutation, the statistic I (respectively c) is calculated. The proportion of randomized I (respectively c) higher (respectively lower) than the observed I (respectively c) indicates whether the observed I (respectively c) is improbable enough to reject the null hypothesis that there is no phylogenetic autocorrelation in the data. The choice of matrix W is not without consequences. It implies a model of evolution and will inﬂuence the power of the test. These two statistics I and c either in their original D or in their generalized W form are at the core of two schools of thought. The ﬁrst school advocates the advantages of Geary’s c statistic, stating for example that c is more sensitive to the absolute differences between pairs of values, whereas I is more sensitive to extreme values (Jumar et al., 1977). The second school advocates the advantages of Moran’s I statistic, mainly because Cliff and Ord (1981) demonstrated that I is more powerful than c, that is to say the probability of rejecting H0 given that H1 is effectively true is higher when I is used rather than c. Actually, the segregation in two distinct schools relies more on mathematical properties than on their consequences for the results of the tests. When applied to real data, the results obtained either by I or by c are very similar (Cliff and Ord, 1981, p. 170; Upton and Fingleton, 1985). Thioulouse et al. (1995) provided some reconciliation between these two statistics, by applying two modiﬁcations. First Geary’s statistic considers variance computed with 1/(n1) rather than 1/n whereas Moran’s statistic uses the latter. Therefore Thioulouse et al. suggested to unify the use of variances by dividing Geary’s index by (n1)/n leading to zt ðR WÞz cn ¼ Pn Pn , i¼1 j¼1 wij while I is unchanged. Secondly they introduced a vector of neighborhood weights to standardize the data: let ri ¼ .P P Pn n n j¼1 wij i¼1 j¼1 wij be the weight attributed to species i, ﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ !,v !2ﬃ u n n n uX X X t z ¼ x r x r x r x i

i

j

j¼1

j

k

k¼1

k

j

j

j¼1

leading to I* and cn . In a phylogenetic context, we will call the weights ri ‘‘relatedness weights’’. Thioulouse et al.

(1995) demonstrated that these two modiﬁcations unify Moran’s and Geary’s concepts with the simple relationship I þ cn ¼ 1. The consequence is that the tests based on either I* or cn are identical and the two measures are complementary: I* measures local correlation while cn measures local variation. However, the introduction of these relatedness weights ri is not without consequences for the results of the tests. Indeed, the weights ri are higher for species close in average from all others in the tree, which is characterized by more internal nodes, speciation events between them and the root node, that is to say, species belonging to species-rich clades. As a result, the isolated couples of species in a phylogenetic tree have much less inﬂuence on test results, while, because they are isolated from the rest of the tree, their similarities with one another together with their deferences from the rest of the species would reinforce a phylogenetic signal. We will show in the three next sections that a formal deﬁnition of Abouheif’s (1999) test uniﬁes Abouheif’s, Moran’s, and Geary’s tests, this time giving all species equal weights. 2.3. Presentation of Abouheif’s test Abouheif’s (1999) Cmean statistic is deﬁned as PT 1 Ci T C mean ¼ 1 Pn i¼1 , 2 j¼1 ðxj xÞ ¯ 2 2 P where C i ¼ n1 j¼1 xK i ðjÞ xK i ðjþ1Þ . In this formula xK i ðjÞ denotes the observed phenotypic trait for the species Ki (j). Imagine that the tree topology is displayed with all the tips aligned with one column as in Figs. 1A and B. The function Ki represents the tips’ order for a given representation i of the tree topology from the top of the column to its bottom. For example, on the ﬁrst topology of Fig. 1B at the top lefthand corner, the ﬁrst species, ‘‘species a’’, is placed at the third position in the column of tips’ letters, that is to say K1(3) ¼ 1. In the 12th which is the last topological representation at the bottom right-hand corner of this Fig. 1B, ‘‘species a’’ is placed at the second position, that is to say K12(2) ¼ 1. When the number of tips increases, it quickly becomes impractical to manage all the possible representations. Abouheif (1999) considers then an approximate solution by sampling a subset of 2999 random rotations of the topology (with rotated nodes), using a program called ‘Phylogenetic Independence’ developed by J. Reeve and E. Abouheif and available at http:// biology.mcgill.ca/faculty/abouheif/. The statistic (Cmean) serves to test for the null hypothesis of no phylogenetic autocorrelation (hypothesis H0). The non-parametric test of H0 proposed by Abouheif (1999) is close to those proposed for Moran’s and Geary’s statistics (Cliff and Ord, 1981). It consists of randomly permuting the original values (x1, x2,y, xn) 999 times so that the species’ values are randomly placed on the tips of the original phylogenetic topology. For each permutation, the statistic Cmean is

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

calculated. This procedure is done repeatedly until a distribution of Cmean is obtained. The number of randomized Cmean higher than the observed Cmean indicates whether the observed Cmean is improbable enough to reject the null hypothesis that there is no phylogenetic autocorrelation in the data.

83

As a result, Cmean is equal to Moran’s I when one choose matrix A as a proximity matrix (see Appendix A for more explanations). Because A is doubly stochastic, using A in Moran’s I leads to I+cn ¼ 1. 2.5. Description of the new matrix of phylogenetic proximity (A)

2.4. Abouheif’s test turns out to be a Moran’s test We discovered that Abouheif’s test turns out to be an application of Moran’s test with a special phylogenetic proximity matrix A (Fig. 1C and D). We show below that Cmean can be rewritten as a Moran’s I statistic. We demonstrate that the test proposed by Abouheif leads to a new matrix of phylogenetic proximity. Indeed, the Cmean can be rewritten as Pn Pn 2 i¼1 j¼1 aij ðxi xj Þ , C mean ¼ 1 Pn 2 i¼1 ðxi xÞ ¯ 2 where aij is the general term of a phylogenetic proximity matrix A. Each off-diagonal term aij is equal to the frequency of rotations of the nodes which put species j just behind species i. The values of the diagonal terms of A do not change Cmean because aii(xixi)2 ¼ 0, whatever aii. We are thus free to choose the diagonal values that we think most appropriate. We choose that each diagonal term aii is equal to the frequency of representations which put species i at one extremity of the sequence of the P tips, i.e. after all the other species. This choice leads to nj¼1 aij ¼ 1. Matrix A is thus symmetric and each row, as well as each column, has a sum equal to 1. With matrix A the relatedness weights are all equal to ri ¼ 1=n so that means and variances in Abouheif’s test are unweighted. This matrix A has the following interesting properties. It is a n nPmatrix of components aij satisfying aij ¼ aji, aij40 and nj¼1 aij ¼ 1. By verifying this property, matrix A is said to be ‘‘doubly stochastic’’. The row associated with A are thus all equal to P weights P 1/n and ni¼1 nj¼1 aij ¼ n. Consequently, it can be shown that Pn Pn 2 i¼1 j¼1 aij ðzi zj Þ C mean ¼ 1 2n Pn 2 Pn Pn i¼1 zi i¼1 j¼1 aij zi zj ¼1 n and thus zt ðIn AÞz zt ðIn Þz zt Az ¼1 þ n n n zt ðIn Þz zt Az ¼1 þ Pn Pn , n i¼1 j¼1 aij

C mean ¼ 1

where In denotes the identity matrix. Given that zt(In)z ¼ n, t

C mean

z Az ¼ Pn Pn i¼1

j¼1 aij

.

For unrooted trees, the rotation of nodes in a tree is called cyclic permutation. For a given cyclic permutation the arrangement of the set of leaves is called ‘‘cyclic ordering’’. Cyclic permutations are studied to compare trees and to analyze tree metrics (functions deﬁning distances between tips, Semple and Steel, 2004). We can show now that matrix A has quite a simple analytic expression for all phylogenies, whether resolved or not (demonstration in the Appendix B): aij ¼ Q

1 p2Pij dd p

;

for

iaj

and

aii ¼ Q

1 p2PiRoot dd p

.

For a species i, aii is therefore the inverse of the product of the number of branches descending from each node from the species to the root. For a couple (i, j), aij is the inverse of the product of the number of branches descending from each node in the path connecting i and j. Now that A has been analytically resolved, the test proposed by Abouheif can be computed without referring to the cyclic permutation. Furthermore, we presented above the diagonal terms aii as the residuals of a processus to render matrix A doubly stochastic, but they are more than that. They have a biological meaning: they measure originality sensu Pavoine et al. (2005). The originality of a species provides a singlespecies measure of cladistic distinctiveness. It measures how evolutionarily isolated a species is relative to other members (tips) of a phylogenetic tree. The more a species is in average distant from the other, the more it is original. May’s (1990) provided such an index of originality whose formula is exactly aii except the product is replaced by a sum: mii ¼ P

1

p2PiRoot dd p

.

The ﬁrst consequence of this relation between matrix A and the indices of phylogenetic originality is that matrix A contains elements developed for ecological studies, evolutionary studies and conservation biology, when the need for establishing bridges between these three disciplines has been pointed out as urgently necessary. The second consequence is that we can now deﬁne a second matrix (M) of phylogenetic proximities, where mij ¼ P

1 p2Pij dd p

;

for

iaj

and

mii ¼ P

Note that M is not doubly stochastic.

1 p2PiRoot dd p

.

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

84

We provide then a formalization of Abouheif’s test both in mathematical and biological terms. Redefinition of Abouheif Test: The Abouheif’s test is a Moran’s test with a speciﬁed matrix of phylogenetic proximity A:

The non-diagonal values of A, aij, measure the proximity between two species i and j and are equal to the inverse of the product of the number of branches descending from each interior node in the path connecting i to j. The diagonal values of A, aii, measure the originality of species i as the inverse of the product of the number of branches descending from each interior node in the path connecting i to the root of the tree.

Semple and Steel (2004) introduced a calculation under the same realms as matrix A but for unrooted trees: for unrooted trees, the proportion of circular orderings for Q which j immediately follows i is lij ¼ ðdd p Þ1 for i6¼j, p2Qij

where Qij is the set of interior nodes in the path connecting i and j. Fixing lii ¼ 0, we obtain a matrix of phylogenetic proximity for unrooted tree. Denote K the matrix [lij]. By deﬁnition, K is doubly stochastic. Semple and Steel (2004) discovered an interesting property for K: let’s dij be the sum of the branch lengths in the path connecting i and j, then X PD ¼ lij d ij ij

is the sum of all the branch lengths on the tree, which is a measure of phylogenetic diversity (Faith, 1992), used in conservation biology. P Proposition. Consider d ij ¼ d ij for i6¼j and P d ii¼ jjPij ¼ fRootgajj d ij dd Root ðdd Root 1Þ, then PD ¼ ij aij d ij . (Proof in Appendix C). Roughly speaking, d ii concerns for species i ‘‘what happens on the other side of the root’’. Once again, this result contributes to ﬁlling in the gap between ecology, evolutionary biology and conservation biology. 2.6. Estimating the effect of the matrix of phylogenetic proximity in Moran’s test For each simulated tree, we calculated the matrix A. We chose to compare the use of matrix A in Moran’s test with four other matrices of proximities among pairs of species, which were proposed for trees with equal branch lengths. The ﬁrst one is the new matrix M. With the second one, named B, the proximity between the two species is the number of internal nodes, or taxonomic levels from the ﬁrst common node to the root. When branch lengths are available and summed rather than counting the numbers of nodes, B is the matrix of phylogenetic variance–covariance given a Brownian motion model of character changes

(e.g. Felsenstein, 1985). For the third matrix, denoted C, Cheverud and Dow (1985) ﬁrst deﬁned the 4-point metric distance dij between the two species by the number of internal nodes connecting the two species to a common ancestor. According to Cheverud and Dow (1985) and Cheverud et al. (1985), we should consider in the phylogenetic tree a level at which species are said unrelated, and if two species are unrelated, a zero is entered in matrix C. We considered that the species connected only at the root node are unrelated, but other choices could be made, for example Cheverud et al. (1985) truncated the tree at the family level in a taxonomy. The proximity is then deﬁned as 1/dij if dij6¼0, and 0 elsewhere. The particularity of this matrix is the zero values on the diagonal, while the proximity of a given species with itself should be maximum. Matrix C was used by Cheverud and Dow (1985) in the phylogenetic autoregressive method which they developed for distinguishing between the phylogenetic effect and the speciﬁc effect on variation in trait values: y ¼ rCy+e, where y is the normalized vector of observed data, r is called the ‘‘autoregressive coefﬁcient’’. For the fourth matrix called D, we apply the exponent a proposed by Gittleman and Kot (1990; see Martins and Hansen, 1996) to Cheverud and Dow (1985) proximity matrix. In the phylogenetic autoregressive method, the values of both a and r were obtained by maximum likelihood or more correctly by least squares (Rohlf, 2001). The objective of using a was to improve the performance of the phylogenetic autoregressive method developed by Cheverud and Dow (1985) and Cheverud et al. (1985) in distinguishing between the phylogenetic effect (rCy) and the speciﬁc effect (e) on variation in trait values. Owing to Martins and Hansen (1996) another objective was to best stretch and shrink the phylogeny. Here, we used the value of a which maximizes Mantel (1967) correlation among A and D. One of our objectives is to highlight the high congruence among A and D. We also studied the effect of a by varying its value from 0.1 to 3 using a step equal to 0.1, from 3 to 10 using a step of 1 and from 10 to 100 using a step of 10. Moran’s tests were performed on the 22 trees, containing four real trees, six symmetric trees, six comb-like trees and six yule-model trees. We performed ﬁrst Type I error tests. For a tree with n species, 1000 data sets of size n were independently and randomly drawn from a normal distribution N(0,1). We applied a Moran test for each simulated data set and measured the type I error as the percentage of signiﬁcant tests at the nominal 5% level. We performed then a series of power tests. For a tree with n species, and a ﬁxed value of the constraining force, 1000 data sets of size n were simulated from a OU process, as indicated in Section 2.1. The power was then measured as the percentage of signiﬁcant tests at the nominal 5% level. For each tree and each simulated trait, Moran’s tests are realized with the statistic I* including species relatedness weights, and with 1000 random permutations of the values (x1, x2,y, xn) around the species. All computations and graphical displays were carried out using R Core

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

Development Team 2007, with both pre-programmed and personal routines. The data and routines for performing Moran’s test are available in the ‘ade4’ package at http:// lib.stat.cmu.edu/R/CRAN/ (Chessel et al., 2004). Instruction guidelines are available in the supplementary ﬁle. 3. Results We use the following notation for the rest of the text: MTA, MTM, MTB, MTC, MTD denotes Moran’s test when I is measured with matrices A, M, B, C, D, respectively. What emerges from these simulations (Fig. 2) is that except for the comb-like model, the use of matrix B in Moran’s I provides a less powerful test. With the Yule model, our simulations highlight a far lower power of I when it is used with B. With 256 species and a constraining force a ¼ 10 for the OU process, the power of MTB is estimated equal to 0.368, while the estimated power of the four other tests varies from 0.962 to 0.985. With most of the trees, the power of MTA and MTD are very close, and for most of the trees except the comb-like trees, the power of Moran’s test increases in the following order: MTBoMTMoMTCo (MTAEMTD). For the comb-like model, the power of Moran’s test increases in the inverse order: MTAo MTDoMTCoMTMoMTB, but the power of the ﬁve tests are in that case very close.(Fig. 3) The type I error of all tests are close to the nominal 5% level (Table 1). The ﬁve matrices differ in how they value a high proximity and a low proximity and the gap between them. The coefﬁcients of variation (CV) for non-diagonal values of these ﬁve matrices are in average (over the 22 trees): 3.56 for matrix A, 2.58 for D, 1.40 for C, 0.97 for B, and 0.93 for M. Matrix D and, above all, matrix A provides the most contrasting values (high CV). Their CV highly increases with the number of species (Fig. 4). This difference in CV appears in the graphical representations of the matrices (Fig. 5). Furthermore, one can observe in Fig. 5 that the values near the diagonal for A and D (which are the estimates of the proximities among close species) are not identical but very close to each others. Despite D provides more contrast between very close species and less related species, only matrix A also provides clear distinctions among the proximities of less related species (values far from the diagonal). Note also that one of the differences between A, M and B, C, D is that A and M never consider that species connected only at the root node of the tree are not related. Regarding the effect of an exponent on C, let MTCa be the Moran’s test used with Ca. For all trees studied, the power of MTCa is ﬁrst enhanced when a increases from 0.1 to reach a maximum for a value of a comprised between 1 and 2 in all our examples (Fig. 3). Then the power regularly decreases. In all cases, D is at or very close to the maximum power of MTCa over a. We highlighted that the diagonal terms of matrix A are measures of originality (single-species measure of cladistic

85

distinctiveness) sensu Pavoine et al. (2005). For all our 22 trees (Fig. 6), the rank correlation between the diagonal values aii of matrix A and May (1990)’s index (diagonal values of matrix M), one of the main indices of species originality, is equal to 1. The difference between the formulas of the aii and the mii is the use of a product instead of a sum. The consequence is that the aii decreases more quickly with the number of nodes between i and the root. The advantage of this steeper decrease is that the most original species are more emphasized (Fig. 6). By deﬁnition, aii is equal to the frequency of representations which put species i at one extremity of the sequence of the tips, i.e. after all the other species. another Pn Consequently, P n advantage of the aii is that a ¼ 1 while ii i¼1 i¼1 mii depends on the tree shape and size. 4. Discussion 4.1. Rediscovering the link between Moran’s I and Geary’s c Because A is a doubly stochastic matrix, when using A as a proximity matrix, Moran’s I is equal to Abouheif’s Cmean and to one minus Geary’s cn, thus the three statistical methods are brought back together in the same theoretical framework. In that case, the two statistics I and cn provide complementary information. The statistic I measures the local autocorrelation, which is the degree to which related species are close from each other in a given trait, and the statistic cn measures the local variability, that is to say the degree to which related species differ from each other. 4.2. Improving Abouheif’s test and giving it new purposes We improved Abouheif’s test by providing its exact analytical value, while it was previously calculated by an approximate algorithm. We centered the test on a new matrix of phylogenetic proximities (A). This mathematical formalization leads to a clear biological deﬁnition of the Abouheif test: Abouheif test measures the proximity between two taxa i and j as the inverse of the number of branches descending from each interior node in the path connecting i to j the root of the tree. The proximity depends thus on the interior nodes in the path connecting i to j, and was previously approximated by a technical approach referring to the percentage of time i and j were found next to each other in the set of all cyclic permutations of the tree. The reference to cyclic permutations was thus unnecessary and can be thought as technical, even makeship job and disturbing because a cyclic permutation does not change topology. Despite that, cyclic permutations were here at the foundation of the three matrices A, M and K. Research on cyclic permutations has already proved useful for studies on tree metric and tree reconstitutions (Semple and Steel, 2004). A tree is mechanically embedded in a plane which implicitly demands to choose a way of arranging tips (a circular ordering of the tips). The existence of a ﬁnite number of

ARTICLE IN PRESS 86

S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

Fig. 2. Power tests for (A) the symmetric model, (B) the comb-like model, (C) the Yule model and (D) the observed, real trees. Simulations were done separately for each sample size from n ¼ 8 to 256 species. The legends for the line symbols, drawing and color (black or gray) for the ﬁve matrices of phylogenetic proximity (A, M, B, C and D) are given in the box on the bottom left-hand corner. The power is given in the ordinate axis as a function of alpha, the constraining force of the OU process. Note that the powers obtained with A and D are very close so that the curves for A and D are often superimposed.

distinct circular orderings for a tree (Semple and Steel, 2004) constitutes one of the properties of the tree object which merits our attention. Abouheif’s intuition can

reinforce an underestimated interest of the research on cyclic permutations for evolutionary and biological conservation studies.

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

87

Fig. 3. Effect of the exponent g in Cg proposed by Gittleman and Kot (1990) on the power of Moran’s test used with Cg for (A) the symmetric model, (B) the comb-like model and (C) the Yule model. The black square indicates the position of D ¼ Cb where corðCb ; AÞ ¼ maxg ½corðCg ; AÞ. The open circle indicates the position of C. The graphs are given for different values of the constraining force from a ¼ 0 to 10. The precise value of a for a given graph is given on the bottom right-hand corner of the graph.

Table 1 Average type I error for Moran’s test at the nominal 5% level (with standard deviation in brackets)

A M B C D

Symmetric

Comb-like

Yule

4.68 4.32 4.83 4.78 4.83

4.53 5.35 4.25 4.93 5.15

4.65 5.15 4.57 4.77 4.57

(0.44) (0.59) (0.32) (0.41) (0.63)

(0.22) (1.21) (0.96) (0.88) (0.72)

Real-trees (0.61) (0.43) (0.68) (0.76) (0.55)

4.44 4.88 5.05 4.68 4.75

(0.59) (0.35) (0.99) (0.81) (0.83)

All values are given as percentages. The matrix used with Moran’s test is given in the ﬁrst column. For the symmetric, the comb-like and the Yule models, values are averaged over the six sample sizes. For the real trees, values are averaged over the four phylogenies considered.

4.3. Comparison between A, M and currently used proximity matrices The following kinds of methods have been recommended when suspecting phylogenetic dependence. A ﬁrst group of methods aims to test for a phylogenetic signal in each of the phenotypic traits under study, whatever the shape of this signal for example by measuring a correlation among sister-species (Gittleman and Kot, 1990). The second group of methods aims to describe the link between a phylogenetic tree and the states of one or several traits, either by searching at which level(s) in a phylogenetic tree, one can detect phylogenetic signal with methods such as

ARTICLE IN PRESS 88

S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

Fig. 4. Coefﬁcients of variation (CV) for the ﬁve matrices A, M, B, C and D for (A) the symmetric model, (B) the comb-like model and (C) the Yule model. The legends for the line symbols, drawing and color (black or gray) for the ﬁve matrices of phylogenetic proximity (A, M, B, C and D) are given in the boxes on the right side of the ﬁgure. Each graph provides CV as a function of an index of the number of species on a logarithmic scale.

Fig. 5. Differences among the ﬁve matrices of phylogenetic proximities for (A) the symmetric model, (B) the comb-like model and (C) the Yule model. The name of the matrices is given in boxes on the right-hand corner of the ﬁgures. Each value in the matrix is represented by a square. The larger the square the higher is the value. Species are ordered by rows and columns according to the phylogenetic tree which is given.

nested ANOVA (Crook, 1965; Clutton-Brock and Harvey, 1977, 1979, 1984), correlogram (Gittleman and Kot, 1990; Rohlf, 2001, 2006), and orthonormal transform (Giannini, 2003; Ollier et al., 2006) or by modeling the evolution of the trait supposing that the real causes involving phylogenetic inertia and adaptation are precisely known (Martins

and Hansen, 1996; Blomberg et al., 2003; Bonsall and Mangel, 2004; Bonsall, 2006). A critical step in all these approaches is the proper speciﬁcation of a phylogenetic proximity matrix. Indeed, many possibilities exist depending on whether branch lengths are known and on the model of macroevolution used, for examples pure neutral model

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

89

Fig. 6. The diagonal values of matrix A are measures of phylogenetic originality: (A) Representation by Cleveland (1994) dot plots of the links between the diagonal values of A and the diagonal values of M (May’s (1990) index) for the comb-like model, the Yule model, and the four real trees considered in the text. The congruence among the diagonal values of matrix A and May’s index is high but the differences in the originality values among species are higher for the diagonal values of matrix A than for May’s index, which leads in (B) to a parabolic shape for the relationship among the diagonal values of matrix A and May’s index. Note that the diagonal values of A and the values obtained from May’s (1990) index may be small but never null.

(Brownian motion), stabilizing selection (Ornstein–Uhlenbeck), directional selection, accelerating and decelerating Brownian evolution (ACDC) (Blomberg et al., 2003) and more complex models (Mooers et al., 1999) and it is difﬁcult to choose among them (Hansen and Martins, 1996). In this context, matrix A constitutes a useful alternative. In this paper, we give the exact analytic expression of this matrix. We prove that it is a symmetric and doubly stochastic proximity matrix and because it is doubly stochastic, it uniﬁes Moran’s and Geary’s statistics. We showed through the power tests, that there may be a strong effect of the choice of the proximity matrix in Moran’s non-parametric test. Two main criticisms were raised to Abouheif’s test: it does not rely on branch length and it is used due to technical reasons rather than because it relies on a precise model of character changes. By formalizing Abouheif’s test we redeﬁned it and clariﬁed its biological foundations. Regarding the absence of branch length, if branch lengths are available, one should use them and have higher power, to the extent that they are accurate. Even if focusing on nodes means using a rather unlikely model in which branch lengths are assumed to be equal, the tree topology is one out of the two key components of this history. Unlike B, the matrices A and M we introduced and the matrices C and D suggested by Cheverud and Dow (1985) and Gittleman and Kot (1990) are specially designed for topologies. The advantage of matrices A and D over B, C

and M is that they provide more contrasted species proximity estimates, which enhances the power of the tests. Our results showed that most of the time matrix A ﬁts data sets better than existing matrices of phylogenetic proximities and when it does not ﬁt better, it ﬁts almost as well as other matrices. Matrix A and M also have the advantage of correcting proximities for unresolved trees. Indeed, for these two matrices the product or sum of the number of branches descending from nodes are counted instead of the simple number of nodes (see, May, 1990). Gittleman and Kot (1990) proposed to add an exponent to matrix C to best-ﬁt data. This led us to introduce matrix D with an exponent which maximizes its correlation with matrix A. We studied the effect of the exponent on C. Let MTCa be the Moran’s test used with Ca. For all trees studied, D was at or very close to the maximum power of MTCa over a. This result reinforces the interest of A as a powerful matrix. 4.4. Matrix A, M and conservation biology We emphasized the statistical advantages gained by using matrix A as a matrix of phylogenetic proximities among species. Actually the advantages gained by using matrix A are both statistical and biological, because matrix A has a biological meaning which has not been explored with previous matrices of phylogenetic proximity. The diagonal values of matrix A are important for two reasons: ﬁrst because they give the doubly stochastic property to the

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

90

whole matrix A, unifying then Moran’s, Geary’s and Abouheif’s tests, and second because they are measures of cladistic originality, that is to say aii is a measure of how evolutionarily isolated species i is relative to other members of the phylogenetic tree under study. As we highlighted, the diagonal values of matrix A have the same capacity as May’s index (diag(M)) to measure cladistic originality. They are even more segregating, giving a larger range of values from the less to the most original species. Consequently, the diagonal values of matrix A can be used as a powerful alternative to current indices of cladistic originality when designing conservation priorities. Moran’s test only uses the non-diagonal values of A and M. The challenge now is to evaluate the use of these complete two matrices in more complex analyses, such as multivariate analysis (cf. Thioulouse et al. (1995) for ordination under spatial autocorrelation) and comparative analyses. In conclusion, the appealing qualities of matrix A are that it uniﬁes Abouheif’s, Moran’s and Geary’s tests; it increases the power of Moran’s test; its diagonal values measures cladistic originality and it contributes to ﬁlling in the gap between evolutionary biology and conservation biology. The interest of matrix A for a wide variety of taxa and traits, and for a larger range of evolutionary and ecological issues has still to be proved but our ﬁrst results are encouraging. Acknowledgments The authors would like to thank Michael Bonsall (University of Oxford, UK), Margaret Evans (University of Yale, USA), Se´bastien Devillard, Cle´ment Calenge, Ste´phane Dray and Thibaut Jombart (University of Lyon, France), Patrick James Degeorges (French ministery of ecology and sustainable development) for their helpful comments on a ﬁrst draft of this paper. Appendix A. Abouheif’s test is a Moran’s test We demonstrated in the text that zt Az C mean ¼ Pn Pn i¼1

j¼1 aij

.

Let us decompose matrix A into two additive components: KA contains the diagonal values of A, and MA the non-diagonal values (with 0 on the diagonal); hence A ¼ KA+MA. Moran’s statistics was deﬁned for matrices with zeros down the principal diagonal. Consider S A ¼ Pn P Pn1 P n n a and S ¼ 2 ij M i¼1 j¼1 i¼1 j4i aij , then A C mean ¼

zt K A z S M A zt MA z þ . SA SA SM A

During the randomization test, when the observed values (x1, x2,y, xn) are randomly permuted around the species and S M A S A is a constant. Consequently,

Cmean depends on two components: one linked to the originalities of the species and one linked to the proximities of couples of species. This demonstration is still true for matrix M. Appendix B. Analytical expression of the proximity matrix A We propose to give explicitly the analytical expression of matrix A. These assertions will be illustrated by way of a simple theoretical example (Fig. 1). We will consider the following terminology concerning the phylogenetic tree. The root is the common ancestor to the sets L and N of all the l contemporary taxons (OTUs: operational taxonomic units) and n interior nodes, respectively (HTUs: hypothetical taxonomic units). Branches emanate from the root and nodes, tracing the course of evolution. The deﬁnition of A is independent of branch length and depends only on the topology of the tree. The minimum spanning path between two taxonomic units (nodes or tips) deﬁnes an ordered set of nodes. For example in Fig. 1 A, the set Pab ¼ {A} is associated with the path (a, A, b) that spans species a and b. Similarly, the set PaRoot ¼ {A, B} deﬁnes the hypothetical ancestors associated to the path (a, A, B, Root) between species a and the root of the tree. We denote DDi the set of direct descendants for node i, including hypothetical descendants at interior nodes and species at tips. For example, in Fig. 1A, DDA ¼ {a, b} and DDB ¼ {A, c, d}. Let ddi ¼ card(DDi) be the number of direct descendants for node i (ddA ¼ 2 and ddB ¼ 3). The total number of consistent representations of the tree topology is deﬁned by the Q product of rotations associated with each node: T¼ dd i !, where ddi denotes the number of direct i2N

descendants of each node iAN. Among the set of equivalent representations of the phylogeny (Fig. 1B), there is at least one representation that puts tip j just behind tip i. Consider such a representation and observe what happens if a rotation occurs at one node pAN. If pePij, all the ddp! rotations associated with that node do not disturb the respective position of i and j. On the other hand, if pAPij, only (ddp1)! out of the ddp! possible rotations do not disturb the respective positions of both tips. As the set of node rotations that put tip j just behind tip i is equal to the set of equivalent representations that put tip i just behind j, we deduce the analytical expression aij of the matrix A (Fig. 1C): Q Q I ij 1 p2Pij ðdd p 1Þ! p2NPij dd p ! Q ¼Q . aij ¼ ¼ T dd ! p p2N p2Pij dd p From this expression, we can deduce the diagonal values: aii ¼ 1

n X j¼1;jai

aij .

ARTICLE IN PRESS S. Pavoine et al. / Theoretical Population Biology 73 (2008) 79–91

Therefore, the Cmean statistic is equal to Moran’s statistic applied to the so-deﬁned matrix A (Fig. 1D). Appendix C. Proof for PD ¼

P

ij aij d ij

Because of the presence of a root in the tree, if Pij afRootg, aij ¼ lij, and if Pij ¼ fRootg aij ¼ lij/ddRoot and aij ¼ aiiajjddRoot. X PD ¼ lij d ij ij

¼

X

ijjPij afRootg

¼

X X X ij

X

aij ðdd Root 1Þd ij

ijjPij ¼fRootg

X

aij d ij þ

ij

¼

aij dd Root d ij

ijjPij ¼fRootg

aij d ij þ

ij

¼

X

aij d ij þ

aii ajj dd Root ðdd Root 1Þd ij

ijjPij ¼fRootg

aij d ij þ

X i

aii

X

ajj d jj dd Root ðdd Root 1Þ.

jjPij ¼fRootg

for i6¼j and Consider d ij ¼ d ij P d ii ¼ fRootgajj d ij dd Root ðdd Root 1Þ then PD ¼ ij aij d ij For dichotomous trees, X X X PD ¼ aij d ij þ aii 2aij d ij ij

i

P

jjPij ¼

jjPij ¼fRootg

Appendix D. Supplementary materials Supplementary data associated with this article can be found in the online version at doi:10.1016/j.tpb. 2007.10.001.

References Abouheif, E., 1999. A method for testing the assumption of phylogenetic independence in comparative data. Evol. Ecol. Res. 1, 895–909. Ackerly, D.D., Donoghue, M.J., 1998. Leaf size, sappling allometry, and Corner’s rules: phylogeny and correlated evolution in Maples (Acer). Am. Nat. 152, 767–791. Bauwens, D., Dı´ az-Uriarte, R., 1997. Covariation of life-history traits in lacertid lizards: a comparative study. Am. Nat. 149, 91–111. Blomberg, S.P., Garland, T., Ives, A.R., 2003. Testing for phylogenetic signal in comparative data: behavioral traits are more liable. Evolution 57, 717–745. Bonsall, M.B., 2006. The evolution of anisogamy: the adaptative signiﬁcance of damage, repair and mortality. J. Theor. Biol. 238, 198–210. Bonsall, M.B., Mangel, M., 2004. Life-history trade-offs and ecological dynamics in the evolution of longevity. Proc. Roy. Soc. Lond. Ser. B 271, 1143–1150. Bried, J., Pontier, D., Jouventin, P., 2003. Mate ﬁdelity in monogamous birds: a re-examination of the Procellariiformes. Anim. Behav. 65, 235–246. Chessel, D., Dufour, A.-B., Thioulouse., J., 2004. The ade4 package -I- One-table methods. R News 4, 5–10. Cheverud, J.M., Dow, M.M., 1985. An autocorrelation analysis of genetic variation due to lineal ﬁssion in social groups of rhesus macaques. Am. J. Phys. Anthrop. 67, 113–121.

91

Cheverud, J.M., Dow, M.M., Leutenegger, W., 1985. The quantitative assessment of phylogenetic constraints in comparative analyses: sexual dimorphism in body weight among primates. Evolution 39, 1335–1351. Cleveland, W.S., 1994. The Elements of Graphing Data. AT&T Bell Laboratories, Murray Hill, New Jersey. Cliff, A.D., Ord, J.K., 1981. Spatial Processes. Model & Applications. Pion, London, UK. Clutton-Brock, T.H., Harvey, P.H., 1977. Primate ecology and social organization. J. Zool. Lond. 183, 1–39. Clutton-Brock, T.H., Harvey, P.H., 1979. Comparison and adaptation. Proc. Natl. Acad. Sci. USA 205, 547–565. Clutton-Brock, T.H., Harvey, P.H., 1984. Comparative approaches to investigating adaptation. In: Krebs, J.R., Davies, N.B. (Eds.), Behavioral Ecology: An Evolutionary Approach. Blackwell Press, Oxford, England, pp. 7–29. Crook, J.H., 1965. The adaptive signiﬁcance of avian social organization. Symp. Zool. Soc. Lond. 14, 181–218. Diniz-Filho, J.A.F., 2001. Phylogenetic autocorrelation under distinct evolutionary processes. Evolution 55, 1104–1109. Faith, D.P., 1992. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 1–10. Felsenstein, J., 1985. Phylogenies and the comparative method. Am. Nat. 125, 1–15. Geary, R.C., 1954. The contiguity ratio and statistical mapping. Inc. Stat. 5, 115–145. Giannini, N.P., 2003. Canonical phylogenetic ordination. Syst. Biol. 52, 684–695. Gittleman, J.L., Kot, M., 1990. Adaptation: statistics and a null model for estimating phylogenetic effects. Syst. Zool. 39, 227–241. Hansen, T.F., Martins, E.P., 1996. Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspeciﬁc data. Evolution 50, 1404–1417. Jumar, P.A., Thistle, D., Jones, M.L., 1977. Detecting two-dimensional spatial structure in biological data. Oecologia 28, 109–123. Mantel, N., 1967. The detection of disease clustering and a generalized regression approach. Cancer Res. 27, 209–220. Martins, E.P., Hansen, T.F., 1996. The statistical analysis of interspeciﬁc data: a review and evaluation of phylogenetic comparative methods. In: Martins, E.P. (Ed.), Phylogenies and the Comparative Method in Animal Behavior. Oxford University Press, Oxford, pp. 22–27. Martins, E.P., Hansen, T.F., 1997. Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspeciﬁc data. Am. Nat. 149, 646–667. May, R.M., 1990. Taxonomy as destiny. Nature 347, 129–130. Mooers, A.O., Vamosi, S.M., Schluter, D., 1999. Using phylogenies to test macroevolutionary hypotheses of trait evolution in cranes (Gruinae). Am. Nat. 154, 249–259. Moran, P.A.P., 1948. The interpretation of statistical maps. J. Roy. Stat. Soc. B 10, 243–251. Ollier, S., Couteron, P., Chessel, D., 2006. Orthonormal transform to decompose the variance of a life-history trait across a phylogenetic tree. Biometrics 62, 417–477. Pavoine, S., Ollier, S., Dufour, A.B., 2005. Is the originality of a species measurable? Ecol. Lett. 8, 579–586. Rohlf, F.J., 2001. Comparative methods for the analysis of continuous variables: geometric interpretations. Evolution 55, 2143–2160. Semple, C., Steel, M., 2004. Cyclic permutations and evolutionary trees. Advances in Applied Mathematics 32, 669–680. Sibley, C.G., Ahlquist, J.E., 1990. Phylogeny and Classiﬁcation of Birds: A Study in Molecular Evolution. Yale University Press, New Haven. Thioulouse, J., Chessel, D., Champely, S., 1995. Multivariate analysis of spatial patterns: a uniﬁed approach to local and global structures. Environ. Ecol. Stat. 2, 1–14. Upton, G., Fingleton, B., 1985. Spatial Data Analysis by Example. Wiley, Chichester. Von Neumann, J., Kent, R.H., Bellinson, H.R., Hart, B.I., 1941. The mean square successive difference. Ann. Math. Stat. 12, 153–162.

Our partners will collect data and use cookies for ad personalization and measurement. Learn how we and our ad partner Google, collect and use data. Agree & Close