3. The reconstruction of phylogeny
The first Darwinian principle told that every phylogenetic tree has one common ancestor.
Phylogenetic analysis is the study of taxonomic relationships among lineages.
Willi Hennig (1913-1976)
Phylogenetic systematics
Cladistics (greek κλάδος: branch)Numerical taxonomy
Robert Sokal(1927-)
http://www.eol.org/http://tolweb.org/tree/phylogeny.htmlhttp://www.faunaeur.org/
Ancestor
a
b
c ee
d
f
The cladistic methodology
A B C D Apomorphies are common derived characters.
Autapomorphies are characters that are restricted to single lineages.
Plesiomorphies are ancestral derived characters.
adf ade abc abd
b: Synapomorphy of lineage C+D
d: Plesiomorphy of lineage A It is a symplesiomorphya: Apomorphy of the whole tree It is the ancestral state.
e: Autapomorphy of lineage D
The collective set of plesiomorphies defines the ground plan of a phylogenetic tree.
Ancestor
a
b
de
d
f
A B C
adf ade abd C is the sister taxon of A and B
Character a in lineages A, B, and C is homologous because it synapomorph
Character d in lineages A, B, and C is not homologous because it derived twice. It is homoplasious
Ancestor
b
de
d
f
A B C D E
Monophyletic taxon Paraphyletic taxon
f
bPolyphyletic taxon
The ultimate aim of taxonomy is to group
higher taxa into monophyletic subtaxa.
For this task we have to infer autapomorphies
Autapomorphy defines monophyly
Actino-pterygia Dipnoi Anura Urodela Mammalia Squamata
Therosauria
Aves
Tetrapoda
Amniota
Reptilia(paraphyletic)
Archosauria
Common ancestor Lungsplesiomorph
Tetrapod limbsapomorph
Amnionapomorph
Mammaeautapomorph
Feathersapomorph
Loss of tailapomorph
The evolutionary change within a lineage is called anagenesis
The diversification of an evolutionary tree is called cladogenesis
Linnean systematics and cladistics
Linnean approach
Hierachical encaptive system
Phenomenological method based on similarity
It uses grades (groups of similar body plan)
Different taxonomies are possible
There is no clear decision intrument for taxonomies
The number of higher taxa is rather small (Pisces, Amphibia, Reptilia, Aves, Mammalia)
It does not assume common evolutionary history
It does not reconstruct evolution
Taxonomy is independent of evolution
Hennigean approach
Hierachical encaptive system
Analytical method based on lineage branching
It uses clades (groups of identical root)
Only one taxonomic solution is allowed
Autapomorphies decide about taxonomic position
The number of higher taxa is large (Pisces, Amphibia, Reptilia are not valid taxa )
It is based on common evolutionary history
It does reconstruct evolution
Taxonomy is a part of evolutionary theory
Low resolution trees High resolution trees
Phylogenetic tree of winged insect orders
Devonian TriassianPermianCarboniferous CretaceousJurassic Paleogene to recent
PalaeodictyopteraOdonata
EphemeropteraDictyopteraPlecopteraZorapteraEmbiopteraIsoptera
GrylloblatodeaDermaptera
PhasmidaOrthopteraMallophagaPsocopteraThysanopteraHeteropteraHymenopteraNeuropteraColeoptera
MecopteraSiphonaptera
Diptera
LepidopteraTrichoptera
Devonian origin
Radiation
Radiation
Low resolution
In the Triassic period all extant taxa already existed
The tree lacks 9 orders that went extinct by the end of the Permian
Rhyniognatha hirsti
The principle of maximum parsimony (Occam’s razor) holds that we should accept that phylogenetic tree that can be constructed with the least number of morphological
changes.
The construction of phylogenetic trees from numerical methods
CSpecies 1 2 3 4 5 6A 1 1 0 1 1 1B 1 1 1 1 1 1C 0 1 0 0 1 0D 0 0 1 1 0 1E 1 0 1 1 0 1
Characters
The raw data
Species A B C D EA 0 1 3 4 3B 1 0 5 3 2C 3 4 0 5 6D 4 3 5 0 1E 3 2 6 1 0
Distance matrix
We are looking for such a tree that minimizes the sum of distances.
A B ED
010010
110111
101101
001101
8 changes
111111
A B CD E
110111010111
010010
111111
101101
001101
7 changes
Outgroup
How to define the root?
Parsimony analysis
To find the most parsimonious tree we have to cross all combinations of lineages (trees) with all character combinations at the root.
SpeciesNumber of
trees2 13 34 155 1056 9457 103958 1351359 202702510 34459425
The number of possible trees
S 1
(2S 2)!N
2 (S 1)!
Neighbour joining
A
F
DE
C
B
Root
A
F
DE
C
B
RootX
A
F
DE
C
B
RootX
Y
Neighbour joining is particularly used to generate phylogenetic trees
in
(X) (X,Y )
(X,Y)Q (n 2) (X,Y) (X) (Y)
AB
(X,A) (X,B) (A,B)(X, U )
2
(n 2) (A,B) (A) (B)(A, U)
2(n 2)
(n 2) (A,B) (A) (B)(B, U)
2(n 2)
Dissimilarities
You need similarities (phylogenetic distances) d(XY) between all elements X and Y.
Select the pair with the lowest value of Q
Calculate new dissimilarities
Calculate the distancies from the new node
Calculate
Distance matrixMouse Raven Octopus Lumbricus
Mouse 0 0.2 0.6 0.7Raven 0.2 0 0.6 0.8Octopus 0.6 0.6 0 0.5Lumbricus 0.7 0.8 0.5 0
Delta values 1.5 1.6 1.7 2
Q-valuesMouse/Raven -2.7Mouse/Octopus -2Mouse/Lumbricus -2.1Raven/Octopus -2.1Raven/Lumbricus -2Octopus/Lumbricus -2.7
Distance matrixMouse Raven Protostomia
Mouse 0 0.2 0.4Raven 0.2 0 0.45Protostomia 0.4 0.45 0
Delta values 0.6 0.65 0.85
Q-valuesMouse/Raven -1.25Mouse/Protostomia -1.05Raven/Protostomia -0.6
Distance matrixVertebrata Protostomia
Vertebrata 0 0.075Protostomia 0.075 0
in
(X) (X,Y )
(X,Y)Q (n 2) (X,Y) (X) (Y)
AB
(X,A) (X,B) (A,B)(X, U )
2
(X,Y)Q (n 2) (X,Y) (X) (Y)
in
(X) (X,Y )
Mouse
Lumbricus
Raven
Octopus
Mouse
Protostomia
Raven
Protostomia
Vertebrata
Assumption of the numerical methods
Characters (or transitions) have to be independent.
Impossible character states have to be excluded.
Scales
Hairs
Feathers
Loss of feathersLoss of hairsFish
MammalsBirds
Incompatible
Characters are assumed to have equal importance. In reality transitions are not comparable.
To overcome this problem you give character weights. Technically you multiply the occurrence of a character in a distance matrix
A B C1 C2 DA 1 0 1 1 6B 1 2 4 4 2C1 4 5 2 2 1C2 4 5 2 2 1D 1 0 3 3 2
http://evolution.genetics.washington.edu/phylip/software.html
Species SequenceA A A T T A A C C C A A T AB C A T T A A C C C A A T AC C G T T T G G A A T G A CD C G T G T G G A A T A A AE G G T G T G C C C A A T A
Trees from molecular data
A B C D EA 0 1 11 10 5B 1 0 10 9 5C 11 10 0 3 9D 10 9 3 0 6E 5 5 9 6 0
Distance matrix
Linus Pauling (1901-1994)
Motoo Kimura(1924-1994)
Emile Zuckerkandl(1922-)
Evolutionary time scalesThe molecular clock
Numbers of amino acid substitutions and therefore trespective numbers of nucleotide substitutions are for many proteins and genomes approximately
proportional to time.
Hence, numbers of substitutions are a measure of time of divergence from
the latest common ancestor.
Substitutions alone provide a relative time scale
An appropriate calibration adds the absolute time scale
0
10
20
30
40
50
60
70
80
0 200 400 600 800 1000
Paleontological divergence estimate
Nu
um
be
r o
f am
ino
c
aci
d d
iffe
ren
ces
c
Superoxide dismutase
Tomoko Ohta(1933-)
Errors
A B C D
Ancestor
The length of a tree segment is a measure of the duration of a lineage 1
4
3
2
Is it possible to convert numbers of character changes into evolutionary time scales?
T→CTCA→GAG→C→GTG→C→AAACG
TTCA→GAGTGCCCT
Single substitution
Parallel substitution
Back substitution
Multiple substitution
The Jukes Cantor model now assumes that the probabilities l of any transition within these 4
nucleotides is the same.
Assuming that transition probability is time independent (every period has the same
transition probability). The probability distribution follows an Arrhenius
model. t
transp 1 e
A
TC
G
/3l
/3l
/3l
/3l
Applying the molecular clock
ttransp 1 e
A→T:
1 4xt ln(1 )
3n
t ttrans
1 3p 3( (1 e )) (1 e )
4 4
x n x
t t
nL(x; t) p (1 p)
x
n 3 3 3 3ln(L(x; t)) ln x ln( e ) (n x) ln(1 ( e ))
x 4 4 4 4
A→G: t1(1 e )
4 A→C: t1
(1 e )4
A→A:
What is the probability to get exactly x differences out of n possible?
We apply the binomial:
We are interested in the time that maximizes this function. Hence we need the root of
the first derivative
We apply the principle of maximum likelihood.
t1(1 e )
4
The distances t are now used in distance matrices to construct the phylogenetic tree
)43
1(41
)1(43
1 tt ee
Paleontological versus molecular timescales
Morphological change
Gen
etic
al c
hang
e Time axis
Molecular divergence of placental orders (120-140 mya)
First fossils of placental orders (65 mya)
Eomaia (125 mya)
Morphological change
Gen
etic
al c
hang
e Time axis
Molecular divergence (4-5 mya)
First fossils of erect hominids(6-7 mya)
Gene flow up to 2 mya
Molecular estimates point frequently much more ancient divergences of lineages than estimates based on the fossil record.
The reason are different speeds of morhological and genetical changes.
Changes in genetic constitution accumulate to a point where basic regulatory elements are
involved
Changes in genetic constitution involve first basic regulatory elements.
Paleontological versus molecular timescales
Matching of molecular and paleontological timescales in Echinodermata
For the majority of Echinoderm subtaxa molecular divergence estimates are higher than
the paleontological estimates.
Taxon First recordDuration of record
Missing records
Euechinoidea Serpianotiaris coaeva 235–240 45 15Acroechinoidea Diademopsis serialis 205–210 0 0Acrosalenia chartroni Lambert 200–205 0 0Diadematoida Gymnotiara varusense 190–195 10 5Plesiechinus hawkinsi Jesionek 195–200 5 5Irregularia Plesiechinus hawkinsi 195–200 15 0Microstomata Galeropygus sublaevis 180–185 0 0Neognathostomata Galeropygus sublaevis 180–185 80 10Cassiduloida Hungaresia ovum 85–90 90 15Clypeasteroida Nucleopygus angustatus 100–105 50 0Scutellina Eoscutum doncieuxi 50–55 0 0Laganiformes Sismondia logotheti 50–55 0 0Scutelliformes Eoscutum doncieuxi 50–55 25 0Atelostomata Hyboclypus ovalis 175–180 25 0Spatangoida Disaster moeschi 160–165 65 5Paleopneustina Polydesmaster fourtaui 90–95 0 5Brissidea Micraster distinctus 95–100 45 0Meoma antiqua Arnold 40–45 0 0Eupatagus haburiensis Khanna 50–55 15 0Stirodonta Atlasaster jeanneti 195–200 30 0Camarodonta Glyptocyphus difficilis 115–120 0 0Echinoida Pseudarbacia archaici 90–95 65 65Echinoida Lytechinus axiologus 45–50 0 5Cidaroidea Eotiaris keyserlingi 250–255 255 0Echinothurioida Pelanechinus oolithicum 170–175 175 45Pedinoida Hemipedina hudsoni 205–210 210 0Aspidodiadematidae Gymnotiara varusense 190–195 195 35Diadematidae Farquharsonia crenulata 165–170 170 0Echinoneoida Pygopyrina icaunensis 160–165 165 5Cassidulidae Rhyncholampas macari 65–70 70 30Echinolampadidae Hungaresia ovum 85–90 90 35Clypeasterina Clypeaster calzadai 40–45 45 20Fibularidae Echinocyamus gurnahensis 50–55 55 0Laganidae Sismondia logotheti 50–55 55 10Mellitidae Encope ciae 20–25 25 0Astriclypeidae Amphiope duffi 25–30 30 0Holasteroida Collyrites ellipticus 165–170 170 5Schizasteridae Periaster elatus 90–95 95 0Paleopneustidae Polydesmaster fourtaui 90–95 95 0Archaeopneustids Heterobrissus salvae 40–45 45 0Brissidae Meoma antiqua 40–45 45 0Spatangidae Granopatagus lonchophorus 35–40 40 0Loveniidae Hemimaretia subrostrata 35–40 40 0Arbacioida Atopechinus cellensis 165–170 170 0Somopneustids Phymechinus mirabilis 155–160 160 0Temnopleuridae Zeuglopleurus costulatus 95–100 100 0Echinidae Psammechinus dubius 15–20 20 0Strongylocentrotidae Strongylocentrotus antiquus 20–25 25 0Echinometridae Plagiechinus priscus 25–30 30 10Toxopneustidae Lytechinus axiologus 45–50 50 5Trigonocidaridae Arbacina monilis 15–20 20 30
Sum 3210 360
Species
0
50
100
150
200
250
300
0 100 200 300
Paleontological divergence estimate
Mo
lecu
lar
div
erg
en
ce
z
est
ima
te
Data from Smith et al. (2006)
Data from Qun et al. (2007)
DivergencesEarliest fossil
recordMolecular estimates
Placental-marsupials 175–145 185–161Amniotes-amphibians 310 375–345Myriapods-chelicerates 530 705–579Mosses-vascular plants 450 899–515Crustaceans-insects 530 726-539Echinoderms-chordates <530 1001–586Spiralian-Ecdysozoans 560–540 643–544Protostomes-deuterostomes 560–540 678–556Arthropods-chordates 560–540 1200–588Cnidaria-bilaterians <600 724–615Sponges-chordates <600 1350–592
Paleontological versus molecular timescales
Have all phylogenetic trees a single root?
Darwin’s first principle: All species of a given taxon have a common ancestor.
Parsimony analysis cannot answer this question. A brush would always have a lower number of character changes
TimeSpontaneous origin of simple life forms
Sca
le o
f or
gani
zatio
n
Scala naturae
A brush means:
• No speciation.
• If we except that extinction occurs this would mean a constant decrease in the number of species.
• Character change within whole species.
• No genetic (character) variability within populations.
• Extreme longevity of lineages.
Theory of Lamarck
But horizontal gene transfer and might at least in bacteria result in networks and rings!
Evolution and development (EvoDevo)
August Weismann (1834-1914)
The soma - germ line distinction
makes it impossible to transmit acquired characters
to the next generation
Ernst Haeckel(1834-1919)
Theory of recapitulation
The ontogeny of advanced species
recapitulates respective stages in ancestral
forms.
In fact, only basic genetic programs are conserved and modifications at all stages of ontogenesis
appear. Haeckel’s rule is only a crude approximation.
Today’s reading
Phylogenetic systematics: http://evolution.berkeley.edu/evolibrary/article/phylogenetics_01
Cladistics: http://en.wikipedia.org/wiki/Cladistics
Ernst Haeckel: Kunstformen der Natur (Internet exhibition of original drawings: http://caliban.mpiz-koeln.mpg.de/~stueber/haeckel/kunstformen/liste.html
The modern molecular clock: http://awcmee.massey.ac.nz/people/dpenny/pdf/BromhamPenny_2003.pdf