(Garrard 2009)

Embed Size (px)

Citation preview

  • 8/6/2019 (Garrard 2009)

    1/16

    Cognitive archaeology: Uses, methods, and results

    Peter Garrard

    University of Southampton School of Medicine, Division of Clinical Neurosciences, Southampton General Hospital,

    LD69 South Path and Lab Block, Southampton SO16 6YD, UK

    Received 9 April 2008; received in revised form 25 July 2008; accepted 29 July 2008

    Abstract

    The earliest stages of cognitive decline in cases of slowly progressive dementia are difficult to pinpoint,

    yet detection of the preclinical period of the illness is likely to be of significant importance to under-

    standing Alzheimers disease and other slowly progressive dementias at both clinical and biological levels.

    A number of authors have used retrospective analysis to describe preclinical linguistic decline in written

    texts and spoken language samples. This paper reviews the methods available for classifying and

    comparing such samples, and presents some exploratory analyses of historical texts derived from verbatim

    records of preclinical spoken activity. Change in the nature of the language used by Harold Wilson (Prime

    Minister of the United Kingdom 1964e1970 and 1974e1976) is quantified in the light of a later diagnosis

    of probable Alzheimers disease and historical uncertainties about his final months in office.

    2008 Elsevier Ltd. All rights reserved.

    Keywords: Alzheimers disease; Mild cognitive impairment (MCI); Textual analysis; Digital stylometry

    1. Introduction

    Functional reserve is a property of many biological systems whose performance depends on

    the cumulative effects of populations of similarly-structured subunits. Mammals are, for

    example, endowed with pairs of organs (lungs, kidneys, adrenal glands, and gonads), whose

    physiological effects under normal conditions are not detectably changed if one of the pair is

    lost or otherwise rendered inoperative, and the other partially compromised. Unpaired organs,

    such as the liver and heart, will also continue to meet circulatory and metabolic/digestive

    E-mail address: [email protected]

    0911-6044/$ - see front matter 2008 Elsevier Ltd. All rights reserved.

    doi:10.1016/j.jneuroling.2008.07.006

    Journal of Neurolinguistics 22 (2009) 250e265www.elsevier.com/locate/jneuroling

    mailto:[email protected]://www.elsevier.com/locate/jneurolinghttp://www.elsevier.com/locate/jneurolingmailto:[email protected]
  • 8/6/2019 (Garrard 2009)

    2/16

    demands in the face of marked depletion of their constituent cells. Given the obvious adaptive

    advantages of this redundancy to creatures competing for reproductive success in a Hobbesian

    environment, it would be surprising if the organ system that mediates behaviour, learning,

    perception, planning, decision making, and communication was not similarly endowed.

    In its mature state, the analogy with other paired organs is not applicable to the hemisphericstructure of the brain (though the similarity may hold during development (de Bode & Curtiss,

    2000; Vicari et al., 2000)). Moreover, the demonstrable heterogeneity of function within

    different regions of the cerebral cortex means that the effects of a focal insult are seldom

    completely ameliorated by compensatory activity in undamaged regions. Nonetheless, the

    brains capacity to support normal levels of cognitive activity in the face of gradual decline in

    the structural and functional integrity of its constituent elements implies that a degree of

    redundancy is indeed built into the systems architecture, a point that is strikingly illustrated by

    the not uncommon finding of marked cerebral atrophy on CT or MRI brain scans of cognitively

    normal elderly subjects (Matsubayashi et al., 1992).

    The existence of a functional reserve capacity in the brain (Fig. 1) is supported, and can to

    a limited extent be quantified, by postmortem studies of the nigrostriatal systems of aged brains.

    Neuronal depletion within these paired mid-brain structures produces the classical, idiopathic

    form of Parkinsons disease (a syndrome of progressive motor dysfunction characterized by the

    emergence of tremor, rigidity and loss of dexterity), whose earliest effects are often relieved by

    the pharmacological supplementation of the neurotransmitter dopamine. Postmortem exami-

    nation of the striatum has revealed cases with marked degrees of dopamine depletion associated

    with only mild motor symptoms at the time of death, suggesting a significant, if variable,

    functional reserve capacity inherent in the system (Bernheimer et al., 1973).

    The notion of a reserve capacity for more global measures of cognitive function has alsobeen upheld by postmortem studies focusing on the common causes of late-onset dementia

    syndromes. The most pervasive of these is Alzheimers disease (AD), which gives rise to

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    Years

    %I

    ntegrity

    C

    B

    A

    0 5 10?

    Neuronal function

    Cognitive function

    Fig. 1. Illustrative representation of the course of a neurodegenerative condition at the functional and neuronal levels.

    The diagram includes three key points in the clinical evolution of dementia: the onset of the earliest symptoms (point

    A); the diagnosis of the condition (point B); and death (point C). The duration of the period to the left of point A is

    unknown.

    251P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    3/16

    a progressive and irreversible decline in a range of cognitive abilities, typically beginning with

    episodic memory (Galton et al., 2000). The clinical features of AD normally begin in the sixties

    or seventies, with diagnosis depending on the recognition of the typical pattern of symptom-

    atology and the exclusion of other, more unusual causes of late-life cognitive decline. Verifi-

    cation of the diagnosis, however, requires the demonstration of amyloid plaques (AP) andneurofibrillary tangles (NFT) within the substance of the brain, normally at autopsy.

    Although postmortem examination is carried out in only a minority of cases, careful corre-

    lations between clinical and pathological findings have revealed a complex relationship between

    these two descriptive levels. The seminal studies ofBraak and Braak (1995) traced a characteristic

    sequence of NFT formation that followed a neuroanatomical pathway through entorhinal, limbic,

    and finally isocortical stages. Tomlinson, Blessed, and Roth (1970) demonstrated that cognitive

    function could be preserved in the presence of established degenerative disease, suggesting that

    clinical dementia, like Parkinsonism, may occur when pathological change exceeds a certain

    threshold level. More recent community-based surveys of autopsy findings in an unselected

    sample of elderly people in the United Kingdom found evidence of vascular and/or degenerative

    changes in almost 80% (Neuropathology Group of MRC CFAS, 2001), a figure that sits in striking

    contrast to estimated clinically-defined dementia prevalence rates among the oldest old of

    around 25% (Fichter et al., 1995). Perhaps most striking of all was a cross-sectional neuropath-

    ological study of incidental Alzheimer changes in postmortem brains across a wide age spectrum,

    which suggested that a disease whose clinical manifestations typically appear in the seventh or

    eighth decade of life may begin to develop in early adulthood (Ohm et al., 1995).

    The ability to identify and measure the earliest phase of AD e ie after the earliest patho-

    logical changes but before the patient meets diagnostic criteria for dementia (ie anywhere to the

    left of point B in Fig. 1)e

    could provide important insights into the phenomenon ofcognitive reserve. Since the duration of this presymptomatic period reflects the capacity of the

    reserve, any marked degree of variability would clearly be of further interest e as well as

    enormous socioeconomic importance e if any environmental factors (eg diet, education,

    intellectual engagement in later life) could be shown to be positively or negatively correlated

    with it. Before discussing existing and future attempts to acquire this information, however, the

    clinical characteristics of patients with established AD will be briefly reviewed.

    Neuropsychological studies of AD have revealed a cumulative pattern of deficits mapping on to

    the anatomical pattern of progression described by Braak and Braak (1995): episodic memory

    deficits (attributable to mesial temporal and limbic involvement) usually occur in the earlieststages,

    while effects on semantic memory, visuospatial skills, word production, and executive function(indicating disruption of neocortical regions) emerge later. In a minority of (usually younger onset)

    cases, bimanual praxis is the earliest and clinically dominant feature (biparietal variant AD)

    (Galton et al., 2000; Ross et al., 1996). Given the language systems complexity and dependence on

    multiple and widespread cortical regions, it is perhaps not surprising that detailed studies of

    language processing in AD have provided some of the most valuable contributions at the neuro-

    psychological level. Analyses of individuals and groups have demonstrated disruption in produc-

    tion and comprehension at both word and sentence level (Croot, Hodges, & Patterson, 1999; Croot

    et al., 2000; Kempler et al., 1998), and disintegration of semantic memory (Garrard et al., 1998).

    2. Measuring cognitive reserve

    Neuropsychological data is not normally acquired until there are already clinical grounds for

    making a diagnosis of AD (ie some way down the cognitive function slope illustrated in Fig. 1),

    252 P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    4/16

    and is therefore unable to tell us anything about the trajectory of the line prior to point B.

    Moreover, many studies have employed assessment techniques based on standardised tasks

    such as word fluency and picture naming, all of which may be subject to effortful compen-

    sation, or to ceiling or floor effects depending on premorbid educational level (OCarroll &

    Ebmeier, 1995). Finally, because theoretically motivated methods of evaluation are designed totest hypotheses about functional organisation, such tests tend to be sensitive to a relatively

    narrow subset of deficits.

    The importance of the earlier, prediagnostic, period of patients cognitive histories has given

    rise to the concept of mild cognitive impairment (MCI), a state in which a patient may be

    aware of and report symptoms of cognitive dysfunction, but which is at too early a stage of

    progression to justify a diagnosis of dementia (between points A and B in Fig. 1)

    (Bruscoli & Lovestone, 2004). As we have seen, diagnosing AD is probabilistic (ie a judge-

    ment of the future likelihood of finding plaque and tangle pathology in the brain), and the

    recognition of MCI adds a further layer of uncertainty e namely the need to distinguish

    between an essentially stable state of mild impairment (eg due to anxiety, depression, or the

    ageing process) and one that is destined to deteriorate at some future time (ie those in the

    earliest stages of AD).

    It is perhaps not surprising that patients with MCI are neuropsychologically heterogeneous

    (Nordlund et al., 2005), nor that the proportion of MCI cases who go on to develop dementia

    within a year is highly variable, in some samples lower than 10% (Bruscoli & Lovestone,

    2004). Consequently, using the duration of MCI as a surrogate for the cognitive reserve

    capacity is difficult to justify, and highlights the need for a retrospective approach that allows

    a reliable index of the duration of the preclinical period to be reproducibly obtained.

    A number of studies have already demonstrated how this might be effectively achieved usingarchived language samples dating back years, or even decades, before the onset of cognitive

    symptoms. Such outputs are free from the distorting effects that knowledge or suspicion of

    incipient cognitive decline might have on performance, and are interpreted under three basic

    assumptions: 1) that the material in question is reliably datable; 2) that there are measurable

    differences between the characteristics of such samples from individuals with normal and

    disordered cognition; and 3) that these differences become more pronounced with progression

    of the disease.

    If these conditions are met, then the onset of any relevant change in a text corpus should be

    identifiable. This will, in turn, allow objective and reproducible estimates of the duration of the

    presymptomatic and preclinical phases of the disease to be made. If obtained from large enoughcohorts of affected individuals, such measurements could provide insights into the factors

    determining variations in preclinical states, and suggest strategies for optimizing them. Progress

    towards this goal has come from retrospective analyses of language samples that have been

    recorded or archived for various reasons.

    2.1. The Nun study

    This ongoing longitudinal study traces incident dementia among members of a religious

    order using interval neuropsychological assessment and postmortem examination. The study

    has also examined premorbid linguistic data produced by participants as many as fifty yearsbefore the appearance of the earliest symptoms of AD (Snowdon, 2003). Between 1931 and

    1943, at ages of between 18 and 32 years, subjects were required to write their autobiographies

    on entry into the order. When, many years later, these texts were analysed for measures of

    253P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    5/16

    syntactic complexity and idea density, lower scores on both dimensions predicted poorer

    performance on memory and other cognitive tests many decades later. Intriguingly, a subgroup

    analysis identified reliably lower initial idea density scores in individuals in whom Alzheimer

    pathology was demonstrated at postmortem. The latter finding was interpreted as suggesting the

    existence of common factors underpinning both neurocognitive development and susceptibilityto AD (Snowdon et al., 1996).

    Updated autobiographies, written by a subset of the original entrants in the late 1950s and

    again in the late 1980s, were used for a longitudinal comparison, which supported an apparently

    linear decline on both measures over the course of the lifespan (Kemper et al., 2001).

    Surprisingly, the rate of decline did not differ between those who were later diagnosed with

    dementia and those who remained cognitively healthy into later life though a similar study using

    data from more regular language assessments of a different group of volunteers did demonstrate

    a difference in the rate of decline in idea density ( Kemper, Thompson, & Marquis, 2001).

    Several aspects of the Nun Study data are clearly relevant to the question of accurately

    estimating cognitive reserve: the first is that the retrospective linguistic data employed (ie the

    written diary entries) were not only naturalistic but free from any compensatory biases that

    might derive from an awareness of being tested. A second is the uniformity, over a number of

    behavioural and demographic dimensions, of the participants themselves. As Kemper notes

    (Kemper et al., 2001):

    Participants in the Nun Study have led relatively homogeneous adult lives. Participants

    have the same reproductive and marital histories, have similar social activities and

    support throughout their adult lives, have similar occupations and incomes, have equal

    access to preventative health and medical care, and do not smoke or drink alcohol

    excessively [p. 238].

    Because some or all of these lifestyle factors are likely to be important in determining the

    robustness of the cognitive reserve, however, informative variations in the linguistic data may

    be missed when study participants are well matched. Moreover, those subjects who were fol-

    lowed up over their entire lifetime were observed on at most three occasions e perhaps

    insufficient to detect subtle changes in language samples predating the clinical onset of

    dementia. Although Kemper et al. assumed that the decline seen in both the demented and non-

    demented groups was linear, a larger number of observations might have demonstrated

    a departure from linearity e for example, a longer maintenance period followed by a precipitate

    decline in successfully aging subjects, and a more gradual decline in those who developeddementia (see Fig. 2). Such longitudinal differences would be compatible with variations in the

    cognitive reserve.

    2.2. The Iris Murdoch study

    The celebrated English novelist and philosopher Iris Murdoch (1922e1999) was diagnosed

    with Alzheimers disease in 1997, following deterioration in her cognitive abilities, particularly

    marked in the domain of language. Postmortem examination of her brain later confirmed the

    presence of diagnostic amyloid plaques and neurofibrillary tangles as the dominant pathological

    feature. Murdoch may have been the first to notice her own decline; in an interview published inThe Observer, she commented on an uncharacteristic writers block that had plagued her while

    she was writing her final novel, Jacksons Dilemma, in 1995. To look for prediagnostic

    Alzheimer-like characteristics in the language of this work, Garrard et al. (2005) studied stylistic,

    254 P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    6/16

    syntactic and lexical attributes, comparing them with works composed at earlier periods in her

    four-and-a-half decade long writing career. To enhance the power of the analyses digitised

    versions of the complete texts were used. Concordance software (Watt, 2002) generated word

    lists, type-to-token ratios, and collocation (pairs of word types occurring at fixed intervals from

    one another) statistics. Word and character counts were used to derive sentence length distri-butions, as an indirect index of syntactic complexity (Rosenberg & Abbeduto, 1987).

    Stylistic and syntactic analyses revealed no detectable differences between the three works.

    Whether this reflected a relative preservation of syntactic ability in the language disorder of

    Alzheimers disease e at least in some cases (Croot et al., 1999) e or, perhaps more likely, the

    insensitivity of the methods used to assess them (Bates et al., 1995; Kempler et al., 1998), is

    unclear. The comparison did, however, detect differences in the lexical characteristics of the

    three books. Data indicated an initial phase of enhancement over the first twenty years of IMs

    career, as measured by both the variety (cumulative type-to-token ratios) and frequency (using

    published norms (Francis, 1967)) of the vocabulary. This was followed by a later decline on

    both measures over the last twenty-five years of her life. All these findingse

    the absence of anystructural variation, and the marked difference in word frequency without a similar effect of

    word length, coupled with a more repetitive and higher frequency vocabulary e mirrored the

    changes that have been consistently documented in the spontaneous spoken language of early

    Alzheimers disease sufferers (Croisile et al., 1995; Croisile et al., 1996; Garrard et al., 1998;

    Kremin et al., 2001).

    Indirectly, the Murdoch data also implied the existence of a detectable gradient in abnormal

    linguistic characteristics, as formal neuropsychological testing two years after AD was diag-

    nosed demonstrated a similar, though more severe, impoverishment of vocabulary, semantic

    impairment, frequency-dependent anomia, and a surface dysgraphia (Garrard et al., 2005).

    A more decisive demonstration of this putative gradient, however, would clearly requirea comparison of like with like, and a more extensive survey of the impressive literary output

    from the last fifteen years of Murdochs working life (Table 1) is likely to yield information

    about its nature and temporal characteristics.

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    0 5 10 15 20

    Years

    %I

    ntegrity

    Neuronal function

    Rapid cognitive decline

    'Successful aging'

    Fig. 2. Hypothetical course of two dementia sufferers with the same rate of neuronal degradation: the slowly progressive

    case possesses a larger cognitive reserve than the more rapidly progressive, and therefore enjoys more symptom free

    years.

    255P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    7/16

    2.3. Studies of archived spoken language

    Spoken output generally requires a greater degree of spontaneity than written, offers feweropportunities for off-line revision, and may therefore be a more sensitive indicator of change.

    One of the earliest retrospective language studies was carried out by Brian Butterworth, using

    televised speeches of the former U.S. President Ronald Reagan. Occurrence rates for errors in

    both content and syntax, and for abnormally long word-finding pauses, were significantly

    higher during Reagans debates against Walter Mondale in 1984 than during similar events in

    1980, when he was campaigning against the incumbent President Carter [unpublished data].

    Reagan was famously diagnosed with Alzheimers disease in 1994 e five years after the end of

    his second term as President. The announcement of the diagnosis, and the implications of its

    progressive nature, gave rise to speculation about Reagans mental performance while he was

    still in office. The journalist Lesley Stahl, for example, describes an interview with the Pres-

    ident in which a vacant Reagan barely seemed to realize anyone else was in the room (Stahl,

    1999). Regardless of the significance of such one-off anecdotal observations however, the

    similarities between Butterworths language error data and the language problems characteristic

    of Alzheimers disease (Schwartz & Moscovitch, 1990), would suggest that the earliest

    cognitive effects of the disease were detectable at least ten years before a diagnosis was made.

    3. Automated discourse analysis

    The techniques used to define differences between texts and between samples of continuousdiscourse that have been described so far e such as deriving measures of syntactic complexity

    and idea density, and comparing lexical frequency rates using published databases e have for

    the most part been normative, top-down methods. Yet this general approach has obvious

    disadvantages: the most obvious is its labour intensiveness, which inevitably limits the size of

    the text samples to which it can be applied; for informative studies to be conducted on large,

    longitudinal written and spoken samples, this is clearly impractical. A second difficulty is that

    reliance on lexical frequency norms alone as an index of linguistic change is subject to error,

    because i) low frequency words tend to be under-represented in the available databases, and ii)

    word usage is subject to prevailing fashion and other transient influences that may not have

    been current when the norms were compiled.Automated, data-driven methods of analysis offer a potential solution to all these difficulties,

    and considerable progress has been made in the field of text classification over the past decade

    (Feldman & Sanger, 2007; Forsyth, 1999). Various techniques have been validated in the fields

    Table 1

    Titles and years of publication of Iris Murdochs last eight novels

    Title Year published

    The Sea, The Sea 1978

    Nuns and Soldiers 1980The Philosophers Pupil 1983

    The Good Apprentice 1985

    The Book and the Brotherhood 1988

    The Message to the Planet 1990

    The Green Knight 1994

    Jacksons Dilemma 1995

    256 P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    8/16

    of authorship attribution (Love, 2002), genre analysis (Stamatatos, Fakotatis, & Kokkinakis,

    2000) and topic identification (Clifton, Cooley, & Rennie, 2004), providing the basis for a range

    of methods for specifying differences between texts, all of which can be rapidly implemented

    using digital text samples as input. A comprehensive survey of these methods is beyond the

    scope of the present article, but in view of the potential usefulness to the enterprise ofpresymptomatic discourse analysis in cognitive ageing, they will be briefly reviewed.

    3.1. Digital stylometry

    Burrows pioneered a method for quantifying differences between texts based on the means

    and standard deviations of the proportional frequencies of the n commonest words across

    a corpus of contemporary texts of the same genre (Burrows, 2004). The mean of the z-trans-

    formed values associated with each word in the target texts yields a summary statistic (Delta),

    the magnitude of which varies inversely with similarity. Burrows showed that pairs of texts

    originating from male and female authors, from Northern and Southern hemispheres, and from

    the 19th and 20th centuries all yielded higher Delta values in between- than within-group

    comparisons (Burrows, 2003).

    Burrows Delta depends on the frequency distributions of the commonest word types, the

    majority of which are grammatical (function) words. Similar measures based on mid- or low-

    frequency usages (which include more lexical, or content words) are differentially sensitive to

    texts of differing length (Burrows, 2006).

    3.2. N-gram analysis

    An extension of the word-count method is to use the frequencies of any recurring feature of

    a text from letters upwards, and compare the occurrences of each across samples. N-grams

    above the letter level can be flexibly defined in terms of words, parts of speech (using auto-

    mated parsing routines), and letter or word collocations, allowing rapid automated comparison

    of texts over a range of different dimensions of interest. In the field of forensic linguistics the

    method has proved sensitive to differences at lexical, syntactic, and stylistic levels (Chaski,

    2004). The approach has also proved successful as a basis for authorship attribution and topic

    identification (Peng, Schuurmans, & Wang, 2004).

    3.3. Entropy

    Juola (2003) has proposed a method for estimating the inherent redundancy in a piece of

    continuous discourse or text. In the framework of information theory (Shannon & Weaver,

    1949) entropy is proportional to the number of binary decisions required to determine an

    unknown value. Where the values in question are letters of the alphabet, successful discovery of

    an unknown could be achieved heuristically by asking sequentially in which half of the

    alphabet, which half of that half, and so forth, the target letter is located. This algorithm would

    be needed to identify any member of a truly random sequence of letters, but the multiplicity of

    constraints that apply to connected discourse greatly reduces the candidate letters that may

    complete a fragment of text. A method for arriving at a comprehensive estimate of similaritybetween two documents based on one such constraint (ie the tendency for similar strings of

    characters to recur), is to determine the average number of consecutive characters in one

    document that matches all possible character n-grams within the other (Wyner, 1996). In the

    257P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    9/16

    extreme case in which the two texts are identical, the value would be simply determined by the

    number of characters they (it) contained. In other pairs, higher values would result from more

    frequent usages of larger combinations of words. The method would therefore be suitable for

    estimating differences between an index text and multiple subsequent outputs by the same

    author to identify and timestamp the onset of progressive degrees of deviation.For these methods to be accepted as appropriate to the analysis of text or discourse passages

    in the field of cognitive ageing, they must first be shown to be reliably associated with the

    presence of underlying cerebral pathology (Garrard, in press). If they can, then in individuals

    who have left behind a datable record of spontaneous verbal activity spanning the presymp-

    tomatic, preclinical and symptomatic periods of disease, it should be possible robustly to

    identify the earliest vestiges of cognitive change. The usefulness of such a marker to the study

    of variations in cognitive reserve has already been discussed, but the detection and dating of

    AD like changes in archived language may also prove important in other spheres. By way of an

    illustrative case-study I will outline the methods, and some preliminary results, from work

    currently in progress relating to a British Prime Minister, Harold Wilson, and the reasons for his

    sudden and unexplained resignation from office.

    4. The Harold Wilson project

    The political sphere provides a source of spoken language samples, faithfully transcribed

    and saved for posterity since the late 19th century, when Thomas Curson Hansard introduced

    the Official Report (usually referred to simply as Hansard, after its founder). Hansardcontains

    transcripts of all spoken activity in the two Houses of Parliament. Although it cannot and does

    not always report every word said by a Member, departures from verbatim are seldomnoticeable, and typically reflect deletions of repeated words, fillers and particles, as well as

    corrections of departures from grammatical convention. To illustrate this point, a recent extract

    from the Hansard version (A) and a verbatim transcript (B) taken from a live recording, with

    altered segments underlined, is reproduced below.

    [A]

    The Prime Minister: I thank my hon. Friend for taking up the cause of veterans in her

    constituency. She is absolutely right; last week the Health Secretary announced that

    veterans would be accorded priority treatment in the national health service, as they

    should be. He also announced that there will be a new community-based veterans mentalhealth care service, which will run for the next two years with independent evaluation.

    There are 150 mental health professionals working throughout defence, employed by the

    Ministry of Defence, and we are determined to do what we can to support not only our

    veterans but all those in our armed forces who do an outstanding job and to whom we owe

    a debt of gratitude and a duty of care. (Hansard, 2007).

    [B]

    The Prime Minister: Let let let me thank my honourable Friend for taking up the cause of

    vet- veterans in her constituency and shes absolutely right that last week the Health

    Secretary announced that er veterans would be accorded priority treatment in the nationalhealth service, as they should be. He has also announced that therell be a new

    community-based veterans mental healthcare service, and that will run for the next two

    years, with independent evaluation. There are hundred and fifty mental health

    258 P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    10/16

    professionals working across defence, er through employment by the MOD, and we are

    determined to do what we can to support not only our veterans, but all those in our armed

    forces who do an outstanding job and to whom we owe a debt of gratitude and a duty of

    care.

    Naturally, some parliamentary speeches recorded in Hansard would have been read from

    texts that may not even have been written by the speaker, so sampling of the archive should be

    limited to sessions in which verbal exchanges are less carefully planned. Prime Ministers

    Questions (PMQs), a twice weekly1 opportunity for members to interrogate the Prime Minister

    (or a deputy, if he is absent) on a range of matters, would seem to meet these requirements.

    Although the questions asked at PMQs are prepared in advance, these can easily be eliminated

    from the text to be analysed. The Prime Ministers responses, follow-up questions, and

    subsequent exchanges are, for the most part, unscripted. Indeed, the practice of speaking from

    a prepared script during PMQs attracts disapproval, if not derision2.

    Longitudinal analysis of the speeches of one celebrated victim of late-life cognitive declinehas the potential to contribute to the resolution of a longstanding historical dispute. Harold

    Wilson (HW) is one of the most fascinating characters to have appeared on the British political

    stage in recent times, and the motive for his unexpected resignation during a third term as Prime

    Minister in 1976 remains one of the great unsolved mysteries of British politics. HW was noted

    for his intellectual gifts and academic precocity, his prodigious memory, astute political sense,

    and razor-sharp wit in debate (Pimlott, 1993). His unforeseen resignation in the middle of

    a third term as Prime Minister has been variously attributed to an alleged involvement with the

    KGB (Mitrokhin, 2000), the impact of negative propaganda spread by rogue elements within

    the security services (Wright, 1987), and even a plot to replace him forcibly with an emergency

    administration headed by Lord Louis Mountbatten. A more prosaic explanation, however, isthat in the months leading up to March 1976, HW was becoming aware of a progressive mental

    blunting which, much later, would turn out to have been the preclinical phase of a progressive

    degenerative dementia, very probably Alzheimers disease (Pimlott, 1993).

    The existence of precisely dated language samples from HW and his contemporaries

    therefore raises the historically significant possibility that the time course of this preclinical

    period may be able to be retrospectively determined. To carry out a textual analysis on so large

    a scale, top-down methods would certainly be impractical. I will therefore present some

    preliminary results obtained from the Hansard archive using methods broadly similar to those

    described above under the heading of Digital stylometry.

    Transcripts of PMQs that were held while HW was Prime Minister (ie firstly between

    October 1964 and June 1970 and secondly between March 1974 and April 1976) were obtained

    and converted to ASCII format using optical character recognition software. Markers were

    added to identify the date at each change of year and month, while the identity and party

    affiliation of every speaker was recorded at the beginning of any speech or contribution to

    debate. The texts of questions themselves were omitted because they had been prepared in

    advance and would in some cases have been read from a script. Unattributable comments and

    interjections from the floor were also removed, as were entries that recorded the reading of

    a report or communique.

    1 Until 1997 PMQs were held on Tuesday and Thursday mornings, Current practice is for the event to be held once

    a week, on Wednesdays.2 Judging by regular entries in the record such as, Honourable Members: Reading!.

    259P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    11/16

    Three twelve-month epochs were selected for analysis: April 1965eMarch 1966; April

    1969eMarch 1970; and April 1975eMarch 1976 (the month immediately preceding HWs

    resignation announcement). These periods contain over 200 separate PMQs sessions, at which

    HW answered in person in all but 16 (when the deputy leader or a senior Cabinet minster

    responded in his absence). If HWs resignation was, as hypothesised, influenced by his growingawareness of incipient cognitive decline (or, to use modern terminology, the emergence of the

    pre-Alzheimer MCI state), then ex hypothesi we should expect consistent differences in his

    output across epochs, that would not be detectable in the records of other speakers.

    As before, Concordance (Watt, 2002) was used to generate word lists: the corpus contained

    537,932 word tokens, and 12,993 unique word types (proper nouns included). Words associated

    with a frequency of 1 (hapaxlegomena), accounted for 4765 items. There were 1854 words with

    frequency 2, and 1033 with frequency 3, resulting in a heavily skewed distribution. The mean

    word frequency was 42.2, with standard deviation 588.9. Pearson analysis of a subset of these

    Table 2

    The 30 most frequently used words in the entire text sample, and their overall occurrence rates by epoch (expressed as

    a percentage of the total number of words in the epoch) in utterances made by HW (right hand column) and by all other

    speakers (left hand column)

    All other speakers HW

    Percentage of all words used in: Percentage of all words used in:

    Word or lemma 1965e1996 1969e1970 1975e1976 1965e1996 1969e1970 1975e1976

    THE 7.58 7.6 8.33 7.07 7.23 7.95

    BE (all grammatical forms) 4.63 4.25 4.06 4.84 3.55 4.28

    OF 3.45 3.55 3.49 6.30 3.35 3.33TO 3.15 3.09 3.09 3.25 3.06 2.84

    THAT 2.75 2.52 2.64 2.57 2.32 2.30

    IN 2.12 2.36 2.2 2.10 2.48 2.19

    A or AN 1.96 1.89 1.71 1.95 1.82 1.68

    I or ME 1.89 1.73 1.86 2.65 2.36 2.83

    AND 1.82 1.86 1.81 1.89 1.96 1.86

    HAVE (all grammatical forms) 1.73 1.78 1.66 2.03 2.10 2.09

    HONOURABLE 1.42 1.65 1.57 1.48 1.84 1.80

    WILL or WOULD 1.33 1.35 1.43 1.07 1.22 1.05

    HE or HIM 0.99 0.99 1.1 0.54 0.70 0.60

    NOT 1.14 0.99 0.9 1.06 0.91 0.96

    IT 1.1 0.97 0.88 1.17 1.01 0.99FOR 0.87 0.96 1 0.87 1.02 0.99

    RIGHT 0.92 0.96 0.94 0.79 0.90 0.85

    THIS 1.1 0.96 0.75 1.14 0.99 0.89

    MY 0.63 0.88 0.93 0.62 0.93 1.03

    WE or US 1.1 0.81 0.83 1.45 0.99 0.99

    AS 0.77 0.7 0.68 1.41 0.74 0.82

    WHICH 0.66 0.77 0.62 0.69 0.82 0.73

    DO (all grammatical forms) 0.68 0.64 0.71 0.62 0.59 0.64

    WITH 0.64 0.74 0.64 0.72 0.82 0.75

    FRIEND(S) 0.49 0.66 0.72 0.50 0.77 0.84

    BY 0.54 0.59 0.58 0.55 0.67 0.65

    MINISTER(S) 0.58 0.48 0.48 0.20 0.14 0.18PRIME 0.51 0.44 0.46 0.08 0.08 0.08

    GOVERNMENT(S) 0.42 0.44 0.44 0.40 0.38 0.44

    THERE 0.5 0.41 0.38 0.57 0.46 0.43

    260 P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    12/16

    values together with their published frequency norms (Brown, Kucera and Francis, andThorndike Lorge (Brown, 1984; Francis, 1967; Thorndike & Lorge, 1944)), did not reveal any

    significant correlations between the internally derived and published values (R 0.13 [Brown];

    R 0.29 [K & F]; R 0.19 [T-L]), supporting the suggestion made earlier that word frequency

    Table 3

    The 30 most frequently used content words in the entire text sample, and their overall occurrence rates by epoch

    (expressed as a percentage of the total number of words in the epoch) in utterances made by HW (right hand column)

    and by all other speakers (left hand column)

    All other speakers HWProportion of all words in: Proportion of all words in:

    Word or lemma 1965e1966 1969e1970 1975e1976 1965e1966 1969e1970 1975e1976

    AGREE 0.18 0.13 0.17 0.16 0.07 0.12

    ANSWER 0.17 0.14 0.13 0.17 0.15 0.15

    AWARE 0.25 0.24 0.23 0.08 0.15 0.12

    BRITISH 0.11 0.10 0.11 0.08 0.07 0.08

    COUNTRY 0.12 0.15 0.18 0.08 0.14 0.17

    FRIEND 0.50 0.67 0.73 0.42 0.62 0.70

    GENTLEMAN/GENTLEMEN 0.55 0.47 0.33 0.59 0.65 0.34

    GOVERNMENT 0.42 0.45 0.45 0.37 0.38 0.39

    HOUSE 0.34 0.41 0.42 0.32 0.43 0.49LAST 0.19 0.18 0.19 0.22 0.21 0.21

    MANY 0.12 0.13 0.14 0.11 0.14 0.13

    MATTER 0.17 0.24 0.22 0.17 0.27 0.29

    MEMBER 0.20 0.21 0.23 0.23 0.22 0.29

    MINISTER 0.58 0.49 0.48 0.11 0.14 0.10

    MORE 0.16 0.17 0.19 0.16 0.16 0.16

    OPPOSITION 0.07 0.09 0.16 0.07 0.07 0.21

    ORDER 0.16 0.22 0.11 0.05 0.05 0.03

    PART 0.09 0.11 0.08 0.09 0.14 0.10

    PARTY 0.07 0.06 0.17 0.06 0.06 0.18

    PEOPLE 0.09 0.10 0.15 0.05 0.06 0.08

    POINT 0.11 0.16 0.11 0.09 0.09 0.08POLICY 0.15 0.11 0.19 0.13 0.10 0.16

    PRIME 0.52 0.44 0.47 0.08 0.08 0.08

    PUBLIC 0.06 0.07 0.15 0.05 0.06 0.11

    QUESTION 0.44 0.46 0.33 0.49 0.56 0.40

    SECRETARY 0.13 0.12 0.15 0.09 0.12 0.12

    STATE 0.07 0.09 0.15 0.06 0.06 0.11

    STATEMENT 0.13 0.08 0.11 0.14 0.09 0.11

    THINK/THOUGHT 0.35 0.26 0.16 0.51 0.36 0.21

    TIME 0.20 0.20 0.20 0.23 0.20 0.21

    Table 4

    Values ofWfor pairwise comparisons (using Wilcoxsons signed rank test) between language attributable to HW and all

    other speakers during each of the three epochs studied. Comparisons reaching statistical significance are printed in bold

    HW 69e70 HW 75e76 All 65e66 All 69e70 All 75e76

    HW 65e66 1.10 1.52 L2.02 1.75 L2.59

    HW 69e

    70 1.35 0.96 1.44L

    2.41HW 75e76 0.46 0.42 1.44

    All 65e66 0.01 0.95

    All 69e70 0.42

    261P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    13/16

  • 8/6/2019 (Garrard 2009)

    14/16

    complexity, are essentially top-down methods, which use the theoretical assumptions of neu-

    ropsychological models to characterize a piece of discourse. Although these methods have the

    advantage of an empirical basis that allows one to know what to look for as well as how and

    why to look for it, they are limited in scope and are frequently dependent on data (eg lexical

    frequency) that may not be universally applicable across languages, cultures and time periods.By contrast, metrics such as cumulative typeetoken ratios and the relative distributions of

    lexical types will vary as a result of the lexical choices of the speaker or writer. It could be

    argued that such choices are likely to be highly individual specific rather than reflections of

    group membership or neuropsychological condition. Stylometric analysis has certainly been

    successful in distinguishing the work of two individuals, but its sensitivity to distinctions at

    group level e century, sex, (English-speaking) country of birth e attest to collective as well as

    individual influences.

    Of course it does not follow automatically from this that the presence or absence of

    degenerative neuropathology delineates a group in the same sense, though this is an empirical

    question that remains to be resolved. It also remains to be seen whether the power that auto-

    mated stylometric analysis derives from being applied to literary texts many thousands of words

    in length is sufficient to deal with the very much smaller samples that are usually produced in

    the course of day-to-day life. If degenerative cognitive decline does have a stylometric

    signature and descriptive methods are available to detect it, then the scope for further insights

    into the origins and natural history of these common and devastating disorders will be

    considerably enhanced.

    References

    Bates, E., Harris, C., Marchman, V., Wulfeck, B., & Kritchevsky, M. (1995). Production of complex syntax in normalaging and Alzheimers-disease. Language and Cognitive Processes, 10(5), 487e539.

    Bernheimer, H., Birkmeyer, W., Hornykiewicz, O., Jellinger, K., & Seitelberger, F. (1973). Brain dopamine and the

    syndromes of Parkinson and Huntington Clinical, morphological and neurochemical correlations. Journal of the

    Neurological Sciences, 20(4), 415e455.

    de Bode, S., & Curtiss, S. (2000). Language after hemispherectomy. Brain and Cognition, 43(1e3), 135e138.

    Braak, H., & Braak, E. (1995). Staging of Alzheimers-disease-related neurofibrillary changes. Neurobiology of Aging,

    16(3), 271e278.

    Brown, G. D. A. (1984). A frequency count of 190,000 words in the London-Lund Corpus of English Conversation.

    Behavioural Research Methods Instrumentation and Computers, 16, 502e532.

    Bruscoli, M., & Lovestone, S. (2004). Is MCI really just early dementia? A systematic review of conversion studies.

    International Psychogeriatrics, 16(2), 129e140.

    Burrows, J. (2003). Questions of authorship: attribution and beyonde a lecture delivered on the occasion of the Roberto

    Busa Award ACH-ALLC 2001, New York. Computers and the Humanities, 37(1), 5e32.

    Burrows, J. (2004). Textual analysis. In S. Schreibman, R. Siemans, & J. Unsworth (Eds.), A companion to digital

    humanities (pp. 323e347). Oxford: Blackwell.

    Burrows, J. (2006). All the way through: testing for authorship in different frequency strata. Literary and Linguistic

    Computing fqi067.

    Chaski, C. E. (2004). Forensic linguistics: an introduction to language, crime and the law. International Journal of

    Speech Language and the Law, 11(2), 298e303.

    Clifton, C., Cooley, R., & Rennie, J. (2004). TopCat: data mining for topic identification in a text corpus. IEEE

    Transactions on Knowledge and Data Engineering, 16(8), 949e964.

    Croisile, B., Adelein, P., Carmoi, T., Aimard, G., & Trillet, M. (1995). Evaluation of spelling in Alzheimers-disease.

    Revue De Neuropsychologie, 5(1), 23e

    51.Croisile, B., Ska, B., Brabant, M.-J., Duchenne, A., Lepage, Y., Aimard, G., et al. (1996). Comparative study of oral and

    written picture description in patients with Alzheimers disease. Brain and Language, 53(1), 1e19.

    Croot, K., Hodges, J. R., & Patterson, K. (1999). Evidence for impaired sentence comprehension in early Alzheimers

    disease. Journal of the International Neuropsychological Society, 5(5), 393e404.

    263P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265

  • 8/6/2019 (Garrard 2009)

    15/16

  • 8/6/2019 (Garrard 2009)

    16/16

    Stahl, L. R. (1999). Reporting live. New York: Simon and Schuster.

    Stamatatos, E., Fakotatis, N., & Kokkinakis, G. (2000). Automatic text categorisation in terms of genre and author.

    Computational Linguistics, 26, 471e495.

    Thorndike, E. L., & Lorge, I. (1944). The teachers word book of 30,000 words. New York: Teachers College, Columbia

    University.

    Tomlinson, B. E., Blessed, G., & Roth, M. (1970). Observations on the brains of demented old people. Journal of the

    Neurological Sciences, 11(3), 205e242.

    Vicari, S., Albertoni, A., Chilosi, A. M., Cipriani, P., Cioni, G., & Bates, E. (2000). Plasticity and reorganization during

    language development in children with early brain injury. Cortex, 36(1), 31e46.

    Watt, R. J. C. (2002). Concordance. Dundee.

    Wright, P. (1987). Spycatcher. Heinemann.

    Wyner, A. J. (1996). Entropy estimation and patterns. In Workshop on Information Systems and Information Theory.

    Haifa, Israel.

    265P. Garrard / Journal of Neurolinguistics 22 (2009) 250e265