Metodo Automatizado Analisis Contenido Psicoterapia

  • Upload
    jose

  • View
    240

  • Download
    0

Embed Size (px)

Citation preview

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    1/19

    This article was downloaded by: [Universidad De Concepcion]On: 06 October 2014, At: 18:42Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House37-41 Mortimer Street, London W1T 3JH, UK

    Psychotherapy ResearchPublication details, including instructions for authors and subscription information:

    http://www.tandfonline.com/loi/tpsr20

    Automated method of content analysis: A device for

    psychotherapy process researchSergio Salvatore

    a, Alessandro Gennaro

    a, Andrea Francesco Auletta

    a, Marco Tonti

    a&

    Mariangela Nittia

    aDepartment of Pedagogy, Psychology, and Teaching Science , University of Salento ,

    Lecce , Italy

    Published online: 16 Jan 2012.

    To cite this article:Sergio Salvatore , Alessandro Gennaro , Andrea Francesco Auletta , Marco Tonti & Mariangela Nitti

    (2012) Automated method of content analysis: A device for psychotherapy process research, Psychotherapy Research, 22:3,256-273, DOI: 10.1080/10503307.2011.647930

    To link to this article: http://dx.doi.org/10.1080/10503307.2011.647930

    PLEASE SCROLL DOWN FOR ARTICLE

    Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of thContent. Any opinions and views expressed in this publication are the opinions and views of the authors, and

    are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon anshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveor howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

    This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

    http://dx.doi.org/10.1080/10503307.2011.647930http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10503307.2011.647930http://www.tandfonline.com/page/terms-and-conditionshttp://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1080/10503307.2011.647930http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10503307.2011.647930http://www.tandfonline.com/loi/tpsr20
  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    2/19

    Automated method of content analysis: A device for psychotherapy

    process research

    SERGIO SALVATORE, ALESSANDRO GENNARO, ANDREA FRANCESCO AULETTA,

    MARCO TONTI, & MARIANGELA NITTI

    Department of Pedagogy, Psychology, and Teaching Science, University of Salento, Lecce, Italy

    (Received 4 October 2010; revised 4 November 2011; accepted 28 November 2011)

    Abstract

    The work presents a computer-aided method of content analysis applicable to verbatim transcripts of psychotherapy: theAutomated Co-occurrence Analysis for Semantic Mapping (ACASM). ACASM is able to perform a context-sensitivestrategy of analysis aimed at mapping the meanings of the text through a trans-theoretical procedure. The paper is devotedto the presentation of the method and testing its validity. To the latter end we have compared ACASM and independentblind human coders on two tasks of content analysis: (a) estimating the semantic similarity between two utterances; (b) the

    semantic classification of a set of utterances. Results highlight that: (a) ACASMs estimates of semantic similarity areconsistent with the corresponding estimates provided by coders; (b) coders agreement and coder-ACASM agreement onthe task of semantic classification have the same magnitude. Results lead to the conclusion that the content analysisproduced by ACASM is indistinguishable from that performed by human coders.

    Keywords: qualitative research methods; technology in psychotherapy research and training; content analysis;

    meaning

    Introduction

    Consistent with Freuds definition of psychotherapy

    as the talking cure, psychotherapy process research

    has since its very beginning commonly focused on

    the communicative exchange unfolding within ses-sions. Many methods of process analysis have been

    developed for investigating such an exchange (e.g.,

    Colli & Lingiardi, 2009; Dahl, Kachele, & Thoma,

    1988; Dimaggio & Semerari, 2004; Goncalves,

    Matos, & Santos, 2009; Greenberg & Pinsof, 1986;

    Luborsky & Crits-Christoph, 1990; Mergenthaler,

    1996a; Perry, 1991; Salvatore, Gelo, Gennaro,

    Manzo, & Al Radaideh, 2010). A good proportion

    of these methods of process analysis is based on

    verbatim transcripts of sessions*exclusively or to-

    gether with other kind of data (e.g. data concerning

    non-verbal behaviour). Consequently, the develop-ment of the efficacy and efficiency of methods of

    textual analysis is worth considering as a major task

    for psychotherapy process research (Mergenthaler,

    1996b). This study intends to contribute to such

    development, through the presentation of a bottom-

    up automated method of content analysis of texts.

    Semantic Analysis: Top-Down Versus

    Bottom-Up Methods

    The method presented in this study belongs to the

    family of models focusing on the semantic level of

    text (henceforth: semantic analysis). These methodsare aimed at mapping the content of the text, namely

    the meaning it conveys. Semantic analysis is essential

    for psychotherapy process research. Psychotherapy is

    an exchange of meanings (Angus & McLeod, 2004;

    Dimaggio & Semerari, 2004; Hermans & Hermans-

    Jansen, 1995; McNamee & Gergen, 1992; Salvatore

    et al., 2010; Salvatore & Venuleo, 2008; Santos,

    Goncalves, Matos, & Salvatore, 2009) and therefore

    it is hard to consider deepening our understanding of

    it without taking into account the content of what

    patient and therapist say.

    Within semantic analysis it is worth differentiatingbetween top-down methods and bottom-up methods.

    Top-down methods are based on pre-defined coding

    systems according to which units of texts

    are categorized. The Core Conflictual Relational

    Theme (Luborsky & Crits-Christoph, 1990), the

    Defence Mechanism Rating Scale (Perry, 1991), the

    Correspondence concerning this article should be addressed to Alessandro Gennaro, University of Salento, Department of Pedagogy,

    Psychology, and Teaching Science, via stampacchia, Lecce, 73100 Italy. Email: [email protected]

    Psychotherapy Research, May 2012; 22(3): 256273

    ISSN 1050-3307 print/ISSN 1468-4381 online # 2012 Society for Psychotherapy Research

    http://dx.doi.org/10.1080/10503307.2011.647930

    http://dx.doi.org/10.1080/10503307.2011.647930http://dx.doi.org/10.1080/10503307.2011.647930
  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    3/19

    Collaborative Interactions Scale (Colli & Lingiardi

    2009) and the Innovative Moments Coding System

    (Goncalves, Ribeiro, Mendes, Matos, & Santos

    2011; Goncalves, Ribeiro, Matos, Santos, &

    Mendes, 2010) are examples of top-down semantic

    methods. In general terms, they consist of a repertoire

    of categories of content working as coding system

    and of a set of rules for applying the categories to the

    text. Bottom-up methods pursue the same aim ofmapping the meaning of the text, but they do not

    adopt a pre-defined coding system. Rather, like the

    logic of Grounded Theory (Glaser & Strauss, 1967;

    Rennie, 2000), these methods start from the text

    and define the coding categories together with the

    mapping of the textual content*through an iterative

    interpretative procedure. Task analysis (Greenberg &

    Pascual-Leone, 2001; Pascual-Leone, Greenberg,

    & Pascual-Leone, 2009) is an example of this iterative

    way of working. It starts from a set of theoretical

    assumptions that are deliberately used for orienting

    the extrapolation of sequences of events of change. In

    turn, observed sequences can lead to the modificationof the original theoretical assumptions and therefore

    to further observations.

    The Contextuality of Meaning: Implications for

    Semantic Analysis

    The meaning of a linguistic sign (a word, a sentence) is

    inherently dynamic and contextual (Salvatore, 2011,

    2012; Valsiner, 2007; for a discussion of this general

    tenet in the field of psychotherapy, see Gennaro,

    Al-Radaideh, Gelo, Manzo, Nitti, & Salvatore, 2010;

    Greenberg, & Pinsoff 1986; Salvatore et al., 2010;Salvatore, Gennaro, Auletta, Grassi & Rocco, 2011). It

    is not a fixed, pre-established content (e.g., an idea, an

    image, a concept) held in the sign itself; rather, it

    emerges from the way the linguistic signs combine with

    each other in the contingency of the talk (Linell, 2009;

    Salvatore & Valsiner, 2011; Wittgenstein, 1953/1958).

    Thus, understanding the meaning of the signa means

    mapping with which other signs a occurs, in the specific

    context of its use.

    This pragmatic, dynamic and contextual defini-

    tion of meaning provides a way to appreciate the

    inherent multidimensionality and fuzziness of mean-

    ing. In the concrete circumstance of communication,signs always occur within an array of connections

    with many other signs; therefore, meaning depends

    on how the interpreter selects some of these connec-

    tions as pertinent, leaving others in the background.

    In sum, meaning is not in the text, but in the

    constructive, hermeneutic relationship between text

    and interpreter.

    Semantic analysis of text, therefore, cannot be

    performed in terms of the application of context-

    blind rules of coding*namely, if the word x occurs,

    then this means that content A has occurred; rather,

    inferential reconstruction of the linguistic and/or

    extra-linguistic context of the text is required. In

    other words, the specific interconnections that words

    create within that particular text must be taken into

    account*namely, word x in the context of its

    connection with words y and z means A; but in the

    context of its connection with words m and n itmeans B.

    Thus far, automated procedures of semantic

    analysis have not proved able to take into account

    efficaciously the contextuality of meaning. And this

    has prevented the spread of this kind of procedure

    within psychotherapy research. As a result, the

    semantic methods adopted in psychotherapy re-

    search are currently based on human judgment.

    Yet, the use of human coders raises several metho-

    dological, metric and organizational problems that

    place a considerable constraint on the heuristic

    potentialities of this kind of method.

    First of all, semantic analysis is usually verylabour-intensive and time-demanding work: it re-

    quires time, people, and hours and hours of work.

    This hinders the possibility of generalizing the

    application of semantic methods across cases and

    researchers. We are led to consider the methodolo-

    gical fragmentation of contemporary process re-

    search related to this constraint: the work required

    for developing the competence for applying a coding

    system*and for reaching a satisfying agreement

    among coders*entails a level of commitment that

    can often be expressed only by the group of

    researchers working on developing the coding system

    itself.

    Secondly, the codersinferences will be always and

    in any case endowed with an irreducible subjective

    valence that cannot but have negative consequence

    on the levels of reliability, and therefore on the

    semantic methods power of revealing significant

    relationships. On the other hand, in the case of

    semantic analysis, the problem of reliability cannot

    be considered merely in terms of error of measure-

    ment; rather it reflects the inherent multidimension-

    ality of meaning: the variability among coders stems

    from the fact that the text is open to many different

    levels of interpretation. Consequently, increasing thereliability of semantic analysis requires clarifying and

    sharing the hermeneutic criteria according to which

    the coders reduce the multidimensionality of mean-

    ing. In this way a specific semantic map of the text is

    constructed. In accordance with this perspective,

    many efforts have been put into making the rules of

    coding clearer and more specific and forcing the

    coders to use procedures of consensual validation

    (Lambert & Ogles, 2009; Lutz & Hill, 2009); yet,

    Automated content analysis in process research 257

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    4/19

    given the high level of inference inherently implied in

    these methods, these solutions cannot be fully

    resolutive. And above all, they make the semantic

    methods even more work and more time-consuming.

    The above considerations lead us to conclude that

    an alternative way is worth pursuing: the develop-

    ment of bottom-up procedures of semantic analysis

    based on explicit, invariant rules of coding and yet

    able to take the contextuality of meaning intoaccount. Procedures of this kind would represent a

    highly significant contribution to the growth of

    psychotherapy process research. On the one hand,

    they would allow the automated implementation of

    the semantic analyses. On the other hand, they

    would provide a shared ground supporting and

    constraining the (at least to date) non-renounceable

    human inferential judgments, so to increase the

    inter-coder agreement as well as the comparability

    among textual analysis.

    Purpose of the Study and Hypothesis

    This study intends to present an automated bottom-

    up procedure of semantic analysis, Automated Co-

    occurrence Analysis for Semantic Mapping

    (ACASM), and to provide a first test of its validity.

    ACASM constructs a map of the text in terms of

    thematic nuclei active in it. It works through

    invariant, ostensible, yet context-sensitive proce-

    dures, defined in terms of computational algorithms.

    Due to these characteristics, ACASMs procedures

    are: (a) implementable through automated routines

    carried out by computer; (b) reproducible reliably

    across analyses and analysers; (c) able to produce avalid representation of the textual data (Lancia,

    2002).

    The current paper pursues two complementary

    aims. First, the ACASM method is presented

    together with an exemplification of its application

    to a case of psychotherapy. Second, an initial

    empirical test of ACASM validity is performed. As

    concerns the latter point, we adopt a Turing-like

    criterion of validity (for similar logic, see Rosenberg,

    Schnurr & Oxmann, 1990; Steinbach, Karypis &

    Kumar, 2000). Following this criterion, ACASM

    could be considered a valid semantic method if and

    only if the analysis it produces cannot be distin-guished from those produced by expert human

    coders. We adopted this criterion because in the

    case of bottom-up semantic analysis it is not possible

    to refer to an external, objective normative criterion

    in accordance with which to evaluate the validity of

    the analysis in absolute terms. Meaning is multi-

    dimensional and therefore any text permits many

    representations of its semantic content. Conse-

    quently, we assumed that in order that an automated

    bottom-up procedure of semantic analysis could be

    considered valid, such a procedure has to produce a

    map of the text whose level of agreement with the

    maps produced by expert coders is comparable with

    the level of agreement that coders show with each

    other.

    Our hypothesis is that ACASM passes the Turing-

    like test of validity.

    Method

    ACASMs Conceptual Framework

    ACASM is an example of a bottom-up method of

    semantic analysis. This is so because it does not start

    with a pre-established repertoire of thematic con-

    tents in accordance with which the units of analysis

    are classified. Rather, the repertoire of thematic

    contents working as a coding system is produced

    by the analysis itself.

    ACASM belongs to a set of methods focused on

    the co-occurrence of words (Carli & Paniccia, 2007;Lancia, 2002; Reinert, 1986)*that is, the way the

    words combine with each other within the same unit

    of analysis into which the text is segmented (gen-

    erally, the unit of analysis consists of an utterance or

    a group of a few utterances). The co-occurrence of

    words is taken as a criterion of similarity for

    clustering the units of text. That is, the units of

    analysis are clustered in accordance with the words

    co-occurring within them: units of text holding the

    same co-occurring words are considered similar and

    therefore grouped. The rationale is that a set of co-

    occurring words marks a specific thematic content

    (named thematic nucleon too). Therefore, unitshaving a certain set of co-occurring words in

    common share the thematic content marked by

    such a set. In this way, the procedure of semantic

    analysis is able to provide a fine level of semantic

    representation, coding each unit of analysis in terms

    of a specific content*namely, the one marked by the

    set of co-occurring words according to which the

    unit has been clustered.

    From a conceptual point of view, the reference to

    co-occurrence of words within the same unit of

    analysis can be considered a way of taking into

    account the linguistic level of the contextuality of

    meaning*namely the level consisting of the way the

    words are combined within the text.

    ACASMs Procedure of Analysis

    ACASM is performed in terms of invariant algo-

    rithms implemented automatically by ad hoc soft-

    ware on the basis of parameters of analysis

    established by the researcher (Alceste, T-LAB).

    258 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    5/19

    We adopted the procedure implemented by the

    software T-LAB (Lancia, 2002), in the version T-

    LAB PRO_XL2. T-LAB PRO_XL2 is able to

    analyse textual data of various languages (English,

    Italian, Spanish, Portuguese, German).

    ACASM is implemented through four steps,

    which take about 1 hour of work, performed by

    even only one researcher (the dimension of the

    textual dataset affects only marginally the durationof the procedure).

    Step 1. Segmentation of transcripts. ACASM

    works on the textual dataset (henceforth: corpus) as

    defined by the researcher in accordance to the aim of

    the study. The corpus may consist of the verbatim

    transcript of the patient and/or therapists talk,

    concerning all or only sampled sessions. ACASM

    divides the corpus into units of analysis*each of

    them called an elementary context unit (ECU). An

    ECU consists of a group of a few contiguous

    utterances.The dividing of the text into ECUs has to find a

    point of equilibrium between two requirements

    dialectically linked to each other: interpretability

    and specificity. On the one hand, the segments

    have to be long enough to be interpretable in terms

    of thematic content. On the other hand, the longer

    the segments are, the greater the likelihood is that

    each segment may not be associated with a specific

    thematic content. The point of equilibrium between

    interpretability and specificity is an empirical issue

    (varying according to the language). After a series of

    trials and simulations, we have got to the point (to

    date) of defining the following criterion (for

    the English language): (a) each ECU begins with

    the character just subsequent to the last character of

    the previous ECU; (b) each ECU ends with the first

    punctuation mark (., or !, or ?) occurring after

    the 250th character from the first character (i.e.

    punctuation marks occurring before the 250th char-

    acter are not considered for closing the ECU); (c) at

    any rate the ECUs length must not be more than

    500 characters; therefore, the ECU in any case ends

    with the last word remaining within this limit, even if

    no punctuation mark has occurred.

    As one can note, the formulation of the criterion isexpressed in terms of characters. This is so because

    the ACASMs algorithm adopts characters as basic

    computational unit*namely the lexical units are

    defined as the string of characters encompassed

    between two empty characters. Nevertheless, pre-

    vious application of this criterion on psychotherapy

    transcripts (Salvatore et al., 2010) has shown that it

    leads to definitions of units of text endowed with

    semantic meaningfulness.

    Step 2. Selection of the lexical forms and

    construction of the dictionary. Depending on its

    size, a textual corpus can hold even several thou-

    sands of lexical forms. Lexical forms play the role of

    variables in the ACASM procedure (see step 3).

    Consequently, it is necessary to reduce them to a

    number suitable for the constraints of the following

    multidimensional analysis (see step 4), which re-

    quires a reduction in the dispersion of the datamatrix.

    This task is performed through two sequential

    sub-steps.

    Firstly, the procedure singles out all the lexical

    forms present in the text and categorizes them

    according to the lemma they belong to. A lemma is

    the citation form (namely, the headword) used in a

    language dictionary to refer to a lexeme (i.e., a set of

    word forms having the same lexical root and mean-

    ing). For example, word forms such as go,goes,

    going and went have go as their lemma;

    childand childrenhave childas their lemma.

    The output of this sub-step is the list of lemmaspresent in the textual corpus.

    The second sub-step is the selection of a subset of

    lemmas within the list of lemmas. This sub-set

    constitutes the dictionary the following analysis will

    be based on. To this end, 10% of the whole list of

    lemmas is selected. Selected lemmas are the most

    frequent ones*yet the 5% highest-frequency lem-

    mas are excluded by the ACASM dictionary. The

    exclusion is motivated by the fact that the higher the

    frequency of the lemma the less it contributes to

    discriminating among the ECUs: high-frequency

    lemmas (e.g., words like and, to, of) tend to

    be present in too many ECUs, therefore enter too

    many patterns of co-occurrences. This criterion of

    exclusion has been determined through a prelimin-

    ary empirical work of approximation; however, it is

    consistent with the lexical-statistical logic grounding

    several methods of textual analysis (Bolasco, 1999).

    It is worth noting that, because of the high

    frequency of the most commonly used words, the

    10% percentage of lemmas included in the ACASM

    dictionary corresponds to the level of coverage of the

    text considered acceptable in literature, namely

    about 7085% of the occurrences as a whole,

    depending on the dimension of the textual corpus

    (Bolasco, 1999; Lancia, 2002).

    Step 3. Digital representation of the text.The

    reduction of the original text into ECUs and the

    identification of the lemmas active in the corpus

    allows the text to be transformed into a digital matrix

    representing the distribution of lemmas in ECUs (in

    binary terms: present/absent). The matrix has all

    ECUs displayed in rows and the lemmas in columns;

    Automated content analysis in process research 259

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    6/19

    the value 1 in the generic cell xij represents the

    presence of the jth

    lemma in the ith

    ECU, the value

    0otherwise (Table I).

    Step 4. Identification of cluster of ECUs/

    co-occurring lemmas and classification of the

    ECUs. A Cluster Analysis (CA; Aldenderfer &

    Blashfield, 1984) is applied to the matrix. Inciden-

    tally, note that CA incorporates a previous procedureof Multidimensional Lexical Correspondence Ana-

    lysis, transforming the binary variables of the original

    data matrix into continuous classificatory dimen-

    sions. Cluster Analysis groups the ECUs using the

    co-occurrence of lemmas as the criterion of similar-

    ity: the higher the number of lemmas shared by two

    ECUs, the higher is the probability that these two

    ECUs are grouped in the same cluster. Therefore, in

    the final analysis, each cluster obtained is a set of

    utterances (i.e., of ECUs) that share many lemmas

    among them. According to this criterion of similar-

    ity, the ACASM considers a given cluster the marker

    of a thematic content which is active in the text and

    semantically characterizes the ECUs grouped in that

    cluster (see below, the section Semantic Interpreta-

    tion of the ACASM Output). The number of clusters

    in which the text is segmented is defined in

    accordance with an iterative algorithm; the proce-

    dure of clustering stops when further partitions do

    not further produce a significant improvement of the

    inter/intra cluster ratio, which means that increasing

    the number of clusters does not produce an appreci-

    able increment of information.

    A complementary output of the Cluster Analysis is

    the assignation of each ECU to the cluster withwhich it has the highest index of association. In this

    way, each ECU is marked with the most representa-

    tive cluster representing one of the thematic contents

    extrapolated by the Cluster Analysis. (Table III

    shows the most representative ECUs, in English

    translation, of the 14 clusters defined in the case

    analysed in the current study, together with their

    interpretation.)

    Before concluding the presentation of the method,

    it is worth noting that though ACASMs computa-

    tional rules (i.e., the operative criteria according to

    which the text is segmented, lemmatized and the-

    matically clustered) are invariant, they can bemodified in accordance with the aim of the research-

    er. For instance, if the researcher is interested in

    analysing patients feelings concerning the marital

    couple, she could find it useful in distinguishing two

    lemmas for any word denoting a feeling: one lemma

    concerning the word when associated with the

    marital couple and the other concerning the word

    when used outside such domain.

    Semantic Interpretation of the ACASM Output

    The interpretation is provided by the researcher.

    Since each cluster represents a subset of ECUs

    sharing lemmas tending to co-occur in the same

    utterances, it can be understood as a thematic

    nucleon made up of a set of words whose aggregation

    reflects the shared presence of certain semantic traits

    (Lancia, 2005). It is worth noting that the words

    composing the set may have various kinds and

    degrees of semantic relationship among them (e.g.,

    they may be synonymous, as in muchand a lot,

    antonymous, as in good and bad, connected

    functionally, as in car and trip, and so forth).

    The interpretation of the content of the set is based

    on the identification of such a network of semantic

    relationships.

    Characteristics of ACASM

    Before concluding the presentation of ACASM, it is

    worth pointing out three peculiar characteristics of

    the method.

    1. Though the process of human comprehension

    of texts is a highly debated issue (Kintsch, 1988;

    Landauer & Dumais, 1997; Visetti & Cadiot,2002), in general terms one can assume that

    human bottom-up semantic analysis requires

    the implementation of two basic complemen-

    tary functions. Firstly, semantic analysis con-

    sists of the evaluation of semantic similarity

    between the units of analysis (e.g., groups of

    words, utterances, groups of utterance, and so

    on) into which the text is segmented. Thus,

    utterances considered to have a similar semantic

    content are grouped together and this leads to

    the identification of a semantic/thematic nu-

    cleon. For instance, utterances concerning

    trouble at work, conflicts within the familyand health issues can be clustered in terms

    of their sharing of the content: undesirable,

    Table I. A hypothetical example of digital representation of the text:I went home. Kate was still therein terms of the matrix ECU/lemma

    ECU/Lemmas I Go Home Kate Be Still There

    I went home 1 1 1 0 0 0 0

    Kate was still there 0 0 0 1 1 1 1

    260 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    7/19

    problematic events. On the other hand, seman-

    tic analysis implies an operation of categoriza-

    tion: utterances are attributed to the semantic

    nucleon that is the most representative of their

    content. ACASM performs the same two basic

    functions of human coders bottom-up seman-

    tic analysis too. It does so through context-

    sensitive computational rules, namely the

    multidimensional analysis of the distribution

    of co-occurrence through ECUs.

    2. We do not claim that ACASMs parametersand computational rules are the same as those

    used by human coders. On this point we keep

    an open position, though some studies lead

    one to think that human comprehension of text

    is also based on computational rules similar

    to multidimensional analysis (Landauer &

    Dumais, 1997; Visetti & Cadiot, 2002). What

    we maintain is that, given their context-

    sensitiveness, ACASMs computational rules

    are functionally equivalent to human coders

    procedures: ACASM reproduces the same basic

    functions*evaluation of semantic similarity

    and classification*as human coders bottom-up semantic analysis.

    3. ACASM is assumed to be functionally equiva-

    lent to a model of human bottom-up semantic

    analysis based on commonsense, namely to a

    human coder interpreting the textual content

    guided by no specific theoretical criterion, but

    based on the basic cultural and linguistic

    competence in terms of which she/he commu-

    nicates, understands and interprets in daily life

    (Garfinkel, 1967; Valsiner, 2007).

    Data Source

    The present study concerns a sample of verbatim

    transcripts, extracted from a good-outcome Italian-

    speaking 124-session psychotherapy (the Katja

    case). Katja received a Cognitive-Constructivist

    Therapy for Narcissistic Disorder (Dimaggio, Her-

    mans, & Lysaker, 2010; Semerari, Dimaggio,

    Nicolo, Procacci, & Carcione, 2007). The treat-

    ment lasted three and a half years; according to

    several independent analyses Katjas therapy was

    considered a good-outcome therapy (for details, see

    the review proposed by Nicolo & Salvatore, 2007).

    Good outcome proved to be maintained a year

    later in follow-up (Dimaggio & Semerari, 2001).

    Analysis was performed on the transcripts of 48

    sessions of the third and last stage of the psy-

    chotherapy (from session 74 to session 121,

    corresponding to the last year and half of psy-

    chotherapy*note that the last three sessions were

    left out because they were participated in by othersubjects than the therapeutic dyad). We decided to

    concentrate our analysis on just the last part of the

    psychotherapy because one can expect that the

    patient-therapist talk is subjected to a process of

    specialization in the use of words*namely certain

    combinations of words become progressively more

    and more probable while others become more and

    more improbable; empirical evidence supporting

    this hypothesis on the same case is provided by

    Salvatore, Tebaldi and Pot (2009). Therefore,

    given that our analysis is a first test of validity of

    the method, we preferred to focus on the lastportion of clinical dialogue, where patterns of co-

    occurrences should be more differentiated and

    therefore more efficiently distinguishable in

    clusters.

    Following a dialogical clinical approach (Gennaro

    et al., 2010) the whole transcript of the sessions,

    encompassing both patient and therapist talk, was

    inserted into the analysis.

    Design

    Analysis of the thematic contents and theirtemporal evolution. First, we applied the

    ACASM procedure to the textual corpus and

    interpreted the clusters defined in terms of their

    thematic content. Second, in order to take into

    account the temporal evolution of the thematic

    contents, we divided the period of therapy analysed

    into three sub-periods (sub-period A, sessions 74

    89; sub-period B, sessions 90105; sub-period C,

    sessions 106121). The incidence of each thematic

    Table II. Descriptive parameters of the textual corpus subjected to analysis (Katja case)

    Descriptive parameters Amount

    Sessions 48

    Number of elementary context units (ECUs) of the text 5548

    Number of elementary context units (ECUs) clustered 5054

    Number of occurrences in the text (token) 146673

    Number of lemmas in the text (type) 7258

    Number of lemmas in analysis 726

    Number of cluster extracted 14

    Automated content analysis in process research 261

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    8/19

    Table III. Katja cases ACASM output: clusters most representative ECUs, and their semantic interpretation

    ECU Thematic interpretation

    or so so then on Sunday morning, so see the time he poor thing was trying to give me that

    freedom I took advantage of it in an extreme way but because he was giving me the freedom

    exactly but in his opinion it was not permission from his view point he just took a decision yes

    from my point of view, it was (74; 136.562)

    1. Own vs. others point of

    view

    Its just that I have to be able not to care about it or at least I have to understand his Katja s

    view point both or if I understand my view point and the other person s view point then I say

    that the other person is right (83; 81.255)Yes exactly, probably, and this is from your viewpoint, we know, we know each other quite

    well, and from this point of view you have to take the responsibility and if you look sincerely at

    your thoughts you cant think that its the other person that have to notice it, here you have to

    discipline yourself, you know, you feel it, two points (79; 73.475)

    I dont want to be balanced no, it s just that I also understand my viewpoint, now the difficulty

    is understanding both of them, lets say mediating so as to perform actions that somehow are

    good for yourself without making you feel too guilty or anyway, without knowing my

    viewpoint (83; 54.308)

    its clear that changing this perspective changes the way of seeing the defects of others, and of

    yourself and all, and also the relationship and this is a change in the vision of yourself, of Katja

    and of the relationship, and the general vision of your issues in other things, the vision of

    yourself (75; 61.268)

    2. Differences in perspectives

    This inner torment continually between the choice and continuing to have that perspective

    which however was not confined to the view of other things but really of emotions and feelings

    linked to. . .

    and its what we were saying about challenging the choice each two or threeseconds because if I have another perspective linked to a different sensation (92; 24.638)

    its not like buying a Ferrari because its one thing to buy a car and another to buy a Ferrari, in

    sum between one million six hundred and the thirty million that a Ferrari costs there is a huge

    difference, but between one million six hundred and thirty million which is the price of a car

    theres also a difference, but its always less than that between a million six hundred and a

    Ferrari (86; 24.107)

    its true in the sense that as it were, your story is like that, but with dad I understand, yes I

    agree, but what we were saying last time, its one thing if someone doesnt understand me, I

    had some problems too, then when I had problems I wasnt able to explain myself, I mean,

    when I explain myself, when I say one plus one equals two and then if you want to do as you

    like its as she said (114; 23.364)

    I mean I tried to explain some things to him, to tell him after that episode of the bloodiness

    and so . . . I actually see that its all pointless, all quite pointless, but he insists on a specific

    topic that is the daughter who needs to be treated, no, no, not on the topic of the daughter that

    needs to be treated (109; 148.387)

    3. Concerns for relational

    problems

    Yes, the couple doesnt work but not only the couple doesnt work, I dont work and neither

    do you, that means it becomes a false happiness because children are absolutely are like pet

    animals so they feel what happens (98; 92.132)

    in those aspects you are highlighting, the relationship with your parents, now it seems silly,

    you are highlighting some daughter-like aspects, that is, those are aspects related to

    dependence, the car, upkeep, and obviously in this position you feel bad, as soon I m in this

    position I cant stand it (94; 90.888)

    Im calming down but if you assume a more stable identity people trust it too much and thus

    you get bored because nothing happens (87; 88.162)

    A. and I gave our parents some presents, for my mother we bought a pair of shoes, we gave the

    same present each other, more or less the same gift . . .because between us we never give any

    gifts. . . how come between Alberta and me? (75; 126.676)

    4. Exchange of presents

    his, mine that is because when I went to get it he was very kind, he said here it is, I wanted to

    give it to you for your birthday, its an engagement gift, its nicehe was very kind to try to

    connect the gift with what you are feeling (77; 110.891)But silly example if someone knows that on my birthday I like to receive flowers, I tell you this

    the first year, the second year, do I have to tell you the third year? Or do I say please will you

    give me some flowers? Ah, what did you give them to me for? (98; 79.356)

    in this hour we did some window shopping do you like this bag?I said, yes, nice, sure,

    well, I was thinking he was going to give me a present and so, you know, that was my birthday

    present (121; 68.016)

    more than attacked I felt misunderstood but also not respected yes is it more important to be

    not respected or to be misunderstood? Well, I think that its a result, that is, maybe

    misunderstood in the sense that there is no effort to relate to someone else and therefore to

    understand them and try to respect what has been understood (79; 50.439)

    5. Experience of feelings in the

    relationship

    262 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    9/19

    Table III (Continued)

    ECU Thematic interpretation

    You are imagining yourself in our work? And you are saying: Im imagining, lets do a meta-

    thought, right, ok lets make a meta-thought and you are saying there is the risk that I could be

    here, pay attention exactly to the way you described yourself before (118; 31.874)

    moustache I think that I will need his help to continue my therapy, that is the therapy goes on

    by itself, but we need his help for the therapy. Lets think how it could be presented, and then

    actually also the sense of our proposal, where he can help us or where I could convince him to

    come? (116; 23.848)

    we are talking about a sensation of emptiness which you are describing eh but the silence of

    last time in the way you described it to me may help me to understand. . .that is Im trying to

    visualize the scene, the inner scene in this moment its more or less like this, there is an inner

    feeling about something inside that, isnt there? (100; 16.484)

    I mean even if I had some difficulties understanding those signals of attention, that doesn t

    concern these episodes, its about something else, different circumstances, but not these

    episodes even if I have some difficulties reading the attention signals that are given eh, anyway

    I dont care about it (98; 88.81)

    6. Experience of difficulty in the

    relationship

    guys eh of course broadly speaking your difficulty is to admit to yourself that you re involved

    in a relationship, somehow it was really I still have difficulties, but if this is the difficulty, as I

    said, have more (119; 88.767)

    maybe its a difficulty related to being able to live inside the world of others, able to move, you

    see? In the world, in general even without the relationship thats under way its a difficulty in

    directing this energy which is anyway activated, that is it s somewhere (100; 83.576)

    Im trying to check with you and with everyone and this is a difficulty that also belongs to you:

    P: the difficulty is being afraid that the emotions could be too big to be constrained, controlled

    or anyway felt, I dont know (118; 42.205)

    Im leaving again and thats all, anyway on Wednesday I decided to take a day off because that

    friend I study with, lucky her, passed the written part of the Police entrance exam to become

    an officer and now she has the oral exam and she said obviously I want to become a

    magistrate (111; 52.298)

    7. Work activities

    so I dont have any working identity simply because I dont work because Im doing that

    public exam so its pointless because I have or I dont have difficulties, that is, the issue of

    working identity doesnt exist, if it arises when Im working the difficulty or the limit will exist

    but at the moment it doesnt (94; 43.806)

    the following days I wanted to sleep in the morning but I couldnt either on Tuesday or

    Wednesday because I had to go to work because there was something to do, so I didn t relax

    on Tuesday or Wednesday but in the afternoon I did my stuff, I had a wax, I went out with a

    friend of mine thats (89; 42.793)

    another thing that I must say now I realized that I was led astray by you insisting so much on

    making choices without following your nature, which meant working three times harder S but

    visibly working hard for my magistrates exam I thought you were referring to that kind of

    fatigue in the sense that its one thing to (96; 38.153)

    Well, some things happened so you couldnt not link together the frame of mind with what

    happened V well, but you could have connect the frame of mind to what happened in general

    with your dad, which probably created your frame of mind, a basic sadness, yes but it was also,

    you see, (104; 189.949)

    8. Account of negative feelings

    it was due to the suffering that I feel each time I meet or I hear my dad, the frame of mind it s

    that frame of mind, I didnt cry because Im narcissistic, but anyway sometimes I shed a tear,

    yeah, very often, but this is not the point because anyway what this meeting gave you (115;

    125.425)

    its a period that it seems to me that Ive been living with this struggle for ages, fighting,

    improving myself, but I dont enjoy myself, Ive had enough! Its a drag, I get bored, laughter,

    I dont understood your frame of mind K., Im sorry my frame of mind (79; 110.369)

    because obviously Ill become bad, Ill be bad, Ill be bad, I dont know, I dont feel as if Im

    bad, but at times when he says hes sorry, it seems to me that he expresses his upset, but others

    see it as being bad (88; 100.742)

    And so you also have to accept all the consequences on you, this lack of sensibility from your

    dad, that is if I accept that others can make mistakes, I also accept, I dont understand I dont

    think its so easy for you to accept that people in general could be more or less sensitive could

    have a different degrees of sensitivity, why cant these ones? (114; 56.695)

    9. Tolerance of negative feelings

    Automated content analysis in process research 263

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    10/19

    Table III (Continued)

    ECU Thematic interpretation

    I cant do it, I cant feel this connection, that is to do it mentally, I can do that, I can do it,

    anyway you dont want to accept that its an aspect of you, yes but I accept it, but what would

    it mean in concrete terms? Accepting that Im working _t hard, I accept that, I feel it, but I

    cant understand which part is holding off at a distance, and at a distance from what? (87;

    39.952)

    Yes, its like saying that there is an aspect of loving, of taking care, I mean thats ok, accepting

    that it has a limit, accepting that there is a degree of suffering on the part of someone else that

    we cant do anything about, no? And accepting that a degree of guilt feeling, ok could I

    suggest something you could read? (84; 37.911)

    going into a closed agency has no sense because it makes me waste time that I could use

    differently doing more interesting things, so yes it troubles me, ok, so you are telling me that

    now you are able to manage your troubles in a more natural way, that is we can say (93;

    32.784)

    This? No ah, I dont have to but the first, the first moment _no, this Saturday and Sunday he

    has An and I dont but Monday is a holiday yes Monday is a holiday and we will be together,

    but it doesnt count as a weekend, Monday really I will not make it weigh, this Saturday and

    Sunday he is with An, and Katja? (105; 72.368)

    10. Leisure

    anyway, its better than before, thats all, then its always the same struggle with money he

    complained all winter and he still goes on I wont be able to have any holidays, my god, my

    god, my god, but in fact he goes away every weekend, now he s leaving for ten days I dont

    know where in the mountains and then maybe hell go to his relatives in France, down in YY

    (97; 38.19)

    alone, and yes laugh he is going to the gym, poor him, yes nice I like him, no it s something I

    like, hes nice, then slowly slowly in the following days, in the following days, in the following

    days uh. Wednesday and thats all because on Thursday weve decided to go away not to go

    away (77; 29.37)

    we had a weekend alone on 4 September because An went to OO to a child s birthday party,

    so we now at 4 October, and October has 4 weekends and I say see if you can take one, good,

    otherwise anyway Im living a life where I get up every day at 8.00 including Saturdays and

    Sundays (102; 24.718)

    ah, no no, so I had to phone to get information the day after this marriage was too much, the

    day after the marriage we had to leave for two days in NN but the weather wasn t good on

    Sunday and so the ferries werent leaving (97; 22.734)

    relatively its _not that Im leaving but goodbye and thank you from a view point obviously I

    mean goodbye and thank you its not directed to my parents, goodbye and thank you no, no of

    course not the fact of being a daughter, not the fact of being daughter, of being maintained its

    one of the aspects connected to being a daughter, but not the whole thing I imagine, no, its a

    very important part of it (94; 94.047)

    11. Adherence to others

    expectations

    there isnt a dialogue so its impossible and so he needs to be surrounded by people or

    someone to say yes, yes, the partner or whoever say yes, of course he s smart, yes, hes good,

    hes good, but whats hes good at? (103; 88.427)

    in the sense that she has some shortcomings of her own but she is nice in the relationship

    because she doesnt smother, she isnt, anyway shes good, while my dad isnt, not at all, so

    hes there, so thats why I think _ that more or less_ there is a better balance form that point of

    view, then obviously until I start working and earning, goodbye and thank you, well (94;

    85.956)

    my goals, so that if he wants to give me something yes, no, but more than good or bad which

    anyway is all relative, its the fact that I felt at ease which is much harder than being good or

    not, a person may be good but feel like this and thats a quality that, youve seen I think, that

    comes quite naturally to me, no? (91; 77.94)

    and also its hard, hard in the sense that everything is hard, a hard perspective the way it s

    managed yes, my dad, then it depends on the period because when I want to see him often,

    certainly in this period*continues less certain*the less I see him the better I feel, but I went

    to his home, I saw the lights, I saw a whole difficult period with dad, (94; 134.537)

    12. Refusal of dependency

    maybe Im wrong or maybe its because Im used to it for so long, I never depended on you,

    you depended on me the difference is maybe quite considerable if it will make you more feel

    better in a few years you will depend on us, well be equal, I really dont think so, Id prefer to

    shoot myself rather than depend on you (120; 108.038)

    Also very long periods when I was happy. I was calm, I felt good with that person name period

    and period, I can also forget name, I remember him, well, name, ok . . .. (98; 82.972)

    264 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    11/19

    content was calculated as the percentage of ECUs

    associated with it.

    Analysis of validity. We have translated the

    Turing-like criterion of validity (see section Pur-

    pose of the Study and Hypothesis) into two

    complementary hypotheses, each of them concern-ing one of the two basic functions implemented by

    bottom-up semantic analyses*evaluation of simi-

    larity and classification (see section Characteristics

    of ACASM, point 1). More in particular, we

    expect to find that: (a) the ACASMs evaluation

    of similarity of ECUs is consistent with the

    evaluation of semantic similarity produced by blind

    expert human coders (hypothesis 1); (b) the

    ACASM classification of ECUs is consistent with

    those provided by blind human coders and based

    on their semantic content (hypothesis 2); needless

    to say, hypothesis 2 is only exploratory, being

    expressed in terms of confirmation of the null

    hypothesis.

    In order to test these two hypotheses, we subjected

    the ACASM output to the following two analysesbased on judgments performed by independent blind

    expert human coders.

    Analysis 1. Association of the ACASMs

    Assessment of Similarity and Human Coders

    Measure of Semantic Similarity

    The aim of this comparison is to estimate the

    consistency between the ECUs evaluation of

    Table III (Continued)

    ECU Thematic interpretation

    I was thinking of the wheel breaking, things more like this getting stuck in the middle of the

    road yes yes, but also an aggression could happen no, no, usually those episodes can happen to

    girls alone at night, I know usually*it happens I know, actually its always happening, Ive got

    some girl friends who always get someone to take them home into the house, oh well, (76;

    77.202)

    its not important that its good for you or not but knowing that if it hurt you or it s good for

    you its in your hands yes, it certainly is, but as I was saying before, its not that as you said

    now it hurt you or its not good for you, its the same, its like two levels before, its not that

    you dont know theres a level below a level, which, I mean, (89; 84.752)

    13. Attitude towards the other

    Yes, he is a good guy, he understands you, hes improved, he really loves you, but you, I mean,

    are you in love or not? I dont answer myself and he doesnt give any answer and I dont

    answer myself and if I want to give an answer yes but not yes, but it isnt an answer, its an

    attempt at an answer (98; 73.914)

    because they are not equal on me because for sure I feel better because for sure I know that

    before the answer might have been dictated by an aggressive attitude and so the answer was

    aggressive, and now its not like that anymore, that is, the answer is the answer and thats all,

    that is its not linked to something that I do, its his choice and thus if one feels like that eh,

    (102; 64.278)

    that, its always what we said last time, that it seems obvious to me to make certain requests

    and wait for certain answers and instead different ones arrive, I dont know, about the photo,

    on the furniture, then uh (101; 52.244)

    I still dont have it, ok, I wont read them, I wont read them if you want to wait it would be

    better then I will also let you read other things that I m writing, probably, in fact I will also ask

    you for advice about things, about as it were, about the relevance of what Ive written to what

    youve experienced here (103; 93.922)

    14. To communicate

    obviously in some way the fact of being noticed the two things are exchanged, you wear

    something of someone else no? The goal is to be noticed to write a story R: together but

    belonging to someone else, not to me, that is I didn t have to write it was her that needed to

    write and asked me please, please (87; 68.722)

    because I need to be noticed because I have to write, I need to be noticed by that writer

    because I have to write I dont know, a biography, or something, in sum he had to write

    something and I was trying to say no but, (56.995)

    And what did he write on the card, uh . . . he wrote well now I dont remember exactly the

    words it was like you are my grumbling love but you are wonderful, Id never change you with

    anyone elsethats all. I think it could correspond to reality, what was on the card? (89;

    46.646)

    Note. Translation from the original in Italian.

    The first number in parentheses indicates the session in which the ECU occurred; the second number is a measure of the level of ECU s

    representativeness of the corresponding cluster (Chi square metrics).

    Automated content analysis in process research 265

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    12/19

    semantic similarity by human coders and the evalua-

    tion of similarity provided by ACASM. To this end,

    we adopted the following five-step procedure.

    First, we selected 70 ECUs, the five most repre-

    sentative ones from each of the 14 clusters defined

    by the application of ACASM to the corpus. As

    criterion of representativeness we adopted the

    chi-square derived parameter computed by Cluster

    Analysis for each ECU (the output of CA is reportedin the section Results; cf. Tables II and III too). This

    parameter is based on the computation of the

    number of a clusters words co-occurring within

    the ECU*the more the words are in the cluster, the

    greater is the representativeness of the ECU for that

    cluster.

    Second, two blind coders (PhD students), with

    experience in content analysis for psychosocial

    research, were separately asked to evaluate the

    semantic similarity of the 2415 pairs of ECUs

    produced by the combination of the 70 selected

    ECUs (each ECU was compared with all the

    others the number of pairs is given by theformula k(k 1)/2, where knumber of

    elements70; therefore: 70(70 1)/22415

    couples). Consistent with the commonsense criteria

    of coding (see section Characteristics of ACASM,

    point 3), we have chosen to use coders not

    endowed with clinical expertise and not to provide

    them with any specific, theory-oriented semantic

    rules and criteria for coding. The coders received 2

    hours of preliminary training. Training was aimed

    at clarifying the task. Moreover, coders were

    informed that the ECUs had been extracted from

    the verbatim transcript of a psychotherapy and

    asked to use a 5-point Likert scale*from 1

    indicating very different thematic content, t o 5

    meaning same thematic content. No further

    information on the aim of the task was provided

    to them; coders were blind to ECUs belonging to

    ACASM clusters. The ECUs were presented in

    random order, the same for both coders. By so

    doing, 2415 similarity judgments were obtained

    from each coder. It is worth noting that we did not

    implement any consensus procedure, often adopted

    in semantic analysis for the sake of increasing the

    inter-coder convergence (e.g., Stiles, Elliott, Lle-

    welyn, FirthCozens, Margison, Shapiro & Hardy,1990). Thus, the comparison between ACASM

    and human coders is limited at the basic level of

    functioning of semantic analysis*namely not en-

    compassing the post-coding process of increasing

    reliability.

    Third, in order to make the matrix thus obtained

    suitable for parametric analysis, the Likert scores

    were transformed into metric scale, following the

    procedure proposed by Ciavolino and Dahlgaard

    (2009), based on the probability associated with the

    relative frequency of each level of similarity.

    Fourth, we calculated an ACASM rate of similarity

    for allthe 2415 couples of ECUs. To this end, we used

    the Euclidian distance as the ACASM measure of

    similarity between two ECUs. In order to understand

    this parameter, one has to consider that each ECU

    corresponds to a point on the multidimensional

    factorial space resulting from the multidimensionallexical correspondence analysis performed as the first

    step of the procedure of Cluster Analysis (see above,

    Method section, ACASM step 4). The Euclidian

    distance is the metric distance between two points on

    this space. The closer the two points, the less is the

    Euclidian distance, and the more similar are the

    ECUs they represent (Lancia, 2002). In formal terms,

    the distance between every couple of ECUs was

    calculated as:

    ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffip1 q1 2

    p2 q2 2

    ::: pn qn 2

    q

    ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiXnk1

    pkqk 2:

    s

    withP(p1,p2,. . .pn) andQ(q1,q2,. . .qn) represent-

    ing the coordinates on the n-dimension factorial

    space of the two generic ECUs whose distance is

    computed. In the case of our analysis, we used the

    first 10 factorial dimensions defined by the multi-

    dimensional lexical correspondence analysis applied

    to the corpus (i.e., n10).

    Finally, we compared the values of Euclidian

    distance (as ACASMs measure of dissimilarity) andthe human coders judgments of semantic similarity.

    The comparisons were performed on all the 2415

    couples of ECUs for each coder. Given the structure

    of the ECU sample*five ECUs for each of the 14

    clusters*most of the pairs of ECUs had a low level of

    similarity. Consequently, most of the 2415 pairs were

    rated 1 by both coders (coder A: 1 corresponded to

    91% of judgments; mean1.1085; d.s..37506;

    curtosis18.216; skeweness4.042; coder B:

    point 1 corresponded to 77.8% of judgments;

    mean1.3102; d.s..65622; curtosis4.304;

    skeweness2.203). For this reason, we adopted a

    nonparametric index of correlation*Spearmans rho.According to the first hypothesis, we expect to find

    a significant negative correlation between the Eu-

    clidian distance and the average human coders

    evaluations of similarity; the negative direction of

    the correlation is due to the fact that the Euclidian

    distance is a measure of dissimilarity, rather than

    similarity. Moreover, we expect to find that this level

    of correlation is not distinguishable from the level of

    association between the two coders.

    266 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    13/19

    Analysis 2. Level of Agreement Between

    ACASM and Human Coders Classification of

    ECUs.

    This analysis is aimed at comparing the

    ACASM classification with those based on human

    interpretation of the semantic content of the ECUs.

    The analysis is based on the same set of 70 ECUs

    adopted for analysis 1. The ECUs were ranked

    randomly, the same for all the coders, to avoid that

    their order of presentation being related to cluster

    membership. Three blind coders, different from

    those involved in the previous analysis, yet similar

    for level and type of competence (i.e. PhD students,

    skilled in content analysis for psychosocial research,

    lacking clinical expertise) were separately asked to

    group the 70 ECUs into 14 groups of five ECUs on

    the basis of their thematic similarity. We have

    indicated 14 partitions in order to make the human

    coders classification directly comparable with the

    ACASMs. Also in this case, the coders were given 2

    hours of preliminary training, for the sake of makingthe task clear to them. Moreover, coders were

    informed that the ECUs had been extracted from

    the verbatim transcript of a psychotherapy. No

    further information on the aim of the task was

    provided to them; coders were blind to ECUs

    belonging to ACASM clusters. Akin to analysis 1,

    no theory-oriented semantic criterion of classifica-

    tion was provided to coders, and no consensus

    procedure was implemented.

    Finally, Cohens K inter-coders agreement was

    calculated for the four classifications (i.e., those

    carried out by the three coders and the one produced

    by ACASM); thus, we calculated six Cohens K

    values: three concerning the coders against each other

    and three each coder against the ACASM.

    According to the second operative hypothesis, we

    expect to find that the level of ACASM-human

    coders agreement is at least of the same degree as

    the level of agreement between human coders.

    Results

    Interpretation of the Thematic Contents and

    their Incidence

    Theapplication of ACASMto thecorpus (cf., Table IIfor statistics describing it) produced 5548 ECUs and a

    list of 7258 lemmas, from which we sampled 726

    lemmas, following the procedure described above (see

    step 2 of the ACASM procedure). Therefore, the

    Cluster Analysis (step 4 of the ACASM procedure)

    was performed on the matrix defined by 5548 ECUs/

    rows726 lemmas/columns. Cluster Analysis was

    able to group 5054 out 5548 ECUs (91.095%, cf.,

    Table II). It provides 14 clusters as the optimal

    partition. A sample of the most representative ECUs

    for each cluster, together with the clusters interpreta-

    tion in terms of thematic content, is provided in Table

    III. Table IV shows the number of ECUS grouped ineach cluster. Exchange of presents (10.74%), Differences

    in perspective(9.36%), Adherence to others expectations

    (8.71%), Leisure (8.43%) and Tolerance of negative

    feelings (8.23%) are shown to be the most frequent

    cluster/thematic contents, while the least frequent are:

    Own vs. others point of view (4.45%), Experience of

    difficulty in relationship(4.71%), Concerns for relational

    problems(5.05%)Account of negative feelings (5.1%).

    The frequency of the 14 clusters changes signifi-

    cantly through the three sub-periods (Chi-square

    132,684; df 26; pB.000). Nevertheless, the visual

    inspection of the distribution of the clusters shows

    how all clusters tend to be spread among the three

    periods, namely that they occur in every sub-period

    (cf., Figure 1).

    Analysis of validity. As concerns analysis 1,

    the ACASM measure of similarity (the Euclidian

    Table IV. Partition of ECUs in the clusters

    Cluster/Thematic content Number of ECUs Percentage

    1. Own vs. others point of view 225 4.45%

    2. Difference in perspectives 473 9.36%

    3. Concerns for relational problems 255 5.05%

    4. Exchange of presents 543 10.74%

    5. Experience of feelings in the relationship 353 6.98%

    6. Experience of difficulty in relationship 238 4.71%

    7. Work activities 332 6.57%

    8. Account of negative feelings 258 5.1%

    9. Tolerance of negative feelings 416 8.23%

    10. Leisure 426 8.43%

    11. Adherence to others expectations 440 8.71%

    12. Refusal of dependency 380 7.52%

    13. Attitude toward the other 405 8.01%

    14. To comunicate 310 6.13%

    Automated content analysis in process research 267

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    14/19

    distance) and the human coders judgment of the-

    matic similarity were significantly correlated in both

    cases (ACASM-coder A: Rho.125, PB.01;

    ACASM-coder B: Rho.121;PB.01). The correla-

    tion between the two coders (coder A-coder B:

    Rho.162; PB.01) is of the same magnitude as

    the correlation coder-ACASM.

    Table V shows the Cohens K measures of inter-

    coder agreement concerning the classification of the

    70 ECUs into 14 partitions (analysis 2). The

    magnitudes of K are quite similar among the six

    scores; all comparisons lie within the range 0.34

    0.42 (according to Landis & Koch, 1977 this

    corresponds to a fair to moderate level of agree-

    ment). The levels of agreement between human

    coders and between human coders and ACASM

    are substantially overlapping*the average K

    concerning the agreement between coders is .383(sd.034); the average K concerning the agreement

    between human coders and ACASM is 0.378

    (sd .45). The highest K (.427) concerns the

    inter-coder agreement between coder 3 and

    ACASM.

    Discussion

    ACASM has mapped the transcripts content in terms

    of 14 clusters, each of them being interpretable in

    terms of thematic content. From a quantitative

    standpoint, all thematic contents prove to be specific,

    in the sense that every cluster encompasses only a

    limited portion of the therapeutic exchange*the

    most frequent thematic content concerns about 10%

    of the classified text*but not marginal*no cluster

    represents less than about 5% of the classified text.

    Moreover, though the overall distribution of thematic

    contents has been shown to change significantlythrough time, all thematic contents are present in a

    Figure 1. Distribution of the thematic contents in the three sub-periods of psychotherapy.

    Table V. Cohens Kbetween coders and ACASM classification

    Coder 2 Coder 3 ACASM

    Coder 1 .400 .407 .338

    Coder 2 .344 .369

    Coder 3 .427

    268 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    15/19

    non-marginal way in all three sub-periods. Taken

    together, these results lead us to conclude that each of

    the 14 thematic contents mapped by ACASM repre-

    sents a systematic semantic area of the clinical

    exchange analysed*namely a line of discourse which

    is present in varying degrees through sessions, but

    which goes through the whole treatment.

    Interestingly enough, the most frequent thematic

    contents concerns the account of positive circum-stances, associated with the patients experience of

    relational engagement (Exchange of presents,

    Differences in perspectives, Adherence to others

    expectations, Leisure) and/or her inner states and

    feelings (Tolerance of negative feelings), while the

    least frequent refer to negative issues*in terms of

    negative feelings (Account of negative feelings) or

    relational disengagement (Concerns for relational

    problems, Experience of difficulty in relationship, Own

    vs. others point of view). Moreover, one can observe

    that some thematic contents seems to be stable

    through the three sub-periods into which the

    period of therapy examined has been divided*inparticular, Differences in perspectives, Exchange of

    presents, Leisure, Experience of difficulty in

    relationship, Own vs. others point of view.

    If one considers that the period analysed consists

    of the last year of the three and half years of good-

    outcome therapy, this result lends itself to be

    interpreted as a marker of the positive evolution

    of the therapeutic dialogue*namely, of the fact

    that, in the final segment of the psychotherapy,

    patient and therapist have focused on the patients

    more positive personal and relational experiences,

    leaving conflictual and problematic issues partially

    in the background. Needless to say, given the

    exemplificative purpose of the analysis at stake,

    such interpretation has to be considered in merely

    descriptive terms, namely as a picture of the

    content of the clinical dialogue between Katja

    and her therapist which is consistent with the

    good outcome of the psychotherapy.

    As concerns the analysis of the ACASMs validity,

    findings are consistent with both the hypotheses we

    subjected to test.

    Analysis 1. Evaluation of Similarity

    Results of analysis 1 highlight that ACASM provides

    a measure of the similarity of the units of text (in

    ACASM terms: ECUs) which is associated with the

    evaluation of thematic similarity provided by two

    blind coders with average experience in semantic

    analysis. More in particular, we have found a

    significant negative correlation between the

    ACASMs measure of similarity between couples of

    ECUs (Euclidian distance) and the human coders

    evaluation of thematic similarity. Hence, ACASMs

    way of representing the relationship of (dis)similarity

    among the units of text tends to agree with that

    produced by human coders.

    The level of correlation is not high for both the

    comparisons (rho.125; rho.150); yet it is

    similar to that between the two coders

    (rho.162). As concerns this quite low level of

    the correlational indexes, we are led to think that itdepends on two convergent factors. First of all, a

    role could have been played by the structure of the

    data. As observed, the distribution of the evalua-

    tion of similarity inevitably proved to have a

    limited variability and this has an inherent negative

    impact on the calculation of correlation. Secondly,

    the limited agreement between the two coders

    reflects the data driven bottom-up logic of the

    task given to the coders. Each coder was asked to

    evaluate the thematic similarity between ECUs,

    without providing her/him further indications

    about the criterion of similarity which had to be

    used. Therefore, the coders low level of agreement

    could reflect the multidimensionality of the seman-

    tic content: two utterances may be thematically

    similar from a certain point of view but different

    from many others. Take for example the following

    two sentences:

    We hope to be able to convince the readers of the

    utility of ACASM (1)We hope to be able to enjoy

    ourselves with ACASM(2)

    Now, if one considers them from the perspective of

    the fact that both of them concern a wish related toACASM, they are quite similar; on the other hand, if

    one considers them from the point of view of the

    content of the desire, they can be considered quite

    different, with (1) oriented to a third (the readers)

    and (2) to the subject of the sentence; moreover, (1)

    concerns the scientific evaluation of ACASM, (2) the

    use of it*and so on.

    Obviously, bottom-up methods of semantic ana-

    lysis can be endowed with constraints increasing the

    level of agreement among coders, in accordance with

    the specific aim of the analysis. However, the same

    can be done with ACASM*

    for instance, throughworking on the choice of lemmas to be selected for

    analysis. Yet, given that extending the comparison

    with human coders to this further level of ACASMs

    functioning would have required a different design,

    according to the initial aim of this study, we have

    decided not to include these further constraints,

    limiting the analysis to the extent of potential

    agreement at the level of basic data driven bottom-

    up analysis.

    Automated content analysis in process research 269

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    16/19

    Analysis 2. Classification Task

    Results from analysis 2 confirm the picture provided

    by analysis 1, from the complementary point of view

    concerning the task of classification. Here we have

    found that the agreement between the ACASM

    classification and the coders classifications is of the

    same extent as the agreement among human coders

    who are expert in content analysis for psychosocial

    research. At the same time, however, analysis 2 also

    highlights how the extent of agreement among

    classifications*regardless of whether they are per-

    formed by human coders or ACASM*is rather low:

    from fair to moderate. This double finding requires

    some comments.

    Preliminarily, in order to appreciate it, one has to

    take into account the very large degrees of freedom

    associated with the task of classification at stake. As a

    matter of fact, the probability of ordering 70 ECUs

    (n) into 14 groups (g) of five items (k) is:P(n,g,k)(k!

    (nk)!n!)g(5! (705)!70!)146.19100. This in-

    finitesimal value of probability of casual agreementcan be considered an assessment of the difficulty of

    semantic classification tasks*and in the current

    study the classification task is a rather simple

    example, compared to those usually addressed in

    semantic analysis. Thus, even if the level of agree-

    ment is not high in absolute terms, it is more

    appreciable as one takes into account that it has

    been reached in the context of a task having to deal

    with a very high level of uncertainty.

    Needless to say, the coders might have made some

    mistakes in classifying the ECUs; yet, given their

    level of expertise, the error of measurement could

    help marginally at best in explaining the not high

    level of Cohens K. Just as for the evaluation of

    similarity, in the classification task the partial diver-

    gence among coders also needs to be considered in

    the light of the multidimensionality of meaning.

    Texts do not hold a pre-established, fixed meaning;

    rather, they define the constraints within which the

    reader constructs the interpretation (Eco, 1979).

    Hence, any ECU has no single true meaning, as

    such able to define normatively which is the right

    classification and, complementarily, to qualify all the

    other classifications as errors. On the contrary, any

    unit of text is open to a multiplicity of interpreta-tions. Consequently, the divergence among classifi-

    cations that we have found depends on the fact that

    coders may classify the ECUs in accordance with a

    plurality of hermeneutic criteria, each of them

    grounded on a certain component of the meaning

    at stake and made pertinent by the coders specific

    point of view and interpretative plan (Salvatore,

    2011). In sum, the moderate-fair level of agreement

    has to be considered in the light of the inherent

    interpretative autonomy of the coder. Anyway, we

    recognize that our results do not allow us to exclude

    the alternative interpretation*namely that the mod-

    erate-fair level of agreement (as well as the low level

    of correlation shown by analysis 1) is a matter of

    error of measurement. Further analyses are required

    for arriving at a conclusive statement on this point.

    From a complementary point of view, the similar-

    ity of the levels of agreement among the three pairsof coders provides food for thought. In order to

    interpret this aspect of results, one has to take into

    account that coders were asked to classify the ECUs

    in terms of commonsense (see section: Design). One

    can thus conclude that the convergence among

    coders reflects the fact that they share some implicit

    semantic criteria rooted in their common cultural-

    linguistic membership. Incidentally, the statement

    just made is not contradicted by the fact that the

    agreement documented by analysis 2 is only of

    moderate-fair extent. This is so because common-

    sense guides the interpretations through texts in a

    variable way: according to their semantic, syntacticand lexical characteristics, some units of text are

    more conventionalized (Bartlett, 1932) sensitive to

    the influence of commonsense, while others are less

    affected by this semantic attractor (Rommetveit,

    1992; Valsiner, 2007). To summarize, we consider

    the agreement between the classifications performed

    by the independent coders as the effect of the

    commonsense ground shared by the coders and as

    such guiding them to converge with each other. On

    the other hand, the intermediate extent of the

    agreement shows that this common ground put

    some constraint on the interpreters autonomy*on

    the more conventionalized part of the text*but it

    did not cancel it.

    The homogeneity of the levels of agreement

    between the set of the three inter-coder comparisons

    and the set of the three coder-ACASM comparisons

    allows us to draw the following double conclusion.

    Firstly, as expected by hypothesis 2, the ACASM

    classification reaches a level of agreement with those

    carried out by human coders, which is consistent

    with the level of agreement the coders are able to

    reach with each other. In the final analysis, this

    means that*as the Turing-like criterion requires*

    an external observer blind to the nature of theclassifier could not distinguish among the four

    classifications (the three provided by coders and

    the one by ACASM). Hence, analysis 2 shows that

    ACASM satisfies the Turing-like criterion as far as

    the classification task is concerned. Secondly, the

    level of agreement between ACASM and coders is

    comparable to the level of agreement that human

    coders reach with each other on the basis of the

    commonsense competence they share as members of

    270 S. Salvatore et al.

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    17/19

    a given cultural-linguistic community. Therefore,

    though this does not necessarily mean that ACASM

    performs the same job carried out by human coders

    on the basis of commonsense (i.e. computational

    equivalence), it means that it does a job at least

    quantitatively equivalent to that (i.e., functional

    equivalence).

    Methodological Limits of the Study

    Before concluding, some major limits of our study

    have to be underlined, for the sake of clarifying how

    the results discussed above have to be interpreted.

    Firstly, two issues concerning the design have to

    be highlighted. On the one hand, the comparison

    between human coder and ACASM is based on a

    non-random sample of units of analysis*we selected

    the units of text in accordance with the ACASM

    output, sampling the most representative ECUs for

    each cluster defined by the automated method. We

    adopt this modality of sampling in order to reduce a

    potential source of variability and focus the compar-

    ison on the parts of text that are more clearly and

    reliably interpretable from the perspective of

    ACASM output. On the other hand, for the sake

    of making the human and ACASM classification

    homogeneous, and therefore immediately compar-

    able, we asked coders to classify the units of text in

    the same number of classes as those produced by the

    automated method (14). We recognize that these two

    choices weakened the Turing-like criterion, because

    they made the terms of the comparison (i.e., the

    ACASM output and the human coders perfor-

    mance) non-independent. Thus, even though thedesign adopted might have improved the reliability

    and power of the analysis, it did so at the cost of

    reducing its external validity: our study leaves open

    the question of whether the indistinguishableness

    between the performance of human coders and

    ACASM would have been retained if a random

    sample of units of text had been used and no

    constraints had been put on the number of classes

    human coders adopt for the sake of classifying.

    Secondly, we compared ACASM and human

    coders just on the two basic functions of similarity

    and classification. Yet, human coders perform such

    functions on the basis of a preliminary operation ofselection of the pertinent part of the text. In order to

    code, human coders firstly have to select as relevant

    any parts of the text, thereby defining the units of

    analysis to be subjected to coding. And it is evident

    that the output of any semantic analysis strongly

    depends on how (in terms of which criteria) perti-

    nentization is carried out. For instance, according to

    the Narrative Process Coding System (NPCS; An-

    gus, Levitt, & Hardtke, 1999; Angus & Hardtke

    1994; Angus, Hardtke, & Levitt, 1996), coders

    assume as unit of analysis the thematic nuclei

    (according to the terminology of the method: con-

    tent areas); once this construction of the unit of

    analysis has been performed, they code them in

    terms of narrative categories (External Narrative

    Process Sequences, Internal Narrative Process Se-

    quences, and Reflexive Narrative-Process Se-

    quences). Still, think of methods like the CoreConflictual Relational Theme (CCRT; Luborsky &

    Crits-Cristoph, 1990) and the Innovative Moments

    Coding System (IMCS, Goncalves et al., 2009;

    Goncalves et al., 2010), whose systems of coding

    are applied only after the selection of the units of text

    considered pertinent (Narrative Episode in CCRT;

    Innovative Moments in IMCS). As concerns

    ACASM, it adopts a data-driven bottom-up proce-

    dure of pertinentization, as implemented by the

    methods step 1. According to this procedure, all

    the text is selected, and the pertinentization concerns

    the length of the segments of text. However, the non-

    selective, data-driven character of the ACASMprocedure of pertinentization does not mean that it

    is a neutral operation. Rather, through its specific

    way of segmenting, ACASM constructs a peculiar

    version of the textual corpus (e.g., a partition of

    groups of sentences) as the object of coding: its

    thematic map cannot but reflect and move within the

    limits defined by such a version. Consequently, we

    have to conclude that the validity of our comparison

    among human coders and ACASM is limited to a

    model of a human coder adopting the ACASMs

    version of text as object of coding. However, we do

    not consider this limitation a reason for invalidating

    the results of the current study. Meaning does not

    have its own length and place in the text: one may

    segment units of analysis at many gradients of

    length*words, sentences, groups of sentences, as

    well as larger partitions of texts*and will none-

    theless create a version of text that is semantically

    interesting. Thus, there is not a preferential way of

    pertinentizing*any system of coding entails a defi-

    nition of the units of analysis, in accordance with its

    aim and theoretical framework as well as with the

    computational requirements for implementing it.

    Consequently, further studies have to verify whether

    the ability of ACASM to satisfy the Turning-like testwithin the constraints of the current study is the basis

    for the more general capability of ACASM to provide

    a thematic map that is meaningful in itself and usable

    for clinical purposes, in integration with other

    methods too. As concerns the latter point, we already

    have some promising evidence*Nitti, Ciavolino,

    Salvatore, & Gennaro (2010) have applied ACASM

    as the first phase of a more articulated method

    (Discourse Flow Analysis) aimed at analysing the

    Automated content analysis in process research 271

  • 8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

    18/19

    way contents connect with each other within the

    communicational flow of the psychotherapy. In so

    doing, they were able to show that the way contents

    are related to each other changes through the

    psychotherapy process, and that this change is a

    valid marker, thanks to which one can discriminate

    the clinical quality of sessions.

    Conclusion

    This study has presented an automated method of

    data-driven bottom-up semantic analysis*

    ACASM*providing a first test of its validity. Results

    have shown that ACASM produces a meaningful,

    systematic map of the thematic content of verbatim

    psychotherapy transcripts, which is consistent with

    the one produced by expert human coders.

    Needless to say, this study is just a first step in the

    direction of ACASM validat