Metodo Automatizado Analisis Contenido Psicoterapia

8/10/2019 Metodo Automatizado Analisis Contenido Psicoterapia

1/19

This article was downloaded by: [Universidad De Concepcion]On: 06 October 2014, At: 18:42Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House37-41 Mortimer Street, London W1T 3JH, UK

Psychotherapy ResearchPublication details, including instructions for authors and subscription information:

http://www.tandfonline.com/loi/tpsr20

Automated method of content analysis: A device for

psychotherapy process researchSergio Salvatore

a, Alessandro Gennaro

a, Andrea Francesco Auletta

a, Marco Tonti

a&

Mariangela Nittia

aDepartment of Pedagogy, Psychology, and Teaching Science , University of Salento ,

Lecce , Italy

Published online: 16 Jan 2012.

To cite this article:Sergio Salvatore , Alessandro Gennaro , Andrea Francesco Auletta , Marco Tonti & Mariangela Nitti

(2012) Automated method of content analysis: A device for psychotherapy process research, Psychotherapy Research, 22:3,256-273, DOI: 10.1080/10503307.2011.647930

To link to this article: http://dx.doi.org/10.1080/10503307.2011.647930

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the Content) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of thContent. Any opinions and views expressed in this publication are the opinions and views of the authors, and

are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon anshould be independently verified with primary sources of information. Taylor and Francis shall not be liable forany losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveor howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use ofthe Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
http://dx.doi.org/10.1080/10503307.2011.647930http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10503307.2011.647930http://www.tandfonline.com/page/terms-and-conditionshttp://www.tandfonline.com/page/terms-and-conditionshttp://dx.doi.org/10.1080/10503307.2011.647930http://www.tandfonline.com/action/showCitFormats?doi=10.1080/10503307.2011.647930http://www.tandfonline.com/loi/tpsr20


2/19

Automated method of content analysis: A device for psychotherapy

process research

SERGIO SALVATORE, ALESSANDRO GENNARO, ANDREA FRANCESCO AULETTA,

MARCO TONTI, & MARIANGELA NITTI

Department of Pedagogy, Psychology, and Teaching Science, University of Salento, Lecce, Italy

(Received 4 October 2010; revised 4 November 2011; accepted 28 November 2011)

Abstract

The work presents a computer-aided method of content analysis applicable to verbatim transcripts of psychotherapy: theAutomated Co-occurrence Analysis for Semantic Mapping (ACASM). ACASM is able to perform a context-sensitivestrategy of analysis aimed at mapping the meanings of the text through a trans-theoretical procedure. The paper is devotedto the presentation of the method and testing its validity. To the latter end we have compared ACASM and independentblind human coders on two tasks of content analysis: (a) estimating the semantic similarity between two utterances; (b) the

semantic classification of a set of utterances. Results highlight that: (a) ACASMs estimates of semantic similarity areconsistent with the corresponding estimates provided by coders; (b) coders agreement and coder-ACASM agreement onthe task of semantic classification have the same magnitude. Results lead to the conclusion that the content analysisproduced by ACASM is indistinguishable from that performed by human coders.

Keywords: qualitative research methods; technology in psychotherapy research and training; content analysis;

meaning

Introduction

Consistent with Freuds definition of psychotherapy

as the talking cure, psychotherapy process research

has since its very beginning commonly focused on

the communicative exchange unfolding within ses-sions. Many methods of process analysis have been

developed for investigating such an exchange (e.g.,

Colli & Lingiardi, 2009; Dahl, Kachele, & Thoma,

1988; Dimaggio & Semerari, 2004; Goncalves,

Matos, & Santos, 2009; Greenberg & Pinsof, 1986;

Luborsky & Crits-Christoph, 1990; Mergenthaler,

1996a; Perry, 1991; Salvatore, Gelo, Gennaro,

Manzo, & Al Radaideh, 2010). A good proportion

of these methods of process analysis is based on

verbatim transcripts of sessions*exclusively or to-

gether with other kind of data (e.g. data concerning

non-verbal behaviour). Consequently, the develop-ment of the efficacy and efficiency of methods of

textual analysis is worth considering as a major task

for psychotherapy process research (Mergenthaler,

1996b). This study intends to contribute to such

development, through the presentation of a bottom-

up automated method of content analysis of texts.

Semantic Analysis: Top-Down Versus

Bottom-Up Methods

The method presented in this study belongs to the

family of models focusing on the semantic level of

text (henceforth: semantic analysis). These methodsare aimed at mapping the content of the text, namely

the meaning it conveys. Semantic analysis is essential

for psychotherapy process research. Psychotherapy is

an exchange of meanings (Angus & McLeod, 2004;

Dimaggio & Semerari, 2004; Hermans & Hermans-

Jansen, 1995; McNamee & Gergen, 1992; Salvatore

et al., 2010; Salvatore & Venuleo, 2008; Santos,

Goncalves, Matos, & Salvatore, 2009) and therefore

it is hard to consider deepening our understanding of

it without taking into account the content of what

patient and therapist say.

Within semantic analysis it is worth differentiatingbetween top-down methods and bottom-up methods.

Top-down methods are based on pre-defined coding

systems according to which units of texts

are categorized. The Core Conflictual Relational

Theme (Luborsky & Crits-Christoph, 1990), the

Defence Mechanism Rating Scale (Perry, 1991), the

Correspondence concerning this article should be addressed to Alessandro Gennaro, University of Salento, Department of Pedagogy,

Psychology, and Teaching Science, via stampacchia, Lecce, 73100 Italy. Email: [email protected]

Psychotherapy Research, May 2012; 22(3): 256273

ISSN 1050-3307 print/ISSN 1468-4381 online # 2012 Society for Psychotherapy Research

http://dx.doi.org/10.1080/10503307.2011.647930
http://dx.doi.org/10.1080/10503307.2011.647930http://dx.doi.org/10.1080/10503307.2011.647930


3/19

Collaborative Interactions Scale (Colli & Lingiardi

2009) and the Innovative Moments Coding System

(Goncalves, Ribeiro, Mendes, Matos, & Santos

2011; Goncalves, Ribeiro, Matos, Santos, &

Mendes, 2010) are examples of top-down semantic

methods. In general terms, they consist of a repertoire

of categories of content working as coding system

and of a set of rules for applying the categories to the

text. Bottom-up methods pursue the same aim ofmapping the meaning of the text, but they do not

adopt a pre-defined coding system. Rather, like the

logic of Grounded Theory (Glaser & Strauss, 1967;

Rennie, 2000), these methods start from the text

and define the coding categories together with the

mapping of the textual content*through an iterative

interpretative procedure. Task analysis (Greenberg &

Pascual-Leone, 2001; Pascual-Leone, Greenberg,

& Pascual-Leone, 2009) is an example of this iterative

way of working. It starts from a set of theoretical

assumptions that are deliberately used for orienting

the extrapolation of sequences of events of change. In

turn, observed sequences can lead to the modificationof the original theoretical assumptions and therefore

to further observations.

The Contextuality of Meaning: Implications for

Semantic Analysis

The meaning of a linguistic sign (a word, a sentence) is

inherently dynamic and contextual (Salvatore, 2011,

2012; Valsiner, 2007; for a discussion of this general

tenet in the field of psychotherapy, see Gennaro,

Al-Radaideh, Gelo, Manzo, Nitti, & Salvatore, 2010;

Greenberg, & Pinsoff 1986; Salvatore et al., 2010;Salvatore, Gennaro, Auletta, Grassi & Rocco, 2011). It

is not a fixed, pre-established content (e.g., an idea, an

image, a concept) held in the sign itself; rather, it

emerges from the way the linguistic signs combine with

each other in the contingency of the talk (Linell, 2009;

Salvatore & Valsiner, 2011; Wittgenstein, 1953/1958).

Thus, understanding the meaning of the signa means

mapping with which other signs a occurs, in the specific

context of its use.

This pragmatic, dynamic and contextual defini-

tion of meaning provides a way to appreciate the

inherent multidimensionality and fuzziness of mean-

ing. In the concrete circumstance of communication,signs always occur within an array of connections

with many other signs; therefore, meaning depends

on how the interpreter selects some of these connec-

tions as pertinent, leaving others in the background.

In sum, meaning is not in the text, but in the

constructive, hermeneutic relationship between text

and interpreter.

Semantic analysis of text, therefore, cannot be

performed in terms of the application of context-

blind rules of coding*namely, if the word x occurs,

then this means that content A has occurred; rather,

inferential reconstruction of the linguistic and/or

extra-linguistic context of the text is required. In

other words, the specific interconnections that words

create within that particular text must be taken into

account*namely, word x in the context of its

connection with words y and z means A; but in the

context of its connection with words m and n itmeans B.

Thus far, automated procedures of semantic

analysis have not proved able to take into account

efficaciously the contextuality of meaning. And this

has prevented the spread of this kind of procedure

within psychotherapy research. As a result, the

semantic methods adopted in psychotherapy re-

search are currently based on human judgment.

Yet, the use of human coders raises several metho-

dological, metric and organizational problems that

place a considerable constraint on the heuristic

potentialities of this kind of method.

First of all, semantic analysis is usually verylabour-intensive and time-demanding work: it re-

quires time, people, and hours and hours of work.

This hinders the possibility of generalizing the

application of semantic methods across cases and

researchers. We are led to consider the methodolo-

gical fragmentation of contemporary process re-

search related to this constraint: the work required

for developing the competence for applying a coding

system*and for reaching a satisfying agreement

among coders*entails a level of commitment that

can often be expressed only by the group of

researchers working on developing the coding system

itself.

Secondly, the codersinferences will be always and

in any case endowed with an irreducible subjective

valence that cannot but have negative consequence

on the levels of reliability, and therefore on the

semantic methods power of revealing significant

relationships. On the other hand, in the case of

semantic analysis, the problem of reliability cannot

be considered merely in terms of error of measure-

ment; rather it reflects the inherent multidimension-

ality of meaning: the variability among coders stems

from the fact that the text is open to many different

levels of interpretation. Consequently, increasing thereliability of semantic analysis requires clarifying and

sharing the hermeneutic criteria according to which

the coders reduce the multidimensionality of mean-

ing. In this way a specific semantic map of the text is

constructed. In accordance with this perspective,

many efforts have been put into making the rules of

coding clearer and more specific and forcing the

coders to use procedures of consensual validation

(Lambert & Ogles, 2009; Lutz & Hill, 2009); yet,

Automated content analysis in process research 257


4/19

given the high level of inference inherently implied in

these methods, these solutions cannot be fully

resolutive. And above all, they make the semantic

methods even more work and more time-consuming.

The above considerations lead us to conclude that

an alternative way is worth pursuing: the develop-

ment of bottom-up procedures of semantic analysis

based on explicit, invariant rules of coding and yet

able to take the contextuality of meaning intoaccount. Procedures of this kind would represent a

highly significant contribution to the growth of

psychotherapy process research. On the one hand,

they would allow the automated implementation of

the semantic analyses. On the other hand, they

would provide a shared ground supporting and

constraining the (at least to date) non-renounceable

human inferential judgments, so to increase the

inter-coder agreement as well as the comparability

among textual analysis.

Purpose of the Study and Hypothesis

This study intends to present an automated bottom-

up procedure of semantic analysis, Automated Co-

occurrence Analysis for Semantic Mapping

(ACASM), and to provide a first test of its validity.

ACASM constructs a map of the text in terms of

thematic nuclei active in it. It works through

invariant, ostensible, yet context-sensitive proce-

dures, defined in terms of computational algorithms.

Due to these characteristics, ACASMs procedures

are: (a) implementable through automated routines

carried out by computer; (b) reproducible reliably

across analyses and analysers; (c) able to produce avalid representation of the textual data (Lancia,

2002).

The current paper pursues two complementary

aims. First, the ACASM method is presented

together with an exemplification of its application

to a case of psychotherapy. Second, an initial

empirical test of ACASM validity is performed. As

concerns the latter point, we adopt a Turing-like

criterion of validity (for similar logic, see Rosenberg,

Schnurr & Oxmann, 1990; Steinbach, Karypis &

Kumar, 2000). Following this criterion, ACASM

could be considered a valid semantic method if and

only if the analysis it produces cannot be distin-guished from those produced by expert human

coders. We adopted this criterion because in the

case of bottom-up semantic analysis it is not possible

to refer to an external, objective normative criterion

in accordance with which to evaluate the validity of

the analysis in absolute terms. Meaning is multi-

dimensional and therefore any text permits many

representations of its semantic content. Conse-

quently, we assumed that in order that an automated

bottom-up procedure of semantic analysis could be

considered valid, such a procedure has to produce a

map of the text whose level of agreement with the

maps produced by expert coders is comparable with

the level of agreement that coders show with each

other.

Our hypothesis is that ACASM passes the Turing-

like test of validity.

Method

ACASMs Conceptual Framework

ACASM is an example of a bottom-up method of

semantic analysis. This is so because it does not start

with a pre-established repertoire of thematic con-

tents in accordance with which the units of analysis

are classified. Rather, the repertoire of thematic

contents working as a coding system is produced

by the analysis itself.

ACASM belongs to a set of methods focused on

the co-occurrence of words (Carli & Paniccia, 2007;Lancia, 2002; Reinert, 1986)*that is, the way the

words combine with each other within the same unit

of analysis into which the text is segmented (gen-

erally, the unit of analysis consists of an utterance or

a group of a few utterances). The co-occurrence of

words is taken as a criterion of similarity for

clustering the units of text. That is, the units of

analysis are clustered in accordance with the words

co-occurring within them: units of text holding the

same co-occurring words are considered similar and

therefore grouped. The rationale is that a set of co-

occurring words marks a specific thematic content

(named thematic nucleon too). Therefore, unitshaving a certain set of co-occurring words in

common share the thematic content marked by

such a set. In this way, the procedure of semantic

analysis is able to provide a fine level of semantic

representation, coding each unit of analysis in terms

of a specific content*namely, the one marked by the

set of co-occurring words according to which the

unit has been clustered.

From a conceptual point of view, the reference to

co-occurrence of words within the same unit of

analysis can be considered a way of taking into

account the linguistic level of the contextuality of

meaning*namely the level consisting of the way the

words are combined within the text.

ACASMs Procedure of Analysis

ACASM is performed in terms of invariant algo-

rithms implemented automatically by ad hoc soft-

ware on the basis of parameters of analysis

established by the researcher (Alceste, T-LAB).

258 S. Salvatore et al.


5/19

We adopted the procedure implemented by the

software T-LAB (Lancia, 2002), in the version T-

LAB PRO_XL2. T-LAB PRO_XL2 is able to

analyse textual data of various languages (English,

Italian, Spanish, Portuguese, German).

ACASM is implemented through four steps,

which take about 1 hour of work, performed by

even only one researcher (the dimension of the

textual dataset affects only marginally the durationof the procedure).

Step 1. Segmentation of transcripts. ACASM

works on the textual dataset (henceforth: corpus) as

defined by the researcher in accordance to the aim of

the study. The corpus may consist of the verbatim

transcript of the patient and/or therapists talk,

concerning all or only sampled sessions. ACASM

divides the corpus into units of analysis*each of

them called an elementary context unit (ECU). An

ECU consists of a group of a few contiguous

utterances.The dividing of the text into ECUs has to find a

point of equilibrium between two requirements

dialectically linked to each other: interpretability

and specificity. On the one hand, the segments

have to be long enough to be interpretable in terms

of thematic content. On the other hand, the longer

the segments are, the greater the likelihood is that

each segment may not be associated with a specific

thematic content. The point of equilibrium between

interpretability and specificity is an empirical issue

(varying according to the language). After a series of

trials and simulations, we have got to the point (to

date) of defining the following criterion (for

the English language): (a) each ECU begins with

the character just subsequent to the last character of

the previous ECU; (b) each ECU ends with the first

punctuation mark (., or !, or ?) occurring after

the 250th character from the first character (i.e.

punctuation marks occurring before the 250th char-

acter are not considered for closing the ECU); (c) at

any rate the ECUs length must not be more than

500 characters; therefore, the ECU in any case ends

with the last word remaining within this limit, even if

no punctuation mark has occurred.

As one can note, the formulation of the criterion isexpressed in terms of characters. This is so because

the ACASMs algorithm adopts characters as basic

computational unit*namely the lexical units are

defined as the string of characters encompassed

between two empty characters. Nevertheless, pre-

vious application of this criterion on psychotherapy

transcripts (Salvatore et al., 2010) has shown that it

leads to definitions of units of text endowed with

semantic meaningfulness.

Step 2. Selection of the lexical forms and

construction of the dictionary. Depending on its

size, a textual corpus can hold even several thou-

sands of lexical forms. Lexical forms play the role of

variables in the ACASM procedure (see step 3).

Consequently, it is necessary to reduce them to a

number suitable for the constraints of the following

multidimensional analysis (see step 4), which re-

quires a reduction in the dispersion of the datamatrix.

This task is performed through two sequential

sub-steps.

Firstly, the procedure singles out all the lexical

forms present in the text and categorizes them

according to the lemma they belong to. A lemma is

the citation form (namely, the headword) used in a

language dictionary to refer to a lexeme (i.e., a set of

word forms having the same lexical root and mean-

ing). For example, word forms such as go,goes,

going and went have go as their lemma;

childand childrenhave childas their lemma.

The output of this sub-step is the list of lemmaspresent in the textual corpus.

The second sub-step is the selection of a subset of

lemmas within the list of lemmas. This sub-set

constitutes the dictionary the following analysis will

be based on. To this end, 10% of the whole list of

lemmas is selected. Selected lemmas are the most

frequent ones*yet the 5% highest-frequency lem-

mas are excluded by the ACASM dictionary. The

exclusion is motivated by the fact that the higher the

frequency of the lemma the less it contributes to

discriminating among the ECUs: high-frequency

lemmas (e.g., words like and, to, of) tend to

be present in too many ECUs, therefore enter too

many patterns of co-occurrences. This criterion of

exclusion has been determined through a prelimin-

ary empirical work of approximation; however, it is

consistent with the lexical-statistical logic grounding

several methods of textual analysis (Bolasco, 1999).

It is worth noting that, because of the high

frequency of the most commonly used words, the

10% percentage of lemmas included in the ACASM

dictionary corresponds to the level of coverage of the

text considered acceptable in literature, namely

about 7085% of the occurrences as a whole,

depending on the dimension of the textual corpus

(Bolasco, 1999; Lancia, 2002).

Step 3. Digital representation of the text.The

reduction of the original text into ECUs and the

identification of the lemmas active in the corpus

allows the text to be transformed into a digital matrix

representing the distribution of lemmas in ECUs (in

binary terms: present/absent). The matrix has all

ECUs displayed in rows and the lemmas in columns;



6/19

the value 1 in the generic cell xij represents the

presence of the jth

lemma in the ith

ECU, the value

0otherwise (Table I).

Step 4. Identification of cluster of ECUs/

co-occurring lemmas and classification of the

ECUs. A Cluster Analysis (CA; Aldenderfer &

Blashfield, 1984) is applied to the matrix. Inciden-

tally, note that CA incorporates a previous procedureof Multidimensional Lexical Correspondence Ana-

lysis, transforming the binary variables of the original

data matrix into continuous classificatory dimen-

sions. Cluster Analysis groups the ECUs using the

co-occurrence of lemmas as the criterion of similar-

ity: the higher the number of lemmas shared by two

ECUs, the higher is the probability that these two

ECUs are grouped in the same cluster. Therefore, in

the final analysis, each cluster obtained is a set of

utterances (i.e., of ECUs) that share many lemmas

among them. According to this criterion of similar-

ity, the ACASM considers a given cluster the marker

of a thematic content which is active in the text and

semantically characterizes the ECUs grouped in that

cluster (see below, the section Semantic Interpreta-

tion of the ACASM Output). The number of clusters

in which the text is segmented is defined in

accordance with an iterative algorithm; the proce-

dure of clustering stops when further partitions do

not further produce a significant improvement of the

inter/intra cluster ratio, which means that increasing

the number of clusters does not produce an appreci-

able increment of information.

A complementary output of the Cluster Analysis is

the assignation of each ECU to the cluster withwhich it has the highest index of association. In this

way, each ECU is marked with the most representa-

tive cluster representing one of the thematic contents

extrapolated by the Cluster Analysis. (Table III

shows the most representative ECUs, in English

translation, of the 14 clusters defined in the case

analysed in the current study, together with their

interpretation.)

Before concluding the presentation of the method,

it is worth noting that though ACASMs computa-

tional rules (i.e., the operative criteria according to

which the text is segmented, lemmatized and the-

matically clustered) are invariant, they can bemodified in accordance with the aim of the research-

er. For instance, if the researcher is interested in

analysing patients feelings concerning the marital

couple, she could find it useful in distinguishing two

lemmas for any word denoting a feeling: one lemma

concerning the word when associated with the

marital couple and the other concerning the word

when used outside such domain.

Semantic Interpretation of the ACASM Output

The interpretation is provided by the researcher.

Since each cluster represents a subset of ECUs

sharing lemmas tending to co-occur in the same

utterances, it can be understood as a thematic

nucleon made up of a set of words whose aggregation

reflects the shared presence of certain semantic traits

(Lancia, 2005). It is worth noting that the words

composing the set may have various kinds and

degrees of semantic relationship among them (e.g.,

they may be synonymous, as in muchand a lot,

antonymous, as in good and bad, connected

functionally, as in car and trip, and so forth).

The interpretation of the content of the set is based

on the identification of such a network of semantic

relationships.

Characteristics of ACASM

Before concluding the presentation of ACASM, it is

worth pointing out three peculiar characteristics of

the method.

1. Though the process of human comprehension

of texts is a highly debated issue (Kintsch, 1988;

Landauer & Dumais, 1997; Visetti & Cadiot,2002), in general terms one can assume that

human bottom-up semantic analysis requires

the implementation of two basic complemen-

tary functions. Firstly, semantic analysis con-

sists of the evaluation of semantic similarity

between the units of analysis (e.g., groups of

words, utterances, groups of utterance, and so

on) into which the text is segmented. Thus,

utterances considered to have a similar semantic

content are grouped together and this leads to

the identification of a semantic/thematic nu-

cleon. For instance, utterances concerning

trouble at work, conflicts within the familyand health issues can be clustered in terms

of their sharing of the content: undesirable,

Table I. A hypothetical example of digital representation of the text:I went home. Kate was still therein terms of the matrix ECU/lemma

ECU/Lemmas I Go Home Kate Be Still There

I went home 1 1 1 0 0 0 0

Kate was still there 0 0 0 1 1 1 1



7/19

problematic events. On the other hand, seman-

tic analysis implies an operation of categoriza-

tion: utterances are attributed to the semantic

nucleon that is the most representative of their

content. ACASM performs the same two basic

functions of human coders bottom-up seman-

tic analysis too. It does so through context-

sensitive computational rules, namely the

multidimensional analysis of the distribution

of co-occurrence through ECUs.

2. We do not claim that ACASMs parametersand computational rules are the same as those

used by human coders. On this point we keep

an open position, though some studies lead

one to think that human comprehension of text

is also based on computational rules similar

to multidimensional analysis (Landauer &

Dumais, 1997; Visetti & Cadiot, 2002). What

we maintain is that, given their context-

sensitiveness, ACASMs computational rules

are functionally equivalent to human coders

procedures: ACASM reproduces the same basic

functions*evaluation of semantic similarity

and classification*as human coders bottom-up semantic analysis.

3. ACASM is assumed to be functionally equiva-

lent to a model of human bottom-up semantic

analysis based on commonsense, namely to a

human coder interpreting the textual content

guided by no specific theoretical criterion, but

based on the basic cultural and linguistic

competence in terms of which she/he commu-

nicates, understands and interprets in daily life

(Garfinkel, 1967; Valsiner, 2007).

Data Source

The present study concerns a sample of verbatim

transcripts, extracted from a good-outcome Italian-

speaking 124-session psychotherapy (the Katja

case). Katja received a Cognitive-Constructivist

Therapy for Narcissistic Disorder (Dimaggio, Her-

mans, & Lysaker, 2010; Semerari, Dimaggio,

Nicolo, Procacci, & Carcione, 2007). The treat-

ment lasted three and a half years; according to

several independent analyses Katjas therapy was

considered a good-outcome therapy (for details, see

the review proposed by Nicolo & Salvatore, 2007).

Good outcome proved to be maintained a year

later in follow-up (Dimaggio & Semerari, 2001).

Analysis was performed on the transcripts of 48

sessions of the third and last stage of the psy-

chotherapy (from session 74 to session 121,

corresponding to the last year and half of psy-

chotherapy*note that the last three sessions were

left out because they were participated in by othersubjects than the therapeutic dyad). We decided to

concentrate our analysis on just the last part of the

psychotherapy because one can expect that the

patient-therapist talk is subjected to a process of

specialization in the use of words*namely certain

combinations of words become progressively more

and more probable while others become more and

more improbable; empirical evidence supporting

this hypothesis on the same case is provided by

Salvatore, Tebaldi and Pot (2009). Therefore,

given that our analysis is a first test of validity of

the method, we preferred to focus on the lastportion of clinical dialogue, where patterns of co-

occurrences should be more differentiated and

therefore more efficiently distinguishable in

clusters.

Following a dialogical clinical approach (Gennaro

et al., 2010) the whole transcript of the sessions,

encompassing both patient and therapist talk, was

inserted into the analysis.

Design

Analysis of the thematic contents and theirtemporal evolution. First, we applied the

ACASM procedure to the textual corpus and

interpreted the clusters defined in terms of their

thematic content. Second, in order to take into

account the temporal evolution of the thematic

contents, we divided the period of therapy analysed

into three sub-periods (sub-period A, sessions 74

89; sub-period B, sessions 90105; sub-period C,

sessions 106121). The incidence of each thematic

Table II. Descriptive parameters of the textual corpus subjected to analysis (Katja case)

Descriptive parameters Amount

Sessions 48

Number of elementary context units (ECUs) of the text 5548

Number of elementary context units (ECUs) clustered 5054

Number of occurrences in the text (token) 146673

Number of lemmas in the text (type) 7258

Number of lemmas in analysis 726

Number of cluster extracted 14



8/19

Table III. Katja cases ACASM output: clusters most representative ECUs, and their semantic interpretation

ECU Thematic interpretation

or so so then on Sunday morning, so see the time he poor thing was trying to give me that

freedom I took advantage of it in an extreme way but because he was giving me the freedom

exactly but in his opinion it was not permission from his view point he just took a decision yes

from my point of view, it was (74; 136.562)

1. Own vs. others point of

view

Its just that I have to be able not to care about it or at least I have to understand his Katja s

view point both or if I understand my view point and the other person s view point then I say

that the other person is right (83; 81.255)Yes exactly, probably, and this is from your viewpoint, we know, we know each other quite

well, and from this point of view you have to take the responsibility and if you look sincerely at

your thoughts you cant think that its the other person that have to notice it, here you have to

discipline yourself, you know, you feel it, two points (79; 73.475)

I dont want to be balanced no, it s just that I also understand my viewpoint, now the difficulty

is understanding both of them, lets say mediating so as to perform actions that somehow are

good for yourself without making you feel too guilty or anyway, without knowing my

viewpoint (83; 54.308)

its clear that changing this perspective changes the way of seeing the defects of others, and of

yourself and all, and also the relationship and this is a change in the vision of yourself, of Katja

and of the relationship, and the general vision of your issues in other things, the vision of

yourself (75; 61.268)

2. Differences in perspectives

This inner torment continually between the choice and continuing to have that perspective

which however was not confined to the view of other things but really of emotions and feelings

linked to. . .

and its what we were saying about challenging the choice each two or threeseconds because if I have another perspective linked to a different sensation (92; 24.638)

its not like buying a Ferrari because its one thing to buy a car and another to buy a Ferrari, in

sum between one million six hundred and the thirty million that a Ferrari costs there is a huge

difference, but between one million six hundred and thirty million which is the price of a car

theres also a difference, but its always less than that between a million six hundred and a

Ferrari (86; 24.107)

its true in the sense that as it were, your story is like that, but with dad I understand, yes I

agree, but what we were saying last time, its one thing if someone doesnt understand me, I

had some problems too, then when I had problems I wasnt able to explain myself, I mean,

when I explain myself, when I say one plus one equals two and then if you want to do as you

like its as she said (114; 23.364)

I mean I tried to explain some things to him, to tell him after that episode of the bloodiness

and so . . . I actually see that its all pointless, all quite pointless, but he insists on a specific

topic that is the daughter who needs to be treated, no, no, not on the topic of the daughter that

needs to be treated (109; 148.387)

3. Concerns for relational

problems

Yes, the couple doesnt work but not only the couple doesnt work, I dont work and neither

do you, that means it becomes a false happiness because children are absolutely are like pet

animals so they feel what happens (98; 92.132)

in those aspects you are highlighting, the relationship with your parents, now it seems silly,

you are highlighting some daughter-like aspects, that is, those are aspects related to

dependence, the car, upkeep, and obviously in this position you feel bad, as soon I m in this

position I cant stand it (94; 90.888)

Im calming down but if you assume a more stable identity people trust it too much and thus

you get bored because nothing happens (87; 88.162)

A. and I gave our parents some presents, for my mother we bought a pair of shoes, we gave the

same present each other, more or less the same gift . . .because between us we never give any

gifts. . . how come between Alberta and me? (75; 126.676)

4. Exchange of presents

his, mine that is because when I went to get it he was very kind, he said here it is, I wanted to

give it to you for your birthday, its an engagement gift, its nicehe was very kind to try to

connect the gift with what you are feeling (77; 110.891)But silly example if someone knows that on my birthday I like to receive flowers, I tell you this

the first year, the second year, do I have to tell you the third year? Or do I say please will you

give me some flowers? Ah, what did you give them to me for? (98; 79.356)

in this hour we did some window shopping do you like this bag?I said, yes, nice, sure,

well, I was thinking he was going to give me a present and so, you know, that was my birthday

present (121; 68.016)

more than attacked I felt misunderstood but also not respected yes is it more important to be

not respected or to be misunderstood? Well, I think that its a result, that is, maybe

misunderstood in the sense that there is no effort to relate to someone else and therefore to

understand them and try to respect what has been understood (79; 50.439)

5. Experience of feelings in the

relationship



9/19

Table III (Continued)


You are imagining yourself in our work? And you are saying: Im imagining, lets do a meta-

thought, right, ok lets make a meta-thought and you are saying there is the risk that I could be

here, pay attention exactly to the way you described yourself before (118; 31.874)

moustache I think that I will need his help to continue my therapy, that is the therapy goes on

by itself, but we need his help for the therapy. Lets think how it could be presented, and then

actually also the sense of our proposal, where he can help us or where I could convince him to

come? (116; 23.848)

we are talking about a sensation of emptiness which you are describing eh but the silence of

last time in the way you described it to me may help me to understand. . .that is Im trying to

visualize the scene, the inner scene in this moment its more or less like this, there is an inner

feeling about something inside that, isnt there? (100; 16.484)

I mean even if I had some difficulties understanding those signals of attention, that doesn t

concern these episodes, its about something else, different circumstances, but not these

episodes even if I have some difficulties reading the attention signals that are given eh, anyway

I dont care about it (98; 88.81)

6. Experience of difficulty in the

relationship

guys eh of course broadly speaking your difficulty is to admit to yourself that you re involved

in a relationship, somehow it was really I still have difficulties, but if this is the difficulty, as I

said, have more (119; 88.767)

maybe its a difficulty related to being able to live inside the world of others, able to move, you

see? In the world, in general even without the relationship thats under way its a difficulty in

directing this energy which is anyway activated, that is it s somewhere (100; 83.576)

Im trying to check with you and with everyone and this is a difficulty that also belongs to you:

P: the difficulty is being afraid that the emotions could be too big to be constrained, controlled

or anyway felt, I dont know (118; 42.205)

Im leaving again and thats all, anyway on Wednesday I decided to take a day off because that

friend I study with, lucky her, passed the written part of the Police entrance exam to become

an officer and now she has the oral exam and she said obviously I want to become a

magistrate (111; 52.298)

7. Work activities

so I dont have any working identity simply because I dont work because Im doing that

public exam so its pointless because I have or I dont have difficulties, that is, the issue of

working identity doesnt exist, if it arises when Im working the difficulty or the limit will exist

but at the moment it doesnt (94; 43.806)

the following days I wanted to sleep in the morning but I couldnt either on Tuesday or

Wednesday because I had to go to work because there was something to do, so I didn t relax

on Tuesday or Wednesday but in the afternoon I did my stuff, I had a wax, I went out with a

friend of mine thats (89; 42.793)

another thing that I must say now I realized that I was led astray by you insisting so much on

making choices without following your nature, which meant working three times harder S but

visibly working hard for my magistrates exam I thought you were referring to that kind of

fatigue in the sense that its one thing to (96; 38.153)

Well, some things happened so you couldnt not link together the frame of mind with what

happened V well, but you could have connect the frame of mind to what happened in general

with your dad, which probably created your frame of mind, a basic sadness, yes but it was also,

you see, (104; 189.949)

8. Account of negative feelings

it was due to the suffering that I feel each time I meet or I hear my dad, the frame of mind it s

that frame of mind, I didnt cry because Im narcissistic, but anyway sometimes I shed a tear,

yeah, very often, but this is not the point because anyway what this meeting gave you (115;

125.425)

its a period that it seems to me that Ive been living with this struggle for ages, fighting,

improving myself, but I dont enjoy myself, Ive had enough! Its a drag, I get bored, laughter,

I dont understood your frame of mind K., Im sorry my frame of mind (79; 110.369)

because obviously Ill become bad, Ill be bad, Ill be bad, I dont know, I dont feel as if Im

bad, but at times when he says hes sorry, it seems to me that he expresses his upset, but others

see it as being bad (88; 100.742)

And so you also have to accept all the consequences on you, this lack of sensibility from your

dad, that is if I accept that others can make mistakes, I also accept, I dont understand I dont

think its so easy for you to accept that people in general could be more or less sensitive could

have a different degrees of sensitivity, why cant these ones? (114; 56.695)

9. Tolerance of negative feelings



10/19



I cant do it, I cant feel this connection, that is to do it mentally, I can do that, I can do it,

anyway you dont want to accept that its an aspect of you, yes but I accept it, but what would

it mean in concrete terms? Accepting that Im working _t hard, I accept that, I feel it, but I

cant understand which part is holding off at a distance, and at a distance from what? (87;

39.952)

Yes, its like saying that there is an aspect of loving, of taking care, I mean thats ok, accepting

that it has a limit, accepting that there is a degree of suffering on the part of someone else that

we cant do anything about, no? And accepting that a degree of guilt feeling, ok could I

suggest something you could read? (84; 37.911)

going into a closed agency has no sense because it makes me waste time that I could use

differently doing more interesting things, so yes it troubles me, ok, so you are telling me that

now you are able to manage your troubles in a more natural way, that is we can say (93;

32.784)

This? No ah, I dont have to but the first, the first moment _no, this Saturday and Sunday he

has An and I dont but Monday is a holiday yes Monday is a holiday and we will be together,

but it doesnt count as a weekend, Monday really I will not make it weigh, this Saturday and

Sunday he is with An, and Katja? (105; 72.368)

10. Leisure

anyway, its better than before, thats all, then its always the same struggle with money he

complained all winter and he still goes on I wont be able to have any holidays, my god, my

god, my god, but in fact he goes away every weekend, now he s leaving for ten days I dont

know where in the mountains and then maybe hell go to his relatives in France, down in YY

(97; 38.19)

alone, and yes laugh he is going to the gym, poor him, yes nice I like him, no it s something I

like, hes nice, then slowly slowly in the following days, in the following days, in the following

days uh. Wednesday and thats all because on Thursday weve decided to go away not to go

away (77; 29.37)

we had a weekend alone on 4 September because An went to OO to a child s birthday party,

so we now at 4 October, and October has 4 weekends and I say see if you can take one, good,

otherwise anyway Im living a life where I get up every day at 8.00 including Saturdays and

Sundays (102; 24.718)

ah, no no, so I had to phone to get information the day after this marriage was too much, the

day after the marriage we had to leave for two days in NN but the weather wasn t good on

Sunday and so the ferries werent leaving (97; 22.734)

relatively its _not that Im leaving but goodbye and thank you from a view point obviously I

mean goodbye and thank you its not directed to my parents, goodbye and thank you no, no of

course not the fact of being a daughter, not the fact of being daughter, of being maintained its

one of the aspects connected to being a daughter, but not the whole thing I imagine, no, its a

very important part of it (94; 94.047)

11. Adherence to others

expectations

there isnt a dialogue so its impossible and so he needs to be surrounded by people or

someone to say yes, yes, the partner or whoever say yes, of course he s smart, yes, hes good,

hes good, but whats hes good at? (103; 88.427)

in the sense that she has some shortcomings of her own but she is nice in the relationship

because she doesnt smother, she isnt, anyway shes good, while my dad isnt, not at all, so

hes there, so thats why I think _ that more or less_ there is a better balance form that point of

view, then obviously until I start working and earning, goodbye and thank you, well (94;

85.956)

my goals, so that if he wants to give me something yes, no, but more than good or bad which

anyway is all relative, its the fact that I felt at ease which is much harder than being good or

not, a person may be good but feel like this and thats a quality that, youve seen I think, that

comes quite naturally to me, no? (91; 77.94)

and also its hard, hard in the sense that everything is hard, a hard perspective the way it s

managed yes, my dad, then it depends on the period because when I want to see him often,

certainly in this period*continues less certain*the less I see him the better I feel, but I went

to his home, I saw the lights, I saw a whole difficult period with dad, (94; 134.537)

12. Refusal of dependency

maybe Im wrong or maybe its because Im used to it for so long, I never depended on you,

you depended on me the difference is maybe quite considerable if it will make you more feel

better in a few years you will depend on us, well be equal, I really dont think so, Id prefer to

shoot myself rather than depend on you (120; 108.038)

Also very long periods when I was happy. I was calm, I felt good with that person name period

and period, I can also forget name, I remember him, well, name, ok . . .. (98; 82.972)



11/19

content was calculated as the percentage of ECUs

associated with it.

Analysis of validity. We have translated the

Turing-like criterion of validity (see section Pur-

pose of the Study and Hypothesis) into two

complementary hypotheses, each of them concern-ing one of the two basic functions implemented by

bottom-up semantic analyses*evaluation of simi-

larity and classification (see section Characteristics

of ACASM, point 1). More in particular, we

expect to find that: (a) the ACASMs evaluation

of similarity of ECUs is consistent with the

evaluation of semantic similarity produced by blind

expert human coders (hypothesis 1); (b) the

ACASM classification of ECUs is consistent with

those provided by blind human coders and based

on their semantic content (hypothesis 2); needless

to say, hypothesis 2 is only exploratory, being

expressed in terms of confirmation of the null

hypothesis.

In order to test these two hypotheses, we subjected

the ACASM output to the following two analysesbased on judgments performed by independent blind

expert human coders.

Analysis 1. Association of the ACASMs

Assessment of Similarity and Human Coders

Measure of Semantic Similarity

The aim of this comparison is to estimate the

consistency between the ECUs evaluation of



I was thinking of the wheel breaking, things more like this getting stuck in the middle of the

road yes yes, but also an aggression could happen no, no, usually those episodes can happen to

girls alone at night, I know usually*it happens I know, actually its always happening, Ive got

some girl friends who always get someone to take them home into the house, oh well, (76;

77.202)

its not important that its good for you or not but knowing that if it hurt you or it s good for

you its in your hands yes, it certainly is, but as I was saying before, its not that as you said

now it hurt you or its not good for you, its the same, its like two levels before, its not that

you dont know theres a level below a level, which, I mean, (89; 84.752)

13. Attitude towards the other

Yes, he is a good guy, he understands you, hes improved, he really loves you, but you, I mean,

are you in love or not? I dont answer myself and he doesnt give any answer and I dont

answer myself and if I want to give an answer yes but not yes, but it isnt an answer, its an

attempt at an answer (98; 73.914)

because they are not equal on me because for sure I feel better because for sure I know that

before the answer might have been dictated by an aggressive attitude and so the answer was

aggressive, and now its not like that anymore, that is, the answer is the answer and thats all,

that is its not linked to something that I do, its his choice and thus if one feels like that eh,

(102; 64.278)

that, its always what we said last time, that it seems obvious to me to make certain requests

and wait for certain answers and instead different ones arrive, I dont know, about the photo,

on the furniture, then uh (101; 52.244)

I still dont have it, ok, I wont read them, I wont read them if you want to wait it would be

better then I will also let you read other things that I m writing, probably, in fact I will also ask

you for advice about things, about as it were, about the relevance of what Ive written to what

youve experienced here (103; 93.922)

14. To communicate

obviously in some way the fact of being noticed the two things are exchanged, you wear

something of someone else no? The goal is to be noticed to write a story R: together but

belonging to someone else, not to me, that is I didn t have to write it was her that needed to

write and asked me please, please (87; 68.722)

because I need to be noticed because I have to write, I need to be noticed by that writer

because I have to write I dont know, a biography, or something, in sum he had to write

something and I was trying to say no but, (56.995)

And what did he write on the card, uh . . . he wrote well now I dont remember exactly the

words it was like you are my grumbling love but you are wonderful, Id never change you with

anyone elsethats all. I think it could correspond to reality, what was on the card? (89;

46.646)

Note. Translation from the original in Italian.

The first number in parentheses indicates the session in which the ECU occurred; the second number is a measure of the level of ECU s

representativeness of the corresponding cluster (Chi square metrics).



12/19

semantic similarity by human coders and the evalua-

tion of similarity provided by ACASM. To this end,

we adopted the following five-step procedure.

First, we selected 70 ECUs, the five most repre-

sentative ones from each of the 14 clusters defined

by the application of ACASM to the corpus. As

criterion of representativeness we adopted the

chi-square derived parameter computed by Cluster

Analysis for each ECU (the output of CA is reportedin the section Results; cf. Tables II and III too). This

parameter is based on the computation of the

number of a clusters words co-occurring within

the ECU*the more the words are in the cluster, the

greater is the representativeness of the ECU for that

cluster.

Second, two blind coders (PhD students), with

experience in content analysis for psychosocial

research, were separately asked to evaluate the

semantic similarity of the 2415 pairs of ECUs

produced by the combination of the 70 selected

ECUs (each ECU was compared with all the

others the number of pairs is given by theformula k(k 1)/2, where knumber of

elements70; therefore: 70(70 1)/22415

couples). Consistent with the commonsense criteria

of coding (see section Characteristics of ACASM,

point 3), we have chosen to use coders not

endowed with clinical expertise and not to provide

them with any specific, theory-oriented semantic

rules and criteria for coding. The coders received 2

hours of preliminary training. Training was aimed

at clarifying the task. Moreover, coders were

informed that the ECUs had been extracted from

the verbatim transcript of a psychotherapy and

asked to use a 5-point Likert scale*from 1

indicating very different thematic content, t o 5

meaning same thematic content. No further

information on the aim of the task was provided

to them; coders were blind to ECUs belonging to

ACASM clusters. The ECUs were presented in

random order, the same for both coders. By so

doing, 2415 similarity judgments were obtained

from each coder. It is worth noting that we did not

implement any consensus procedure, often adopted

in semantic analysis for the sake of increasing the

inter-coder convergence (e.g., Stiles, Elliott, Lle-

welyn, FirthCozens, Margison, Shapiro & Hardy,1990). Thus, the comparison between ACASM

and human coders is limited at the basic level of

functioning of semantic analysis*namely not en-

compassing the post-coding process of increasing

reliability.

Third, in order to make the matrix thus obtained

suitable for parametric analysis, the Likert scores

were transformed into metric scale, following the

procedure proposed by Ciavolino and Dahlgaard

(2009), based on the probability associated with the

relative frequency of each level of similarity.

Fourth, we calculated an ACASM rate of similarity

for allthe 2415 couples of ECUs. To this end, we used

the Euclidian distance as the ACASM measure of

similarity between two ECUs. In order to understand

this parameter, one has to consider that each ECU

corresponds to a point on the multidimensional

factorial space resulting from the multidimensionallexical correspondence analysis performed as the first

step of the procedure of Cluster Analysis (see above,

Method section, ACASM step 4). The Euclidian

distance is the metric distance between two points on

this space. The closer the two points, the less is the

Euclidian distance, and the more similar are the

ECUs they represent (Lancia, 2002). In formal terms,

the distance between every couple of ECUs was

calculated as:

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiffiffiffiffiffiffiffip1 q1 2

p2 q2 2

::: pn qn 2

q

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffiffiffiffiffiffiffiXnk1

pkqk 2:

s

withP(p1,p2,. . .pn) andQ(q1,q2,. . .qn) represent-

ing the coordinates on the n-dimension factorial

space of the two generic ECUs whose distance is

computed. In the case of our analysis, we used the

first 10 factorial dimensions defined by the multi-

dimensional lexical correspondence analysis applied

to the corpus (i.e., n10).

Finally, we compared the values of Euclidian

distance (as ACASMs measure of dissimilarity) andthe human coders judgments of semantic similarity.

The comparisons were performed on all the 2415

couples of ECUs for each coder. Given the structure

of the ECU sample*five ECUs for each of the 14

clusters*most of the pairs of ECUs had a low level of

similarity. Consequently, most of the 2415 pairs were

rated 1 by both coders (coder A: 1 corresponded to

91% of judgments; mean1.1085; d.s..37506;

curtosis18.216; skeweness4.042; coder B:

point 1 corresponded to 77.8% of judgments;

mean1.3102; d.s..65622; curtosis4.304;

skeweness2.203). For this reason, we adopted a

nonparametric index of correlation*Spearmans rho.According to the first hypothesis, we expect to find

a significant negative correlation between the Eu-

clidian distance and the average human coders

evaluations of similarity; the negative direction of

the correlation is due to the fact that the Euclidian

distance is a measure of dissimilarity, rather than

similarity. Moreover, we expect to find that this level

of correlation is not distinguishable from the level of

association between the two coders.



13/19

Analysis 2. Level of Agreement Between

ACASM and Human Coders Classification of

ECUs.

This analysis is aimed at comparing the

ACASM classification with those based on human

interpretation of the semantic content of the ECUs.

The analysis is based on the same set of 70 ECUs

adopted for analysis 1. The ECUs were ranked

randomly, the same for all the coders, to avoid that

their order of presentation being related to cluster

membership. Three blind coders, different from

those involved in the previous analysis, yet similar

for level and type of competence (i.e. PhD students,

skilled in content analysis for psychosocial research,

lacking clinical expertise) were separately asked to

group the 70 ECUs into 14 groups of five ECUs on

the basis of their thematic similarity. We have

indicated 14 partitions in order to make the human

coders classification directly comparable with the

ACASMs. Also in this case, the coders were given 2

hours of preliminary training, for the sake of makingthe task clear to them. Moreover, coders were

informed that the ECUs had been extracted from

the verbatim transcript of a psychotherapy. No

further information on the aim of the task was

provided to them; coders were blind to ECUs

belonging to ACASM clusters. Akin to analysis 1,

no theory-oriented semantic criterion of classifica-

tion was provided to coders, and no consensus

procedure was implemented.

Finally, Cohens K inter-coders agreement was

calculated for the four classifications (i.e., those

carried out by the three coders and the one produced

by ACASM); thus, we calculated six Cohens K

values: three concerning the coders against each other

and three each coder against the ACASM.

According to the second operative hypothesis, we

expect to find that the level of ACASM-human

coders agreement is at least of the same degree as

the level of agreement between human coders.

Results

Interpretation of the Thematic Contents and

their Incidence

Theapplication of ACASMto thecorpus (cf., Table IIfor statistics describing it) produced 5548 ECUs and a

list of 7258 lemmas, from which we sampled 726

lemmas, following the procedure described above (see

step 2 of the ACASM procedure). Therefore, the

Cluster Analysis (step 4 of the ACASM procedure)

was performed on the matrix defined by 5548 ECUs/

rows726 lemmas/columns. Cluster Analysis was

able to group 5054 out 5548 ECUs (91.095%, cf.,

Table II). It provides 14 clusters as the optimal

partition. A sample of the most representative ECUs

for each cluster, together with the clusters interpreta-

tion in terms of thematic content, is provided in Table

III. Table IV shows the number of ECUS grouped ineach cluster. Exchange of presents (10.74%), Differences

in perspective(9.36%), Adherence to others expectations

(8.71%), Leisure (8.43%) and Tolerance of negative

feelings (8.23%) are shown to be the most frequent

cluster/thematic contents, while the least frequent are:

Own vs. others point of view (4.45%), Experience of

difficulty in relationship(4.71%), Concerns for relational

problems(5.05%)Account of negative feelings (5.1%).

The frequency of the 14 clusters changes signifi-

cantly through the three sub-periods (Chi-square

132,684; df 26; pB.000). Nevertheless, the visual

inspection of the distribution of the clusters shows

how all clusters tend to be spread among the three

periods, namely that they occur in every sub-period

(cf., Figure 1).

Analysis of validity. As concerns analysis 1,

the ACASM measure of similarity (the Euclidian

Table IV. Partition of ECUs in the clusters

Cluster/Thematic content Number of ECUs Percentage

1. Own vs. others point of view 225 4.45%

2. Difference in perspectives 473 9.36%

3. Concerns for relational problems 255 5.05%

4. Exchange of presents 543 10.74%

5. Experience of feelings in the relationship 353 6.98%

6. Experience of difficulty in relationship 238 4.71%

7. Work activities 332 6.57%

8. Account of negative feelings 258 5.1%

9. Tolerance of negative feelings 416 8.23%

10. Leisure 426 8.43%

11. Adherence to others expectations 440 8.71%

12. Refusal of dependency 380 7.52%

13. Attitude toward the other 405 8.01%

14. To comunicate 310 6.13%



14/19

distance) and the human coders judgment of the-

matic similarity were significantly correlated in both

cases (ACASM-coder A: Rho.125, PB.01;

ACASM-coder B: Rho.121;PB.01). The correla-

tion between the two coders (coder A-coder B:

Rho.162; PB.01) is of the same magnitude as

the correlation coder-ACASM.

Table V shows the Cohens K measures of inter-

coder agreement concerning the classification of the

70 ECUs into 14 partitions (analysis 2). The

magnitudes of K are quite similar among the six

scores; all comparisons lie within the range 0.34

0.42 (according to Landis & Koch, 1977 this

corresponds to a fair to moderate level of agree-

ment). The levels of agreement between human

coders and between human coders and ACASM

are substantially overlapping*the average K

concerning the agreement between coders is .383(sd.034); the average K concerning the agreement

between human coders and ACASM is 0.378

(sd .45). The highest K (.427) concerns the

inter-coder agreement between coder 3 and

ACASM.

Discussion

ACASM has mapped the transcripts content in terms

of 14 clusters, each of them being interpretable in

terms of thematic content. From a quantitative

standpoint, all thematic contents prove to be specific,

in the sense that every cluster encompasses only a

limited portion of the therapeutic exchange*the

most frequent thematic content concerns about 10%

of the classified text*but not marginal*no cluster

represents less than about 5% of the classified text.

Moreover, though the overall distribution of thematic

contents has been shown to change significantlythrough time, all thematic contents are present in a

Figure 1. Distribution of the thematic contents in the three sub-periods of psychotherapy.

Table V. Cohens Kbetween coders and ACASM classification

Coder 2 Coder 3 ACASM

Coder 1 .400 .407 .338

Coder 2 .344 .369

Coder 3 .427



15/19

non-marginal way in all three sub-periods. Taken

together, these results lead us to conclude that each of

the 14 thematic contents mapped by ACASM repre-

sents a systematic semantic area of the clinical

exchange analysed*namely a line of discourse which

is present in varying degrees through sessions, but

which goes through the whole treatment.

Interestingly enough, the most frequent thematic

contents concerns the account of positive circum-stances, associated with the patients experience of

relational engagement (Exchange of presents,

Differences in perspectives, Adherence to others

expectations, Leisure) and/or her inner states and

feelings (Tolerance of negative feelings), while the

least frequent refer to negative issues*in terms of

negative feelings (Account of negative feelings) or

relational disengagement (Concerns for relational

problems, Experience of difficulty in relationship, Own

vs. others point of view). Moreover, one can observe

that some thematic contents seems to be stable

through the three sub-periods into which the

period of therapy examined has been divided*inparticular, Differences in perspectives, Exchange of

presents, Leisure, Experience of difficulty in

relationship, Own vs. others point of view.

If one considers that the period analysed consists

of the last year of the three and half years of good-

outcome therapy, this result lends itself to be

interpreted as a marker of the positive evolution

of the therapeutic dialogue*namely, of the fact

that, in the final segment of the psychotherapy,

patient and therapist have focused on the patients

more positive personal and relational experiences,

leaving conflictual and problematic issues partially

in the background. Needless to say, given the

exemplificative purpose of the analysis at stake,

such interpretation has to be considered in merely

descriptive terms, namely as a picture of the

content of the clinical dialogue between Katja

and her therapist which is consistent with the

good outcome of the psychotherapy.

As concerns the analysis of the ACASMs validity,

findings are consistent with both the hypotheses we

subjected to test.

Analysis 1. Evaluation of Similarity

Results of analysis 1 highlight that ACASM provides

a measure of the similarity of the units of text (in

ACASM terms: ECUs) which is associated with the

evaluation of thematic similarity provided by two

blind coders with average experience in semantic

analysis. More in particular, we have found a

significant negative correlation between the

ACASMs measure of similarity between couples of

ECUs (Euclidian distance) and the human coders

evaluation of thematic similarity. Hence, ACASMs

way of representing the relationship of (dis)similarity

among the units of text tends to agree with that

produced by human coders.

The level of correlation is not high for both the

comparisons (rho.125; rho.150); yet it is

similar to that between the two coders

(rho.162). As concerns this quite low level of

the correlational indexes, we are led to think that itdepends on two convergent factors. First of all, a

role could have been played by the structure of the

data. As observed, the distribution of the evalua-

tion of similarity inevitably proved to have a

limited variability and this has an inherent negative

impact on the calculation of correlation. Secondly,

the limited agreement between the two coders

reflects the data driven bottom-up logic of the

task given to the coders. Each coder was asked to

evaluate the thematic similarity between ECUs,

without providing her/him further indications

about the criterion of similarity which had to be

used. Therefore, the coders low level of agreement

could reflect the multidimensionality of the seman-

tic content: two utterances may be thematically

similar from a certain point of view but different

from many others. Take for example the following

two sentences:

We hope to be able to convince the readers of the

utility of ACASM (1)We hope to be able to enjoy

ourselves with ACASM(2)

Now, if one considers them from the perspective of

the fact that both of them concern a wish related toACASM, they are quite similar; on the other hand, if

one considers them from the point of view of the

content of the desire, they can be considered quite

different, with (1) oriented to a third (the readers)

and (2) to the subject of the sentence; moreover, (1)

concerns the scientific evaluation of ACASM, (2) the

use of it*and so on.

Obviously, bottom-up methods of semantic ana-

lysis can be endowed with constraints increasing the

level of agreement among coders, in accordance with

the specific aim of the analysis. However, the same

can be done with ACASM*

for instance, throughworking on the choice of lemmas to be selected for

analysis. Yet, given that extending the comparison

with human coders to this further level of ACASMs

functioning would have required a different design,

according to the initial aim of this study, we have

decided not to include these further constraints,

limiting the analysis to the extent of potential

agreement at the level of basic data driven bottom-

up analysis.



16/19

Analysis 2. Classification Task

Results from analysis 2 confirm the picture provided

by analysis 1, from the complementary point of view

concerning the task of classification. Here we have

found that the agreement between the ACASM

classification and the coders classifications is of the

same extent as the agreement among human coders

who are expert in content analysis for psychosocial

research. At the same time, however, analysis 2 also

highlights how the extent of agreement among

classifications*regardless of whether they are per-

formed by human coders or ACASM*is rather low:

from fair to moderate. This double finding requires

some comments.

Preliminarily, in order to appreciate it, one has to

take into account the very large degrees of freedom

associated with the task of classification at stake. As a

matter of fact, the probability of ordering 70 ECUs

(n) into 14 groups (g) of five items (k) is:P(n,g,k)(k!

(nk)!n!)g(5! (705)!70!)146.19100. This in-

finitesimal value of probability of casual agreementcan be considered an assessment of the difficulty of

semantic classification tasks*and in the current

study the classification task is a rather simple

example, compared to those usually addressed in

semantic analysis. Thus, even if the level of agree-

ment is not high in absolute terms, it is more

appreciable as one takes into account that it has

been reached in the context of a task having to deal

with a very high level of uncertainty.

Needless to say, the coders might have made some

mistakes in classifying the ECUs; yet, given their

level of expertise, the error of measurement could

help marginally at best in explaining the not high

level of Cohens K. Just as for the evaluation of

similarity, in the classification task the partial diver-

gence among coders also needs to be considered in

the light of the multidimensionality of meaning.

Texts do not hold a pre-established, fixed meaning;

rather, they define the constraints within which the

reader constructs the interpretation (Eco, 1979).

Hence, any ECU has no single true meaning, as

such able to define normatively which is the right

classification and, complementarily, to qualify all the

other classifications as errors. On the contrary, any

unit of text is open to a multiplicity of interpreta-tions. Consequently, the divergence among classifi-

cations that we have found depends on the fact that

coders may classify the ECUs in accordance with a

plurality of hermeneutic criteria, each of them

grounded on a certain component of the meaning

at stake and made pertinent by the coders specific

point of view and interpretative plan (Salvatore,

2011). In sum, the moderate-fair level of agreement

has to be considered in the light of the inherent

interpretative autonomy of the coder. Anyway, we

recognize that our results do not allow us to exclude

the alternative interpretation*namely that the mod-

erate-fair level of agreement (as well as the low level

of correlation shown by analysis 1) is a matter of

error of measurement. Further analyses are required

for arriving at a conclusive statement on this point.

From a complementary point of view, the similar-

ity of the levels of agreement among the three pairsof coders provides food for thought. In order to

interpret this aspect of results, one has to take into

account that coders were asked to classify the ECUs

in terms of commonsense (see section: Design). One

can thus conclude that the convergence among

coders reflects the fact that they share some implicit

semantic criteria rooted in their common cultural-

linguistic membership. Incidentally, the statement

just made is not contradicted by the fact that the

agreement documented by analysis 2 is only of

moderate-fair extent. This is so because common-

sense guides the interpretations through texts in a

variable way: according to their semantic, syntacticand lexical characteristics, some units of text are

more conventionalized (Bartlett, 1932) sensitive to

the influence of commonsense, while others are less

affected by this semantic attractor (Rommetveit,

1992; Valsiner, 2007). To summarize, we consider

the agreement between the classifications performed

by the independent coders as the effect of the

commonsense ground shared by the coders and as

such guiding them to converge with each other. On

the other hand, the intermediate extent of the

agreement shows that this common ground put

some constraint on the interpreters autonomy*on

the more conventionalized part of the text*but it

did not cancel it.

The homogeneity of the levels of agreement

between the set of the three inter-coder comparisons

and the set of the three coder-ACASM comparisons

allows us to draw the following double conclusion.

Firstly, as expected by hypothesis 2, the ACASM

classification reaches a level of agreement with those

carried out by human coders, which is consistent

with the level of agreement the coders are able to

reach with each other. In the final analysis, this

means that*as the Turing-like criterion requires*

an external observer blind to the nature of theclassifier could not distinguish among the four

classifications (the three provided by coders and

the one by ACASM). Hence, analysis 2 shows that

ACASM satisfies the Turing-like criterion as far as

the classification task is concerned. Secondly, the

level of agreement between ACASM and coders is

comparable to the level of agreement that human

coders reach with each other on the basis of the

commonsense competence they share as members of



17/19

a given cultural-linguistic community. Therefore,

though this does not necessarily mean that ACASM

performs the same job carried out by human coders

on the basis of commonsense (i.e. computational

equivalence), it means that it does a job at least

quantitatively equivalent to that (i.e., functional

equivalence).

Methodological Limits of the Study

Before concluding, some major limits of our study

have to be underlined, for the sake of clarifying how

the results discussed above have to be interpreted.

Firstly, two issues concerning the design have to

be highlighted. On the one hand, the comparison

between human coder and ACASM is based on a

non-random sample of units of analysis*we selected

the units of text in accordance with the ACASM

output, sampling the most representative ECUs for

each cluster defined by the automated method. We

adopt this modality of sampling in order to reduce a

potential source of variability and focus the compar-

ison on the parts of text that are more clearly and

reliably interpretable from the perspective of

ACASM output. On the other hand, for the sake

of making the human and ACASM classification

homogeneous, and therefore immediately compar-

able, we asked coders to classify the units of text in

the same number of classes as those produced by the

automated method (14). We recognize that these two

choices weakened the Turing-like criterion, because

they made the terms of the comparison (i.e., the

ACASM output and the human coders perfor-

mance) non-independent. Thus, even though thedesign adopted might have improved the reliability

and power of the analysis, it did so at the cost of

reducing its external validity: our study leaves open

the question of whether the indistinguishableness

between the performance of human coders and

ACASM would have been retained if a random

sample of units of text had been used and no

constraints had been put on the number of classes

human coders adopt for the sake of classifying.

Secondly, we compared ACASM and human

coders just on the two basic functions of similarity

and classification. Yet, human coders perform such

functions on the basis of a preliminary operation ofselection of the pertinent part of the text. In order to

code, human coders firstly have to select as relevant

any parts of the text, thereby defining the units of

analysis to be subjected to coding. And it is evident

that the output of any semantic analysis strongly

depends on how (in terms of which criteria) perti-

nentization is carried out. For instance, according to

the Narrative Process Coding System (NPCS; An-

gus, Levitt, & Hardtke, 1999; Angus & Hardtke

1994; Angus, Hardtke, & Levitt, 1996), coders

assume as unit of analysis the thematic nuclei

(according to the terminology of the method: con-

tent areas); once this construction of the unit of

analysis has been performed, they code them in

terms of narrative categories (External Narrative

Process Sequences, Internal Narrative Process Se-

quences, and Reflexive Narrative-Process Se-

quences). Still, think of methods like the CoreConflictual Relational Theme (CCRT; Luborsky &

Crits-Cristoph, 1990) and the Innovative Moments

Coding System (IMCS, Goncalves et al., 2009;

Goncalves et al., 2010), whose systems of coding

are applied only after the selection of the units of text

considered pertinent (Narrative Episode in CCRT;

Innovative Moments in IMCS). As concerns

ACASM, it adopts a data-driven bottom-up proce-

dure of pertinentization, as implemented by the

methods step 1. According to this procedure, all

the text is selected, and the pertinentization concerns

the length of the segments of text. However, the non-

selective, data-driven character of the ACASMprocedure of pertinentization does not mean that it

is a neutral operation. Rather, through its specific

way of segmenting, ACASM constructs a peculiar

version of the textual corpus (e.g., a partition of

groups of sentences) as the object of coding: its

thematic map cannot but reflect and move within the

limits defined by such a version. Consequently, we

have to conclude that the validity of our comparison

among human coders and ACASM is limited to a

model of a human coder adopting the ACASMs

version of text as object of coding. However, we do

not consider this limitation a reason for invalidating

the results of the current study. Meaning does not

have its own length and place in the text: one may

segment units of analysis at many gradients of

length*words, sentences, groups of sentences, as

well as larger partitions of texts*and will none-

theless create a version of text that is semantically

interesting. Thus, there is not a preferential way of

pertinentizing*any system of coding entails a defi-

nition of the units of analysis, in accordance with its

aim and theoretical framework as well as with the

computational requirements for implementing it.

Consequently, further studies have to verify whether

the ability of ACASM to satisfy the Turning-like testwithin the constraints of the current study is the basis

for the more general capability of ACASM to provide

a thematic map that is meaningful in itself and usable

for clinical purposes, in integration with other

methods too. As concerns the latter point, we already

have some promising evidence*Nitti, Ciavolino,

Salvatore, & Gennaro (2010) have applied ACASM

as the first phase of a more articulated method

(Discourse Flow Analysis) aimed at analysing the



18/19

way contents connect with each other within the

communicational flow of the psychotherapy. In so

doing, they were able to show that the way contents

are related to each other changes through the

psychotherapy process, and that this change is a

valid marker, thanks to which one can discriminate

the clinical quality of sessions.

Conclusion

This study has presented an automated method of

data-driven bottom-up semantic analysis*

ACASM*providing a first test of its validity. Results

have shown that ACASM produces a meaningful,

systematic map of the thematic content of verbatim

psychotherapy transcripts, which is consistent with

the one produced by expert human coders.

Needless to say, this study is just a first step in the

direction of ACASM validat

Documents

Metodo Automatizado Analisis Contenido Psicoterapia