ISA.88

Embed Size (px)

Citation preview

  • 8/20/2019 ISA.88

    1/38

    THE STATISTICAL CHARACTERISTICS OFEVENT DATA

    Philip A. Schrodt

    Department of Political ScienceUniversity of Kansas

    Lawrence, KS 66045

    voice: 913-864-3523

    fax: 913-864-5208

    Internet: [email protected]

    August, 1991

    revised February 1994

    An earlier version of this paper was presented at the International Studies Association,

    St. Louis, March 1988. This research was supported in part by National Science

    Foundation Grant SES89-10738. My thanks to Hayward Alker for comments on an

    earlier version of this paper.

  • 8/20/2019 ISA.88

    2/38

    ABSTRACT

    This paper explores event data as an abstract statistical object. It briefly traces the

    historical development of event data, with particular attention to how nominal events

    have come to be used primarily in interval-level studies. A formal definition of event

    data and its stochastic error structure is presented. From this definition, some concrete

    suggestions are made for statistically compensating for misclassification and censoring

    errors in frequency-based studies. The paper argues for returning to the analysis of 

    events as discrete structures. This type of analysis was not possible when event data

    were initially developed, but electronic information processing capabilities have

    improved dramatically in recent years and many new techniques for generating and

    analyzing event data may soon be practical.

    Key words: Event data, statistical analysis, machine coding

  • 8/20/2019 ISA.88

    3/38

    INTRODUCTION

    Event data — nominal or ordinal codes recording the interactions between

    international actors — are one of the most common types of information used in

    quantitative international relations research. McGowan et al (1988) note that theCREON, WEIS and COPDAB event data sets are collectively comparable in numbers of 

    citations to the World Handbook1 , itself second only to the Correlates of War data set.

    The aforementioned event data sets plus the event data portions of the World Handbook

    and Gurr data sets account for 41% of all institutional requests for international relations

    data sets from the ICPSR (McGowan et al, 1988: 111). The advantages and

    disadvantages of event data have been extensively discussed elsewhere (e.g. Andriole

    and Hopple 1984; Azar and Ben-Dak 1975; Burgess and Lawton 1972; Gaddis 1987,

    Howell et al 1986; International Studies Quarterly 1983; Laurance 1990; Merritt 1987;

    Merritt, Muncaster and Zinnes, 1993; Munton, 1978; Peterson 1975; Schrodt 1994) and

    while the approach is not without problems, there is every reason to believe these data

    will be in use of some time to come.

    Event data come in a variety of forms and coding schemes. The most widely cited

    sources are Azar's (1982) COPDAB (Conflict and Peace Data Bank) and McClelland's

    WEIS (World Events Interaction Survey). Both are comprehensive: they attempt to

    code all interactions by all states and some non-state actors during for a period of time.

    In contrast, specialized event data sets such as Hermann's (1973) CREON (Comparative

    Research on the Events of Nations) and Leng's (1987) BCOW (Behavioral Correlates of 

    War) focus on specific subsets of behavior, foreign policy and crises respectively. A

    variety of domestic and international event data collections, usually focusing on alimited set of actions such as uses of force, domestic violence, or changes of 

    government, are embedded in other data sets such as Rummel's DON (Dimensionality

    of Nations), the World Handbook , and various Gurr data sets2. In addition to the data

    sets in the public domain, event data have also been collected by governmental

  • 8/20/2019 ISA.88

    4/38

    Event Data Page 2

    agencies such as Department of Defense and various intelligence agencies in the United

    States (see Andriole and Hopple,1984; Hopple, 1984; Daly and Andriole 1980; Laurance

    1988) and private consulting firms such as CACI Inc. and Third Point Systems (Howell

    et al 1986).Despite their prevalence in contemporary quantitative studies, event data are odd

    statistical objects: they are nominal random variables occurring at irregular intervals

    over time subject to non-random selection bias. The conventional statistical repertoire

    has almost no techniques explicitly designed for such data and, as Achen (1987) points

    out virtually no original statistical work has been undertaken to fill these gaps. As a

    consequence, event data are commonly aggregated and then analyzed using interval-

    level statistical techniques.

    This article will provide an overview of the statistical issues inherent in event data

    and provide some suggestions as to how those issues might be addressed. The

    discussion is necessarily somewhat abstract, and as such differs substantially from most

    of the event data literature, which tends to emphasize concrete issues as such source

     biases in the New York Times , whether there is a cooperation-conflict dimension, and

    methods of improving intercoder reliability3. In contrast, I focus on the theoretical

    what , not the how, of event data generation and analysis. I assume a familiarity with the

     basic concepts and implementations of event data and basic statistical terminology; the

    discussion is largely oriented towards the two comprehensive event data collections,

    WEIS and COPDAB, though most is also relevant to specialized data sets such as

    CREON and BCOW.

    STATISTICAL ANALYSIS IN THE EARLY DEVELOPMENT OF EVENT DATA

    The development of event data sets is well documented: Merritt (1987) provides an

    excellent overview; Azar, Brody and McClelland (1972) provide a series of papers

    coming out of Azar's Michigan State University event data conferences during 1969 -

    1971. The early theoretical development of WEIS is thoroughly discussed in a series of 

  • 8/20/2019 ISA.88

    5/38

    Event Data Page 3

    papers by McClelland (1967a, 1967b, 1968a, 1968b, 1969,1970)4; Azar's early

    development of COPDAB is also fairly well documented (see for example. Azar and

    Ben-Dak. 1975, Azar et al 1972; Azar and Sloan, 1975).

    In these early discussions, neither the statistical characteristics of event data nor theformal specification of the models for which the data were appropriate are commonly

    mentioned. Instead, most of the discussion focused, as it does today, on issues such as

    the definition of an event, coding reliability, event categories and coding scheme, and

    source bias and language problems. Issues of statistical modeling are almost always

    implicit rather than explicit in the event data literature. As Achen (1987) notes, this

    distinguishes event data research from most of the statistical social sciences (e.g.

    econometrics and survey research) where the statistical characteristics of data have

     been exhaustively specified and researched.

    McClelland (1970) provides one of the earliest and most theoretically sophisticated

    discussions of the role of events in the study of international behavior. For McClelland,

    event data form a bridge between the then-prevalent general systems theories of 

    international behavior, and the empirical basis of our understanding of that behavior

    through textual history.

    ...International conduct, expressed in terms of event data, is the chief dependent

    variable of international relations research. ... It is interesting that a starting point is

    provided as readily by the ordering principle of classical diplomatic history as by the

     basic concepts of general system analysis. Thus, we may assert that the prime

    intellectual task in the study of international relations is to account for actions and

    responses of states in international politics by relating these to the purposes of 

    statecraft or, alternatively, we can say that the problem is to account for the

    relations among components of the international system by analyzing the

    characteristics of the various components of that system by tracing recurring

    processes within these components, by noting systematically the structure and

  • 8/20/2019 ISA.88

    6/38

    Event Data Page 4

    processes of exchange among the components, and by explaining, finally, the

    linkages of within-component and between-component phenomena. Obviously the

    classical definition of diplomatic history is less ponderous and more literary than the

    general system definition of the task but both, as we shall seek to show next, carryabout the same information and involve nearly the same range of choices of inquiry

    and analysis. (McClelland, 1970:6)

    McClelland did not adopt the approach of diplomatic history completely. He would

    explicitly reject the utility of motivational information, which is necessarily inferred, and

    also reject the "Ranke tradition of painstaking research and recovery of all extant

    evidence" (McClelland, 1968b:163) in favor of focusing on overt actions recorded in asingle source. However, McClelland accepted most of the standards of theory and

    evidence from diplomatic history as a starting point and wanted to supersede these

    with the more rigorous and systematic approaches of general systems theory.

    McClelland, in short, was proposing a Lakatosian, rather than Kuhnian, transition from

    traditional to behavioral models; an evolution rather than a revolution.

    In McClelland's assessment, however, this transition failed. After some years of 

    work with event data focusing on several crises, he concluded:

    It proved relatively easy to discern event patterns and sequences intuitively. We

    found we could follow the successions of action and response in flow diagram form.

    Stages of crisis and the linkage of event types to temporary status quo  situations

    also were amenable to investigation. We were defeated, however, in the attempt to

    categorize and measure event sequences. This was an early expectation that was

    disappointed by the data which showed too few significant sequences to support

    quantitative or systematic treatment. (McClelland, 1970:33)

    As a consequence of this failure, McClelland's "World News Index" project, published in

    the mid-1970s, used continuous variables in its measures. With the hindsight of two

  • 8/20/2019 ISA.88

    7/38

    Event Data Page 5

    decades, the failure of a discrete event approach appears due to a paucity of data and

    processing capability. McClelland writes of analyzing hundreds or at most thousands

    of events; a contemporary event data researcher has available hundreds of thousands

    of events and computer power sufficient to work with millions.After this early definition of international politics as event sequences, the field of 

    quantitative IR moved rapidly to analyzing events with interval-level techniques. This

    change was probably due to the general shift in quantitative IR in the late 1960s away

    from historical approaches towards theories based on the model of the physical sciences

    and economics. By the 1970s a Kuhnian split was underway in international relations

    with the traditionalist and behavioralist camps proudly speaking totally different

    languages, whereas when McClelland's 1961 World Politics article was written, this split

    was not apparent.  For example, Rummel (1972) proposes a science of international

    politics similar to meteorology, using interval-level metaphors such as field theory and

    interval-level techniques such as factor analysis. Azar, while using WEIS as the model

    for COPDAB, abandons McClelland's nominal categories in favor of an interval-level

    measure and approached coding as a scaling problem. Azar and Sloan (1975) consists

    entirely of interval-level data and Azar (1980:150) emphasizes

    quantitative aggregations, called here 'analytic data', [which] are summaries of the

    weighted frequencies of interactions. They describe the amount of conflict or

    cooperation exchanged between or within nation-states over some unit of time.

    This conversion of discrete entities to interval-level data is somewhat puzzling from

    a statistical standpoint. The reason probably is due as much to paradigmatic

    developments in quantitative international relations as in the nature of the data. During

    the 1970s, data-based studies of international behavior saw the ascendancy of 

    correlational analyses, particularly regression. The mathematics behind these

    techniques had been fully developed by econometricians and could be easily applied to

  • 8/20/2019 ISA.88

    8/38

    Event Data Page 6

    international relations data using SPSS and other statistical packages; comparable tools

    were not available for sequence analysis. The successful formal theories were

    continuous-variable models such as the Richardson arms race model and DYNAMO-

    like global models. Rational choice models preserved some discrete variables,particularly in game theory, but even these models used continuous variables in

    expected utility calculations5.

    The emphasis on crisis, initiated by McClelland and expanded in a large event-based

    crisis management literature (e.g. Hoople, Andriole and Freedy 1984, Azar et al. 1977)

    also contributed to the emphasis on interval-level variables. Implicit in most of the

    crisis models is either a simple distinction between crisis and non-crisis or a

    unidimensional ordinal set of "steps" to crisis (e.g. Hoople, 1984). Crisis forecasting is

    reduced to a problem of monitoring some activity—usually some aggregated

    measure—to ascertain when the system is going to change states. The crisis literature

    also frequently emphases the concept of the "intensity" of events, another interval

    measure.

    A STATISTICAL DEFINITION OF EVENT DATA

    The study of the statistical characteristics of event data must start with a formal

    definition of those data in mathematical terms; this section will provide one such

    definition. I will also deal with the issue of what an event coding system seeks to

    achieve.

    International events are reported through natural language text. Coders work in

    the peace and quiet of an office creating data from secondary sources, rather than

    coding from direct experience6. Between the occurrence of an international event and

    the datum being entered into a computer, information passes through a textual phase.

    These text strings form a very large—but still quite finite—set S , the set of all

    possible texts one might conceivably code as events. S is not the same for all event

    coding schemes: an event in BCOW is not necessarily an event in COPDAB. Every

  • 8/20/2019 ISA.88

    9/38

    Event Data Page 7

    event data coding scheme has a finite set C of discrete event codes. In COPDAB these

    are the integers 1 to 15 combined with the issue and area codes; in WEIS they are the 63

    WEIS nominal categories. The set C is dramatically smaller that the set S.

    An event data coding scheme is simply a function E that maps text strings into

    codes:

    E : S -> Pow(C)

    where Pow(C) is the power set of C (all possible subsets of C )7. Because S is very large,

    the function E is not explicitly specified in human-coded data. E is instead implicit in the

    coding manual and training of the coders; each coder should be meticulously trained to

    emulate the ideal E function. In machine-coding system, E is explicitly specified in the

    coding algorithm..

    Text strings are also mapped into a set of actors and targets. This set and the

    coding rules relevant to it may differ substantially between coding schemes, particularly

    when dealing with quasi-international actors such as national liberation movements.

    While generally the problems of identifying actors are less difficult than those involving

    events, a similar implicit function exists for actor coding, and the statistical problems are

    similar.

    A specific event data set is dependent on the choice of the source text and the choice

    of a particular coding system; these, in turn, are determined by the theory or set of 

    theories that a researcher wishes to explore. The purpose of event data research is to

    find statistical regularities and this differentiates the event data approach from the

    diplomatic history approaches advocated by Gaddis (1987). If one is not interested in

    statistical regularities—and Gaddis m(1987), for example, clearly is not—then event data

    are useless, but in statistical studies, the approach can be quite serviceable.

    To use an analogy, the works of Thomas Wolfe, Norman Mailer and Hunter S.

    Thompson provide one approach to understanding life in the United States circa 1970;

    the 1970 U.S. Census and the 1970 American National Election Study provide another.

  • 8/20/2019 ISA.88

    10/38

    Event Data Page 8

     Just as one would be ill-advised to attempt to fathom the angst of the urban liberals

    from the Census, so it is unwise to attempt to ascertain the subtle nuances of diplomatic

    strategy through event data. However, the Census and NES are far more useful in

    studying economic and demographic change in Arkansas in 1970 than Wolfe, Mailer orThompson, and in a similar fashion event data can reveal statistical patterns not

    apparent in conventional history.8

    Gaddis, having criticized event data as shallow in 1987, observed five years later

    (Gaddis 1992) that none of the major theoretical approaches to IR correctly predicted the

    most dramatic political transformation of the last quarter of the 20th century—the fall

    of communism in Eastern Europe. Event data analysis has problems, but the traditional

    approaches it seeks to supplement are not exactly flawless.

    A good event coding system leads to theoretically-informed statistical regularities; a

    coding system must therefore be closely linked to a theory or set of theories about

    international behavior. The WEIS and COPDAB codes, for example, were constructed

    in a realist milieu that placed primary emphasis on diplomatic and military behavior;

    CREON focuses on the elements of foreign policy behavior identified by the theories

    developed in James Rosenau's Inter-University Comparative Foreign Policy Project (see

    Hermann et al 1973:8-15); none of these are particularly useful in studying

    contemporary international economic or environmental issues. The detailed reports of 

    the event data collection efforts (for example Azar, Brody and McClelland 1972; Azar

    and Ben-Dak 1975; Burgess and Lawton 1972; Hermann et al 1973; Merritt, Muncaster

    and Zinnes, 1993)—though not necessarily the codebooks for those data—clearly

    indicate a deep awareness of the linkage between theory, coding and data collection.The choice of the source text to be coded affects the resulting data set, and the fact

    that event data is coded from a finite number of sources9 has been criticized by Alker

    (1988), among others, as privileging some interpretations of history over others. While

    this may be true, it is no more or less the case than the situation facing traditional

  • 8/20/2019 ISA.88

    11/38

    Event Data Page 9

    studies. Short of descending into a deconstructionist quagmire, any political analysis

    must assume that certain events, conditions, motivations and coalitions occurred, and

    others did not. The traditional method of constructing these accounts of political

    activities using a variety of documentary and autobiographical sources is one way of doing this; the processes by which text is selected for event coding is another. Each

    method is subject to selection bias and varying interpretation.

    STOCHASTIC ELEMENTS IN THE MEASUREMENT OF EVENT DATA

    As noted above, any event data analysis assumes, implicitly or explicitly, that the

    coded events reflect underlying statistical regularities in the political system; the nature

    of those regularities is specified by a theory. Since political systems are very complex,regularities will almost inevitably appear to be stochastic, but additional random

    variation is introduced into the data between the occurrence of an event and the

    assignment of the event code. The three major sources of variation are noise events,

    data censoring, and misclassification; Figure 1 summarizes this process schematically.

    [FIGURE 1 ABOUT HERE]

    Noise Events

    Any event stream contains noise in the form of random events that appear to be

    endogenous to the process being studied but which in fact have been generated by

    other processes or are irrelevant to the model. The distinction between endogenous

    and exogenous events depends on the theory being studied. What is noise to one

    model may be the signal of another, just as the variation in growth of a soybean plant

    due to insect damage is noise in a study of fertilizers but signal to a study of insecticides.

    In correlational studies, noise is usually assumed to be normally distributed. In an

    event data study, the default model for noise would be statistical independence and a

    Poisson distribution since the Poisson is the only temporal probability distribution that

    is memory-free.10   Ideally, one could ascertain the Poisson intensity parameter for the

  • 8/20/2019 ISA.88

    12/38

    Event Data Page 10

    distribution of noise affecting each event code and use that information in statistical

    studies, just as the mean and variance of normally distributed errors are used in

    correlational studies.11   As in correlational studies, the actual noise will often not be

    distributed according to the assumptions of the model, and might exhibit non-Poisson behavior such as cyclicity or statistical interdependence.

    Censoring

    Censoring means that an event occurs in the system and does not appear in the

    data. The term is meant in the statistical, rather than political, sense: while overt

    censoring of information is certainly a factor in event data, far more problematic are the

    editing and coverage biases introduced in journalistic and historical sources; these have

     been discussed exhaustively in the event data literature.

    Censoring occurs nonuniformly. Any event data set has a vector dC (i.e. a vector

    indexed on the set of classification codesC)12  that is the ratio of the frequency of codes

    in the observed data set to their frequency in an ideal data set where all events

    occurring in the system were reported. These ratios can be quite low—it is unlikely that

    existing data sets capture more than a few percent of all political activity except for

    extreme events such as the outbreak of war. To the extent that some events are more

    important than others in determining international behavior, censoring is probably

    inversely proportional to importance: the more important an event, the more likely it

    will be reported.

    The converse of censoring is "disinformation": strings in the source text concerning

    events that did not actually occur. These might be introduced deliberately—for

    example the deception campaigns that preceded the US ground offensive against Iraq in

    1991—but they are more commonly generated by rumors and second-hand

    information. While disinformation from deception and rumor is probably not a major

    component of event data, it is more problematic than pure noise because it is very non-

  • 8/20/2019 ISA.88

    13/38

    Event Data Page 11

    random and is specifically designed to make the system appear as though it is operating

    under a different process than that actually occurring.13

    Misclassification

    Misclassification occurs when the code assigned to a text string does not correspond

    to the code that was intended when the coding scheme was designed. Any event data

    set has an error matrix M where eij gives the probability of misclassifying v j as vi. This

    error matrix incorporates errors due to both validity and reliability problems.

    Misclassification does not occur randomly: a trade agreement may be confused with a

    cultural exchange but it is unlikely to be confused with a war. In all likelihood M has a

     block structure—one can arrange the row and columns of M is a fashion where non-

    zero entries are most likely to occur in adjacent cells with the remainder of the matrix

     being zero.

    STATISTICAL CORRECTIONS FOR CODING AND CENSORING ERRORS

    The overall error structure of an event data set can be specified by combining the

    elements of noise, censoring and misclassification. Let r be the true frequency of the

    codes—the frequency generated from an ideal uncensored data source coded using E

    without misclassification errors. Let n be the frequency of the noise (both random

    noise and disinformation); let

    D = diag(dC)

    that is, the matrix with the elements of dC on the diagonal. Then x, the observed

    frequency of events in the data set, is

    x = MD (r + n)

    Using this relationship, it is possible to make several statistical corrections to an event

    set if one is interested only in the aggregate  frequency of events; frequency is the key

    concern is most correlational and descriptive studies.

  • 8/20/2019 ISA.88

    14/38

    Event Data Page 12

    Assume that the misclassification matrix M can be estimated and let t = (r + n) be

    the true event frequency vector. If there is no censoring, then the observed frequency

    vector x is simply

    x = M t

    One can correct for misclassification and get an improved estimate of t by adjusting x

    using

    x* = M-1x

    Under certain circumstances, M is very straightforward to compute. Suppose one

    has two coding systems, one very slow but accurate (e.g. coding by a principal

    investigator or well-trained and well-motivated graduate student coders), and the otherfast but less accurate (e.g. machine coding or poorly trained, supervised or motivated

    work study students). Assuming that both coding processes are consistent, then M can

     be estimated by comparing two coding results on a suitably large and representative

    set of texts.

    If the censoring vector d were also known, the correction can be extended further:

    x* = (MD)-1x

    d is less likely to be known with any degree of confidence than M , though efforts could

     be made to approximate it by comparing multiple sources.14

    The standard technique for estimating the size of an unknown population is to

    sample the population twice and compare the number of cases captured in both

    samples15 . Let N be the true population size, n1 and n2 the sizes of two independent

    random samples from N, and m the number of cases that occur in both random

    samples. The probability of a case being in sample 1 is p1 = n1!N; probability of a case

     being in sample 2 is p2 = n2!N, so

    m = p1!p2!N " N =n1! n2

    m  

  • 8/20/2019 ISA.88

    15/38

    Event Data Page 13

    Once N is known, the number of cases being censored can be estimated by comparing

    the number of events found in a source to the estimated population size N.

    The weakness in this approach is the requirement that the two samples be random.

    All text sources of event data are biased to report certain events while ignoring others,rather than randomly sampling from all possible events. However, comparing two

    sources that are attempting to provide equivalent coverage—for example the New York

    Times and the Los Angeles Times—would provide a rough estimate of the censoring

    probabilities. This technique could also be used to ascertain which event categories are

    more frequently censored.

    The final issue that can be analyzed is the tradeoff between quantity and quality. If 

    one is doing event data coding with finite resources, the greater the effort devoted to

    coding reliability, the fewer events can be coded. Consider the situation where event

    data are used to estimate a value, ô, (e.g. an hostility score) by taking the mean of a

    sample of observations:

    ô = X_  =

    1n #

    i=1

    nxi 

    Assume that each xi can be decomposed into three parts

    xi = µ + ei + mi

    where µ = true value; ei = true error (in other words, deviation due to the intrinsic

    variability of the underlying random variable) and mi = measurement error. Assuming

    ei and mi have mean zero16 , from the Central Limit Theorem, we know that ô will be

    distributed

    Normal$%&

    '()µ ; 

    Var(e) 

    Var(m)n  

    whatever the underlying distributions of ei and mi.

    Suppose there are two measurement instruments A and B which have sample sizes

    Na and N b and measurement error variances va and v b respectively. Assume that Na >

  • 8/20/2019 ISA.88

    16/38

    Event Data Page 14

    N b and va > v b , in other words, A has a larger number of observations but also has a

    higher measurement error. Let sa and s b be the error variance of ô measured using A

    and B. Assuming without loss of generality that Var(e)=1, a bit of algebra will show

    that sa < s b provided

    va < v bError!) + Error!- 1)

    Since the second term is greater than zero, this means that sa

  • 8/20/2019 ISA.88

    17/38

    Event Data Page 15

    event sequence techniques from biology, linguistics and computer science; Mefford

    (1985, 1991) and Schrodt (1984, 1990, 1991) provide direct applications of event

    sequences to the study of international relations.

    Two examples of event sequence structures will provide a flavor for how this typeof analysis works. Mefford (1991) provides an example of how Edward Luttwak's

    general principles for implementing a coup d'etat could be rendered as a set of rules;

    Mefford's structure (Figure 2) can as easily be conceptualized as a set of events.

    [FIGURE 2 ABOUT HERE]

    When a political actor has a particular objective, the actor knows that certain actions

    must be taken in order to meet that objective. These actions are partially ordered and

    disjunctive.  Partially ordering means that certain actions must be taken in a strict

    temporal order but others can be done in parallel and can be completed in any order.

    For example, in the Luttwak model, both the police and bureaucracy must be

    neutralized before the state is considered neutralized, but the exact order of that

    process would usually be irrelevant17 .

    The sequence is disjunctive because a variety of different actions can serve the same

    purpose in a sequence. This is equivalent to the concept of "foreign policy

    substitutability" discussed in detail by Most and Starr (1984). For example in Luttwak's

    model, "consultation with allies" could occur in a variety of different ways—meeting

    with ambassadors in Washington, meeting with the foreign ministry in the ally's capital,

    discussions at a allied summit meeting—that would be equivalent for the purpose of 

    satisfying this objective.

    Another example of an event structure is Lebow's (1981) concept of a "Justificationof Hostility Crisis". Lebow originally developed this model by studying a number of 

    pre-1970 crises, but it fits the 1990 Iraq-Kuwait crisis almost perfectly.

    [TABLE 1 ABOUT HERE]

  • 8/20/2019 ISA.88

    18/38

    Event Data Page 16

    Much of international politics can be conceptualized as event sequences. A foreign

    policy bureaucracy has a certain objective in mind—an image of the final state of the

    world they wish to create through foreign policy initiatives. Actions are undertaken to

    gradually piece together that sequence—the entire sequence might take literallydecades to complete, such as the Bismarckian reorganization of Europe, or it might be

    done in a few hours, such as an international response to a fall in the value of the dollar.

    At any given time, a variety of parallel efforts are underway to lay the groundwork for

    subsequent actions; when those prior actions are completed, the plan moves ahead.

    False starts are common, but because the overall plan is disjunctive, when one sequence

    hits a dead end, it can be abandoned and another sequence pursued.

    International actions are not taken in a vacuum but rather against rational

    opponents who, depending on their interests, will seek to enhance or thwart the

    successful completion of the plan. Hence international politics is analogous to two

    children playing with a single set of blocks, one child trying to build a model of a house

    and the other a model of a bulldozer, each cleverly trying to build on the model the

    other has created but also tearing out the unnecessary parts when the other is

    distracted or working on subassemblies. Equivalently, to use one of the oldest

    metaphors for international politics, international politics is a game of chess or go: each

    player has envisioned complex sequences of actions ending in a victory, but must

    continually alter those plans to counter the moves of the other player.

    Several attributes of traditional international relations analysis can be

    conceptualized using the event sequence analysis (ESA) approach. First, human

    understanding of an event sequence—that intuitive comprehension of event patternsmentioned by McClelland—is simply the associative recall of an event structure (or set

    of event structures) satisfying an observed sequence of events. Second, history—the

    fascination of traditional theorists—provides the empirical evidence for event structures

    implemented in the past, and furnishes templates for future structures. Third, event

  • 8/20/2019 ISA.88

    19/38

    Event Data Page 17

    structures provide a simple error-correction mechanism for missing data: if one assumes

    that a particular structure is being followed, and actions are observed which are

    predicated by events not observed, and there are reasons to assume that the missing

    action might not be detected (e.g. it was secret or an event not likely to be reported),then one can assume that the event occurred unobserved, rather than rejecting the

    model. Finally, this approach provides a formal means of characterizing motivation

     based on observed events: the motivation for a particular sequence is its end state or

    objective.

    ESA requires a huge information base and very substantial information processing

    capability in terms of associative memory and pattern recognition. The human brain is

    intrinsically very good at the latter two tasks, and through formal education an expert

    in international affairs painstakingly acquires the last. Doing ESA by machine is more

    difficult, and McClelland in 1970 clearly did not have the machine capabilities

    commensurate with this task.

    In 1990s, however, enhanced computer processing capabilities and algorithms

    provide considerably greater pattern recognition capabilities. While ESA is less

    developed than correlational and frequency analysis, three things are likely to be

    characteristic of the approach. First, ESA requires a greater amount of data than do

    frequency and correlational analyses. Second, because of substitutability, there is a

    closer interplay between the coding and theory in ESA than in correlational studies.

    Third, short sequences of international behavior are more easily studied than long

    sequences, and ordinary behavior is more easily studied than extraordinary behavior.

    More Data

    Frequency-based correlational studies are subject to the law of large numbers and

    hence there is a diminishing return on the information provided by additional data. In

    addition, most intensity-based coding schemes—for example COPDAB—have an

    inverse relationship between the magnitude of the code and the probability of 

  • 8/20/2019 ISA.88

    20/38

    Event Data Page 18

    censoring: the events with the highest numerical values are those least likely to be

    missing from the data.

    This is less likely to be the case in ESA because of substitutability, and more

    generally because the structures themselves are complex. Substitutability makesmodeling a given set of data easier, but it is correspondingly more difficult to induce the

    underlying structure18 .

    Closer Linkage Between Coding Scheme and Theory

    Event structures allow multiple events to serve the same function. Coding errors

    within these substitution sets will have no effect on the fit of the structure, since all

    events within the set are equivalent. The existing coding schemes of COPDAB and

    WEIS implicitly use a high degree of substitution, since they map many distinct text

    strings into the same event code. However, the substitution mapping is done at the

    coding stage rather than the modeling stage.

    If one were dealing with a set of models where the substitution sets were always

    the same—if a pair of events v1 and v2 were found in one substitution set they would

    always be found in a substitution set whenever one or the other were present—then

    the problem of determining the details of a coding scheme would be solved. Again,

    existing event sets implicitly do this already. However, the substitution sets probably

    vary across models. For example, the event

    [move an aircraft carrier from the Mediterranean to the Gulf]

    and the event

    [move an aircraft carrier from the Pacific Ocean to the Gulf]

    are roughly equivalent as far as US relations with Iraq are concerned, but have very

    different implications for relations between the US and Italy. The specific event

    structure being analyzed—the context of the event—is important

    When used inductively, however, ESA techniques likely to be robust against the

    most common coding errors: those caused by the intrinsic ambiguity of the event as

  • 8/20/2019 ISA.88

    21/38

    Event Data Page 19

    interpreted by a human observer. The events most likely to be miscoded by a human,

    such as a protest coded as an accusation , or a promise coded as a grant , are those most

    likely to be equivalent in an event structure. Consequently, regularities could be

    detected in an inductive analysis despite miscoding, though the models would lessspecific than those constructed in the absence of coding errors.

    Short Term Regularity

    Fully completed complex event sequences are rare in international politics. Even

    when a long sequence is completed it provides only a single instantiation of the possible

    ways that the sequence might have developed. Complex structures can be induced

    only when multiple instances of the same structure manifest themselves, a slow process

    in the empirical world. Thus, for example, while there are probably very real parallels

     between the US involvement in Vietnam and the Soviet involvement in Afghanistan,

    the entire event structure for "Superpower intervening to support weak and unstable

    allied government fighting an unpopular guerrilla war where the superpower has

    limited national interests" still only has an N = 2 in the post-WWII period. In some

    theoretical contexts, one could go further afield to try to make comparisons with, for

    example, the Syracuse campaign during the Pelopponesian Wars, or to deal with

    contemporary regional powers such as Israel in Lebanon or India in Sri Lanka, but in

    the Cold War context, the sample size will remain permanently at 2.

    In general, the shorter the structure examined, the more likely one will be able to

    find sufficient instances to study it in detail; "short" refers to the number of events in the

    sequence rather than calendar time. Short sequences are also more likely to be

    successfully completed as part of a larger task, even though the larger task itself was

    not successfully completed.

    Event sequences may exhibit a hierarchical structure, with large sequences

    constructed of identifiable subsequences which may themselves have subsequences.

    The short inner structures may be easier to study than the longer structures because of 

  • 8/20/2019 ISA.88

    22/38

    Event Data Page 20

    their frequency. The comparison between Vietnam and Afghanistan may be due less to

    the macro-level, N=2 comparison as to the large number of substructure parallels, such

    as holding the cities while bombing the countryside, civilian and conscript discontent,

    disagreements between civilian and military leadership and so forth. If substructuresare important, micro-level coding is important, certainly to temporal levels more finely

    grained than a month and political levels more detailed than the nation-state.

    Machine Coding

    Advances in data storage and the development of machine coding techniques are

    likely to increase the amount of event data available to researchers by two or three

    orders of magnitude in the near future. In the past, event data have been produced bycoders working with paper copies of source documents, usually books and periodicals.

    This process is very labor intensive and coders are rarely able to generate more than a

    dozen or so events per hour. Recoding an existing large data set has also been

    impossible due to the labor required, which leads to event data sets being used in

    situations perilously remote from their original theoretical context.

    Advances in storage technology and data distribution over the past decade have

    changed this situation substantially. Commercial on-line databases such as Dialog and

    NEXIS provide international news wires such as UPI, Agence France Presse and

    Reuters, as well as the full text of newspapers such as the New York Times , Washington

    Post, Wall Street Journal and several of their non-English-language counterparts. Table 2

    shows the Reuters "lead" sentences for the key events leading up to the 2 August 1990

    invasion; there is clearly sufficient information here for a human analyst to match the

    sequence of events to Lebow's "justification of hostility" sequence.

    [TABLE 2 ABOUT HERE]

    The on-line databases are still relatively expensive, but these are likely to be

    replaced by CD-ROM in the near future. Facts on Files is currently available on CD-

    ROM for 1980 to the present; the full text of the New York Times will be available from

  • 8/20/2019 ISA.88

    23/38

    Event Data Page 21

    1991 onward. CD-ROM prices are currently around $100 for about 500 megabytes of 

    information and with CD-ROM, the computationally-intensive searching process can be

    done by the individual user rather than at a centralized site, reducing the cost of using

    the resources.The substitution of electronic search and display capabilities for paper can generates

    substantial increases in coder productivity, but even greater economies are achieved by

    totally automating the coding process. The automated categorization of natural

    language text is an important research topic in computer science, and a variety of 

    research projects are underway that deal with problems at least as complex as event

    coding (see for example Lehnert and Sundquist, 1991). Gerner et al (1994) describe a

    Macintosh computer program that can duplicate the 3-digit WEIS coding of a single

    human with around 80% - 90% accuracy in either English or German; validation tests

    comparing machine-coded data for the Middle East to a human-coded data set show no

    systematic differences except those due to the higher density of the machine-coded data

    (Schrodt and Gerner 1994).

    Machine coding from machine-readable source material means that event data sets

    to be maintained in real-time, simplifies the implementation of alternative coding

    schemes, provides a complete and unambiguous record of the actual coding rules,

    removes the cultural and perceptual biases introduced by human coders19 , and can

    handle languages other than English more easily than can English-speaking human

    coders. With these improved data processing capabilities, event data sets which

    currently contain several thousand events per year for the entire international system

    might easily be increased to several million events per year without significantadditional labor costs. These denser data sets, in turn, may provide sufficient detail

    about international interactions to allow the use of analytical techniques such as ESA

    which were not effective on the earlier, sparser data sets.

  • 8/20/2019 ISA.88

    24/38

    Event Data Page 22

    CONCLUSION

    The appropriateness of event data analysis ultimately depends on the theoretical

    model to which those events will be applied. Because of the labor involved in the

    collection and coding of the early event data sets, much effort was spent in carefullyconstructing and implementing coding schemes, and training coders, so that high

    intercoder reliability could be achieved. With the advent of inexpensive machine-

    readable text sources and machine coding, the cost of generating event data will drop

    dramatically, intercoder reliability will cease to be an issue20 , and considerably greater

    attention can be devoted to developing coding schemes that correspond to

    contemporary theoretical approaches.

    The use of event data in correlational studies will undoubtedly continue, whatever

    the success of the ESA approach. Correlational analyses are the lingua franca of the

    social sciences, interval-level data can be processed with standardized and thoroughly

    tested statistical programs with known properties, and the frequency/correlation

    approach makes fairly efficient use of the available data. However, the use of event

    data in correlational studies quite unsophisticated, and as noted in this article,

    straightforward statistical corrections can be made for many source and coder reliability

    issues. With fundamental changes occurring in the information processing capabilities

    available to the average researcher, analyses of textual and event data are now possible

    which could not be done when event data were first developed. The past is a very poor

    guide to the future of event data and what was practically impossible a decade ago may

     be trivial a decade from now.

  • 8/20/2019 ISA.88

    25/38

    Event Data Page 23

    FOOTNOTES

    1 This is an underestimate: since the World Handbook itself contains events data, the total

    citations to events data are probably greater than those to the non-event parts of theWorld Handbook.

    2 See McGowan et al 1988 and Merritt 1987 for a general discussion.

    3 Howell et al (1986) provides a good discussion of these issues, as do most of the pre-

    1980 works cited in the bibliography.

    4 I am indebted to Harold Guetzkow for an extensive collection of the early WEIS

    memoranda.

    5 Another factor favoring continuous formulations may be the ease with which one can

    draw a line representing a continuous variable and call it a "pattern" (for example

    Brody, 1972) whereas discrete sequences are more difficult to visualize and explain

    (McClelland, 1961).

    6 In contrast, a sociologist coding the events of a child's play group, an ethnobiologist

    coding the behavior of chimpanzees or Alger's (1968) study of interaction in a UN

    committee involve event coding directly from observations.

    7  An alternative approach to nonevents is to note that Pow(C ) by definition includes

    the null set, and many text strings simply map to "this is not an event"; the function E

    therefore serves to filter all text strings. This is effectively the scheme used in most

    machine-coding systems, whereas in human coding system the decision that a text

    string constitutes an event may be separate from the assignment of a code.

    8

     For example, Schrodt and Mintz (1988) use a conditional probability analysis of theCOPDAB data set to study interactions between six Middle Eastern states: Jordan, Syria,

    Saudi Arabia, Kuwait, Iraq and Iran. In retrospect the most interesting finding of the

    study was the prominent role of Kuwait:

  • 8/20/2019 ISA.88

    26/38

    Event Data Page 24

    The implication [of the study] is that when some interaction occurs with Kuwait,

    this interaction disproportionately sets off other interactions in the system. Thisinitially seems counterintuitive because Kuwait is the least powerful of the states

    we are studying, though that status may be the reason Kuwait is so important. If 

    this characteristic holds generally, we may find that minor powers are more

    important in determining interaction interdependence than major powers. (pp. 227-

    228)

    This was written in 1984, six years before Iraq's invasion of Kuwait led to the massive

    international intervention known in the United States as Desert Storm. The importance

    of Kuwait was deduced exclusively from the event data itself, rather than from a

    traditional political analysis, and when Mintz and I first presented this paper at the

    International Studies Association, the discussant concluded that the Kuwait results were

    clearly incorrect and indicated a weakness in the approach.

    9 For example, WEIS was coded from the New York Times, which greatly over-samples

    events in Europe and the Middle East compared to events in Latin America, Africa, and

    Asia. However, Azar (1980: 146) states that COPDAB is based on "events reported in

    over 70 sources"; CREON is based on Deadline Data on World Affairs , which abstracts 46

    international sources (Hermann et al 1973:18); and the BCOW codebook (Leng, 1987)

    lists dozens of periodical and historical sources. While the single-source criticism may

     be valid for WEIS, it certainly is not intrinsic to the methodology.

    10  Equivalently, exponentially distributed inter-arrival times.

    11  King (1989) shows a variety of techniques for doing this.

    12  In Figure 1, d is placed prior to the source text because that is where most censoring

    occurs in the real world — events occur but are not reported. For reasons that will be

  • 8/20/2019 ISA.88

    27/38

    Event Data Page 25

    clear momentarily, it is more convenient to index d on event codes rather than original

    events.

    13  In particular, rumors must convey plausible patterns of human behavior or they will

    not be voluntarily transmitted. For example, in the week following the June 1989

    Tiananmen Square massacre in Beijing, many rumors circulated about pro-democracy

    military forces preparing to move against the city. These eventually proved completely

    groundless but were credible: when a comparable situation occurred in Rumania in

    December 1989, military units did turn against the government. The nature of rumors

    and story-telling means that a credible sequence of events is generated; if information is

    missing from the story, it will be provided in a fashion that makes sense to the teller

    and listener, rather than generated randomly. Rumors, as a consequence, may be more

    likely to fit models of political behavior than will actual events. Note that if regularity

    did not exist in the international system, disinformation and strategic deception could

    not exist since the entity being deceived must be able to fit the information to a pre-

    existing model of international behavior.

    14  Achen (1987) points out another means of handling censored data in a regression

    framework, citing Heckman (1979). Achen more generally calls attention to the fact

    that many problems of event data have statistical solutions in cognate fields.

    15  This method is typically used to estimate fish or insect populations; it has also been

    used to estimate the undercount in the U.S. Census.

    16  This is a weak restriction since any known bias can be adjusted by subtracting a

    constant from the biased estimator.

    17  Obviously there exist occasions when the order would be important — for example

    in many states a neutralized military will vastly simplify the task of neutralizing other

    components of the state — but there are many facets of political activity where

  • 8/20/2019 ISA.88

    28/38

    Event Data Page 26

    ordering is either unimportant or, more commonly, many tasks are undertaken in

    parallel.

    18  This is similar to the difference between checking the syntactic correctness of a

    sentence and deriving the grammar of a language. If one knows the grammatical rules

    of a language, then the correctness of most sentences can ascertained relatively easily;

    for example grammar checking programs can do this. Deriving a complete grammar,

    however, can be exceedingly difficult, even if a great deal of grammatically correct text

    is available.

    19  The biases in the coding system itself remain, but in machine coding these are at least

    explicit and reproducible.

    20  Machine coding systems make mistakes, but do so consistently and reproducibly so

    that statistical corrections (and recoding if egregious errors are found) are

    straightforward. Human coders, in contrast, are an amalgam of biases, misperceptions,

    sloth, confusion and neglect who graduate and vanish at the end of the year. Neither a

    human nor a machine coding system can perfectly implement the researcher's coding

    scheme, but from a statistical standpoint, the errors generated by the machine will

    usually be easier to deal with.

  • 8/20/2019 ISA.88

    29/38

    BIBLIOGRAPHY

    Achen, Christopher. 1987. "Statistical Models for Event Data: A Review of Errors-in-

    Variables Theory". Paper presented at the Data Development in International

    Relations Conference on Event Data, Columbus, OH.Alger, Chadwick F. 1968. "Interaction in a Committee of the United Nations General

    Assembly." In J. David Singer (ed.) Quantitative International Politics. New York: Free

    Press.

    Alker, Hayward R. 1987. "Fairy Tales, Tragedies and World Histories." Behaviormetrika

    21:1-28.

    ___________. 1988. "Emancipatory Empiricism: Toward the Renewal of Empirical Peace

    Research." In Peter Wallensteen (ed.) Peace Research: Achievements and Challenges.

    Boulder, CO: Westview Press.

    Andriole, Stephen J. and Gerald W. Hopple. 1984. "The Rise and Fall of Events Data:

    From Basic Research to Applied Use in the U.S. Department of Defense." International

    Interactions 11:293-309.

    Azar, Edward E. 1982. The Codebook of the Conflict and Peace Data Bank  (COPDAB) .

    College Park, MD: Center for International Development, University of Maryland.

    Azar, Edward E. 1980. "The Conflict and Peace Data Bank (COPDAB) Project." Journal of 

    Conflict Resolution 24:143-152.

    Azar, Edward E., Stanley H. Cohen, Thomas O. Jukam and James McCormick. 1972.

    "Making and Measuring the International Event as a Unit of Analysis." In Edward E.

    Azar, Richard A. Brody and Charles A. McClelland (eds.) International Events

    Interaction Analysis: Some Research Considerations.  Beverly Hills: Sage Publications.Azar, Edward E., Richard A. Brody and Charles A. McClelland (eds.) 1972. International

    Events Interaction Analysis: Some Research Considerations.  Beverly Hills: Sage

    Publications.

  • 8/20/2019 ISA.88

    30/38

    Event Data Page 28

    Azar, Edward E. and Joseph Ben-Dak. 1975. Theory and Practice of Events Research.

    New York: Gordon and Breach.

    Azar, Edward E. and Thomas Sloan. 1975. Dimensions of Interaction. Pittsburgh:

    University Center for International Studies, University of Pittsburgh.Azar, Edward E., R.D. McLaurin, Thomas Havener, Craig Murphy, Thomas Sloan and

    Charles H. Wagner. 1977. "A System for Forecasting Strategic Crises: Findings and

    Speculations About Conflict in the Middle East." International Interactions 3: 193-222.

    Brody, Richard A. 1972. "International Events: Problems in Measurement and

    Analysis." In Edward E. Azar, Richard A. Brody and Charles A. McClelland (eds.)

    International Events Interaction Analysis: Some Research Considerations. Beverly Hills:

    Sage Publications.

    Burgess, Philip M. and Raymond W. Lawton. 1972. Indicators of International Behavior:

    An Assessment of Events Data Research.  Beverly Hills: Sage Publications.

    Daly, Judith Ayres and Stephen J. Andriole. 1980. "The Use of Events/Interaction

    Research by the Intelligence Community." Policy Sciences 12:215-236.

    Gaddis, John Lewis. 1987. "Expanding the Data Base: Historians, Political Scientists and

    the Enrichment of Security Studies." International Security 12:3-21.

    ____________. 1992. "International Relations Theory and the End of the Cold War."

    International Security 17:5-58.

    Gerner, Deborah J., Philip A. Schrodt, Ronald A. Francisco, and Judith L. Weddle. 1994.

    The Machine Coding of Events from Regional and International Sources,

    International Studies Quarterly 38:91-119.

    Heckman, James J. 1979. "Sample Bias as Specification Error." Econometrica 47: 153-61.Heise, David. 1988a. "Modeling Event Structures."  Journal of Mathematical Sociology

    13:138-168.

    Heise, David. 1988b. "Computer Analysis of Cultural Structures." Social Science

    Computer Review 6:183-196.

  • 8/20/2019 ISA.88

    31/38

    Event Data Page 29

    Hermann, Charles, Maurice A. East, Margaret G. Hermann, Barbara G. Salmore, and

    Stephen A. Salmore. 1973. CREON: A Foreign Events Data Set.  Beverly Hills: Sage

    Publications.

    Hoople, Gerald W. 1984. "Computer-Based Early Warning: A Staircase Display Optionfor International Affairs Crisis Projection and Monitoring." In Gerald W. Hoople,

    Stephen J. Andriole and Amos Freedy (eds.). National Security Crisis Forecasting and

     Management. Boulder: Westview Press.

    Hoople, Gerald W., Stephen J. Andriole and Amos Freedy (eds.). 1984. National Security

    Crisis Forecasting and Management. Boulder: Westview Press.

    Howell, Llewellyn D, Sheree Groves, Erin Morita, and Joyce Mullen. 1986. "Changing

    Priorities: Putting the Data back into Event data Analysis." Paper presented at the

    International Studies Association, Anaheim.

    International Studies Quarterly. 1983. "Symposium: Event data Collections." International

    Studies Quarterly 27.

    King, Gary. 1989. "Event Count Models for International Relations: Generalizations

    and Applications." International Studies Quarterly 33,2:123-148.

    Laurence, Edward J. 1990. "Events Data and Policy Analysis." Policy Sciences 23:111-132.

    Lebow, Richard Ned. 1981. Between Peace and War: The Nature of International Crises.

    Baltimore: Johns Hopkins.

    Lehnert, Wendy and Beth Sundheim. 1991. "A Performance Evaluation of Text

    Analysis." AI Magazine 12: 81-94.

    Leng, Russell J. 1987. Behavioral Correlates of War, 1816-1975. (ICPSR 8606). Ann Arbor:

    Inter-University Consortium for Political and Social Research.McClelland, Charles A. 1961. "The Acute International Crisis." World Politics 14:184-204.

    _________. 1967a. "Event-Interaction Analysis in the Setting of Quantitative

    International Relations Research." mimeo, University of Southern California,

    February, 1967

  • 8/20/2019 ISA.88

    32/38

    Event Data Page 30

    _________. 1967b. "World-Event-Interaction-Survey: A Research Project on the Theory

    and Measurement of International Interaction and Transaction." mimeo, University

    of Southern California, March, 1967

    _________. 1968a. "International Interaction Analysis: Basic Research and some PracticalUses." mimeo, University of Southern California, November, 1968

    _________. 1968b. "Access to Berlin: The Quantity and Variety of Events, 1948-1963." In

     J. David Singer (ed.) 1968. Quantitative International Politics. New York: Free Press.

    _________. 1969. "International Interaction Analysis in the Predictive Mode." mimeo,

    University of Southern California, January, 1969.

    _________. 1970. "Some Effects on Theory from the International Event Analysis

    Movement." mimeo, University of Southern California, February, 1970

    _________. 1976. World Event/Interaction Survey Codebook. (ICPSR 5211). Ann Arbor:

    Inter-University Consortium for Political and Social Research.

    McGowan, Patrick, Harvey Starr, Gretchen Hower, Richard L. Merritt and Dina A.

    Zinnes. 1988. "International Data as a National Resource." International Interactions

    14:101-113.

    Mefford, Dwain. 1985. "Formulating Foreign Policy on the Basis of Historical

    Programming." In Michael Don Ward and Urs Luterbacher (eds.) Dynamic Models of 

    International Conflict.  Boulder, CO: Lynn Rienner Publishing.

    __________. 1991. "Steps Toward Artificial Intelligence: Rule-Based, Case-Based and

    Explanation-Based Models of Politics." In Valerie Hudson (ed.) Artificial Intelligence

    and International Politics. Boulder: Westview.

    Merritt, Richard L., Robert G. Muncaster and Dina A. Zinnes (eds.) 1993. InternationalEvent Data Developments: DDIR Phase II.  Ann Arbor: University of Michigan Press.

    Merritt, Richard. 1987. "Event Data: A Synoptic View." Paper presented at the Data

    Development in International Relations Conference on Event Data, Columbus, OH.

  • 8/20/2019 ISA.88

    33/38

    Event Data Page 31

    Most, Benjamin A. and Harvey Starr. 1984. "International relations theory, foreign

    policy substitutability and 'nice' laws." World Politics 36: 383-406

    Munton. Donald. 1978.  Measuring International Behavior: Public Sources, Events and

    Validity. Dalhousie University: Centre for Foreign Policy Studies.Peterson, Sophia. 1975. "Research on research: Events data studies, 1961-1972." In

    Patrick J. McGowan (ed.) Sage International Yearbook on Foreign Policy Studies , 3.

    Beverly Hills: Sage Publications.

    Rummel, Rudolph J. 1972. The Dimensions of Nations.  Beverly Hills: Sage Publications.

    Sankoff, David and Joseph B. Kruskal (eds.) 1983. Time Warps, String Edits and

     Macromolecules: The Theory and Practice of Sequence Comparison.  New York: Addison-

    Wesley.

    Schrodt, Philip A. 1984. "Artificial Intelligence and International Crisis: An Application

    of Pattern Recognition." Paper presented at the American Political Science

    Association, Washington.

    Schrodt, Philip A. 1990. "Parallel Event Sequences in International Crises." Political

    Behavior 12:97-123.

    Schrodt, Philip A. 1991. "Pattern Recognition in International Event Sequences: A

    Machine Learning Approach." In Valerie M. Hudson (ed.) Artificial Intelligence and

    International Politics. Boulder: Westview Press: .

    Schrodt, Philip A. 1993. "Machine Coding of Event Data." In Richard L. Merritt, Robert

    G. Muncaster and Dina A. Zinnes (eds.) International Event Data Developments: DDIR

    Phase II.  Ann Arbor: University of Michigan Press.

    Schrodt, Philip A. 1994. "Event Data in Foreign Policy Analysis" in Patrick J. Haney,Laura Neack, and Jeanne A.K. Hey. Foreign Policy Analysis: Continuity and Change.

    New York: Prentice-Hall

  • 8/20/2019 ISA.88

    34/38

    Event Data Page 32

    Schrodt, Philip A. and Alex Mintz. 1988. "A Conditional Probability Analysis of 

    Regional Interactions in the Middle East." American Journal of Political Science 32: 217-

    230.

    Schrodt, Philip A., and Deborah J. Gerner. 1994. "Statistical Patterns in a Dense EventData Set for the Middle East, 1982-1992." American Journal of Political Science 38.

    Sigler, John H., John O. Field and Murray L. Adelman. 1972. Applications of Event Data

    Analysis: Cases, Issues and Programs in International Interaction. Beverly Hills: Sage

    Publications.

  • 8/20/2019 ISA.88

    35/38

    Event Data Page 33

    Table 1

    Justification of Hostility Crises

    Lebow sequence July 1914 Crisis Iraq-Kuwait 1990

    Exploit provocation Assassination in Serbia Iraq accuses Kuwait of disregarding28 June OPEC quotas, 17 July

    Make unacceptable demands Austrian ultimatum to Serbia Iraq demands Kuwait forgive loans,23 July make reparations, $25 oil price

    23-25 July

    Understate objectives Austria claims it does not Iraq assures Egypt it will not invadewish to destroy Serbia Kuwait

    Use rejection of demands as Serbia rejects some Austrian Breakdown of Jeddah talks, 1 Augustcasus belli demands, 25 July

    War Austria declares war, 28 July Iraq invades, 2 August

  • 8/20/2019 ISA.88

    36/38

    Event Data Page 34

    Table 2

    Reuters Chronology of 1990 Iraq-Kuwait Crisis

     July 17, 1990: RESURGENT IRAQ SENDS SHOCK WAVES THROUGH GULF ARAB STATESIraq President Saddam Hussein launched an attack on Kuwait and the United Arab Emirates (UAE)

    Tuesday, charging they had conspired with the United States to depress world oil prices throughoverproduction.

     July 23, 1990: IRAQ STEPS UP GULF CRISIS WITH ATTACK ON KUWAITI MINISTERIraqi newspapers denounced Kuwait's foreign minister as a U.S. agent Monday, pouring oil on theflames of a Persian Gulf crisis Arab leaders are struggling to stifle with a flurry of diplomacy.

     July 24, 1990: IRAQ WANTS GULF ARAB AID DONORS TO WRITE OFF WAR CREDITSDebt-burdened Iraq's conflict with Kuwait is partly aimed at persuading Gulf Arab creditors to writeoff billions of dollars lent during the war with Iran, Gulf-based bankers and diplomats said.

     July 24, 1990: IRAQ, TROOPS MASSED IN GULF, DEMANDS $25 OPEC OIL PRICEIraq's oil minister hit the OPEC cartel Tuesday with a demand that it must choke supplies until

    petroleum prices soar to $25 a barrel.

     July 25, 1990: IRAQ TELLS EGYPT IT WILL NOT ATTACK KUWAITIraq has given Egypt assurances that it would not attack Kuwait in their current dispute over oil andterritory, Arab diplomats said Wednesday.

     July 27, 1990: IRAQ WARNS IT WON'T BACK DOWN IN TALKS WITH KUWAITIraq made clear Friday it would take an uncompromising stand at conciliation talks with Kuwait,saying its Persian Gulf neighbor must respond to Baghdad's "legitimate rights" and repair the economicdamage it caused.

     July 31, 1990: IRAQ INCREASES TROOP LEVELS ON KUWAIT BORDERIraq has concentrated nearly 100,000 troops close to the Kuwaiti border, more than triple the number

    reported a week ago, the Washington Post said in its Tuesday editions.

    August 1, 1990: CRISIS TALKS IN JEDDAH BETWEEN IRAQ AND KUWAIT COLLAPSETalks on defusing an explosive crisis in the Gulf collapsed Wednesday when Kuwait refused to give into Iraqi demands for money and territory, a Kuwaiti official said.

    August 2, 1990: IRAQ INVADES KUWAIT, OIL PRICES SOAR AS WAR HITS PERSIAN GULFIraq invaded Kuwait, ousted its leaders and set up a pro-Baghdad government Thursday in a lightningpre-dawn strike that sent oil prices soaring and world leaders scrambling to douse the flames of war inthe strategic Persian Gulf.

    Source: Reuters

  • 8/20/2019 ISA.88

    37/38

    Event Data Page 35

    Figure 1

     

  • 8/20/2019 ISA.88

    38/38

    Event Data Page 36

    Figure 2