Lexical shifts, substantive changes, and continuity inState of the Union discourse, 17902014Alix Rulea, Jean-Philippe Cointetb, and Peter S. Bearmana,1
aInterdisciplinary Center for Innovative Theory and Empirics (INCITE), Columbia University, New York, NY 10025; and bInstitut National de la RechercheAgronomiqueLaboratoire Interdisciplinaire Sciences Innovations Socits, Universit Paris-Est, Marne-la-Valle, F-77454 Marne-la-Valle, France
This contribution is part of the special series of Inaugural Articles by members of the National Academy of Sciences elected in 2014.
Contributed by Peter S. Bearman, June 30, 2015 (sent for review May 21, 2015; reviewed by Ronald L. Breiger and John Mohr)
This study reveals that the entry into World War I in 1917 indexedthe decisive transition to the modern period in American politicalconsciousness, ushering in new objects of political discourse, amore rapid pace of change of those objects, and a fundamentalreframing of the main tasks of governance. We develop a strategyfor identifying meaningful categories in textual corpora that spanlong historic dures, where terms, concepts, and language usechanges. Our approach is able to account for the fluidity of discur-sive categories over time, and to analyze their continuity by iden-tifying the discursive stream as the object of interest.
State of the Union | text analysis | networks | natural languageprocessing | American history
When did modern political discourse emerge in the UnitedStates? What is distinctive of basic understandings of thetasks of governance today, in contrast to those that organized thepolitics of an earlier period? Can the origins of contemporarypolitical understandings be located in the discourse of the past?The annual State of the Union address (hereafter, SoU), in whichthe US president reports broadly on the progress and challengesof his administration, provides a singular standpoint from which toaddress the evolution of the tasks of governance. It can thus beused to investigate old questions like those above using network-based text analysis strategies.This study reveals that the entry intoWorldWar I (WWI) in 1917
indexed the decisive transition to the modern period in Americanpolitical consciousness, ushering in new objects of political dis-course, a more rapid pace of change of those objects, and a fun-damental reframing of the main tasks of governance. At the sametime, this study demonstrates that discourse distinctive to modernpolitics, although it later crystalized around the liberal welfare state,in fact emerged before the transition to the modern period.We offer a unique view of American political history, which
tracks the articulation of the major tasks of governance in Americanpolitical and social discourse. To do so, we develop a strategy foridentifying meaningful categories in textual corpora that span longhistoric dures. We are able to account for the fluidity of discursivecategories over time, and to analyze their continuity by identifyingthe discursive stream as the object of interest. The methodologicalapproach developed in this article can be used to meaningfullyanalyze texts produced over very long historical periods, whereterms, concepts, and language use changesto our knowledge, aproblem not satisfactorily solved.
Historical BackgroundThe SoU address is delivered annually by the president to a jointsession of Congress, a tradition with its basis in the US Consti-tution, where it is mandated that the president shall from timeto time give to the Congress information of the SoU, and rec-ommend to their Consideration such Measures as he shall judgenecessary and expedient. Since George Washingtons first presi-dential address in 1790, the SoU has been given every year, withonly one exception in 1933, when incoming president FranklinRoosevelt did not give a speech. The countrys first two presidentsappeared in person before Congress to deliver the SoU. Thomas
Jefferson, judging that this constituted an imperial gesture, set theprecedent of delivering the address to the legislature in writtenform, a practice that endured until WoodrowWilson took office in1913. The latter is sometimes credited with having transformedthe address into a direct appeal to the US populace, althoughpresidents who immediately followed him sometimes reverted towritten delivery. The SoU was radio broadcast for the first time in1923, was first televised in 1947; in 1965, Johnson became the firstpresident to cater to a television-viewing audience by deliveringthe speech in the evening rather than at midday (1).Research attests to the SoUs significance in political agenda
setting and the reciprocal influence of public opinion on the contentof the address. The SoU reflects opinion regarding the salience ofissues, while also creating it (24). Thanks both to its persistenceand its prominence as an institution in US national politics, the SoUhas been of perennial interest to researchers seeking to understandvarious facets of the countrys history (59). The main focus of thiswork has been to pinpoint changes in political discourse to the in-fluence of particular presidents and thus stands in contrast to thefocus of this article, which is to represent continuity and change inthe structure and content of American social and political thought.To summarize, as a corpus, the text of SoUmirrors contemporary
public understanding of what issues were important. It is nearlyunique in the certainty and consistency of its provenance, producedat regular intervals by an individual occupying a well-defined socialrole, that of the US chief executive. Despite strong a priori reasonsfor doing so, we do not simply assume that the speech constitutes astable cultural form, but rather demonstrate that this is the caseempirically. The SoU thus provides a unique vantage point fromwhich to reconsider arguments about the timing and nature ofcritical transition points in US political consciousness.Revealing the evolution of political discourse requires appre-
ciating how its contents change over time. The method we present
A synoptic picture of the evolution of American politics ispresented, based on analysis of the corpus of presidents Stateof the Union addresses, 17902014. The paper presents astrategy for automated text analysis that can identify mean-ingful categories in textual corpora that span long dures,where terms, concepts and language use changes, and evolu-tion of topical structure is a priori unknown. Discourse streamsidentified as river networks reveal how change in contentsmasks continuity in the articulation of the major tasks of gov-ernance over US history.
Author contributions: A.R. and P.S.B. designed research; A.R., J.-P.C., and P.S.B. performedresearch; J.-P.C. contributed new reagents/analytic tools; A.R., J.-P.C., and P.S.B. analyzeddata; and A.R. and P.S.B. wrote the paper.
Reviewers: R.L.B., University of Arizona; and J.M., University of California, Santa Barbara.
The authors declare no conflict of interest.
Freely available online through the PNAS open access option.1To whom correspondence should be addressed. Email: firstname.lastname@example.org.
This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1512221112/-/DCSupplemental.
www.pnas.org/cgi/doi/10.1073/pnas.1512221112 PNAS Early Edition | 1 of 8
relies on the straightforward idea that words acquire meaningthrough their relations with other words (10). Consequently, wefocus on co-occurrence, extracting the local ties between terms inparagraphs to induce categories of discourse from the resultingnetwork structure. By recognizing that the relations between wordsarise in time, and appropriately defining the period over which co-occurrence is considered, we approximate the semantic standpointof contemporary observers. We thus consider the categoricalstructure of discourse over successive, delimited time periods touncover and analyze continuity and change in social and politicalthought. Clarifying these methodological points and identifyingthe insights into American social and political discourse that theypermit is the focus of this article.
Methodological BackgroundOur analysis strategy falls into a class of text analysis methods broadlycharacterized as co-occurrence approaches (11), which induce cate-gories by relying on terms joint appearance over a particular unit oftext (12). The central aim of our approach is to parsimoniouslyidentify relevant and interpretable higher-level units of meaning en-dogenously, and to track their coevolution through time.The core problem for analysts of text produced over very long
historical periods is that key terms change, but for differentreasonslanguage use shifts, new inventions join the world, conceptsare recast and reorganizedmaking it difficult to distinguish mean-ingful from meaningless change. In general, canonical approaches totext analysis have not been sensitive to the fluidity of meaning overtime, either on the level of individual terms or of higher-level context,conceived as categories, topics, classes, or discussions. Fig. 1 illus-trates the two main reasons that a co-occurrence approach isuniquely well suited to analysis of the SoU and other historicalcorpora: first, in contexts where the reasons for changing word useare unclear and hard to disentangle, attention to the relationshipsbetween words is crucial for understanding the significance of suchchanges. Second, the co-occurrence structure, an abstraction of thechanging context of use, is itself directly interpretable. In this sense,a frontal approach like co-occurrence analysis is preferable to othermethods that identify categories in text, but require additional stepsto make those categories accessible to int