Clark - Metadata is the Message

Preview:

Citation preview

The Metadata is the MessageAssessing, Curating and Publishing Data

for the Humanities

Ashley M. Clark

Digital Scholarship Group

Northeastern University Libraries

Cultures of Reception● Initiative to transcribe and publish texts responding to early modern works by

women

● Collected documents include:

○ Reviews (theatrical and literary)

○ Extracts

○ Essays

○ Biographies

● As part of the transcription process, encoders:

○ Note the original source of the document;

○ Identify the work(s) by women mentioned or reviewed;

○ Identify the women creators mentioned or reviewed;

○ Tag the document with relevant themes, formats, and genres; and

○ Classify the reception of the main work on a positive-negative scale.

Background● Cultures of Reception was funded by the National Endowment for the

Humanities

● The Women Writers Project (WWP) began work on this initiative in 2010 at

Brown University

● In 2013, the WWP moved to Northeastern University Libraries' Digital

Scholarship Group (DSG)

● Transcription for Cultures of Reception continued at Northeastern, with an

entirely new group of encoders

The State of the Data● Transcription records were created by encoders in a web interface

● Records stored in CouchDB as JSON objects

● Pain points for WWP:

○ Transcription accomplished with the help of buttons to insert basic XML tags, but:

■ no pretty printing,

■ no well-formedness checks

○ CouchDB requires Javascript knowledge for querying

○ Inconsistent names and titles, for example:

■ "Horace Juvenal" is the same person as

■ "Mary Darby Robinson" is the same person as

■ "Mary Robinson"

Transcription Interface

Sample JSON Record

An Example from the Corpus

1. The Women Writers Project's transcription of

2. "Art. VII. The Wild Irish Girl; a National Tale. By Miss Owenson..." in

volume ns 57 of The Monthly Review; or Literary Journal, which

reviews

3. The London edition of Lady Morgan's The Wild Irish Girl, published

in 1806 by Sir Richard Philips.

Publication Challenges● 690 transcriptions of "reviews"

● Users are unlikely to want to find any one particular review

● Users are very likely to want to explore reviews by

○ Reviewed or mentioned author,

○ Reviewed or mentioned work,

○ Source book or periodical,

○ Publication location,

○ Tags (e.g. theme)

Data cleanup● For longevity within the WWP:

○ Exported records into TEI-encoded XML;

○ Placed records under version control;

○ Created descriptive filenames for reference and display

● For findability:

○ Created canonical metadata entries for authors, works, and sources;

○ Ensured transcription records included identifier references to the canonical entries of subjects of

interest

● For readability:

○ Created descriptive transcription record name

○ Created shorthand titles of works and sources

○ Minimally tidied TEI encoding

Sample Bibliography Entry

Referencing from Reviews

Women Writers in Review● The web publication for the Cultures of Reception corpus

● Emphasis on discovery and exploration through links and faceting

● Powered by the same API available to researchers for data access

● Future plans:

○ Incorporate visualizations

○ Highlight temporal and geographic shifts

○ Clean up XML encoding for share-ability

Women Writers in Review

Women Writers in Review

Thank you!

as.clark@northeastern.edu

wwp.northeastern.edu/review

wwp.northeastern.edu/blog

@Nuwwp

Recommended