View
697
Download
0
Embed Size (px)
Citation preview
The Metadata is the MessageAssessing, Curating and Publishing Data
for the Humanities
Ashley M. Clark
Digital Scholarship Group
Northeastern University Libraries
Cultures of Reception● Initiative to transcribe and publish texts responding to early modern works by
women
● Collected documents include:
○ Reviews (theatrical and literary)
○ Extracts
○ Essays
○ Biographies
● As part of the transcription process, encoders:
○ Note the original source of the document;
○ Identify the work(s) by women mentioned or reviewed;
○ Identify the women creators mentioned or reviewed;
○ Tag the document with relevant themes, formats, and genres; and
○ Classify the reception of the main work on a positive-negative scale.
Background● Cultures of Reception was funded by the National Endowment for the
Humanities
● The Women Writers Project (WWP) began work on this initiative in 2010 at
Brown University
● In 2013, the WWP moved to Northeastern University Libraries' Digital
Scholarship Group (DSG)
● Transcription for Cultures of Reception continued at Northeastern, with an
entirely new group of encoders
The State of the Data● Transcription records were created by encoders in a web interface
● Records stored in CouchDB as JSON objects
● Pain points for WWP:
○ Transcription accomplished with the help of buttons to insert basic XML tags, but:
■ no pretty printing,
■ no well-formedness checks
○ CouchDB requires Javascript knowledge for querying
○ Inconsistent names and titles, for example:
■ "Horace Juvenal" is the same person as
■ "Mary Darby Robinson" is the same person as
■ "Mary Robinson"
Transcription Interface
Sample JSON Record
An Example from the Corpus
1. The Women Writers Project's transcription of
2. "Art. VII. The Wild Irish Girl; a National Tale. By Miss Owenson..." in
volume ns 57 of The Monthly Review; or Literary Journal, which
reviews
3. The London edition of Lady Morgan's The Wild Irish Girl, published
in 1806 by Sir Richard Philips.
Publication Challenges● 690 transcriptions of "reviews"
● Users are unlikely to want to find any one particular review
● Users are very likely to want to explore reviews by
○ Reviewed or mentioned author,
○ Reviewed or mentioned work,
○ Source book or periodical,
○ Publication location,
○ Tags (e.g. theme)
Data cleanup● For longevity within the WWP:
○ Exported records into TEI-encoded XML;
○ Placed records under version control;
○ Created descriptive filenames for reference and display
● For findability:
○ Created canonical metadata entries for authors, works, and sources;
○ Ensured transcription records included identifier references to the canonical entries of subjects of
interest
● For readability:
○ Created descriptive transcription record name
○ Created shorthand titles of works and sources
○ Minimally tidied TEI encoding
Sample Bibliography Entry
Referencing from Reviews
Women Writers in Review● The web publication for the Cultures of Reception corpus
● Emphasis on discovery and exploration through links and faceting
● Powered by the same API available to researchers for data access
● Future plans:
○ Incorporate visualizations
○ Highlight temporal and geographic shifts
○ Clean up XML encoding for share-ability
Women Writers in Review
Women Writers in Review