64
D Di ig gi it t a al l L Li ib br r a ar ri ie es s: : H Hi is s t t o or ry y , , T T e ec ch hn no ol lo og gy y , , R R& &D D Edward A. Fox Professor, Computer Science, Virginia Tech Blacksburg, VA 24061 USA [email protected] h�p://fox.cs.vt.edu 6 Jan. 2014 1

20140106 qu seminar

Embed Size (px)

Citation preview

Page 1: 20140106 qu seminar

DDiiggiittaall    LLiibbrraarriieess::    HHiissttoorryy,,    TTeecchhnnoollooggyy,,    

RR&&DD

Edward  A.  Fox  Professor,  Computer  Science,  Virginia  Tech  

Blacksburg,  VA  24061  USA  [email protected]            h�p://fox.cs.vt.edu    

6  Jan.  2014   1  

Page 2: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   2  

Page 3: 20140106 qu seminar

HTTP://WWW.QU.EDU.QA/

HTTP://WWW.TAMU.EDU/ HTTP://WWW.PSU.EDU/ HTTP://WWW.VT.EDU/

Funding provided thru the ELISQ project: Electronic Library Institute - SeerQ

6  Jan.  2014   3  

Sponsored  by  Qatar  University  &  Qatar  Na�onal  Library  

HTTP://qnl.qa

Page 4: 20140106 qu seminar

EELLIISSQQ    PPrroojjeecctt    TTeeaamm     Qatar  University,  Qatar:  Mohammed Samaka (Ph.D., Co-Lead PI) Sumaya Ali S A Al-Maadeed (Ph.D., PI) Myrna Tabet Asad Nafees Tahseena Moideen  

 

This  project  was  made  possible  by  NPRP  Grant  #  4  -­‐  029  -­‐  1  –  007  from  the  Qatar  Na�onal  Research  Fund  (a  member  of  Qatar  Founda�on).    

Virginia Tech, USA: Edward Fox (Ph.D., Lead-PI) Tarek Kanan

Penn. State University, USA: C. Lee Giles (Ph.D., PI) Sagnik Ray Choudhury

Texas A&M, USA: Richard Furuta (Ph.D., PI) Hamed Alhoori

6  Jan.  2014   4  

Consultants: John Impagliazzo (Ph.D., Key Investigator) Susan Lukesh (Ph.D.) Carole Thompson

Qatar  Na�onal  Library,  Qatar: Claudia Lux (PI) Krishna Roy Chowdhury Postdoc - TBA

Page 5: 20140106 qu seminar

AAcckknnoowwlleeddggeemmeennttss       Dr.  Mazen  Hasna,  VP  and  Chief  Academic  Officer,  Qatar  University      Dr.  Rashid  Alammari,  Dean,  College  of  Engineering,  Qatar  University      Dr.  Moumen  Hasnah  ,  Director  of  Academic  Research,  Qatar  University    Dr.  Claudia  Lux,  Qatar  Na�onal  Library  Director      Dr.  Imad  Bachir,  Qatar  University  Library  Director    Dr.  Munir  Tag,  Ac�ng  Director  Technical,  ICT  Program  Manager  (QNRF)    Ms.  Krishna  Roy  Chowdhury,  Associate  Director  for  Library  IT,  Qatar  Na�onal  Library    Prof.  Seb�  Foufou,  Head  of  Department  of  Computer  Science  and  Engineering,  Qatar  University    

Page 6: 20140106 qu seminar

AAddddii��oonnaall    TThhaannkkss

6  Jan.  2014   6  

Qscience  –  providing  collec�on: Christopher J. Leonard, Editorial Director Paul Coyne, CTO

US  Na�onal  Science  Founda�on    (recent  and  current  grants  to  Fox):   IIS-­‐1319578    IIS-­‐0916733    DUE-­‐0840719    OCI-­‐1032677    plus  those  to  PSU,  TAMU  

Page 7: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   7  

Page 8: 20140106 qu seminar

IInnttrroodduucc��oonn   Reasons  to  be  here    Interested    Find  what  to  do  with  your  content    Find  how  to  help  your  user  community  

  h�p://www.morganclaypool.com/toc/icr/1/1    1.  DL  Introduc�on,  5S  framework  (2012)    2.  DL  Quality,  Integra�on  (2013)    3.  DL  Technologies  (in  press)    4.  DL  Applica�ons  (in  press)  

 

 

6  Jan.  2014   8  

Page 9: 20140106 qu seminar

6  Jan.  2014   9  

Page 10: 20140106 qu seminar

6  Jan.  2014   10  

Page 11: 20140106 qu seminar

6  Jan.  2014   11  

Page 12: 20140106 qu seminar

6  Jan.  2014   12  

Page 13: 20140106 qu seminar

DDLLss    SShhoorrtteenn    tthhee    CChhaaiinn    ttoo

13  

Author

Reader

Digital

Library Editor

Reviewer

Teacher

Learner

Librarian

Page 14: 20140106 qu seminar

14  

Digital Library Content

Articles,Reports,Books

TextDocuments

Speech,Music

VideoAudio

(Aerial)Photos

GeographicInformation

ModelsSimulations

Software,Programs

GenomeHuman,animal,plant

BioInformation

2D, 3D,VR,CAT

Images andGraphics

ContentTypes

6  Jan.  2014  

Page 15: 20140106 qu seminar

15  

Content Based Information Retrieval

Page 16: 20140106 qu seminar

16  

Page 17: 20140106 qu seminar

Digital  Library  Reference  Model  1.0  p.  30  of  234  

Page 18: 20140106 qu seminar

IInnffoorrmmaall    55SS    DDLL    DDeefifinnii��oonnss        

 help  sa�sfy  info  needs  of  users  (socie�es)   provide  info  services  (scenarios)   organize  info  in  usable  ways  (structures)   present  info  in  usable  ways  (spaces)   communicate  info  with  users  (streams)  

18  

DLs  are  complex  systems  that:  

Page 19: 20140106 qu seminar

19  

IInnffoorrmmaa��oonn    LLiiffee    CCyyccllee

Authoring Modifying

Organizing Indexing

Storing Retrieving

Distributing Networking

Retention / Mining Accessing Filtering

Using Creating

6  Jan.  2014  

Page 20: 20140106 qu seminar

20  

Browsing Collaborating Customizing Filtering Providing access Recommending Requesting Searching Visualizing

Annotating Classifying Clustering Evaluating Extracting Indexing

Measuring Publicizing

Rating Reviewing (peer)

Surveying Translating

(language)

Conserving Converting

Copying/Replicating Emulating Renewing

Translating (format)

Acquiring Cataloging

Crawling (focused) Describing Digitizing

Federating Harvesting Purchasing Submitting

Preservational Creational Add Value

Repository-Building Information Satisfaction

Services

Infrastructure Services

Page 21: 20140106 qu seminar

21  

SSeeeerrSSuuiittee    iiss    NNoott    GGooooggllee

 Metadata  (as  in  library  catalogs)  as  well  as  content    Sets  of  collec�ons,  rather  than  the  Web  as  a  whole  

  Provided  by  a  curator  (e.g.,  publisher,  museum)    Provided  by  user  submissions    Or  collected  by  focused  ‘crawling’  

  Tailored  services,  rather  than  the  same  for  everyone    Browsing  using  categories,  preserving,  adding  value    Based  on  studying  user  requirements,  e.g.,  chemists  

 Working  with  en��es,  rather  than  just  words    Cita�ons,  tables,  figures,  names,  chemical  formula    Using  knowledge  bases,  machine  learning,  ar�ficial  intelligence  

6  Jan.  2014  

Page 22: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   22  

Page 23: 20140106 qu seminar

23  

HHiissttoorryy    OOvveerrvviieeww

  1991,  esp.  from  Informa�on  Retrieval   Connec�ng  computer,  library,  and  informa�on  science  communi�es   NSF  DL  Ini�a�ve  1  in  1994  included  funding  for  Stanford,  where  Google  was  prototyped    Interna�onal  conferences  in  the  Americas  (JCDL),  as  well  as  Europe  (TPDL,  by  DELOS),  Asia  (ICADL)   Publishers:  ACM,  …   DOIs,  (Ins�tu�onal)  Repositories    Spinoffs:  content  &  courseware  management  systems   Recently  including  (linked)  data  

6  Jan.  2014  

Page 24: 20140106 qu seminar

www.nsdl.org  

6  Jan.  2014   24  

Page 25: 20140106 qu seminar

25  

Page 26: 20140106 qu seminar

26  

IInnss��ttuu��oonnaall    RReeppoossiittoorriieess

  “Ins�tu�onal  repositories  are  digital  collec�ons  that  capture  and  preserve  the  intellectual  output  of  a  single  university  or  a  mul�ple  ins�tu�on  community  of  colleges  and  universi�es.”  

 Crow,  R.  “Ins�tu�onal  repository  checklist  and  resource  guide”,  SPARC,  Washington,  D.C.,  USA  

 www.arl.org/sparc/IR/IR_Guide_v1.pdf  

6  Jan.  2014  

Page 27: 20140106 qu seminar

NNDDLLTTDD::    wwwwww..nnddllttdd..oorrgg       Networked  Digital  Library  of  Theses  and  Disserta�ons  (NDLTD)  

  Vision:    Every  thesis  and  disserta�on  in  the  world  is:  o  Devised  to  take  advantage  of  the  most  helpful  electronic  publishing  methods  

o  Shared  globally  and  easily  found  o  Supported  by  a  suite  of  digital  library  services  to  aid  authors,  researchers,  learners,  universi�es  

o  Preserved  and  migrated  permanently  6  Jan.  2014   27  

Page 28: 20140106 qu seminar

28  

 Human  tragedies  that  result  from  man-­‐made  and  natural  events  affect  humans  and  communi�es  significantly.   During  and  a�er  a  tragic  event,  there  are  a  series  of  needs  that  have  to  be  addressed.  o Compounded  by  communica�on  failures  and  a  confusing  plethora  of  data  and  informa�on  

CCrriissiiss,,    TTrraaggeeddyy,,    aanndd    RReeccoovveerryy    ((CCTTRR))    NNeettwwoorrkk    //    IInntteeggrraatteedd    DDiiggiittaall    EEvveenntt    AArrcchhiivvee    &&    LLiibbrraarryy    ((IIDDEEAALL))

6  Jan.  2014  

Page 29: 20140106 qu seminar

 CTRnet  (Crisis,  Tragedy  &  Recovery  Net)    Disaster  Loca�ons  

29  

Page 30: 20140106 qu seminar

 CTRnet  (Crisis,  Tragedy  &  Recovery  Net)   Word  Clouds  of  Japan  Earthquake  and  Libya  Revolu�on  (using  tweets)  

 

30  Libya  Revolu�on    Japan  Earthquake,  

Tsunami  Disaster   Updated  every  10  minutes  

Page 31: 20140106 qu seminar

31  

CCTTRR    ssttaakkeehhoollddeerrss

6  Jan.  2014  

Page 32: 20140106 qu seminar

 CINET:  Network  Science  Middleware  

32  

Page 33: 20140106 qu seminar

 Netviz:    Course  project  aims  to  develop  a  visualiza�on  component  for  CINET  which  contains  large  network  graphs.  The  visualiza�on  service  will  get  Networks  from  CINET,  convert  from  Galib  to  Gexf  format,  then  visualize  the  graphs  using  Gelphi.  

33  

�  CINET:  Network  Science  Middleware  

CINET  network  displayed  using  Gephi  

Page 34: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   34  

Page 35: 20140106 qu seminar

WWeebb    AArrcchhiivviinngg    

  Introduc�on:  Web  archiving  is  the  process  of  gathering  up  data  recorded  on  the  World  Wide  Web,      storing  it,      ensuring  the  data  is  preserved  in  an  archive,  and     making  the  collected  data  available  for  future  research.    

   The  Internet  Archive  and  several  na�onal  libraries  ini�ated  Web  archiving  prac�ces  in  1996.    

6  Jan.  2014   35  

Page 36: 20140106 qu seminar

CCrraawwlleerr    ((HHeerriittrriixx))    ((ffoorr    sseeaarrcchh    eennggiinneess    &&    WWeebb    aarrcchhiivveess))

 A  Web  crawler  starts  with  a  list  of  URLs  to  visit,  called  the  seeds.    

  On  those  page,  iden�fies  all  the  hyperlinks      adds  them  to  the  list  of  URLs  to  visit    recursively  visits  pages  pointed  to      according  to  a  set  of  policies.  

 Priori�zes  its  downloads  –  some  pages  change  o�en.  

6  Jan.  2014   36  

Page 37: 20140106 qu seminar

FFooccuusseedd    CCrraawwlleerrss

  For  a  par�cular  topic  or  event    to  build  a  Web  collec�on  focused  in  that  area  

  Start  with  URLs  of  interest,  viewed  as  seeds  to  grow  from    Expand  in  a  ‘smart’  way  to  get  all  and  only  what  is  relevant  

  Use  informa�on  retrieval  /  ar�ficial  intelligence  /  machine  learning  o Require  ‘knowledge  bases’  and/or  human  training  examples    

  Nevertheless,  there  is  a  tradeoff  between  the  resul�ng  o Recall  (i.e.,  coverage  of  what  is  out  there)  o  Precision  (i.e.,  freedom  from  noise  in  what  is  collected)  

6  Jan.  2014   37  

Page 38: 20140106 qu seminar

SSeeeerrSSuuiittee    IInnssttaann��aa��oonnss

 CiteSeerx   http://citeseerx.ist.psu.edu   A scientific literature digital library and search engine

 ChemXSeer   http://chemxseer.ist.psu.edu   Portal for researchers in environmental chemistry integrating the scientific literature with experimental, analytical, and simulation results and tools

 ArchSeer   http://archseer.ist.psu.edu/   Archeology literature

 TableSeer  ANY fields with tables

6  Jan.  2014   38  

Page 39: 20140106 qu seminar

h�p://citeseerx.ist.psu.edu  CiteSeerX  

   3  M  documents     Ms  of  files     60  M  cita�ons     3  to  6  M  authors     2  to  4  M  hits  day     100K  documents  added  monthly     800K  individual  users     several  Tbytes  

   CiteSeerX  crawls  researcher  homepages  on  the  web  for  scholarly  papers,  formerly  in  computer  science  

   Converts  PDF  to  text     Automa�cally  extracts  OAI  metadata  and  other  data     Automa�c  cita�on  indexing,  links  to  cited  documents,  crea�on  of  document  page,  author  disambigua�on     So�ware  open  source  –  can  be  used  to  build  other  such  tools  

6  Jan.  2014   39  

Page 40: 20140106 qu seminar

6  Jan.  2014   40  

Page 41: 20140106 qu seminar

6  Jan.  2014   41  

Page 42: 20140106 qu seminar

SSeeeerrSSuuiittee   Tool  kit  used  to  build  search  engines  and  digital  libraries  

  CiteSeerX  ,  MyCiteSeerX  ,  ChemXSeer,  ArchSeer,  AlgoSeer,  AckSeer,  BizSeer,  CSSeer,  CollabSeer,  RefSeer,  GrantSeer,  SeerSeer,  YouSeer,  etc.    Built  on  commercial  grade  open  source  tools  (Solr/Lucene)    Penn  State  exper�se  –    automated  specialized  metadata  extrac�on  

  Supports  research  in    Indexing  and  search    Data  mining  &  structures    Informa�on  and  knowledge  extrac�on    Social  networks:  Name/en�ty  disambigua�on    Scientometrics/infometrics    Systems  engineering    User  interface  design  (HCI  =  human-­‐computer  interac�on)    So�ware  engineering  and  management  

Page 43: 20140106 qu seminar

ChemXSeer Highlights Portal for academic researchers in chemistry which integrates the scientific

literature with experimental, analytical and simulation results and tools Provides unique metadata extraction, indexing and searching pertinent to the

chemical literature by using heuristics combined with machine learning Chemical formulae and names Tables Figures Publication functions as in CiteSeerX Expert and expertise search.

After extraction, data stored API accessible xml for users. Hybrid repository: Serves as a federated information interoperational system

Scientific papers crawled and indexed from the web User submitted papers and datasets (e.g. excel worksheets, Gaussian and CHARMM toolkit outputs) Scientific documents and metadata from publishers, web or archives.

Access control for proprietary provided content and user-submitted experiment data

Takes advantage of in-house open source projects such as CiteSeerX/

Seersuite.

Page 44: 20140106 qu seminar

Example Formula Search

Page 45: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   45  

Page 46: 20140106 qu seminar

UUsseerrss    -­‐-­‐    TTAAMMUU

 Requirements  (content,  services)   Prac�ces  (scholarly,  informa�on  seeking)    Social  framework  (collabora�on,  recommenda�on)  

  Interviews,  surveys  

  Evalua�ons:  usability,  benefits  

6  Jan.  2014   46  

Page 47: 20140106 qu seminar

IInnffrraassttrruuccttuurree    -­‐-­‐    PPSSUU

  Computers,  so�ware,  launching  infrastructure  at:    QU:  powerful  server,  now  crawling           +  ready  to  help  any  group  interes�ng  in  cura�ng  a  collec�on    VT,  QNL  (postdoc),  QCRI  (Prof.  Mitra),  …  

  Adapt  to  disciplines,  interes�ng  parts  of  documents    Adapt  to  each  collec�on  

  Develop  knowledge  base  and  heuris�cs  for  the  coll.    Change  document  parser    Change  database  to  match  what  occurs    Change  extractors  :  document  -­‐>  database  

6  Jan.  2014   47  

Page 48: 20140106 qu seminar

AArraabbiicc    -­‐-­‐    VVTT

 Handle  Arabic  text  documents   Obtain  a  suitable  category/classifica�on  system   Have  people  provide  ‘training  set’   Use  machine  learning  to  automa�cally  classify  future  Arabic  text  documents  

  Support  cross-­‐language  informa�on  retrieval    Arabic  ques�on  against  English  documents    English  ques�on  against  Arabic  documents    

6  Jan.  2014   48  

Page 49: 20140106 qu seminar

AArraabbiicc    HHaannddwwrrii��nngg    -­‐-­‐    QQUU

  Images  of  historic  documents   Arabic  text  extracted   Mapping  from  a  part  of  the  text  to  the  corresponding  part  of  the  image    Special  tools  for  

  Those  processing  the  original  documents    Those  doing  research  with  the  collec�on  

 Will  allow  work  on  non-­‐textual  collec�ons  too,  e.g.,  museum  images,  set  of  photos  for  teaching  architecture  

6  Jan.  2014   49  

Page 50: 20140106 qu seminar

AAcccceessssiibbllee    CCoolllleecc��oonnss    iinn    QQaattaarr    -­‐-­‐    QQNNLL  What  collec�ons  have  the  highest  priority?  

 What  special  handling  is  needed  for  each  class,  for  each  subclass  of  collec�on  type?  

 How  do  DLs  best  fit  into  the  ac�vi�es  of  the  Na�onal  Library?  

 Can  .qa  be  fully  archived  for  Wayback  Machine  use?  

6  Jan.  2014   50  

Page 51: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   51  

Page 52: 20140106 qu seminar

52  

DDLL    CCuurrrriiccuulluumm    FFrraammeewwoorrkk Semester 1:

DL collections:development/creation

Semester 2:DL services and

sustainability

CO

UR

SE

STR

UC

TUR

E

DigitizationStorage

Interchange

Digital objectsCompositesPackages

MetadataCataloging

Author submission

NamingRepositories

Archives

Spaces(conceptual,geographic,2/3D, VR)

Architectures(agents, buses,

wrappers/mediators)Interoperability

Services(searching,

linking, browsing, etc.)

Intellectual property rights mgmt.

PrivacyProtection (watermarking)

Archiving and preservation

Integrity

Architectures(agents, buses,

wrappers/mediators)Interoperability

CO

RE

DL

TOP

ICS

DocumentsE-publishing

Markup

Info. NeedsRelevanceEvaluation

Effectiveness

ThesauriOntologies

ClassificationCategorization

Bibliographic information

BibliometricsCitations

RoutingFiltering

Community filtering

Search & search strategyInfo seeking behavior

User modelingFeedback

Info summarizationVisualization

Multimedia streams/structures

Capture/representationCompression/coding

Content-based analysis

Multimedia indexing

Multimediapresentation,

rendering

RE

LATE

DTO

PIC

S

6  Jan.  2014  

Page 53: 20140106 qu seminar

MMoodduulleess

 h�p://en.wikiversity.org/wiki/Curriculum_on_Digital_Libraries    Table  1:  Core  DL  Modules    Table  2:  Informa�on  Retrieval  Packages    Table  3:  Big  Data    Table  4:  Mul�media  So�ware  

  Like  lesson  plans,  for  a  training  session  or  lecture   Can  be  used  for  self-­‐study,  refreshers  

53  

Page 54: 20140106 qu seminar

6  Jan.  2014   54  

h�p://curric.dlib.vt.edu/modDev/modDev.html  

Page 55: 20140106 qu seminar

EELLIISSQQ    AAuuddiieennccee    ((UUsseerrss))   Primary:  

o  Librarians    and  libraries  in  Qatar  o  Researchers  and  academics  o  Government  organiza�ons  o  Non-­‐Governmental  organiza�ons    

(such  as  h�p://www.fsd.org.qa/)  

  Secondary:  o  University  /  School  Students  o  Teachers  /  Faculty    o  Managers  o  Qatari  ci�zens  o  Other  stakeholders  

6  Jan.  2014   55  

h�p://elisq.qu.edu.qa/  

Page 56: 20140106 qu seminar

Project  Objec�ves/Aims  

A.  Research   and   prototype   digital   library   systems   and  infrastructure   for   Qatar,   focusing   ini�ally   on   Qatari  informa�on   related   to   government   and   scholarly  ac�vi�es.  

 

Leverage   the   crawling   engine   from   Penn   State‘s   SeerSuite  so�ware  infrastructure,  and  extend  it  beyond  its  current  focus  on  English  to  support  Arabic-­‐English  collec�ons,  and  to  cover  a  broad  range   of   scholarly   disciplines,   and   all   types   of   government  informa�on.    

6  Jan.  2014   56  

EELLIISSQQ    PPrroojjeecctt    ((11    ooff    22))    

Page 57: 20140106 qu seminar

Project  Objec�ves/Aims  (con�nued)    B.  Research   and   build   the   digital   library   community   in  

Qatar,   suppor�ng   digital   library   use,   services,  collec�on   development,   tailored   systems,   and  advancing  toward  a  Knowledge  Society.  

 

Study   scholarly   ac�vi�es,   and   engage   in   community   building   in  Qatar,   so   DLs   can   be   tailored   to   specific   domains   and   to   the  unique  needs  of  Qatar.  Through  workshops,  a  consul�ng  center  at  the  proposed  Ins�tute,  and  collabora�ve  efforts  with  libraries  and  museums  in  Qatar,  we  will  iden�fy  par�cular  needs  and  uses,  and  tailor  collec�ons,  systems,  and  services,  to  lead  toward  the  Qatari  Knowledge  Society.  

6  Jan.  2014   57  

EELLIISSQQ    PPrroojjeecctt    ((22    ooff    22))    

Page 58: 20140106 qu seminar

SSiiggnniifificcaannccee    ttoo    LLiibbrraarriiaannss,,    CCoorrppoorraa��oonnss,,    aanndd        GGoovveerrnnmmeennttaall    AAggeenncciieess

  The  need  to  preserve  cultural  and  historical  heritage  =>  o  Collec�ons  of  fragile  and  precious  ar�facts  =>    o  Libraries,  museums,  and  archives  developing  digital    

collec�ons  =>  o  Users  from  all  over  the  world  accessing  and  studying  

  A  one  stop  search  of:    o  Informa�on  about  Qatar  o  Informa�on  to  preserve  the  culture  of  Qatar  

  Deep  indexing,  analysis,  and  retrieval  of:  o  Resources,  reports,  sta�s�cs,  and  other  types  of  informa�on  o  Informa�on  in  the  Arabic  language  as  well  as  in  English  

6  Jan.  2014   58  

Page 59: 20140106 qu seminar

EELLIISSQQ    CCoonntteenntt  Metadata,  data,  and  many  types  of  documents  (including  full  text)   Qatari  resources  that  first  appeared  in  digital  form  -­‐  ‘born’  digital   At  a  later  stage  the  project  will  include:    o  Digital  versions  of  material  already  exis�ng  in  print  o  Mul�media  (image,  audio,  video)  forms  

  Free  and  open  as  well  as  content  with  limited  access  

6  Jan.  2014   59  

Page 60: 20140106 qu seminar

EELLIISSQQ    FFooccuuss

Community  in  Qatar    Iden�fy  interested  stakeholders,  to  tailor  to  needs    Train  next  genera�on  of  digital  librarians,  archivists,  and  curators   Partners  helping  with  addi�onal  collec�on  development  

 Advanced  Technology  for  Enhanced  Access    “Low  hanging  fruit”  by  crawling  Qatar-­‐related  Web    Improved  analysis  (cita�ons,  tables,  chemicals,  …)    Support  for  both  Arabic  and  English    

6  Jan.  2014   60  

Page 61: 20140106 qu seminar

OOuuttlliinnee  Acknowledgments    Introduc�on   History   Technology   Research   Development   Summary  and  Discussion    

 6  Jan.  2014   61  

Page 62: 20140106 qu seminar

SSuummmmaarryy    ((ssoommee    hhiigghhlliigghhttss))

  Introduc�on  to  digital  libraries:  5S,  any  content  

  History:  since  1991,  Google,  repositories  

  Technology:  SeerSuite,  Heritrix,  Solr,  HCI    Ini�al  collec�ons:  Qscience,  news,  …  

  Research:  extend  SeerSuite;  Arabic    Adapt  other  tools  for  handwri�ng  collec�on,  non-­‐text  collec�ons  

  Development:  consul�ng  center  (addressing  needs)  

6  Jan.  2014   62  

Page 63: 20140106 qu seminar

QQuueess��oonnss    ffoorr    YYoouu

 What  communi�es  should  be  served?  

 What  collec�ons  should  be  made  accessible?  

 What  services  are  required?  

 What  are  the  priori�es  in  the  above?  

  Can  you  help  us  find  suitable  partners,  content  owners,  curators,  user  groups?  

6  Jan.  2014   63  

Page 64: 20140106 qu seminar

QQuueess��oonnss    ffoorr    UUss??

 h�p://elisq.qu.edu.qa/  

[email protected]  

 h�p://fox.cs.vt.edu  

6  Jan.  2014   64