17
Analytics @ Lancaster University Library IGeLU 2014 John Krug, Systems and Analytics Manager, Lancaster University Library http://www.slideshare.net/jhkrug/igelu-analytics-2014

Analytics @ Lancaster University Library IGeLU 2014 John Krug, Systems and Analytics Manager, Lancaster University Library

Embed Size (px)

Citation preview

Analytics @ Lancaster University Library

IGeLU 2014John Krug, Systems and Analytics Manager, Lancaster University Libraryhttp://www.slideshare.net/jhkrug/igelu-analytics-2014

• We are in Lancaster in the UK North West.• ~ 12,000 FTE students, ~ 2300 FTE Staff• Library has 55 FTE staff, building refurbishment in progress• University aims to be 10, 100 – Research, Teaching,

Engagement• Global outlook with partnerships in Malaysia, India, Pakistan

and a new Ghana campus• Alma implemented January 2013 as an early adopter.• I am Systems and Analytics Manager, at LUL since 2002 to

implement Aleph – systems background, not library• How can library analytics help?

Lancaster University, the Library and Alma

• Following implementation of Alma, analytics dashboards rapidly developed for common reporting tasks

• Ongoing work in this area, refining existing and developing new reports

Alma Analytics reporting and dashboards

Results

Fun with BLISS

347 lines of this!

B Floor 9AZ (B)

Projects & Challenges

• LDIV – Library Data, Information & Visualisation• ETL experiments done using PostgresQL and Python• Data from Aleph, Alma, Ezproxy, etc.

• Smaller projects:• e.g. Re-shelving performance – required to use Alma Analytics

returns data along with the number of trolleys re-shelved daily.• Challenges – Infrastructure, Skills, time

• Lots of new skills/knowledge needed for Analytics. For us :Alma analytics (OBIEE), python, Django, postgres, Tableau, nginx, openresty, lua, json, xml, xsl, statistics, data preparation, ETL, etc, etc, etc

Alma analytics data extraction

• Requires using a SOAP API (thankfully a RESTful API is now available for Analytics)

• SOAP support for python not very good, much better with REST. Currently using the suds python library with a few bug fixes for compression, ‘&’ encoding, etc.

• A script get_analytics invokes the required report, manages collection of multiple ‘gets’ if the data is large and produces a single XML file result.

• Needs porting from SOAP to REST.• Data extraction from Alma Analytics is straight forward,

especially with REST

• Ezproxy logs• Enquiry/exit desk query statistics• Re-shelving performance data• Shibboleth logs, hopefully soon. We are dependent on central

IT services• Library building usage counts• Library PC usage statistics• JUSP & USTAT aggregate usage data• University faculty and department data• Social networking • New Alma Analytics subject areas, especially uResolver data

Data from other places

• Currently we have aggregate data from JUSP, USTAT

• Partial off campus picture from ezproxy, but web orientated rather than resource

• Really want the data from Shibboleth and uResolver

• Why the demand for such low level data about individuals?

Gaps in the electronic resource picture

The library and learner analytics

• Learner analytics a growth field• Driven by a mass of data from VLEs and MOOCs …. and

libraries• Student satisfaction & retention• Intervention(?)

• if low(library borrowing) & low(eresource access)

&high(rate of near late or late submissions) &low_to_middling(grades)

thendo_something()

• The library can’t do all that, but the university could/can• Library can provide data

The library as data provider

• LAMP – Library Analytics & Metrics Project from JISC• http://jisclamp.mimas.ac.uk• We will be exporting loan and anonymised

student data for use by LAMP.• They are experimenting with dashboards

and applications• Prototype application later this year.• Overlap with our own project LDIV

• The Library API• For use by analytics projects within the university• Planning office, Student Services and others

The Library API

• Built using openresty, nginx, lua• Restful like API interface• e.g. Retrieve physical loans for a patron

• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml (or json)

<?xml version="1.0" encoding="UTF-8"?><response> <record> <call_no>AZKF.S75 (H)</call_no> <loan_date>2014-07-10 15:44:00</loan_date> <num_renewals>0</num_renewals> <bor_status>03</bor_status> <rowid>3212</rowid> <returned_date>2014-08-15 10:16:00</returned_date> <collection>MAIN</collection> <rownum>1</rownum> <material>BOOK</material> <patron>b3ea5253dd4877c94fa9fac9</patron> <item_status>01</item_status> <call_no_2>B Floor Red Zone</call_no_2> <bor_type>34</bor_type> <key>000473908000010-200208151016173</key> <due_date>2015-06-19 19:00:00</due_date> </record></response>

[{ "rownum": 1, "key": "000473908000010-200208151016173", "patron": "b3ea5253dd4877c94fa9fac9", "loan_date": "2014-07-10 15:44:00", "due_date": "2015-06-19 19:00:00", "returned_date": "2014-08-15 10:16:00", "item_status": "01", "num_renewals": 0, "material": "BOOK", "bor_status": "03", "bor_type": "34", "call_no": "AZKF.S75 (H)", "call_no_2": "B Floor Red Zone", "collection": "MAIN", "rowid": 3212}]

How does it work?

• GET http://lib-ldiv.lancs.ac.uk:8080/ploans/0010215?start=45&number=1&format=xml

• Nginx configuration maps REST url to database query

location ~ /ploans/(?<patron>\w+) { ## collect and/or set default parameters rewrite ^ /ploans_paged/$patron:$start:$nrows.$fmt; }

location ~ /ploans_paged/(?<patron>\w+):(?<start>\d+):(?<nrows>\d+)\.json { postgres_pass database; rds_json on;

postgres_query HEAD GET " select * from ploans where patron = $patron and row >= $start and row < $start + $nrows"; }

Proxy for making Alma Analytics API requests

• e.g. Analytics report which produces• nginx configuration

• So users of our API can get data directly from Alma Analytics and we manage the interface they useand shield them from any APIchanges at Ex Libris.

location /aa/patron_count { set $b "api-na.hosted.exlibri … lytics/reports"; set $p "path=%2Fshared%2FLancas … tron_count"; set $k "apikey=l7xx6c0b1f6188514e388cb361dea3795e73"; proxy_pass https://$b?$p&$k;}

Re-thinking approaches

• Requirements workshops• Application development

• Data provider via API interfaces• RDF/SPARQL capability

• LDIV – Library Data, Information and Visualisation• Still experimenting• Imported data from ezproxy logs, GeoIP databases, student

data, primo logs, a small amount of Alma data• Really need Shibboleth and uResolver data• Tableau as the dashboard to these data sets

Preliminary results

More at http://public.tableausoftware.com/profile/john.krug#!/

• First UK Analytics SIG meeting Oct 14 following EPUG-UKI AGM

• Questions?