Upload
aba-sah
View
483
Download
3
Embed Size (px)
Citation preview
Rela%veTrendsinScien%ficTermsonTwi4er
VictoriaUren,Aba‐SahDadzieTheOAKGroup,Dept.ofComputerScience,TheUniversityofSheffield
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Introduc%on
• scien%ficresearchtradi%onallydisseminatedviajournals,books,scien%ficconferences
• newformofdiscourse–onlinesocialmedia– suitableforumfordissemina%ngscien%ficresearch?
– doscien%stsengagewithonlinesocialmedia?
– aretheresufficientamountsofinforma%ononscien%fictopics?
• aretheresuitablemetricsformeasuringscien%ficimpactonline?– betweenscien%sts?– forpublicengagement?
• arethesenewmeasurescomparabletoformalmetrics?
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment– Data– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on
• RelatedWork
• Experiment– Data– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
RelatedWork
• Garfield,E.(from1950s)– fatherofscientometrics
• Priemetal.(2010)– Scientometrics2.0asanewmetricformeasuringscholarlyimpactonsocialweb
• Lane(2010)– needtoimprovemetricsusedtomeasurescien%ficimpact
• Micheletal.(2011)– GooglenGramstoanalyseculture– a.o.,recognisedfameforscien%stslow…
• Cheongetal.(2009)– H1N1spike(trend)detectedonTwi4erduringflupandemic(May2009)
• Roweetal.(2011)– influenceofcontentandauthorfeaturesonpredic%onofac%ve,longterm
discussionsonsocialweb• Kinsellaetal.(2011)
– usinghyperlinkedmetadatatoaidcategorisa%onoftopicsdiscussedinonlinesocialmedia
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment– Data– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Experiment
• exploratoryexperiment– todeterminefrequencyofoccurrenceofscien%fictermusagein
onlinesocialmedia
• dataset– threesetsof(scien%fic)termsselectedfromUNESCOthesaurus– GoogleBooksNGramscorpususedasabaseline
– 300tweetscollectedineachsample,usingTwi4erAPI,forselectedterms
• frequency/usageanalysis
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment
– Data– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
UNESCOThesaurus1GramTerms
Topic TermsPhysicalSciences Ioniza%on,Electromagne%sm,Crystallography
ChemicalSciences Phosphorus,Alkalinity,Microchemistry
EarthSciences Permafrost,Lithosphere,Glaciology
• selec%oncriteria– minimisa%onofnoiseduetopolysemy
– avoidanceofscien%fictermswithothercommon/colloquialusage
– termsuniquetoapar%culartopic
– wordswithasinglestem
– 1Gramsonly
altmetrics11:TrackingscholarlyimpactonthesocialWeb
BaselineDataset–Google1Grams
• obtainedfromGoogleBooksNGramscorpus1
• totalNGramsbyyearforthreesetsofterms– 2006–116,029– 2007–126,206– 2008–111,417
• annualvaria%onbytopic(oftotalNGramsbaselinedataset)– ChemicalSciences50‐60%
– PhysicalSciences30‐40%– EarthSciences~10%
• [1]h4p://ngrams.googlelabs.com/datasets
altmetrics11:TrackingscholarlyimpactonthesocialWeb
BaselineDataset–Google1Grams
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erDataset
SampleID CollecAonPeriod ElapsedTime(h)
T‐300‐1 TueMar0120:56:43GMT2011–ThuMar0314:22:18GMT2011
41
T‐300‐2 FriMar0402:35:55GMT2011–SunMar0618:38:05GMT2011
64
T‐300‐3 MonMar0720:31:11GMT2011–WedMar0916:21:36GMT2011
44
• threesamplescollected,containing300consecu%vetweetseach• ~0.003%oftotaltweetsovercollec%onperiod
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment– Data
– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erc.f.GoogleNGrams
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erc.f.GoogleNGrams
• highervaria%onindistribu%onforTwi4ersample– howeverlargelyinlinewithGoogleNGrams
• canGoogleNGramsserveasasuitablebaseline?– needtomorecloselyexaminevaria%on…
• notablepeaksinTwi4ersampleforthreeterms– Permafrost(EarthSciences)
– Alkalinity(ChemicalSciences)– Phosphorus(ChemicalSciences)
• arethesepoten%altrends?
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erc.f.GoogleNGrams
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erc.f.GoogleNGrams• Permafrost
– 17%and15%inTwi4ersamples(T‐300‐1&2)–c.f.5%inG‐2006‐2008– 41outof113tweets(36%)usedinscien%ficcontext– largenumberoftweetsreferredto
• onlinegameserver1• designercaseforiPhone
• Alkalinity– nonefoundtohavescien%ficcontent– mostlyusedinpseudo‐scien%fichealthadvice– peakinT‐300‐2(31outof60tweets–~50%)
• dominatedbypHmeasuresinswimmingpools&fishtanks• influenceprobablyduetocollec%onperiod–weekend–engagementinleisureac%vi%es
• [1]h4p://www.everquest2.com/Permafrost
altmetrics11:TrackingscholarlyimpactonthesocialWeb
ExampleTweets–Permafrost
• advert/chat– @HDNinjacpgotoPermafrostItsneverfull:FriMar0405:21:02GMT2011– @Riffy8888heyCouldyouCometomyPartybirthdayPartyonCPMarch13Server
PermafrostDock6:00PST:SunMar0604:37:13GMT2011– PartyServerPermafrostDockPleaseGoIt'sAnEarlyBirthdayPartyForme:ThuMar03
01:38:28GMT2011
• cold– 36inchesofpermafrosts%ll,Iwanttostakemybirdcondob4thesquirralsknockit
downagain..bas%ds..allof'm:SatMar0501:51:48GMT2011
• science– FireandIce:PermafrostMeltSpewsCombus%bleMethaneh4p://%ny.ly/be8q:FriMar
0416:43:10GMT2011– (retweeted)‐ExpertsMonitorMethaneReleasefromPermafrost:Overthepastfew
years,methanelevelsaroundtheworldhaveb...h4p://bit.ly/hvVEJX:WedMar0212:27:25GMT2011
– RT@NetNewsBuzz:PermafrostMeltSoonIrreversibleWithoutMajorFossilFuelCutsh4p://%nyurl.com/5w8w2oh#oil#climate#CO2#fossilfuels:ThuMar0302:57:48GMT2011
altmetrics11:TrackingscholarlyimpactonthesocialWeb
ExampleTweets–AlkalinityT‐300‐2
• ChemistryHelpNeeded!pH,concentra%onofcarbonatespeciesandalkalinity...justgotpublished:h4p://bit.ly/hUCpz7– URLpointstotheques%onon“MyChemistryTutor”–homework?
• retweeted– Thepropertotalalkalinityforyourpoolis100ppm.h4p://su.pr/8hrxCE:FriMar
0419:02:20GMT2011– IftheTotalAlkalinityinyourswimmingpoolislow,yourpHwillbelow.h4p://
su.pr/8hrxCE:FriMar0420:34:11GMT2011
• spam/adverts(includingretweets)– @Poet_Carl_Wa4s:somefoodscreateacidityoralkalinityayerthey‚Äôre
metabolized...h4p://ping.fm/GQTvA#KnowledgeIsPower!:SatMar0502:38:55GMT2011
– RT@CourtneyPool:Greenjuice,ohLiquidEmeraldElixirofLifeandAlkalinity!CoursethroughmyBODY!#juicing:SunMar0618:34:29GMT2011
altmetrics11:TrackingscholarlyimpactonthesocialWeb
SampleID
Total LegislaAon NutriAon OtherSciences
Industry WhitePhosphorus
T‐300‐1 129 46 16 29 4 5
T‐300‐2 119 4 26 35 9 5
T‐300‐3 171 12 23 37 42 19
• Twi4ertrendsforPhosphorusinsampleT‐300‐3– Industry
• takeoverofaBraziliancompanybytheIndianfirmUnitedPhosphorus– WhitePhosphorus
• 17retweetsofanemo%vemessage(rela%ontoMiddleEastwars)
Twi4erc.f.GoogleNGrams:Phosphorus
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Twi4erc.f.GoogleNGrams:Phosphorus
• usagelargelywithscien%ficcontent– withrela%onships,a.o.,tolegal,nutri%onal&economiccontext– fivemaincategoriesiden%fied
• Legisla%on– limitstouseinfer%liser,soap
• Nutri%on– phosphoruscontent
• OtherScience– peakphosphorus,pollu%on– discoveryofarsenicreplacingphosphorusinamicrobe– tweetsaboutnewpaperonRedfieldra%oinorganisms
• Industry– mergers,pricesofPhosphorus‐containinggoods
• WhitePhosphorus– useinMiddleEastwars
altmetrics11:TrackingscholarlyimpactonthesocialWeb
ExampleTweets–Phosphorus • Legisla%on
– RT@YarnPlayCafe:ThefactthathewantstorepealthephosphorusbanandkilltheMadisonlakesis,byitself,enoughto#killthisbill...:TueMar0802:04:49GMT2011
• Nutri%on– Big,wetsnowflakesdriyoverthefarm.Towarmup,ItrysomeHorlicks,awheat/barley/whey
drinkwithlotsofcalcium&phosphorus.Mmmm.:TueMar0120:56:43GMT2011– VitaminDactsasanhormoneandplaysacontrollingroleinthemetabolismofcalciumand
phosphorus:SunMar0612:12:36GMT2011• OtherScience
– [java]129:GreaterPhosphorusEfficiencyh4p://bit.ly/iehsmK#agriculture:WedMar0214:36:21GMT2011
• Industry– #stocks#bse#nseBuyUnitedPhosphorus‐posi%vemovetotaplargestLa%nAmericanmarket;
Edelweissh4p://dlvr.it/JdSpV:TueMar0817:22:55GMT2011– Enshi:Wugangdevelopstechniquetohandlehigh‐phosphorusironore‐SteelBusinessBriefing
(subscrih4p://uxp.in/30538045:TueMar0809:33:05GMT2011• WhitePhosphorus
– DearAmerica,yourwhitephosphorusanddepleteduraniumcannotstopthegrowthofIraq'sfuture.IraqWillRise.:WedMar0207:49:21GMT2011
– @Remroumsofirsttheystealourland,nowtheywantour"tac%cs"i.e.poetry?iguessthewhitephosphorusjustisn'tcu�ngitanymore.:SatMar0503:42:44GMT2011
• ???– @p_kojo‐PhosphorusPotassium‐Pinocchio,I'msogladwefoundeachothernwwecanhav
lotsoffun:):SunMar0613:43:10GMT2011
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment– Data– Analysis&Results
• Conclusions• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Conclusions–Experiment• recognisedchallenges
– baselinecorpusforonlinesocialmediadifficulttoobtain• verysmall(rela%vely)samplesfoundinTwi4erstream
• difficulttoobtainrepresenta%vesamples moreeffec%vemethodsrequiredtoextractlowerfrequencyterms
– difficultyreproducingexperiments
– reliability,ethical&privacyissues–duetouser‐createdcontent
• whatisasuitable,publiclyavailablebaselinecorpus?– GoogleNGrams?
• differentinforma%oncollec%onmethodsfromonlinesocialmedia– coverageoftopicsmayseelargevaria%onbetweencorpora
– anyothers?• Wikipedia/DBpedia?TREC?
altmetrics11:TrackingscholarlyimpactonthesocialWeb
EngagementwiththeWeb?
• whydoscien%stsnottweet?(orengagemuchinothersocialmedia)?– isthewebnotseentoenforcesufficientscien%ficrigour?
– doscien%stsnotviewthewebasapoten%alaudience?• isthewebaudienceasuitablepeerreviewer?
• whydoscien%stshesitatetodisseminateinforma%ononline?– poten%alforideastobestolen?– trust–howtodifferen%atebetweenvalidscienceandpseudo‐science,
spamandadverts?
• socialmedialargelydrivenbypersonalinterest,sen%ment,opinion– mayexplainlowscien%ficcontent
– morecolloquialuseofwhatistradi%onallyscien%ficterminology
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Implica%onsforAltmetrics• however‐somelevelofscien%ficdiscourseonTwi4er
– e.g.,Phosphorusiden%fiedasapoten%alTwi4ertrend
• onlinesocialmediamays%llhavepoten%altoserveasanaltmetricformeasuringimpactofscience
• star%ngfromscientometrics‐whichlooksatauthorfeatures,e.g.,– co‐cita%on– affilia%on–rela%onshiptoreputa%on
• correspondingfeaturesinonlinesocialmedia– followers– retweets–rela%onshiptotrust?
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Outline
• Aims/Introduc%on• RelatedWork
• Experiment– Data– Analysis&Results
• Conclusions
• NextSteps
• Acknowledgements
altmetrics11:TrackingscholarlyimpactonthesocialWeb
NextSteps• replicateexperimentswithlargersamplesoverlongerperiod
– moredetailedanalysis• e.g.,hashtaganalysis;urlswithintweets• focusontermswithmoretrendingpoten%al,e.g.,nanostructures,nanosilver
• considerspecifictweets– fromscien%ficmediaandjournals– postedduringscien%ficconferences,congresses
• comparisonwithotherindependentbaselinedatasets
• compareTwi4erusewithindifferentdisciplines– influenceofinterdisciplinarycollabora%ononuseofonlinesocialmedia?
• createnewbenchmarksdata&experiments definealt‐metricforscien%fictermusageinonlinesocialmedia
altmetrics11:TrackingscholarlyimpactonthesocialWeb
Acknowledgements
• ElizabethCanofordiscussionsoncollec%onanduseofdatafromTwi4erstreams
• V.S.Uren&A.‐S.Dadziefundedby:– EuropeanCommission7thFrameworkProgrammeproject
SmartProducts(grantno.231204)
altmetrics11:TrackingscholarlyimpactonthesocialWeb
References
• Garfieldbib‐h4p://garfield.library.upenn.edu/pub.html
• Ma4hewRowe,SofiaAngeletouandHarithAlani.(2011)Predic%ngDiscussionsontheSocialSeman%cWeb,Proc.,ESWC(2)2011:405‐420
• SheilaKinsella,MengjiaoWang,JohnBreslinandConorHayes.(2011)ImprovingCategorisa%oninSocialMediausingHyperlinkstoStructuredDataSources,Proc.,ESWC(2)2011:390–404
• othersinpaperreferences–seeh4p://altmetrics.org/altmetrics11/uren‐v0