Digital Transformation: Big Data and Data Science Learning Path

Preview:

Citation preview

Chula DataScienceCenterofExcellenceinMulti-Disciplinary

BigDataAnalytics

BigDataandDataScienceLearningPath

DigitalTransformation #แบ่งปัน

HeadofDepartmentDept.ofComputerEngineeringFacultyofEngineeringChulalongkorn University

natawut.n@chula.ac.th@natawutnhttp://natawutn.wordpress.comhttp://www.slideshare.net/natawutnupairoj

Asst.Prof.NatawutNupairoj,Ph.D.

DataScience=Sensors+BigData+DataAnalytics

TheNewEquation

DataAnalyticsSimplified

Descriptive • “A.Natawut drinksabout1cupofcoffeeaday”

Diagnostic• “NumberofcupsthatA.Natawut drinksdependonnumberofmeetingshehaseachday”

Predictive• “Tomorrow,A.Natawut has2meetings,itisverylikelythatA.Natawut willdrink2cupstomorrow”

Prescriptive• “Informsecretarytoprepare1cupinthemorningandoneintheafternoonforA.Natawut”

Sensors=App/IoT /SocialNetwork

BigData=ProcessingCapabilities

DataAnalytics=Domain-OrientedMachineLearning

IntroducingFDA-ApprovedIngestibleSensorsinPills

http://www.forbes.com/sites/singularity/2012/08/09/no-more-skipping-your-medicine-fda-approves-first-digital-pill/

Casestudy:PredictivePolicing

Beingusedby60citiesintheUSe.g.Atlanta,LA,etc.

Source:http://www.forbes.com/sites/ellenhuet/2015/02/11/predpol-predictive-policing

NHKDocumentary:DisasterBigData- Keytorecovery

KeyQuestion

“Howmanypeoplearestillresidedineacharea?”

Challenges

• Howtoprocessbigdata?• 122Msubscribers+2.5yearsofdata=200TB-300TB

• Howtoanalyzedata?• Whatisthedefinitionofbeing“residence”?• Howtosamplingmobilesubscriberscorrectly?

• Howcanweunderstandtheresults?• Howtovisualizedata?• Howtotellstory?

“DataScienceisaTeamSport”– DJPatil

DomainKnowledge

Math&Statistics

ComputerScience

DataScientist

StatisticalResearchDataProcessing

MachineLearning

DataScientistSkillsintheContextofNHKDocumentary

DomainKnowledge

Math&Statistics

ComputerScience

StatisticalResearchDataProcessing

MachineLearning

• Howtostore300TBofdata?• Howtoprocess300TB

effectively?• HowaboutDataCleansing?• Howtovisualizedata?

• Howtosampledatacorrectly?• Howtoturngeolocationinto

structureddata?• Howtopredictpopulation

accurately?

• Howtodefine“residence”?• Howtoclassifylocalpeople

fromworkers?• Howtoutilizetheseresults?

ModernDataScienceTeam

Source:http://www.slideshare.net/continuumio/why-open-data-science-matters-gartner-bi-analytics-summit-16

Understanding/Preparation/Modeling/Evaluation

Deployment

http://nirvacana.com/thoughts/becoming-a-data-scientist/

MostIn-DemandSkillsforDataScientistin2016

Source:https://www.crowdflower.com/what-skills-should-data-scientists-have-in-2016/

FinalThoughts

• AGoodDataScientistCommunicatesEffectivelyToBusinessUsers• AGoodDataScientistKnowsYourBusiness• AGoodDataScientistUnderstandsStatisticalPhenomena• AGoodDataScientistMakesEfficientPredictions• AGoodDataScientistProvidesProduction-ReadySolutions• AGoodDataScientistCanWorkOnAMassScale

https://blog.dataiku.com/2013/11/10/the-six-core-skills-of-a-data-scientist

Chula DataScienceCenterofExcellenceinMulti-Disciplinary

BigDataAnalytics

Recommended