Author
fares-al-qunaieer
View
448
Download
4
Embed Size (px)
.
NetworkingData transportationData securityData privacyData storageDatabase SystemsData Quality and GovernenceHigh Performance Computing (HPC)
If you know the enemy and know yourself, your victory will not stand in doubt; if you know Heaven and know Earth, you may make your victory complete. Sun Tzu
War is 90% information. Napoleon Bonaparte
What gets measured gets managed Peter Drucker
A Data Warehouse is like a tea bag; you never know how strong it is until you are in hot water. Eleanor Roosevelt
Data is a collection of facts, such as numbers, words, measurements, observations or even just descriptions of things
Information in raw or unorganized form that refer to, or represent, conditions, ideas, or objects
DIKW pyramid
Software Engineer: the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software
Machine Learning: is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed
Data Mining: the practice of examining large pre-existing databases in order to generate new information
Data Analyst: Data analysts generally analyze well-defined sets of datausing an arsenal of different tools to answer tangible business needs
Data Engineer: The data engineer gathers and collects the data, stores it, does batch processing or real-time processing on it, and serves it via an API to a data scientist who can easily query it
Data Scientist: Data Scientists estimate the unknown by asking questions, writing algorithms, and building statistical models
Business intelligence (BI): is a technology-driven process for analyzing data and presenting actionable information to help corporate executives, business managers and other end users make more informed business decisions.
Data ScienceThe term "Data Science" was coined by William S. Cleveland (2001)Is often attributed to Jeff Hammerbacher and DJ Patil, of Facebook and LinkedIn (2008)2010-now: exploded
Data Science
Data Science +
CollectionIntegrationOrganizing (e.g., rows and columns)Missing valuesOutliersNormalizationSegmentation
StatisticsProgramming/Analytics platformExploratory analysisVisualizationMachine learningRegressionClassificationClusteringRecommendation/decession support systems
:
(quantitative)
Quantitative data deals with numbers and things you can measure objectively: dimensions such as height, width, and length. Temperature and humidity. Prices. Area and volume.Discrete data is a count that can't be made more precise.Continuous data, on the other hand, could be divided and reduced to finer and finer levels.
/ (qualitative)
Qualitative data deals with characteristics and descriptors that can't be easily measured, but can be observed subjectivelysuch as smells, tastes, textures, attractiveness, and color.
Sentiment analysisTopic modelingDocument categorizationText SummarizationSpam filteringInformation retrievalMachine translation
(Signals)
(Signals)
(signals)A signal is a function that "conveys information about the behavior or attributes of some phenomenon
(time series)
:
(Graph analytics/Network analysis)
(Relationship analysis) (Path analysis) (Connectivity analysis) (Centrality analysis)
//// ( )
60
Creating images, diagrams, or animations to:communicate a messageUnderstand dataUnderstand concepts
John Snow1854Cholera casesLondon
Charles MinardRussian campaign of 18126 types of data in 2 dimensions:the number of Napoleon's troopsdistance traveledtemperaturelatitude and longitudedirection of travellocation relative to specific dates
Information visualization concentrates on the use of computer-supported tools to explore large amount of abstract dataScientific visualization is the transformation, selection, or representation of data from simulations or experiments, with an implicit or explicit geometric structure, to allow the exploration, analysis, and understanding of the dataEducational visualization
Scientific visualization
( ...)