73
ل و ة ج عا م ات ان ي ب ل ا ة ل م ا ش رة ظ ن. ر عي ي ن( ق ل حا ل صا- ن ب ارس ف د ة ي ق ي ن ط ت ل ا ات ي ض ا رن ل وا ب س حا ل ا ة ي ن( ق ي ل ي طن و ل ا ر ك ر م ل ا ساعد م ب ح ت اد ي سL ا ة ي ن( ق ي لم وا و ل ع ل ل ز ي T ر لع دا ي عW ك ل م ل ا ة ي ي مد

معالجة وتحليل البيانات نظرة شاملة

Embed Size (px)

Citation preview

.

NetworkingData transportationData securityData privacyData storageDatabase SystemsData Quality and GovernenceHigh Performance Computing (HPC)

If you know the enemy and know yourself, your victory will not stand in doubt; if you know Heaven and know Earth, you may make your victory complete. Sun Tzu

War is 90% information. Napoleon Bonaparte

What gets measured gets managed Peter Drucker

A Data Warehouse is like a tea bag; you never know how strong it is until you are in hot water. Eleanor Roosevelt

Data is a collection of facts, such as numbers, words, measurements, observations or even just descriptions of things

Information in raw or unorganized form that refer to, or represent, conditions, ideas, or objects

DIKW pyramid

Software Engineer: the application of a systematic, disciplined, quantifiable approach to the development, operation, and maintenance of software

Machine Learning: is a type of artificial intelligence (AI) that provides computers with the ability to learn without being explicitly programmed

Data Mining: the practice of examining large pre-existing databases in order to generate new information

Data Analyst: Data analysts generally analyze well-defined sets of datausing an arsenal of different tools to answer tangible business needs

Data Engineer: The data engineer gathers and collects the data, stores it, does batch processing or real-time processing on it, and serves it via an API to a data scientist who can easily query it

Data Scientist: Data Scientists estimate the unknown by asking questions, writing algorithms, and building statistical models

Business intelligence (BI): is a technology-driven process for analyzing data and presenting actionable information to help corporate executives, business managers and other end users make more informed business decisions.

Data ScienceThe term "Data Science" was coined by William S. Cleveland (2001)Is often attributed to Jeff Hammerbacher and DJ Patil, of Facebook and LinkedIn (2008)2010-now: exploded

Data Science

Data Science +

CollectionIntegrationOrganizing (e.g., rows and columns)Missing valuesOutliersNormalizationSegmentation

StatisticsProgramming/Analytics platformExploratory analysisVisualizationMachine learningRegressionClassificationClusteringRecommendation/decession support systems

:

(quantitative)

Quantitative data deals with numbers and things you can measure objectively: dimensions such as height, width, and length. Temperature and humidity. Prices. Area and volume.Discrete data is a count that can't be made more precise.Continuous data, on the other hand, could be divided and reduced to finer and finer levels.

/ (qualitative)

Qualitative data deals with characteristics and descriptors that can't be easily measured, but can be observed subjectivelysuch as smells, tastes, textures, attractiveness, and color.

Sentiment analysisTopic modelingDocument categorizationText SummarizationSpam filteringInformation retrievalMachine translation

(Signals)

(Signals)

(signals)A signal is a function that "conveys information about the behavior or attributes of some phenomenon

(time series)

:

(Graph analytics/Network analysis)

(Relationship analysis) (Path analysis) (Connectivity analysis) (Centrality analysis)

//// ( )

60

Creating images, diagrams, or animations to:communicate a messageUnderstand dataUnderstand concepts

John Snow1854Cholera casesLondon

Charles MinardRussian campaign of 18126 types of data in 2 dimensions:the number of Napoleon's troopsdistance traveledtemperaturelatitude and longitudedirection of travellocation relative to specific dates

Information visualization concentrates on the use of computer-supported tools to explore large amount of abstract dataScientific visualization is the transformation, selection, or representation of data from simulations or experiments, with an implicit or explicit geometric structure, to allow the exploration, analysis, and understanding of the dataEducational visualization

Scientific visualization

( ...)