42
Boole Prize 2012 Mark Moriarty University College Cork MATHEMATICS ONLINE Data-Mining, Predictive Analytics, Clustering, A.I., Machine Learning… and where to learn all this.

Mathematics online: some common algorithms

Embed Size (px)

DESCRIPTION

Brief overview of some basic algorithms used online and across data-mining, and a word on where to learn them. Prepared specially for UCC Boole Prize 2012.

Citation preview

Page 1: Mathematics online: some common algorithms

Boole Prize 2012

Mark Moriarty

University College Cork

MATHEMATICS ONLINEData-Mining, Predictive Analytics, Clustering, A.I., Machine Learning… and where to learn all this.

Page 2: Mathematics online: some common algorithms

3 SECTIONS:

• 1 - Overview to some applications of Maths online.

• 2 - Sample algorithms.

• 3 - Recommended online Maths courses.

Page 3: Mathematics online: some common algorithms

SECTION 1 (MOTIVATION):MATHEMATICS IN ACTION

• User Clustering.

• Recommender Systems. Movie recommendations.

• Shopper analytics – send relevant coupons.

• Voice recognition. Machine Learning.

• Spam detection.

• Fraud detection.

• Facebook Feed.

• Google’s PageRank.

• DNA sequencing.

• Health analytics.

• Intelligent ad displays.

• etc.

Page 4: Mathematics online: some common algorithms

AWKS…

“My daughter got this in the mail!

She’s still in high school, and you’re sending her coupons for baby clothes and cribs? Are you trying to encourage her to get pregnant?! ”

Page 5: Mathematics online: some common algorithms

HOW TARGET FIGURED OUT A TEEN GIRL WAS PREGNANT BEFORE HER FATHER DID

As Pole’s computers crawled through the data, he was able to identify about 25 products that, when analyzed together, allowed him to assign each shopper a “pregnancy prediction” score. More important, he could also estimate her due date to within a small window, so Target could send coupons timed to very specific stages of her pregnancy.

Take a fictional Target shopper who is 23, and in March bought cocoa-butter lotion, a purse large enough to double as a diaper bag, zinc and magnesium supplements and a bright blue rug. There’s, say, an 87% chance that she’s pregnant and that her delivery date is sometime in late August.

Page 6: Mathematics online: some common algorithms

HOW KHAN ACADEMY IS USING MACHINE LEARNING TO ASSESS STUDENT MASTERY

Old method: To determine when a student has finished a certain exercise, they awarded proficiency to a user who has answered at least 10 problems in a row correctly — known as a streak.

New metric for accuracy…

What do I mean by accuracy? Now define it as

which is just notation desperately trying to say ”Given that we just gained proficiency, what’s the probability of getting the next problem correct?”

Page 7: Mathematics online: some common algorithms

NETFLIX PRIZE

$1 million top prize for their verified submission on July 26, 2009, achieving the winning RMSE of 0.8567 on the test subset. This represents a 10.06% improvement over Cinematch’s score on the test subset at the start of the contest.

Page 8: Mathematics online: some common algorithms

PANDORA & THE MUSIC GENOME PROJECT®

• On January 6, 2000 a group of musicians and music-loving technologists came together with the idea of creating the most comprehensive analysis of music ever.

• Together we set out to capture the essence of music at the most fundamental level. We ended up assembling literally hundreds of musical attributes or "genes" into a very large Music Genome.

Page 9: Mathematics online: some common algorithms

FACEBOOK NEWS FEED & MACHINE LEARNING

Page 10: Mathematics online: some common algorithms

FACEBOOK NEWS FEED

The default wall setting is "Top News“. EdgeRank is there to do the customizing for you, based on how each item scores in the algorithm. The three main criteria for an item's algorithm score are:1. Affinity: How often you and your friends interact

on the platform2. Weight: Each type of content is weighted

differently, based on the past interactions of that type of content

3. Time: How old the published item is

Page 11: Mathematics online: some common algorithms

AD PLACEMENT

Page 12: Mathematics online: some common algorithms

MACHINE LEARNING IS EVERYWHERE

Mario learns to survive: http://www.youtube.com/watch?v=m0tJLTXNT0A

Page 13: Mathematics online: some common algorithms

SECTION 2:SOME ALGORITHMS, BROKEN DOWN

• Recommender Systems

• Logistic Regression

• K nearest neighbours

• K-means clustering

• Naïve Bayes Classifiers

Page 14: Mathematics online: some common algorithms

RECOMMENDER SYSTEMS[CONTENT-BASED EXAMPLE HERE:]

CONTENT-BASED VS COLLABORATIVE

Page 15: Mathematics online: some common algorithms

LOGISTIC REGRESSION

• At the most basic level, for one input variable, linear regression is simply “fitting a line to some data”.

• Let’s look at the in the sample case of the Khan Academy:

Page 16: Mathematics online: some common algorithms

• vector x = the values of input features (eg. % correct).

• vector w = how much each feature makes it more likely that the user is proficient.

• We can write compactly as a linear algebra dot product:

LOGISTIC REGRESSION ALGORITHM

Already, you can see that the higher z is, the more likely the user is to be proficient. To obtain our probability estimate, all we have to do is “shrink” into the interval (0, 1). We can do this by plugging into a sigmoid function:

Page 17: Mathematics online: some common algorithms

LOGISTIC REGRESSION RESULTS

From http://david-hu.com/2011/11/02/how-khan-academy-is-using-machine-learning-to-assess-student-mastery.html

Page 18: Mathematics online: some common algorithms

K-NEAREST NEIGHBOUR

Tarring you with the same brush as your k nearest peers.

Page 19: Mathematics online: some common algorithms

K-MEANS CLUSTERING

A personal favourite

Page 20: Mathematics online: some common algorithms

K-MEANS ALGORITHM SUBSECTION:

• Introduction

• K-means Algorithm

• Example

• K-means Demo

• Relevant Issues

• Conclusion

Page 21: Mathematics online: some common algorithms

K-MEANS: INTRODUCTION• Partitioning Clustering Approach

• a typical clustering analysis approach via partitioning data set iteratively

• construct a partition of a data set to produce several non-empty clusters (usually, the number of clusters given in advance)

• in principle, partitions achieved via minimising the sum of squared distance in each cluster

• Given a K, find a partition of K clusters to optimise the chosen partitioning criterion• K-means algorithm: each cluster is represented by the centroid of the cluster

and the algorithm converges to stable centres of clusters.

21 |||| iC

Ki i

E mxx

Page 22: Mathematics online: some common algorithms

K-MEAN ALGORITHM• Given the cluster number K, the K-means algorithm is carried out in three steps:

Initialisation: set seed points• Assign each object to the

cluster with the nearest seed point

• Compute seed points as the centroids of the clusters of the current partition (the centroid is the centre, i.e., mean point, of the cluster)

• Go back to Step 1), stop when no more new assignment

Page 23: Mathematics online: some common algorithms

K-MEANS DEMO

1. User set up the number of clusters they’d like. (e.g. k=5)

Credit to Ke Chen for the example graphics used on this and next few slides.

Page 24: Mathematics online: some common algorithms

K-MEANS DEMO 1. User set up the number

of clusters they’d like. (e.g. K=5)

2. Randomly guess K cluster Center locations

Page 25: Mathematics online: some common algorithms

K-MEANS DEMO 1. User set up the number

of clusters they’d like. (e.g. K=5)

2. Randomly guess K cluster Center locations

3. Each data point finds out which Center it’s closest to. (Thus each Center “owns” a set of data points)

Page 26: Mathematics online: some common algorithms

K-MEANS DEMO 1. User set up the number

of clusters they’d like. (e.g. K=5)

2. Randomly guess K cluster centre locations

3. Each data point finds out which centre it’s closest to. (Thus each Center “owns” a set of data points)

4. Each centre finds the centroid of the points it owns

Page 27: Mathematics online: some common algorithms

K-MEANS DEMO 1. User set up the number

of clusters they’d like. (e.g. K=5)

2. Randomly guess K cluster centre locations

3. Each data point finds out which centre it’s closest to. (Thus each centre “owns” a set of data points)

4. Each centre finds the centroid of the points it owns

5. …and jumps there

Page 28: Mathematics online: some common algorithms

K-MEANS DEMO 1. User set up the number

of clusters they’d like. (e.g. K=5)

2. Randomly guess K cluster centre locations

3. Each data point finds out which centre it’s closest to. (Thus each centre “owns” a set of data points)

4. Each centre finds the centroid of the points it owns

5. …and jumps there

6. …Repeat until terminated!

Page 29: Mathematics online: some common algorithms

RELEVANT ISSUES• Efficient in computation

• O(tKn), where n is number of objects, K is number of clusters, and t is number of iterations. Normally, K, t << n.

• Local optimum

• sensitive to initial seed points

• converge to a local optimum that may be unwanted solution

• Other problems

• Need to specify K, the number of clusters, in advance

• Unable to handle noisy data and outliers (K-Medoids algorithm)

• Not suitable for discovering clusters with non-convex shapes

• Applicable only when mean is defined, then what about categorical data? (K-mode algorithm)

Page 30: Mathematics online: some common algorithms

RELEVANT ISSUES• Cluster Validity

• With different initial conditions, the K-means algorithm may result in different partitions for a given data set.

• Which partition is the “best” one for the given data set?

• In theory, no answer to this question as there is no ground-truth available in unsupervised learning

• Nevertheless, there are several cluster validity criteria to assess the quality of clustering analysis from different perspectives

• A common cluster validity criterion is the ratio of the total between-cluster to the total within-cluster distances

• Between-cluster distance (BCD): the distance between means of two clusters

• Within-cluster distance (WCD): sum of all distance between data points and the mean in a specific cluster

• A large ratio of BCD:WCD suggests good compactness inside clusters and good separability among different clusters!

Page 31: Mathematics online: some common algorithms

CONCLUSION

• K-means algorithm is a simple yet popular method for clustering analysis

• Its performance is determined by initialisation and appropriate distance measure

• There are several variants of K-means to overcome its weaknesses

• K-Medoids: resistance to noise and/or outliers

• K-Modes: extension to categorical data clustering analysis

Page 32: Mathematics online: some common algorithms

END OF K-MEANS SUBSECTION

• Nearly there now…

Page 33: Mathematics online: some common algorithms

ALGORITHM:NAÏVE BAYES

• What is a classifier?

Page 34: Mathematics online: some common algorithms

NAÏVE BAYES ALGORITHM• Want

• Use Bayes Rule:

• In English:

• Assume independence: probability of each word independent of others

)|( wordsspamP

)(

)()|()|(

wordsP

spamPspamwordsPwordsspamP

)()|()()|()( goodPgoodwordsPspamPspamwordsPwordsP

)|(...)|2()|1()|( spamwordnPspamwordPspamwordPspamwordsP

Page 35: Mathematics online: some common algorithms

SECTION 3:TAKE FREE TOP-CLASS ONLINE MATH COURSES

• ml-class.org

• Udacity.com

• http://mitx.mit.edu/

Page 36: Mathematics online: some common algorithms

FREE STANFORD CLASSES, SPRING 2012

Page 37: Mathematics online: some common algorithms

SOME OFFER A STATEMENT OF ACCOMPLISHMENT

Page 38: Mathematics online: some common algorithms

UDACITY.COM

Page 39: Mathematics online: some common algorithms

ITUNES U

Page 40: Mathematics online: some common algorithms

ITUNES U

For philosophy lectures, I recommend Dreyfus or Searle. -Mark

Page 41: Mathematics online: some common algorithms
Page 42: Mathematics online: some common algorithms

REFERENCES• “One Learning Hypothesis” image from http://www.ml-class.org

• Khan Academy discussion from http://david-hu.com/2011/11/02/how-khan-academy-is-using-machine-learning-to-assess-student-mastery.html

• K-Means images from http://www.cs.manchester.ac.uk/ugt/COMP24111/materials/slides/K-means.ppt

• Word equation for Naïve Bayes: http://www.wikipedia.org

• K nearest neighbours image from http://mlpy.sourceforge.net/docs/3.0/_images/knn.png

• Recommender Systems image from http://holehouse.org/mlclass/16_Recommender_Systems.html

QUESTIONS?

2012-22-02 UCC Boole Prize [email protected]