B2B Travel technology
Machine Learning IntroductionOctober 20th 2016
Agenda
What’s Machine Learning ?Usage examples
ComplexityAlgorithm families
Let’s go!TroubleshootTech insights Next stepsConclusion
3!
Machine learning
Introduction
4
What’s Machine Learning ?
Software that do something without being explicitly programmed to, just by learning through examples
Same software can be used for various tasks
It learns from experiences with respect to some task and performance, and improves through experience
5!
Usage examples (1/2)
6!
Some typical usage examples
Use cases : MyLittleAdventure (2/2)
7
Language detection
Clustering
Anomaly detection
Recommendation
Chose of parameters
MyLittleAdventure usage
!
Complexity
8!
"""Tests for convolution related functionality in tensorflow.ops.nn.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function
import numpy as np from six.moves import xrange # pylint: disable=redefined-builtin import tensorflow as tf
class Conv2DTransposeTest(tf.test.TestCase):
def testConv2DTransposeSingleStride(self): with self.test_session(): strides = [1, 1, 1, 1]
# Input, output: [batch, height, width, depth] x_shape = [2, 6, 4, 3] y_shape = [2, 6, 4, 2]
# Filter: [kernel_height, kernel_width, output_depth, input_depth] f_shape = [3, 3, 2, 3]
x = tf.constant(1.0, shape=x_shape, name="x", dtype=tf.float32) f = tf.constant(1.0, shape=f_shape, name="filter", dtype=tf.float32) output = tf.nn.conv2d_transpose(x, f, y_shape, strides=strides, padding="SAME") value = output.eval()
# We count the number of cells being added at the locations in the output. # At the center, #cells=kernel_height * kernel_width # At the corners, #cells=ceil(kernel_height/2) * ceil(kernel_width/2) # At the borders, #cells=ceil(kernel_height/2)*kernel_width or # kernel_height * ceil(kernel_width/2)
for n in xrange(x_shape[0]): for k in xrange(f_shape[2]): for w in xrange(y_shape[2]): for h in xrange(y_shape[1]): target = 4 * 3.0 h_in = h > 0 and h < y_shape[1] - 1 w_in = w > 0 and w < y_shape[2] - 1 if h_in and w_in: target += 5 * 3.0
"""GradientDescent for TensorFlow.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function
from tensorflow.python.framework import ops from tensorflow.python.ops import math_ops from tensorflow.python.training import optimizer from tensorflow.python.training import training_ops
class GradientDescentOptimizer(optimizer.Optimizer): """Optimizer that implements the gradient descent algorithm. @@__init__ """
def __init__(self, learning_rate, use_locking=False, name="GradientDescent"): """Construct a new gradient descent optimizer. Args: learning_rate: A Tensor or a floating point value. The learning rate to use. use_locking: If True use locks for update operations. name: Optional name prefix for the operations created when applying gradients. Defaults to "GradientDescent". """ super(GradientDescentOptimizer, self).__init__(use_locking, name) self._learning_rate = learning_rate
def _apply_dense(self, grad, var): return training_ops.apply_gradient_descent( var, math_ops.cast(self._learning_rate_tensor, var.dtype.base_dtype), grad, use_locking=self._use_locking).op
def _apply_sparse(self, grad, var): delta = ops.IndexedSlices( grad.values *
"""Tests for tensorflow.ops.linalg_grad.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function
import numpy as np import tensorflow as tf
class ShapeTest(tf.test.TestCase):
def testBatchGradientUnknownSize(self): with self.test_session(): batch_size = tf.constant(3) matrix_size = tf.constant(4) batch_identity = tf.tile( tf.expand_dims( tf.diag(tf.ones([matrix_size])), 0), [batch_size, 1, 1]) determinants = tf.matrix_determinant(batch_identity) reduced = tf.reduce_sum(determinants) sum_grad = tf.gradients(reduced, batch_identity)[0] self.assertAllClose(batch_identity.eval(), sum_grad.eval())
class MatrixUnaryFunctorGradientTest(tf.test.TestCase): pass # Filled in below
def _GetMatrixUnaryFunctorGradientTest(functor_, dtype_, shape_, **kwargs_):
def Test(self): with self.test_session(): np.random.seed(1) m = np.random.uniform(low=-1.0, high=1.0, size=np.prod(shape_)).reshape(shape_).astype(dtype_) a = tf.constant(m) b = functor_(a, **kwargs_)
# Optimal stepsize for central difference is O(epsilon^{1/3}). epsilon = np.finfo(dtype_).eps delta = 0.1 * epsilon**(1.0 / 3.0) # tolerance obtained by looking at actual differences using # np.linalg.norm(theoretical-numerical, np.inf) on -mavx build
Complex algorithm before
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
And Machine learning now…
Machine learning
Algorithm families
9
Supervised algorithms
10
Supervised algorithms
ClassificationRegression
Unsupervised algorithms
11
Unsupervised algorithms
ClusteringAnomaly detection
Machine learning
Let’s go!
12
Recipe
!13
Collect Training data
Files, database, cache, data flow
Selection of model, and (hyper) parameters
Train algorithm
Use or store your trained estimator
Make predictions
Measure accuracy precision
Measure
Collect training data
Get qualitative data
Get some samples
Don’t get data for months and then try Go fast and try things.
14
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 Granny smith apple
86 6.2 4.7 Mandarin
178 7.1 7.8 Braeburn apple
162 7.4 7.2 Cripps pink apple
118 6.1 8.1 Unidentified lemons
144 6.8 7.4 Turkey orange
362 9.6 9.2 Spanish jumbo orange
… … … …
What about the data?
Fruit identification example
Prepare your data
15
Numerize your features and labels
Put them in same scale (normalization) ?
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to have some tests
Training set Learning phase (60% - 80 %)
Test set Analytics phase (20% - 40%)
16
Prepare your data (code)
Train algorithm
17
Choose a classifier
Fit the decision tree
Weight (g) Width (cm) Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
162 7.4 7.2 5
118 6.1 8.1 10
144 6.8 7.4 8
362 9.6 9.2 9
… … … …
We need to choose an estimator
Make predictions
18
What looks our predictions?
Weight (g)
Width (cm)
Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 3
Test set
Weight (g)
Width (cm)
Height (cm) Label
192 8.4 7.3 1
86 6.2 4.7 2
178 7.1 7.8 1
Predictions
!
Measure (1/2)
Evaluate on the dataset that as never ever been learned by your model
19!
Accuracy
Correct predictions / total predictions
Gives a simple confidence score of our performance level
Measure (2/2)
Try to visualize and analyze your data, and know what you want
20!
Actual true Actual false
Predicted true
True positive
False positive
Predicted false
False negative
True negative
Confusion Matrix
Skewed classes
Precision = True positives / #predicted positives
Recall = True positives / #actual positives
F1 score (trade-off) = (precision * recall) / (precision + recall)
21
Measure and prediction (code)
Machine learning
Troubleshoot
22
Troubleshoot (1/4)
23!
Under/Overfitting situation
Troubleshoot (2/4)
Underfitting
Add / create more features
Use more sophisticated model
Use fewer samples
Decrease regularization
24!
Overfitting
Use fewer features
Use more simple model
Use more samples
Increase regularization
What are the different options ?
Troubleshoot (3/4)
25!
Underfitting Overfitting
Using the learning curves…
Troubleshoot : Model choice (4/4)
26!
Machine learning
Tech insights
27
Platforms : easy, peasy
You don’t even have to code to build something (*wink wink* business developers)
Built-in models
Data munging
Model management by UI
PaaS
28!
Very high-level solutions
Languages
For understanding & prototyping implementation
Most Valuable LanguagesComfortable for prototyping,
yet powerful for industrialisation
For bigger companies & projects, and fine-tuned
softwares
29!
Matlab Octave Go Python Java C++
What language for what purpose ?
Libraries
Built-in models
Data munging
Fine-tuning
Full integration to your product
30!
You will have great power using a library
Golearn
Machine learning
Next steps…
31
Next steps
Split your data in 3 : Training / Cross validation / Test set
Know the top algorithms
Search advanced techniques and optimizers (online learning, stacking)
Deep and reinforcement learning
Partial and semi-supervised learning
Transfer learning
How to store and analyse big data ? How do we scale ? !32
Try it ! Find your best tools and have some fun
Conclusion
Try it and let’s get in touch!
Machine learning is not just a buzz word
Difficulties are not always what we think!
Machine learning is rather experiences and tests than just algorithms
There is no perfect unique solution
There is plenty of easy to use solutions for beginners
33!
Machine learning
One more thing!
34
Tensorflow
35
Tensorflow learn
36
Thank youMachine Learning Introduction
October 20th 2016
Questions ?