Machine Learning Overview
In this lesson, we'll take a look at some of the uses of Machine Learning (ML), some ML algorithms that can be used to provide predictions, and some ML platforms we'll be using for the rest of this course.
Machine Learning Uses
There are a variety of different uses for machine learning. We'll introduce a few of them here.
It's one thing classify text and images, what about creating something entirely new? The MuseNet project uses machine learning to try to generate new songs based on composition styles and a series of notes.
Below is an example of one such piece generated from combining the style of rock musician Bon Jovi and the notes of famous pianist Chopin.
Text Adventure Games
Writing all of the text for a game can be a lot of work. AI Dungeon uses a machine learning text generation algorithm to create a text adventure game for you based on your input.
This model, named GPT-2, was designed with one simple task in mind. Predict the next word in a text. GPT-2 is accurate enough to not only predict the next word, but to take an entire prompt and write an article based on it. The model itself is not available for public use, but you can see some of the stories and articles it's written below.
Does this face look normal to you?
Perhaps, but what's that going on on the left side! It's a mistake. This entire image was generated by a ML algorithm available at thispersondoesnotexist.com, which generates a new face every time.
Google QUICK, DRAW!
This example uses image recognition to detect strokes and interpret them as handwriting, or in a more refined sense, drawings.
Not only can an algorithm read a drawing, it can also draw with you! SKETCH-RNN looks at what strokes the user makes and attempts to continue the drawing itself.
Utilizing a Generative Adversarial Network (or GAN) researchers at NVIDIA were able to create a model that recognizes faces on animals and then generates different animals such as a bear or lynx making that same face!
GauGAN, named after post-Impressionist painter Paul Gauguin, creates photorealistic images from segmentation maps, which are labeled sketches that depict the layout of a scene.
This model, which is written in TensorFlow, will take edges and generate images based off them such as cats, buildings, and shoes.
This game uses neural networks to simulate interactions between players as they learn to trust and distrust each other.
NPR featured an episode studying Facebook's translation software. After building dictionary maps of English and French, they compared the maps to see how they would line up. You can read or listen to what they found here.
Before we dive into implementing machine learning, it's important to know some of the common algorithms used in machine learning.
Regression is a simple way to create a prediction. Based on a number of points, you create a line that shows the trend of those points. Then, if you know any x value, you can estimate what the y value will be.
Linear Regression in the public domain
Decision trees split up a complicated decision into smaller decisions with multiple groups.
Decision Tree Model in the public domain
Support Vector Machines
Support Vector Machines try to classify different parts of data by trying to split the data up into groups. It tries to maximize the distance between those groups to perform the classification.
Support Vector Machine, by
Qluong2016, licensed under
CC BY-SA 4.0
This algorithm considers each feature of a classification differently, and determines the impact each element has on the probability.
kNN (K Nearest Neighbors)
This algorithm classifies results based on figuring out the values of the nearest classified results. You pick
kbased on the number of closest neighbors before it makes a decision. In the below image, picking a K of 2 would mean the green circle would be in the red group, while picking a K of 3 would mean the green circle would be in the blue group.
Image, by Antti Ajanki, licensed under CC BY-SA 3.0
With this algorithm, you pick a certain number of clusters, and the algorithm tries to figure out how to split up the clusters over a number of iterations.
K-means Convergence, by
Chire, licensed under
CC BY-SA 4.0
The random forest algorithm uses multiple decision trees. Rather than putting everything into one decision tree, it splits up the training into multiple trees. Then, to come up with a final classification, the best predicting trees are picked to perform classification.
Image, by Venkata Jagannath, licensed under CC BY-SA 4.0
There are a number of platforms with toolkits specifically for data science and machine learning:
This python module has a lot of algorithms implemented into it, so you don't need to know a lot of math before trying out machine learning algorithms on a problem. The downside is that it does not implement any deep learning algorithms, so it can only apply machine learning on simpler problems.
Scikit Logo, by
the scikit developers, licensed under
The Tensorflow machine learning was developed by Google. It allows for deep learning based on performing sequences of models.
TensorFlow Logo, by
Begoon, licensed under
The PyTorch machine learning toolkit was developed by Facebook. It has a different way of handling inputs from Tensorflow that allow it to be more dynamic, but it is a newer framework and doesn't have as much support.
Pytorch logo, by
Soumith Chintala, licensed under
CC BY-SA 4.0