Machine Learning (ML)

Machine Learning is the practice of using algorithms to analyze data, learn from that data, and then make a determination or prediction about new data. This notebook will introduce Deep Learning, which is a subset of machine learning that attempts to build algorithms called neural networks, which model the structure of the human brain.

Machine Learning vs. Traditional Programming

You may be wondering what the difference between ML and standard programming. For example, if we are building an AI assistant, such as Siri, why can't we just use if-statements and other logic to answer questions.

If we think about it a bit more, we realize that there are so many different questions you can ask Siri, and it would be impossible to program an answer to an infinte amount of questions. This means that using traditional programming would not work, and a new approach is needed, which can be Machine Learning

Example: Analyzing the sentament of a popular media outlet and classifying that sentiment as positive or negative

This figure explains the relationship between AI, ML, and Deep Learning. As we can see, AI is the overarching concept, and ML is a big portion of AI, and Deep Learning is finally a subset of ML.

Deep Learning (DL)

This lesson will cover the most basic idea about what deep learning is and how it’s used. In later lessons, we’ll be cover more detailed concepts, terms, and tools within the field of deep learning.

Definition:

Supervised Learning occurs when your deep learning model learns and makes inferences from data that has already been labeled. A label is basically an explanation of what the data is. If we have a picture of a dog and we are trying to classify between cats and dogs, the labels for each would be "cats" and "dogs."

Unsupervised Learning occurs when the model learns and makes inferences from unlabeled data. This means we only have data points but no labels for this. This makes learning from this data much more challenging.

Semi-Supervised Learning occurs when the model has data points with labels and data points without labels. This is a mix of supervised and unsupervised learning because we need supervised methods to use the labeled data and unsupervised methods to use the unlabeled data.

Labeled vs. Unlabeled example

Artificial Neural Networks

Artificial neural networks are deep learning models that are based on the structure of the brain's neural networks.

Artificial Neural Networks (ANNs)

Artificial neural networks are computing systems that are inspired by the brain’s neural networks.

Keras Sequential Model

The Sequential model is a linear stack of layers. The following network we coded has 3 layers, the first with 10 nodes, the second with 32, and the last with 2.

Layers in an ANN

Artificial neural network are typically organized in layers. Different types of layers include (you will learn about these later)

There are different types of layers because each node and each layer does different types of transformations based on its inputs. Some are better for other tasks, such as Convolutional Layers for images or recurreny layers for time series data.

Each node in a layer represents an individual feature for each sample within our dataset. Each input is connected to each node in the hidden layer. The input is then transformed into another output.

Each connection is called a weight. This is just a number 0-1, it is the strength of the connection between the nodes. The input will be multiplied by the input, and a weighted sum of every weight will be calculated, and that is passed to an activation function that will transform the result to a number 0-1. This is a per nueron basis.

That result is then passed on to the next neuron in the next layer, occurs over and over until it gets to the output.

The model learns how to adjust these weights so that it is omptimized and can make the best predictions.

The last layer contains as many nodes as there are classes. For example, if it was predicting if a drawing was a number 0-9, there would be 10 nodes, and if it was if the picture is a cat or dog, it would have two outputs.

Activation Functions

In an artificial neural network, the activation function of a neuron defines the output of that neuron given a set of inputs.

Pre-process data for training

Data isn't always going to be given to you in a proper format. You will have to process the data and standarize them, which means make them all the same size, and various other tasks. However, if this is challenging, do not worry, because this isn't technically machine learning or AI. It is rather part of data science, which goes hand in hand with AI/ML. This can be tedious, but you will learn more syntax and strategies as you practice more.

Example data:

Training

Solving an optimization problem

Learning

Learning Rate

A learning rate is just a way of reducing the gradient so it makes incremental and small adjustments. We are basically explaining by what factor we should change our weights in order to optimize them.

d(loss)/d(weight) * lr(0.001)

To reduce your error, you have to take steps. The learning rate is basically the steps.

d(loss)/ d(weightValue) * lr(0.001) = y Then you replace that current weight with weight - y.

You have to test and tune to find the best learning rate, but the guidlines are to set it between 0.01 and 0.0001. Setting it too high will overshoot and pass the minimum of the loss function. However, with a lower learning rate, it will take a long time to decrease the loss.

Loss

An overall loss will be collected, based off the difference between the prediction the model gave and the actual label value.

error = output - true = 0.25 - 0 = 0.25

One example is Mean Squared Error(MSE) = square each error for each piece of data, and then average them, and that is the loss. There are lots of different loss functions with different equations and calculations, but they don't really matter right now.

Data Sets

Train, test, and validation sets, all three are part of the entire dataset.

A training set is used for the model to learn from. You give it a piece of data, say an image, and it will predict an output, see the label of that output, and then it will learn how it needs to fix it.

A validation set is used during each pass of the entire data. Once all the data has been given to the model, based on what it has learned, it will make predictions on new data it didn't see for training. This gives a good gague of the accuracy of the model on data it hasn't seen. The weights of the model will not be updated based on the validation set.

The test set is unlabeled, and it is used to determine how your model is doing after it is completely trained.

This is an array of data, they are converted into decimals

This is a binary classification problem. 1 means a positive label and 0 means a negative label.

Predicting

Batch Size

Supervised Learning