Before we apply machine learing in cyber security, let’s start with the basic of machine learning from basic defintions to advanced.
In this post, We will talk about what is machine learing ? What is machine learning algorithms ?
The content of this series I’m follow to machine learning course on Coursera of Adrew Ng. This course free to attend for everyone.
What is machine learning ?
Two defintions of Machine learning are offered. Athur Samuel described it as: “the field of study that gives computers the ability to learn without being explicity programmed.”
Another defintions from Tom Michell: ” A computer is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performace at tasks in T, as mearured by P, improves with exprerience E.”
Example: playing checkers.
E = the experience of playing many games of checkers
T = the task of playing checkers
P = the probability that the program will win the next game
In general, any machine learning problem can be assigned to on of two broad classifications:
- Supervised learning
- Unspersised learning
Others: reinforcement learning, recommender systems
So, let’s talk about them.
In supervised learning, we are given a data set and already know our correct output should look like, having the idea that there is a relationship between the input and the output.
Supervised learing problems are categorized into regression and classification problems.
- Regression: we are trying to prediect results within a continous output, meaning that we are trying to map input variables to some continuous function.
E.g: Given a picture of a persion, we have to predict their age on the basis of the given picture.
Classification: we are instead trying to predict results in a disecret output.
E.g: Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.
Unsupervised learning allows us to approach problems with little or no idea what our results should look like. We can derive structure from date where we don’t nessarilly know the effect of the variables.
We can derive this structure by clustering the data based on relationships among the variables in the data. With unsupervised learning there is no feeback based on the prediction results.
* Clustering: Take a collection of 1,000,000 different genes, and find a way to automatically group these genes into groups that are some how similar or related by different variables, such as lifespan, location, roles and so on.
* Non-clustering: The “Cocktail Party Algorithm”, allows you to find structure in a chaotic evironment like identifying individual voices and music from a mesh of sounds at a cocktail party
A computer program interacts with a dynamic enviroment in which it must perform a certain goal(such as driving a vehicle or playing a game against an opponent). The program is provied feedback in terms of rewards and punishments as it navigates its problem space.
So, in this article we talked about machine learning defition. In the next, we are going talk about Model and Cost function. Hope you enjoy it !