My Machine Learning Notes


There are 2 categories of learning:

Supervised and Unsupervised.

In Supervised learning, Extracting Features is the first step, then a training data set is used to test hypothesis and a model is created.The model is then applied to a larger data set to predict decisions. The results predicted and actual are analyzed,like mean square error is used and model is refined.

Typically Supervised learning has below segments

Classification and Regression.

"....from Oriely book, Learning Spark, the authors have share below...
Classification and regression are two common forms of supervised learning, where algorithms attempt to predict a variable from features of objects using labeled training data (i.e., examples where we know the answer). The difference between them is the type of variable predicted: in classification, the variable is discrete (i.e., it takes on a finite set of values called classes); for example, classes might be spam or nonspam for emails, or the language in which the text is written. In regression, the variable predicted is continuous (e.g., the height of a person given her age and weight).
..........................

Linear regression is one of the most common methods for regression, predicting the output variable as a linear combination of the features. 

Logistic regression is a binary classification method that identifies a linear separating plane between positive and negative examples.
.........."

Identifying spam is supervised learning.

In UnSupervised learning, you don't have a labeled training data set, one sets about exploring the data available and try to classify, create model and get feedback and  improve.

Typically UnSupervised Learning has below segment

Clustering.

In Clustering, common Algorithm is K Means.
For Collaborative Filtering and Recommendations..algorithm used is Alternate least square.



Comments

Popular posts from this blog

ScoreCard Model using R

Zeppelin and Anaconda

The auxService:mapreduce_shuffle does not exist