Thompson sampling, is a heuristic, probabilistic algorithm for choosing actions that addresses the exploration-exploitation dilemma in the multi-armed bandit problem. It consists in choosing the action that maximizes the expected reward with respect to a randomly drawn belief.

Daniel Campos

140

Machine Learning A-Z

Thoughts and flashcards derived from Udemy’s course Machine Learning A-Z.

Machine Learning

Tensorflow

Python

Data Science

Daniel Campos

Get

Explore more quotes

Preprocessing consists of making the dataset suitable to Machine Learning algorithms so the models generated from it can provide more accurate results.

Daniel Campos

1. Getting the dataset: the data is the most important part of any Machine Learning project and it must reflect reality.

Daniel Campos

2. Handle missing data: we can delete the entry, use the column average, use a predictive model to determine the missing value.

Daniel Campos

3. Handle categorical data: if a variable is not numerical it is categorical and must be handled accordingly. We can give a numerical value for each category or create dummy variables with 0 or 1 if present.

Daniel Campos

When using dummy variables for categorical data, we must remove one variable to avoid the Dummy Variable Trap which will impact the model accuracy.

Daniel Campos

4. Split the dataset into Training and Testing sets: a common percentage used is 70% for Training and 30% for Testing.

Daniel Campos

5) Feature Scaling: most algorithms will perform poorly if variables are not normalized. However, most of the Python libraries perform Feature Scaling automatically.

Daniel Campos

NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

Daniel Campos

Pandas is a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.

Daniel Campos

Scikit-learn is a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms.

Daniel Campos

Regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables.

Daniel Campos

Linear Regression is a linear approach for modelling the relationship between a scalar dependent variable y and one or more independent variables denoted X. The case of one independent variable is called Simple Linear Regression. For more than one independent variable, the process is called Multiple Linear Regression.

Daniel Campos

In Multiple Linear Regression, one of the most used strategies to identify the best variables to use is Backward Elimination.

Daniel Campos

R Squared and Adjusted R Squared, the closest to 1, the better.

Daniel Campos

R Squared could become biased when you add a new independent variable whereas Adjusted R Squared penalizes independent variables that don't improve the overall model.

Daniel Campos

R Squared will always increase when you add new independent variables.

Daniel Campos

The adjusted R-squared is a modified version of R-squared that has been adjusted for the number of predictors in the model.

Daniel Campos

When evaluating a Multiple Linear Regression model, it's better to look at the Adjusted R Squared value as you perform Backward Elimination

Daniel Campos

Polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x. It's been used to describe nonlinear phenomena such as the progression of disease epidemics.

Daniel Campos

The idea of Support Vector Regression (SVR) is based on the computation of a linear regression function in a high dimensional feature space where the input data are mapped via a non linear function.

Daniel Campos

Decision Tree builds regression or classification models in the form of a tree structure. It brakes down a dataset into smaller and smaller subsets while at the same time an associated decision tree is incrementally developed. The final result is a tree with decision nodes and leaf nodes.

Daniel Campos

Random Forests are an ensemble learning method for classification, regression and other tasks, that operate by constructing a multitude of decision trees at training time and outputting the class that is the mode of the classes (classification) or mean prediction (regression) of the individual trees.

Daniel Campos

Linear Regression Pros: Works on any size of dataset, gives informations about relevance of features.

Daniel Campos

Linear Regression Cons: The Linear Regression Assumptions e.g. Linearity.

Daniel Campos

Polynomial Regression Pros: Works on any size of dataset, works very well on non linear problems.

Daniel Campos

Polynomial Regression Cons: Need to choose the right polynomial degree for a good bias/variance tradeoff.

Daniel Campos

Support Vector Regression (SVR) Pros: Easily adaptable, works very well on non linear problems, not biased by outliers.

Daniel Campos

Support Vector Regression (SVR) Cons: Mandatory to apply feature scaling, not well known, more difficult to understand.

Daniel Campos

Decision Tree Regression Pros: Interpretability, no need for feature scaling, works on both linear / nonlinear problems.

Daniel Campos

Decision Tree Regression Cons: Poor results on too small datasets, overfitting can easily occur.

Daniel Campos

Random Forest Regression Pros: Powerful and accurate, good performance on many problems, including non linear.

Daniel Campos

Random Forest Regression Cons: No interpretability, overfitting can easily occur, need to choose the number of trees.

Daniel Campos

Logistic Regression is a regression model where the dependent variable is categorical and binary —- that is, where the output can take only two values, ‘0' and ‘1', which represent outcomes such as pass/fail, win/lose, alive/dead or healthy/sick.

Daniel Campos

Logistic Regression is a Linear classifier, it divides the two categories with a straight line and uses a Sigmoid function to determine how to divide the data.

Daniel Campos

In a KNN (K-Nearest Neighbours) algorithm, a test sample is given as the class of majority of its nearest neighbours. In plain words, if you are similar to your neighbours, then you are one of them.

Daniel Campos

KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry.

Daniel Campos

The 'K' in KNN algorithm is the nearest neighbours we wish to take vote from. The most used value for 'K' is 5.

Daniel Campos

KNN uses a distance measurement to classify new elements. The most used distance formula is Euclidian distance.

Daniel Campos

A Support Vector Machine (SVM) is a supervised machine learning algorithm that can be employed for both classification and regression purposes. However, SVMs are more commonly used in classification problems.

Daniel Campos

SVMs are based on the idea of finding a hyperplane that best divides a dataset into two classes.

Daniel Campos

Support vectors are the data points nearest to the hyperplane, the points of a data set that, if removed, would alter the position of the dividing hyperplane. Because of this, they can be considered the critical elements of a data set.

Daniel Campos

SVM is used for text classification tasks such as category assignment, detecting spam and sentiment analysis. It is also commonly used for image recognition challenges, performing particularly well in aspect-based recognition and color-based classification.

Daniel Campos

SVM isn’t suited to larger datasets as the training time with SVMs can be high and it's less effective on noisier datasets with overlapping classes.

Daniel Campos

Sklearn implementation of SVC is based on libsvm. The fit time complexity is more than quadratic with the number of samples which makes it hard to scale to dataset with more than a couple of 10000 samples.

Daniel Campos

Kernel SVM: a kernel is a shortcut that helps us do certain calculation faster which otherwise would involve computations in higher dimensional space. Typical kernel functions include Gaussian RBF, Sigmoid and Polynomial.

Daniel Campos

Naive Bayes is a collection of classification algorithms based on Bayes Theorem. It is not a single algorithm but a family of algorithms that all share a common principle, that every feature being classified is independent of the value of any other feature.

Daniel Campos

In a nutshell, the algorithm allows us to predict a class, given a set of features using probability. So in another fruit example, we could predict whether a fruit is an apple, orange or banana (class) based on its colour, shape etc (features).

Daniel Campos

Email:daniel felicissimo@gmail com44642 hjvgdi.pv5d34n29

A Decision Tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences. A decision tree is also a way of visually representing an algorithm.

Daniel Campos