Machine learning with R


If you want to find the structure hidden behind your data, this is right class for you: you will learn how to group similar observations using Clustering; how to “naturally” aggregate your variables using Dimensionality Reduction; how to predict outcomes using Regression and Classification (LMs, GLMs, Trees, Random forests, Neural networks). In other words, you will get a full-immersion in the Data Mining and the Machine Learning world, using R.


Anyone who is already using R and wants to get an overview of machine learning techniques with R. Some background in theoretical statistics, probability, linear and logistic regression is required.


8 attendees max.

Course organization

The first day is dedicated to an introduction to the main machine learning issues, followed by a review of regression methods for predictions and of techniques for variables selection, collinearity reduction, and best prediction for regression model. After that classification tecniques are presented: first we’ll go through the most popular supervised learning teniques such as classification trees and random forests and then we will present unsupervised tecniques such as nearest neighbours and support vector machines.

During the second day a review of methods to search for “natural subgroups” (Hierachical/non hierarchical Cluster Analysis) within data is shown. As dimensionality reduction is often an issue, we will provide you with the most popular techniques for data/dimensionality reduction; such techniques (Multidimensional Scaling, Principal Components Analysis, Correspondence Analysis) allow the analyst to “extract” the most relevant information from data, reducing the amount of analyzed variables. Last Neural Networks are presented as a powerful tool for extracting patterns and detect trends that are too complex to be noticed by other computer techniques.


  • Introduction
  • Regression techniques
  • Classification techniques (LDA, CLASS, KNN)
  • Clustering (HC, NHC)
  • Dimensionality reduction (MDS, PCA, CA)
  • Neural networks


The cost of a 2 day course is 800 + VAT per person, which includes lunch, comprehensive course materials plus 1 hour of individual online post course support for each student within 30 days from course date.


We offer an academic discount for those engaged in full time studies or research and for private attendees. For them the cost of a 2-day course is 500 + VAT.




Date to be announced.



Via Vitruvio 1
20124 Milano


Enrico Pegoraro
Enrico Pegoraro works in R training and consulting activities, with a special focus on Six Sigma, industrial statistical analysis and corporate training courses. Enrico graduated in Statistics from the University of Padua.
He has taught statistical models and R for hundreds of hours during specialized and applied courses, in universities, masters and companies.