Data Mining with R

 

This course introduces some of most important and popular techniques in data-mining applications with R.
Data mining is the computational process of discovering patterns in large data sets.
During the two-days course we will review a wide variety of techniques to catch information from big amount of data: Dimensionality reduction, Clustering, Classification and Prediction examples will be presented and deepened.

Audience

Anyone who is already using R and wants to get an overview of data-mining techniques with R. Some background in theoretical statistics, probability, linear and logistic regression is required.

Attendees

6 attendees max.

Course organization

The course will start with an introduction to basic methods for data description. After that, we will review the most popular techniques for data/dimensionality reduction, as Multidimensional Scaling, Principal Components Analysis, Correspondence Analysis. Next, we will focus on methods for searching for “natural subgroups” within data, as Hierachical/non hierarchical Cluster Analysis, Gaussian Mixtures Models.

The end of first day and the begin of second day will present techniques for classification analysis (Linear/Quadratic Discriminant Analysis, Logistic Regression, K-Nearest-Neighborhood,…).

Finally, in remaining part of second day, we will review some techniques for variables selection, collinearity reduction, and best prediction for regression models (PCA regresssion, Ridge Regression, Lasso Regression, Elastic-Net regression, ..)

 

Outline

  • Univariate Descriptive Statistics
  • Reduction of Data Dimensions (MDS, PCA and EFA, CA)
  • Clustering (HC, NHC, GMM)
  • Classification (LDA, KNN) 
  • Prediction (Several techniques to model data)

Cost

The cost of a 2 day course is 800 + VAT per person, which includes lunch, comprehensive course materials plus 1 hour of individual online post course support for each student within 30 days from course date.

Discounts

We offer an academic discount for those engaged in full time studies or research. Please contact us for further information.

 

Date

Next session will be in Spring, the dates will be available soon, for any further information you can contact us here.

Location

Quantide premises
Corso Italia, 85
20025 Legnano, MI
Italy

Teacher

Enrico Pegoraro
Enrico Pegoraro works in R training and consulting activities, with a special focus on Six Sigma, industrial statistical analysis and corporate training courses. Enrico graduated in Statistics from the University of Padua.
He has taught statistical models and R for hundreds of hours during specialized and applied courses, in universities, masters and companies.