R for big data


This course presents the latest techniques to work with big data within the R environment. This means manipulating, analyzing, visualizing big data structures that exceed the single computer capacity in a true R style. The large amount of data available nowadays is a tangled and hidden source of knowledge: being able to quickly and effectively unravel high value information from the vastness of data is the most powerful driver for success in this modern competitive market.


This course is suitable for those that already use R. No previous knowledge of big data technology is required.


6 attendees max.

Course organization

During this course you will become familiar with the basic IT infrastructures behind big data, the R toolbox to access and manipulate big data structures, the sparkML libraries for out of memory data modeling and ad hoc techniques for big data visualization.


  • What big data is
  • Principles of big data analysis
  • The Hadoop ecosystem
  • Big data manipulation with sparklyr
  • Distributed machine-learning with sparkML libraries
  • Data reduction and in memory analysis and visualization
  • Introduction to big databases and big data ecosystem


The cost of a 2 day course is 1.000 + VAT per person, which includes lunch, comprehensive course materials plus 1 hour of individual online post course support for each student within 30 days from course date.


We offer an academic discount for those engaged in full time studies or research. Please contact us for further information.


Next session will be in Spring, the date will be available soon. For any further information you can contact us here.


Quantide premises
Corso Italia, 85
20025 Legnano, MI


Andrea Spanò
Andrea Spanò is an Rstudio certificated instructor who has worked as an R trainer and consultant for over 20 years.  He runs Quantide consulting firm and teaches at Luiss University post grad course on Big Data Management

Andrea Melloncelli
Andrea is graduated in Physics. He has a solid experience in R, C, C++ and Python programming and development, along with extensive skills in Unix system management, IT automation tools, cloud technologies and big-data platforms, such as Hadoop & Spark.