HPC, Parallel and Distributed Computing with R
This class aims to give a basis of what the main constraints of parallel programming are, the description and the differences between the main paradigms of parallelization (shared memory, distributed computation, map-reduce paradigm, future, …). We will do exercises using some R libraries and we will measure the time gain. (R packages: “future”, “parallel”, “foreach” and others)
Topics include
- Introduction to HPC, parallel and distributed computing
- Shared Memory and Distributed Computation
- Map and Reduce Paradigm
- Parallel, foreach, doSNOW libraries
- Future
- doSNOW Cluster
- Understand the parallelization overhead
- Integrating C++ with R
What you will be able to do
- Understand how to rewrite your code in a parallelizable way.
- Use different parallelization paradigms and libraries.
- Recognize bottlenecks and benchmark your code.
Duration
1 day.
Pre requisites
Good R programming knowledge with loops and Functions.
Audience
This is an advanced course for professionals and researchers who need to improve algorithm performance by taking full advantage of the available hardware by parallelizing the calculation process. The course explains the basics of parallelization to distinguish where and how it provides a real performance improvement.