This vignette explains the functionality of the RcppThread package:
checkUserInterrupt() and Rcout,Thread: an interruptible thread class that otherwise behaves like
std::thread,ThreadPool: a Thread-based class implementing the thread pool
pattern for easy and flexible
parallelism.Calling multi-threaded C++ code from R is a bit tricky because R is single-threaded. Using R's C API from multiple threads crashes your R session or causes other unexpected behavior. In particular, communication between your C++ code and R is problematic. We can neither check for user interruptions during long computations nor should we print messages to the R console.
It is possible to resolve this, but not without effort. RcppThread relieves us of that burden.
It is not safe to call R's C API from multiple threads. It is safe, however,
to call it from the master thread. That's the idea behind RcppThread's
checkUserInterrupt() and Rcout. They behave almost like their
Rcpp versions, but
only communicate with R when called from the master thread. Here's an example
of their use with
std::thread:
#include <RcppThread.h>
#include <thread> // C++11 threads
// [[Rcpp::export]]
void pyjamaParty()
{
// some work that will be done in separate threads
auto work = [] {
auto id = std::this_thread::get_id();
std::this_thread::sleep_for(std::chrono::seconds(1));
RcppThread::Rcout << id << " slept for one second" << std::endl;
// Rcpp::checkUserInterrupt(); // R would crash
RcppThread::checkUserInterrupt(); // does not crash
std::this_thread::sleep_for(std::chrono::seconds(1));
RcppThread::Rcout << id << " slept for another second" << std::endl;
};
// create two new threads
std::thread t1(work);
std::thread t2(work);
// wait for threads to finish work
t1.join();
t2.join();
}
RcppThread::checkUserInterrupt() and RcppThread::Rcout are thread safe, no
matter what threading framework is used. But be careful: they only communicate
with R from the master thread. When called from another thread,
checkUserInterrupt() does nothing. This drastically limits its usefulness
if it is only called from non-master threads. This is usually not an issue
with OpenMP, but with many others like
std::thread or
TinyThread.Rcout only stores the messages, but doesn't print to the R console. This
is less of a problem: calling RcppThread::Rcout << "" once after the
parallel computations have finished releases all messages left in the
buffer.Because of this, the above example is neither interruptible nor does it print anything to the R console. Conveniently, RcppThread also provides two threading classes that resolve these issues, see below.
If you want to do some clean-up before interrupting, isInterrupted() returns a
bool with the interruption status.
As of C++11, the
standard template library
provides the class std::thread
for executing code in parallel. RcppThread's Thread class is an R-friendly
wrapper to std::thread.
Instances of class Thread behave just like instances of std::thread, with
one exception: Whenever other threads are doing some work, the master thread
periodically synchronizes with R. When the user interrupts a threaded
computation, any thread will stop as soon as it encounters a
checkUserInterrupt().
// [[Rcpp::export]]
void pyjamaParty2()
{
// some work that will be done in separate threads
auto work = [] {
auto id = std::this_thread::get_id();
std::this_thread::sleep_for(std::chrono::seconds(1));
RcppThread::Rcout << id << " slept for one second" << std::endl;
RcppThread::checkUserInterrupt();
std::this_thread::sleep_for(std::chrono::seconds(1));
RcppThread::Rcout << id << " slept for another second" << std::endl;
};
// create two new threads
RcppThread::Thread t1(work);
RcppThread::Thread t2(work);
// wait for threads to finish work
t1.join();
t2.join();
}
Result after interrupting:
> pyjamaParty2()
139862637561600 slept for one second
139862645954304 slept for one second
Error in pyjamaParty2() : C++ call interrupted by user
A thread pool is an abstraction for executing tasks in parallel. A thread pool consists of a fixed number of threads that wait for incoming tasks. Whenever a new task arrives, a waiting worker fetches the task and does the work.
Besides being easy to use, a thread pool is helpful when
The class ThreadPool implements the thread pool pattern using Thread
objects and, thus, plays nicely with RcppThread::checkUserInterrupt() and
RcppThread::Rcout.
// [[Rcpp::export]]
void pool()
{
// set up thread pool with 3 threads
RcppThread::ThreadPool pool(3);
// each tasks prints thread id
auto task = [] () {
std::this_thread::sleep_for(std::chrono::seconds(1));
RcppThread::checkUserInterrupt();
RcppThread::Rcout <<
"Task fetched by thread " << std::this_thread::get_id() << "\n";
};
// push 10 tasks to the pool
for (int i = 0; i < 10; i++)
pool.push(task);
// wait for pool to finish work and join all threads
pool.join();
}
Result after interrupting:
> pool()
Task fetched by thread 140201245763328
Task fetched by thread 140201262548736
Task fetched by thread 140201254156032
Error in pool() : C++ call interrupted by user
You can also wait for all tasks to be done and re-use the thread pool by calling
pool.wait() instead of pool.join() and then pushing new jobs to the pool.
A thread pool created with zero threads will do all work in the main thread.