This document provides some comparisons of different ways of parallel computing in R. There are a number of ways for achieving parallel computing in R, for example,
Here I provide some simple comparisons of the performances of different parallelism in R. The methods being compared include:
foreach: provided in foreach package.
mclapply and parLapply: functions provided in parallel package.
mclapply is simpler. It uses the underlying operating system fork() functionality to achieve parallelization. However, it doesn’t work on Windows OS since Windows does not have fork().parLapply is slightly more complicated but more flexible. It requires users to create and register a cluster. But it supposed to run on all system, even on different machines within a network.bplapply from BiocParallel package, with different BiocParallelParam, including
MulticoreParamSnowParam with type = "SOCK"SnowParam with type = "FORK"In this comparison, the task is to run many linear regressions, using 6 cores. I run everything on my Macbook pro laptop with 2.5GHz Intel Core i7 CPU and 16G RAM.
I first use following codes to simulate data.
Nlm = 10000
nobs = 100
X = matrix( rnorm(nobs*3), ncol=3 )
Y = matrix(rnorm(Nlm*nobs), nrow=Nlm)
So there will be 10,000 linear regressions. For each one, there are 100 observations and 3 covariates in the regressions.
I then created following functions for different type of parallelism.
## functions to run many lm
runManyLM.loop <- function(X, Y) {
N =nrow(Y)
beta = rep(0, N)
for(i in 1:N) {
beta[i] = lm(Y[i,]~X)$coef[2]
}
}
## use foreach
runManyLM.foreach <- function(X, Y, numCores) {
registerDoParallel(numCores)
N = nrow(Y)
beta = foreach(i = 1:N, .combine=c) %dopar% {
lm(Y[i,]~X)$coef[2]
}
}
## use mclapply
runManyLM.mclapply <- function(X, Y, numCores) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
beta = mclapply(1:N, foo, mc.cores = numCores, Y, X)
}
## use parLapply
runManyLM.parLapply <- function(X, Y, numCores) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
registerDoParallel(cores=numCores)
cl <- makeCluster(numCores, type="FORK")
beta <- parLapply(cl, 1:N, foo, Y, X)
stopCluster(cl)
}
## use bplapply
runManyLM.bplapply <- function(X, Y, numCores, BPPARAM) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
beta = bplapply(1:N, foo, BPPARAM = BPPARAM, Y, X)
}
I then benchmark the performances of these functions.
library(foreach)
library(doParallel)
library(microbenchmark)
library(BiocParallel)
numCores = 4
mParam = MulticoreParam(workers=numCores)
snowSOCK <- SnowParam(workers = numCores, type = "SOCK")
snowFORK <- SnowParam(workers = numCores, type = "FORK")
result <- microbenchmark(runManyLM.foreach(X,Y, numCores),
runManyLM.mclapply(X,Y, numCores),
runManyLM.parLapply(X,Y, numCores),
runManyLM.bplapply(X,Y, numCores, mParam),
runManyLM.bplapply(X,Y, numCores, snowSOCK),
runManyLM.bplapply(X,Y, numCores, snowFORK),
times=50)
The results are shown below:
## expr min mean max
## 1 runManyLM.foreach(X, Y, numCores) 3.027713 3.297434 3.923210
## 2 runManyLM.mclapply(X, Y, numCores) 1.636211 1.744566 2.213691
## 3 runManyLM.parLapply(X, Y, numCores) 1.914132 2.059862 2.566976
## 4 runManyLM.bplapply(X, Y, numCores, mParam) 2.285717 2.525748 3.212917
## 5 runManyLM.bplapply(X, Y, numCores, snowSOCK) 6.487970 6.927471 9.762698
## 6 runManyLM.bplapply(X, Y, numCores, snowFORK) 2.311750 2.596356 3.429641
Based on the results, mclapply provides the best performance. parLapply is slightly slower. foreach is about the half speed. For the 3 bplapply, using SOCK with SNOW is very slow. Not sure why. The other two provides similar performances.