This document provides some comparisons of different ways of parallel computing in R. There are a number of ways for achieving parallel computing in R, for example,
Here I provide some simple comparisons of the performances of different parallelism in R. The methods being compared include:
foreach
: provided in foreach
package.
mclapply
and parLapply
: functions provided in parallel
package.
mclapply
is simpler. It uses the underlying operating system fork()
functionality to achieve parallelization. However, it doesn’t work on Windows OS since Windows does not have fork()
.parLapply
is slightly more complicated but more flexible. It requires users to create and register a cluster. But it supposed to run on all system, even on different machines within a network.bplapply
from BiocParallel
package, with different BiocParallelParam
, including
MulticoreParam
SnowParam
with type = "SOCK"
SnowParam
with type = "FORK"
In this comparison, the task is to run many linear regressions, using 6 cores. I run everything on my Macbook pro laptop with 2.5GHz Intel Core i7 CPU and 16G RAM.
I first use following codes to simulate data.
Nlm = 10000
nobs = 100
X = matrix( rnorm(nobs*3), ncol=3 )
Y = matrix(rnorm(Nlm*nobs), nrow=Nlm)
So there will be 10,000 linear regressions. For each one, there are 100 observations and 3 covariates in the regressions.
I then created following functions for different type of parallelism.
## functions to run many lm
runManyLM.loop <- function(X, Y) {
N =nrow(Y)
beta = rep(0, N)
for(i in 1:N) {
beta[i] = lm(Y[i,]~X)$coef[2]
}
}
## use foreach
runManyLM.foreach <- function(X, Y, numCores) {
registerDoParallel(numCores)
N = nrow(Y)
beta = foreach(i = 1:N, .combine=c) %dopar% {
lm(Y[i,]~X)$coef[2]
}
}
## use mclapply
runManyLM.mclapply <- function(X, Y, numCores) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
beta = mclapply(1:N, foo, mc.cores = numCores, Y, X)
}
## use parLapply
runManyLM.parLapply <- function(X, Y, numCores) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
registerDoParallel(cores=numCores)
cl <- makeCluster(numCores, type="FORK")
beta <- parLapply(cl, 1:N, foo, Y, X)
stopCluster(cl)
}
## use bplapply
runManyLM.bplapply <- function(X, Y, numCores, BPPARAM) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2]
}
beta = bplapply(1:N, foo, BPPARAM = BPPARAM, Y, X)
}
I then benchmark the performances of these functions.
library(foreach)
library(doParallel)
library(microbenchmark)
library(BiocParallel)
numCores = 4
mParam = MulticoreParam(workers=numCores)
snowSOCK <- SnowParam(workers = numCores, type = "SOCK")
snowFORK <- SnowParam(workers = numCores, type = "FORK")
result <- microbenchmark(runManyLM.foreach(X,Y, numCores),
runManyLM.mclapply(X,Y, numCores),
runManyLM.parLapply(X,Y, numCores),
runManyLM.bplapply(X,Y, numCores, mParam),
runManyLM.bplapply(X,Y, numCores, snowSOCK),
runManyLM.bplapply(X,Y, numCores, snowFORK),
times=50)
The results are shown below:
## expr min mean max
## 1 runManyLM.foreach(X, Y, numCores) 3.027713 3.297434 3.923210
## 2 runManyLM.mclapply(X, Y, numCores) 1.636211 1.744566 2.213691
## 3 runManyLM.parLapply(X, Y, numCores) 1.914132 2.059862 2.566976
## 4 runManyLM.bplapply(X, Y, numCores, mParam) 2.285717 2.525748 3.212917
## 5 runManyLM.bplapply(X, Y, numCores, snowSOCK) 6.487970 6.927471 9.762698
## 6 runManyLM.bplapply(X, Y, numCores, snowFORK) 2.311750 2.596356 3.429641
Based on the results, mclapply
provides the best performance. parLapply
is slightly slower. foreach
is about the half speed. For the 3 bplapply, using SOCK
with SNOW is very slow. Not sure why. The other two provides similar performances.