### Introduction

This document provides some comparisons of different ways of parallel computing in R. There are a number of ways for achieving parallel computing in R, for example,

• SNOW (Simple Network Of Workstations). One the earliest package for R parallel computing.
• parallel package. Developed by R core team, this contains most of the functionality in SNOW.
• foreach package.
• BiocParallel. The “official” parallel computing package in Bioconductor. It aims to provide a unified interface to existing parallel infrastructure where code can be easily executed in different environments.

Here I provide some simple comparisons of the performances of different parallelism in R. The methods being compared include:

• foreach: provided in foreach package.

• mclapply and parLapply: functions provided in parallel package.

• mclapply is simpler. It uses the underlying operating system fork() functionality to achieve parallelization. However, it doesn’t work on Windows OS since Windows does not have fork().
• parLapply is slightly more complicated but more flexible. It requires users to create and register a cluster. But it supposed to run on all system, even on different machines within a network.
• bplapply from BiocParallel package, with different BiocParallelParam, including

• MulticoreParam
• SnowParam with type = "SOCK"
• SnowParam with type = "FORK"

### Comparison of running many linear regressions.

In this comparison, the task is to run many linear regressions, using 6 cores. I run everything on my Macbook pro laptop with 2.5GHz Intel Core i7 CPU and 16G RAM.

I first use following codes to simulate data.

Nlm = 10000
nobs = 100
X = matrix( rnorm(nobs*3), ncol=3 )
Y = matrix(rnorm(Nlm*nobs), nrow=Nlm)

So there will be 10,000 linear regressions. For each one, there are 100 observations and 3 covariates in the regressions.

I then created following functions for different type of parallelism.

## functions to run many lm
runManyLM.loop <- function(X, Y) {
N =nrow(Y)
beta = rep(0, N)
for(i in 1:N) {
beta[i] = lm(Y[i,]~X)$coef[2] } } ## use foreach runManyLM.foreach <- function(X, Y, numCores) { registerDoParallel(numCores) N = nrow(Y) beta = foreach(i = 1:N, .combine=c) %dopar% { lm(Y[i,]~X)$coef[2]
}
}

## use mclapply
runManyLM.mclapply <- function(X, Y, numCores) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)$coef[2] } beta = mclapply(1:N, foo, mc.cores = numCores, Y, X) } ## use parLapply runManyLM.parLapply <- function(X, Y, numCores) { N = nrow(Y) foo <- function(i, Y, X) { lm(Y[i,]~X)$coef[2]
}
registerDoParallel(cores=numCores)
cl <- makeCluster(numCores, type="FORK")
beta <- parLapply(cl, 1:N, foo, Y, X)
stopCluster(cl)
}

## use bplapply
runManyLM.bplapply <- function(X, Y, numCores, BPPARAM) {
N = nrow(Y)
foo <- function(i, Y, X) {
lm(Y[i,]~X)\$coef[2]
}
beta = bplapply(1:N, foo, BPPARAM = BPPARAM, Y, X)
}

I then benchmark the performances of these functions.

library(foreach)
library(doParallel)
library(microbenchmark)
library(BiocParallel)
numCores = 4
mParam = MulticoreParam(workers=numCores)
snowSOCK <- SnowParam(workers = numCores, type = "SOCK")
snowFORK <- SnowParam(workers = numCores, type = "FORK")

result <- microbenchmark(runManyLM.foreach(X,Y, numCores),
runManyLM.mclapply(X,Y, numCores),
runManyLM.parLapply(X,Y, numCores),
runManyLM.bplapply(X,Y, numCores, mParam),
runManyLM.bplapply(X,Y, numCores, snowSOCK),
runManyLM.bplapply(X,Y, numCores, snowFORK),
times=50)

The results are shown below:

##                                           expr      min     mean      max
## 1            runManyLM.foreach(X, Y, numCores) 3.027713 3.297434 3.923210
## 2           runManyLM.mclapply(X, Y, numCores) 1.636211 1.744566 2.213691
## 3          runManyLM.parLapply(X, Y, numCores) 1.914132 2.059862 2.566976
## 4   runManyLM.bplapply(X, Y, numCores, mParam) 2.285717 2.525748 3.212917
## 5 runManyLM.bplapply(X, Y, numCores, snowSOCK) 6.487970 6.927471 9.762698
## 6 runManyLM.bplapply(X, Y, numCores, snowFORK) 2.311750 2.596356 3.429641

Based on the results, mclapply provides the best performance. parLapply is slightly slower. foreach is about the half speed. For the 3 bplapply, using SOCK with SNOW is very slow. Not sure why. The other two provides similar performances.