Quick Guide For PbdMPI Pbd MPI

User Manual: Pdf

Open the PDF directly: View PDF .
Page Count: 30

Download
Open PDF In Browser	View PDF

A Quick Guide for the pbdMPI Package

Wei-Chen Chen1 , George Ostrouchov1,2,3 , Drew Schmidt1 ,
Pragneshkumar Patel1,3 , Hao Yu4
1 pbdR

Core Team

2 Computer

Science and Mathematics Division,
Oak Ridge National Laboratory,
Oak Ridge, TN, USA

3 National

Institute for Computational Sciences,
University of Tennessee,
Knoxville, TN, USA

4 University

of Western Ontario,
London, Ontario, Canada

Contents
Acknowledgement

iii

1. Introduction

1

1.1. System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2. Installation and Quick Start . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.3. Basic Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3

1.4. More Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

2. Performance

5

3. SPMD in Examples from package parallel

6

4. Long Vector and 64-bit for MPI

9

5. Simple Input and Output

11

6. Simple Pairwise Evaluation

12

7. Windows Systems (MS-MPI)

13

7.1. Install from Binary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
7.2. Build from Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
i

8. FAQs

16

8.1. General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.2. Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
8.3. MPI Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
8.4. Other Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
References

27

© 2012-2016 pbdR Core Team.

Permission is granted to make and distribute verbatim copies of this vignette and its source
provided the copyright notice and this permission notice are preserved on all copies.
This publication was typeset using LATEX.
ii

Acknowledgement
Chen was supported in part by the project “Bayesian Assessment of Safety Profiles for Pregnant Women From Animal Study to Human Clinical Trial” funded by U.S. Food and Drug
Administration, Office of Women’s Health. The project was supported in part by an appointment to the Research Participation Program at the Center For Biologics Evaluation and
Research administered by the Oak Ridge Institute for Science and Education through an interagency agreement between the U.S. Department of Energy and the U.S. Food and Drug
Administration
Chen was supported in part by the Department of Ecology and Evolutionary Biology at the
University of Tennessee, Knoxville, and a grant from the National Science Foundation (MCB1120370.)
Chen and Ostrouchov were supported in part by the project “Visual Data Exploration and
Analysis of Ultra-large Climate Data” funded by U.S. DOE Office of Science under Contract No. DE-AC05-00OR22725. Ostrouchov, Schmidt, and Patel were supported in part by
the project “NICS Remote Data Analysis and Visualization Center” funded by the Office of
Cyberinfrastructure of the U.S. National Science Foundation under Award No. ARRA-NSFOCI-0906324 for NICS-RDAV center.
This work used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge
National Laboratory, which is supported by the Office of Science of the U.S. Department of
Energy under Contract No. DE-AC05-00OR22725. This work also used resources of National
Institute for Computational Sciences at the University of Tennessee, Knoxville, which is supported by the Office of Cyberinfrastructure of the U.S. National Science Foundation under
Award No. ARRA-NSF-OCI-0906324 for NICS-RDAV center. This work used resources of
the Newton HPC Program at the University of Tennessee, Knoxville.
We thank our colleagues from the Scientific Data Group, Computer Science and Mathematics
Division, Oak Ridge National Laboratory, Hasan Abbasi, Jong Youl Choi, Scott Klasky, and
Nobert Podhorszki for discussing windows MPI systems, compiler issues, dynamic libraries,
and generally improving our knowledge of MPI performance issues.
We also thank Brian D. Ripley, Kurt Hornik, Uwe Ligges, and Simon Urbanek from the R
Core Team for discussing package release issues and helping us solve portability problems on
different platforms.

iii

Warning: The findings and conclusions in this article have not been formally disseminated
by the U.S. Department of Health & Human Services nor by the U.S. Department of Energy,
and should not be construed to represent any determination or policy of University, Agency,
Administration and National Laboratory.
This document is written to explain the main functions of pbdMPI (Chen et al. 2012), version
0.3-0. Every effort will be made to ensure future versions are consistent with these instructions,
but features in later versions may not be explained in this document.
Information about the functionality of this package, and any changes in future versions can be
found on website: “Programming with Big Data in R” at http://r-pbd.org/ (Ostrouchov
et al. 2012).

1. Introduction
Our intent is to bring the most common parallel programming model from supercomputing,
Single Program Multiple Data (SPMD), to R and enable distributed handling of truly large
data. Consequently, pbdMPI is intended for batch mode programming with big data (pbd).
Unlike Rmpi (Yu 2002), snow (Tierney et al. 2012), or parallel (R Core Team 2012), interactive mode is not supported. We think that interaction with a large distributed parallel
computing platform is better handled with a client/server relationship, and we are developing
other packages in this direction. pbdMPI simplifies MPI interaction, but leaves low and mid
level functions available for advanced programmers. For example, it is easy to hand communicators to pbdMPI from other applications through MPI array pointers. This is intended
for integration with other, possibly non-R, parallel software.
Under the SPMD parallel programming model, the identical program runs on every processor
but typically works on different parts of a large data set, while communicating with other
copies of itself as needed. Differences in execution stem from comm.rank, which is typically
different on every processor. While on the surface this sounds complicated, after some experience and a new mindset, programming is surprisingly simple. There is no master. There is
only cooperation among the workers. Although we target very large distributed computing
platforms, SPMD works well even on small multicore platforms.
In the following, we list the main features of pbdMPI.
1. Under the SPMD batch programming model, a single program is written, which is
spawned by mpirun. No spawning and broadcasting from within R are required.
2. S4 methods are used for most collective functions so it is easy to extend them for general
R objects.
3. Default methods (like Robj functions in Rmpi) have homogeneous checking for data
type so they are safe for general users.
4. The API in all functions is simplified, with all default arguments in control objects.
5. Methods for array or matrix types are implemented without serialization and un-serialization,
resulting in faster communication than Rmpi.
6. Basic data types of integer, double and raw in pbdMPI are communicated without
further checking. This is risky but fast for advanced programmers.
1

7. Character data type is serialized and communicated by raw type.
System requirements and installation of pbdMPI are described next. Section 2 gives a short
example for comparing performance of pbdMPI and Rmpi (Yu 2002). In Section 8, a few
quick answers for questions are given. Section 7 provides settings for Windows environments.
In Section 3, two examples from parallel are shown as SPMD pbdMPI programs. Section 4
discusses long vector support and communication in pbdMPI as an extension from R. Finally,
in Section 5, some simple input and output methods between regular text/csv/csv2 files and
data.frame are introduced.

1.1. System Requirements
pbdMPI requires MPI (http://en.wikipedia.org/wiki/Message_Passing_Interface). The
package is mainly developed and tested under OpenMPI (http://www.open-mpi.org/) in
xubuntu 11.04 (http://xubuntu.org/). The package should also work with MPICH2 (http:
//www.mcs.anl.gov/research/projects/mpich2/) and Microsoft MPI or MS-MPI (http:
//msdn.microsoft.com/en-us/library/bb524831(v=vs.85).aspx). In addition to unix,
pbdMPI should also run under other operating systems such as Mac OS X with OpenMPI
or Windows 7 with MS-MPI if MPI is installed and launched properly, although we have not
tested on multiple machines yet. Please let us know about your experience.
For normal installation, see Sec. 1.2. To build as a static library, which may be required on
some large systems, use
Shell Command

. / configure -- enable - static -- prefix = $ { MPI _ ROOT }
make
make install






where --enable-static can build a static library (optional), and ${MPI_ROOT} is the path to
MPI root. Note that the static library is not necessary for pbdMPI but may avoid dynamic
loading problems.
To make sure your MPI system is working, test with

mpiexec - np 2 hostname


Shell Command




This should list two host names where MPI jobs are running. Note to use hostname.exe with
the extension on a Windows system.

1.2. Installation and Quick Start
One can download pbdMPI from CRAN at https://cran.r-project.org, and the intallation can be done with the following commands (using OpenMPI library)
Shell Command

tar zxvf pbdMPI _ 0.1 -0. tar . gz
R CMD INSTALL pbdMPI

2




Further configure arguments include

Argument
--with-mpi-type
--with-mpi-include
--with-mpi-libpath
--with-mpi

Default
OPENMPI
${MPI_ROOT}/include
${MPI_ROOT}/lib
${MPI_ROOT}

where ${MPI_ROOT} is the path to the MPI root. For non-default and unusual installations
of MPI systems, the commands may be
Shell Command

# ## Under command mode
R CMD INSTALL pbdMPI \
-- configure - args = " -- with - mpi - type = OPENMPI \
-- with - mpi = / usr / local "
R CMD INSTALL pbdMPI \
-- configure - args = " -- with - mpi - type = OPENMPI \
-- with - mpi - include = / usr / local / ompi / include \
-- with - mpi - libpath = / usr / local / ompi / lib "






See the package source file pbdMPI/configure for details.
One can get started quickly with pbdMPI by learning from the following six examples.

# ## At the shell prompt ,
# ## ( Use Rscript . exe for
mpiexec - np 2 Rscript -e
mpiexec - np 2 Rscript -e
mpiexec - np 2 Rscript -e
mpiexec - np 2 Rscript -e
mpiexec - np 2 Rscript -e
mpiexec - np 2 Rscript -e


Shell Command
run the demo with 2 processors by
windows system )
" demo ( allgather , ' pbdMPI ' , ask =F , echo = F ) "
" demo ( allreduce , ' pbdMPI ' , ask =F , echo = F ) "
" demo ( bcast , ' pbdMPI ' , ask =F , echo = F ) "
" demo ( gather , ' pbdMPI ' , ask =F , echo = F ) "
" demo ( reduce , ' pbdMPI ' , ask = F , echo = F ) "
" demo ( scatter , ' pbdMPI ' , ask =F , echo = F ) "

1.3. Basic Steps
In the SPMD world, every processor is a worker, every worker knows about all the others,
and each worker does its own job, possibly communicating with the others. Unlike the manager/workers style, SPMD is more likely to fully use the computer resources. The following
shows typical basic steps of using pbdMPI.
1. Initialize. (init)
2. Read your portion of the data.
3. Compute. (send, recv, barrier, ...)
4. Communicate results among workers. (gather, allgather, reduce, allreduce, ...)
3





5. Finalize. (finalize)
In a given application, the Compute and Communicate steps may be repeated several times
for intermediate results. The Compute and Communicate steps are more general than the
“map” and “reduce” steps of the map-reduce paradigm but similar in spirit. One big difference
is that the Communicate step may place the “reductions” on all processors rather than just one
(the manager for map-reduce) for roughly the same time cost. With some experience, one can
easily convert existing R scripts, and quickly parallelize serial code. pbdMPI tends to reduce
programming effort, avoid complicated MPI techniques, and gain computing performance.
The major communication functions of pbdMPI and corresponding similar functions of Rmpi
are listed in the following.

pbdMPI (S4)
allgather
allreduce
bcast
gather
recv
reduce
scatter
send

Rmpi
mpi.allgather, mpi.allgatherv, mpi.allgather.Robj
mpi.allreduce
mpi.bcast, mpi.bcast.Robj
mpi.gather, mpi.gatherv, mpi.gather.Robj
mpi.recv, mpi.recv.Robj
mpi.reduce
mpi.scatter, mpi.scatterv, mpi.scatter.Robj
mpi.send, mpi.send.Robj

1.4. More Examples
The package source files provide several examples based on pbdMPI, such as

Directory
pbdMPI/inst/examples/test_spmd/
pbdMPI/inst/examples/test_rmpi/
pbdMPI/inst/examples/test_parallel/
pbdMPI/inst/examples/test_performance/
pbdMPI/inst/examples/test_s4/
pbdMPI/inst/examples/test_cs/
pbdMPI/inst/examples/test_long_vector/

Examples
main SPMD functions
comparison to Rmpi
comparison to parallel
performance testing
S4 extension
client/server examples
long vector examples

where test_long_vector/ requires to recompile with setting

# define MPI _ LONG _ DEBUG 1


pkg constant.h




in pbdMPI/src/pkg_constant.h. See Scetion 4 for details.
Further examples can be found at including:
 “Introduction to distributed computing with pbdR at the UMBC High Performance
Com puting Facility (Technical Report, 2013).” (Raim 2013)

4

2. Performance
There are more examples for testing performance in pbdMPI/inst/examples/test_rmpi.
Here, we only show a simple comparison of pbdMPI to Rmpi. The two scripts are equivalent
for pbdMPI and Rmpi. We run them with two processors and obtain computing times listed
below.
Save the following script in demo_spmd.r and run it with two processors by
Shell Command

mpiexec - np 2 Rscript demo _ spmd . r





to see the computing time on your platform.
pbdMPI R Script

# ## Save this script in " demo _ spmd . r ".
suppressMessages ( library ( pbdMPI , quietly = TRUE ) )
init ()



time . proc <- list ()
time . proc $ default <- system . time ({
for ( i in 1:1000) y <- allgather ( list ( x = 1:10000) )
barrier ()
})
time . proc $ matrix <- system . time ({
for ( i in 1:1000) y <- allgather ( matrix (1:10000 , nrow = 100) )
barrier ()
})
comm . print ( time . proc , quiet = TRUE )
finalize ()




Save the following script in demo_rmpi.r and run with two processors by
Shell Command

mpiexec - np 2 Rscript demo _ rmpi . r





to see the computing time on your platform.
Rmpi R Script

# ## Save this script in " demo _ rmpi . r ".
library ( Rmpi )
invisible ( mpi . comm . dup (0 , 1) )
time . proc <- list ()
time . proc $ Robj <- system . time ({
for ( i in 1:1000) y <- mpi . allgather . Robj ( list ( x = 1:10000) )
mpi . barrier ()
})
time . proc $ matrix <- system . time ({

5



for ( i in 1:1000) y <- mpi . allgather . Robj ( matrix (1:10000 , nrow =
100) )
mpi . barrier ()
})
if ( mpi . comm . rank (1) == 0) print ( time . proc )
mpi . quit ()




The following shows the computing time of the above two scripts on a single machine with
two processors Intel(R) Core(TM) i5-2410M CPU @ 2.30 GHz, xubuntu 11.04 system, and
OpenMPI 1.6. The pbdMPI is more efficient than Rmpi with list and matrix/array data
structures.

>> Output from demo _ spmd . r
$ default
user system elapsed
1.680
0.030
1.706
$ matrix
user
0.950

R Output



system elapsed
0.000
0.953

>> Output from demo _ rmpi . r
$ Robj
user system elapsed
2.960
0.090
3.041
$ matrix
user
3.120


system elapsed
0.030
3.147



3. SPMD in Examples from package parallel
We demonstrate how a simple script from parallel can be written in batch by using pbdMPI.
Each time, we first give the version using parallel followed by the version using pbdMPI. All
codes are available in pbdMPI/inst/examples/test_parallel/.
Example 1: (mclapply() originates in multicore (Urbanek 2011))
Save the following script in a file and run with

Rscript 01 _ mclapply _ par . r


Shell Command




to see the computing time on your platform.
multicore R Script

# ## File Name : 01 _ mclapply _ par . r
6



library ( parallel )
system . time (
unlist ( mclapply (1:32 , function ( x ) sum ( rnorm (1 e7 ) ) ) )
)




Now save this script in a file and run with
Shell Command


mpirun - np 2 Rscript 01 _ mclapply _ spmd . r





to see the computing time on your platform.
SPMD R Script


# ## File Name : 01 _ mclapply _ spmd . r
suppressMessages ( library ( pbdMPI , quietly = TRUE ) )
init ()



time . proc <- system . time ({
id <- get . jid (32)
ret <- unlist ( lapply ( id , function ( i ) sum ( rnorm (1 e7 ) ) ) )
ret <- allgather ( ret , unlist = TRUE )
})
comm . print ( time . proc )
finalize ()




The following shows the computing time of the above codes on a single local machine with two
cores Intel(R) Core(TM) i5-2410M CPU @ 2.30 GHz, xubuntu 11.04 system, and OpenMPI
1.6. There is not much communication latency in this example since all computings are on
one “node” which is also a limitation of parallel.

>> Test . / 01 _ mclapply _ par . r
user system elapsed
16.800
0.570 17.419

R Output

>> Test . / 01 _ mclapply _ spmd . r
COMM . RANK = 0
user system elapsed
17.130
0.460 17.583






Example 2: (parMM() originates in snow (Tierney et al. 2012))
Save the following code in a file and run with two processors

Rscript 02 _ parMM _ par . r


Shell Command




to see the computing time on your platform.
7

snow R Script

# ## File Name : 02 _ parMM _ par . r
library ( parallel )
cl <- makeCluster (2)



splitRows <- function (x , ncl ) {
lapply ( splitIndices ( nrow ( x ) , ncl ) , function ( i ) x [i , , drop = FALSE ])
}
parMM <- function ( cl , A , B ) {
do . call ( rbind , clusterApply ( cl , splitRows (A , length ( cl ) ) ,
get ( " % * % " ) , B ) )
}
set . seed (123)
A <- matrix ( rnorm (1000000) , 1000)
system . time ( replicate (10 , A % * % A ) )
system . time ( replicate (10 , parMM ( cl , A , A ) ) )
stopCluster ( cl )




Now save this script in a file and run with
Shell Command

mpirun - np 2 Rscript 02 _ parMM _ spmd . r





to see the computing time on your platform.
SPMD R Script

# ## File Name : 02 _ parMM _ spmd . r
suppressMessages ( library ( pbdMPI , quietly = TRUE ) )
init ()



set . seed (123)
x <- matrix ( rnorm (1000000) , 1000)
parMM . spmd <- function (x , y ) {
id <- get . jid ( nrow ( x ) )
do . call ( rbind , allgather ( x [ id ,] % * % y ) )
}
time . proc <- system . time ( replicate (10 , parMM . spmd (x , x ) ) )
comm . print ( time . proc )
finalize ()




The following shows the computing time of the above code on a single machine with two processors Intel(R) Core(TM) i5-2410M CPU @ 2.30 GHz, xubuntu 11.04 system, and OpenMPI
1.6. pbdMPI performs better than snow in this example even without communication over
network.

>> Test . / 02 _ parMM _ par . r

R Output

8



user
12.460
user
1.780

system elapsed
0.170 12.625
system elapsed
0.820 10.095

>> Test . / 02 _ parMM _ spmd . r
COMM . RANK = 0
user system elapsed
8.84
0.42
9.26




4. Long Vector and 64-bit for MPI
We add new supports for long vector and communications based on MPI functions to pbdMPI
since version 0.2-1.

4.1. Long Vector for MPI
The current R (3.1.0) uses C structure to extend 32-bit length limitation (231 −1 = 2147483647
defined as R_SHORT_LEN_MAX) to 52-bit length (251 − 1 = 4503599627370496 defined as
R_XLEN_T_MAX). In general, this is more portable and extensible when 128-bit integer comming on (who know when the day comes ...) However, a vector with elements larger than
231 − 1 needs extra effort to be accessed in R. See “R Internals” for details.
The reason is that an integer is 4 bytes in both of x86 64 system (64-bit) and i386 system
(32-bit). Since the capacity of current machine and performance issues, there is no benefit
to use 8 bytes for integer. In x86 64 system, computers or compilers use either long or long
long for pointer address which is in size_t for unsigned address or in ptrdiff_t for signed
address. For example, in GNU C (gcc), the flag -m64 is to use 4 bytes for int and 8 bytes
for long in x86 64 system.1
Therefore, the question is what are the differences of 64-bit and 32-bit system? One of them
is “pointer size” which is 8 bytes in x86 64 machine and it is 4 bytes in i386 machine. This
allows computer to lengthen memory and disk space. Note that address is indexed by long
or long long which is no confilict with integer size, and 4 bytes integer is efficient and safe
enough for general purpose. For example, double *a is a pointer (a) pointing to a real scaler
(*a), but the pointer’s address (&a) is in size_t (long or long long) which is 8 bytes in
x86 64 system and is 4 bytes in i386 system.
To deal with long vector, pbdMPI uses the same framework as R to build up MPI collective
functions. pbdMPI follows R’s standard to assume a vector normally has length smaller than
R_SHORT_LEN_MAX which can be handled by most 32-bit functions. If the vector length is
greater than R_SHORT_LEN_MAX, then R names this as long vector which also has the maximum
R_XLEN_T_MAX. The vector length is stored in type R_xlen_t. The R_xlen_t is long if
LONG_VECTOR_SUPPORT is defined, otherwise it is int. R provides several C macro to check,
access, and manipulate the vector in VECSXP or general SEXP. See Rinternals.h for details.
The pbdMPI first checks if the data size for communication is greater than SPMD_SHORT_LEN_MAX
or not. If the data is long vector, then pbdMPI evokes collective functions to send/receive
1

Is there a way to have 8 bytes integer? The answer is that it is dependent on compiler.

9

chunk of data partitioned by SPMD_SHORT_LEN_MAX until all chunks are all received/sent. For
some MPI collective functions such as allgather() and gather(), extra space may be allocated for receving chunks, then the chunks are copied to right memory address by the rank
of communicator from the extra space to the receiving buffer.
The reason is that most MPI collective functions rely on arguments for indexing buff types
and counting buffer sizes where the types and sizes are both in int. SPMD_SHORT_LEN_MAX
is defined in pbdMPI/src/spmd.h and usually is equal to R_SHORT_LEN_MAX. Developers may
want to use shorter length (such as SPMD_INT8_LEN_MAX which is 27 − 1 = 127) for testing without a large memory machine or for debugging without recompiling R with shorter
R_SHORT_LEN_MAX.
In pbdMPI, the implemented MPI collective functions for long vector are bcast(), allreduce(),
reduce(), send(), recv(), isend(), irecv(), allgather(), gather(), and scatter(). The
other MPI collective functions are “NOT” implemented due to the complexity of memory allocation for long vector including allgatherv(), gatherv(), scatterv(), sendrecv(), and
sendrecv.replace().
Further, pbdMPI provides a way to mimic long vector support. Users can set

# define MPI _ LONG _ DEBUG 1


pkg constant.h




in pbdMPI/src/pkg_constant.h to turn on debugging mode and recompile pbdMPI. Then,
run examples in pbdMPI/inst/examples/test_long_vector/ to see how the mimic long
vectors are communicated between processors. Also, users can also adjust the length limit of
mimic long vector (buffer size) by changing
spmd.h

# define SPMD _ SHORT _ LEN _ MAX R _ SHORT _ LEN _ MAX





in pbdMPI/src/spmd.h.

4.2. 64-bit for MPI
The remaining question is that does MPI library support 64-bit system? The answer is yes,
but users may need to recompile MPI libraries for 64-bit support. The same way as R to
enable 64-bit system that MPI libraries may have 8 bytes pointer in order to communicate
larger memory or disk space.2
For example, the OpenMPI provides next to check if 64-bit system is used.

ompi _ info -a | grep


Shell Command
'

int . * size :

'




If the output is
Shell Command


C int size : 4
2

http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_for_64-bit_integers.

10



C pointer
Fort integer
Fort integer1
Fort integer2
Fort integer4
Fort integer8
Fort integer16



size :
size :
size :
size :
size :
size :
size :

8
8
1
2
4
8
-1



then the OpenMPI supports 64-bit system.3 Otherwise, users may use the next to reinstall
OpenMPI as
Shell Command

. / configure -- prefix = / path _ to _ openmpi \
CFLAGS = - fPIC \
FFLAGS = " - m64 - fdefault - integer -8 " \
FCFLAGS = " - m64 - fdefault - integer -8 " \
CFLAGS = - m64 \
CXXFLAGS = - m64






and remember to reinstall pbdMPI as well.
Note that 64-bit pointer may only provide larger size of data, but may degrade hugely for other
computing. In general, communication with a large amount of data is a very bad idea. Try
to redesign algorithms to communicate lightly such as via sufficient statistics, or to rearrange
and load large data partially or equally likely to every processors.

5. Simple Input and Output
We add new supports simple data input and output for basic CSV and text files to pbdMPI
since version 0.2-2.
Two quick demos can simply explain how a dataset can be input and output via pbdMPI
functions comm.write.table() and comm.read.table(). The first is
Shell Command

# ## Run the demo with 4 processors by
mpiexec - np 4 Rscript -e " demo ( simple _ io , ' pbdMPI ' , ask =F , echo = F ) "

The demo utilizing iris data (Fisher 1936) to show simple input and output functions of
pbdMPI and is summarized as in next.
 150 rows of iris are divided in 4 processors, and processors own 37, 37, 38, and 38 rows
of iris as a gbd row-block format. i.e. Rank 0 owns row 1 to 37, rank 1 owns row 38
to 74, and so on.
 A text file “iris.txt” is dumped via comm.write.table() which sequentially append
processor owned row blocks.
 comm.read.table() then reads the text file back in memory, and again in a gbd rowblock format.
3

The C integer is still in 4 bytes rather than 8 bytes.

11




Note that comm.read.table() may read a first few lines to predetermine how many lines of
the file to read in. This is an approximation and results in unbalance data across processors.
In particular, either the highest order rank may own the largest portion of whole dataset,
or several higher order ranks may own zero row. So, a call comm.load.balance() within
comm.read.table() is to move rows across processors if necessary. Basically, the reading
steps are described as in the next.
1. If file size were less than 5MB, then rank 0 would read in the whole file and scatter rows
to other ranks.
2. If file size were large than 5MB, then rank 0 would read in the first 500 lines and estimate
total number of records in the file. All ranks sequentially read in the designated records.
3. Call comm.load.balance() to balance the data.
The file size limit is controlled by .pbd_env$SPMD.IO$max.file.size, and the first few line
limit is controlled by .pbd_env$SPMD.IO$max.test.lines. Further, users can specify options
nrows and skip to comm.read.*() to manually read the file and call comm.load.balance()
later if needed.
There are several way to distributed or balance data among processors. Currently pbdMPI
supports 3 formats: block, block0, and block.cyclic. In the above demo, the 150 rows are
mainly distributed in (37, 37, 38, 38) which is a block format. The second demo shows how
to load balance between different formats next.
Shell Command

# ## Run the demo with 4 processors by
mpiexec - np 4 Rscript -e " demo ( simple _ balance , ' pbdMPI ' , ask =F , echo = F ) "

In the block0, the iris is distributed as (38, 38, 37, 37) row-bock of each processor. In the
block.cyclic, the iris is distributed as (38, 38, 38, 36) row-bock of each processor. i.e. Each
cycle has 38 rows and one cycle per processor.
See pbdDEMO vignettes (Schmidt et al. 2013) for more details about “block-cyclic” and
“gbd”.

6. Simple Pairwise Evaluation
We build some utilities for pairwise evaluation to pbdMPI since version 0.2-3.
Evaluating a function on any two data points is a common problems, such as distance, pairwise
comparison, and multiple testing problems. Useful functions to solve those problems are
 comm.as.gbd(): This function is to turn a common matrix (in all ranks) to a gbd matrix
in row major blocks. For example, one may read in data from one rank, then utilizes
this function to redistribute data with load balance of all ranks. This is an alternative
way to Section 5, but more efficient for small size of data.
 comm.allpairs(): This function is mainly to provide indices for all pairs of N data
points. It returns a two columns (i, j) gbd matrix in row major blocks. For example, one
may want to evaluate all N 2 pairs of the N data points. However, in distance context,
it provides only indices as in lower-triangular matrix (ordered by row major).

12




 comm.dist(): This function is to compute distance (lower-triangular only) of N data
points as usual dist() function, but evaluated on a gbd matrix in row major blocks.
The returning can be a common distance matrix (only good for small dataset), or a 3
columns gbd matrix in row major blocks. The columns are i, j, and the value of pair (i,
j).
 comm.pairwise(): This functions is a general extension composed of three functions
above that allows users to provide a function FUN to evaluate on pairs of data. For
example, a distance between two data points x and y can be computed via original
dist() function. So, it can be wrapped as

R Script

dist . pair <- function (x , y , ...) {
as . vector ( dist ( rbind (x , y ) , ...) )
}






for the FUN option of comm.pairwise().
This function is also useful for cases that measure of pair (i, j) differs to that of pair
(j, i), i.e. non-symmetric measure. If order is matter, then the FUN can be evaluated
via the options either pairid.gbd which can be user defined or simply symmetric =
FALSE.
Also, we provide some examples in man page. A demo verifies these functions in different
ways.
Shell Command

# ## Run the demo with 4 processors by
mpiexec - np 4 Rscript -e " demo ( simple _ pairs , ' pbdMPI ' , ask =F , echo = F ) "





See pbdDEMO vignettes (Schmidt et al. 2013) for more statistical examples.

7. Windows Systems (MS-MPI)
Originally, pbdMPI (later than version 0.2-3 but only up to version 0.3-1) supports Windows
with Microsoft MPI or MS-MPI (http://msdn.microsoft.com/en-us/library/bb524831(v=
vs.85).aspx). pbdMPI was built with ’HPC Pack 2012 R2 MS-MPI Redistributable Package’
which is available at http://http://www.microsoft.com/en-us/download/. The installation (MSMPISetup.exe) is easily done with a few clicks provided some service packs and Visual
C++ runtime are installed correctly. The default environment and path are recommended
for installation.
Currently, pbdMPI (later than version 0.3-2) supports Windows with Microsoft MPI or MSMPI version 7.1 (https://www.microsoft.com/en-us/download/details.aspx?id=52981).
Note that this is only a SDK development library which does not contain any MPI executable
file such as mpiexec.exe. This is only for compiling and linking the pbdMPI with MPI library. However, you will still need ’MS-MPI Redistributable Package’ to have a mpiexec.exe
to run MPI programs or pbdMPI scripts.
The difference of default installation between the SDK library and ’Redistributable Package’
are
13

 the location of MPI header file, and
 the location of the default installation.

The include path is changed from (Redistributable)
Shell Command

MPI _ INCLUDE = $ { MPI _ ROOT } Inc /





to (SDK)
Shell Command

MPI _ INCLUDE = $ { MPI _ ROOT } Include /





These are used by pbdMPI/src/Makevars.win. The default installation is changed from
(Redistributable)
Shell Command

SET MPI _ HOME = C :\ Program Files \ Microsoft MPI \





to (SDK)
Shell Command

SET MPI _ HOME = C :\ Program Files ( x86 ) \ Microsoft SDKS \ MPI \





These are supposed to be set in a batch file.
For running MPI and R, users need to set PATH to the mpiexec.exe and Rscript.exe. By
default,

# ##
SET
SET
SET


Shell Command
Under command mode , or save in a batch file .
R _ HOME = C :\ Program Files \ R \R -3.0.1\
MPI _ HOME = C :\ Program Files \ Microsoft MPI \
PATH =% MPI _ HOmE % bin \;% R _ HOME % bin \;% PATH %

Note that the installation (MSMPISetup.exe) may set several environmental variables including
 MSMPI_BIN for mpiexec.exe and other executable files,
 MSMPI_INC for header files such as mpi.h,
 MSMPI_LIB32 for 32 bits static libraries such as msmpi.lib, and
 MSMPI_LIB64 for 64 bits static libraries.

These should be useful to verify via R command Sys.getenv().

7.1. Install from Binary
The binary packages of pbdMPI are available on the website: “Programming with Big Data in
R” at http://r-pbd.org/ or “CRAN” at https://cran.r-project.org/package=pbdMPI.
Note that different MPI systems require different binaries. The binary can be installed by
14





Shell Command

R CMD INSTALL pbdMPI _ 0.2 -3. zip





As on Unix systems, one can start quickly with pbdMPI by learning from the following demos.
There are six basic examples.

# ## Run
mpiexec
mpiexec
mpiexec
mpiexec
mpiexec
mpiexec


Shell Command
the
- np
- np
- np
- np
- np
- np

demo with 2 processors by
2 Rscript . exe -e " demo ( allgather , ' pbdMPI ' , ask =F , echo = F ) "
2 Rscript . exe -e " demo ( allreduce , ' pbdMPI ' , ask =F , echo = F ) "
2 Rscript . exe -e " demo ( bcast , ' pbdMPI ' , ask =F , echo = F ) "
2 Rscript . exe -e " demo ( gather , ' pbdMPI ' , ask =F , echo = F ) "
2 Rscript . exe -e " demo ( reduce , ' pbdMPI ' , ask =F , echo = F ) "
2 Rscript . exe -e " demo ( scatter , ' pbdMPI ' , ask =F , echo = F ) "





Warning: Note that spacing inside demo is not working for Windows systems and Rscript.exe
should be evoked rather than Rscript.

7.2. Build from Source
Warning: This section is only for building binary in 32- and 64-bit Windows system. A
more general way can be found in the file pbdMPI/INSTALL.
Make sure that R, Rtools, and MINGW are in the PATH. See details on the website ”Building
R for Windows” at https://cran.r-project.org/bin/windows/Rtools/. The environment
variable MPI_HOME needs to be set for building binaries.
For example, the minimum requirement (for Rtools32 or earlier) may be

# ##
SET
SET
SET
SET
SET


Shell Command
Under command mode , or save in a batch file .
R _ HOME = C :\ Program Files \ R \R -3.0.1\
RTOOLS = C :\ Rtools \ bin \
MINGW = C :\ Rtools \ gcc -4.6.3\ bin \
MPI _ HOME = C :\ Program Files \ Miscrosoft MPI \
PATH =% MPI _ HOME % bin ;% R _ HOME %;% R _ HOME % bin ;% RTOOLS %;% MINGW %;% PATH %





For example, the minimum requirement (for Rtools33 or later) may be

# ##
SET
SET
SET
SET


Shell Command
Under command mode , or save in a batch file .
R _ HOME = C :\ Program Files \ R \R -3.4.0\
RTOOLS = C :\ Rtools \ bin \
MPI _ HOME = C :\ Program Files \ Miscrosoft MPI \
PATH =% MPI _ HOME % bin ;% R _ HOME %;% R _ HOME % bin ;% RTOOLS %;% PATH %





Note that gcc and others within Rtools will be detected by windows R, so the installation
path of Rtools should be exactly the same as C:/Rtools.
With a correct PATH, one can use the R commands to install/build the pbdMPI:


Shell Command

15



# ## Under command mode , build and install the binary .
tar zxvf pbdMPI _ 0.2 -3. tar . gz
R CMD INSTALL -- build pbdMPI
R CMD INSTALL pbdMPI _ 0.2 -3. zip


Warning: For other pbdR packages, it is possible to compile without further changes of
configurations. However, only pbdMPI is tested regularly before any release.

8. FAQs
8.1. General
1. Q: Do I need MPI knowledge to run pbdMPI?
A: Yes, but only the big picture, not the details. We provide several examples in
pbdMPI/inst/examples/test_spmd/ to introduce essential methods for learning MPI
communication.
2. Q: Can I run pbdMPI on my laptop locally?
A: Sure, as long as you have an MPI system. You even can run it on 1 CPU.
3. Q: Does pbdMPI support Windows clusters?
A: Yes, the released binary currently supports MS-MPI. Currently, pbdMPI is built
with ’HPC Pack 2012 R2 MS-MPI Redistributable Package’ which is available at http:
//http://www.microsoft.com/en-us/download/. For other MPI systems, users have
to compile from source.
4. Q: Can I run pbdMPI in OpenMPI and MPICH2 together?
A: No, you can have both OpenMPI and MPICH2 installed in your OS, but you are
only allowed to run pbdMPI with one MPI system. Just pick one.
5. Q: Does pbdMPI support any interactive mode?
A: No, but yes. Since pbdMPI version 0.3-0, there are two additional packages pbdZMQ (Chen and Schmidt 2015) and pbdCS (Schmidt and Chen 2015) which provide
servers-client interaction building upon pbdMPI for parallel computing.
Originally, pbdMPI only considers batch execution and aims for programming with big
data that do not fit on desktop platforms. We think that interaction with big data on
a big machine is better handled with a client/server interface, where the server runs
SPMD codes on big data and the client operates with reduced data representations.
If you really need an interactive mode, such as for debugging, you can utilize pbdMPI
scripts inside Rmpi. Rmpi mainly focuses on Manager/Workers computing environments, but can run SPMD codes on workers only with a few adjustments. See the
“Programming with Big Data in R” website for details at http://r-pbd.org/.
Note that pbdMPI uses communicators different from Rmpi. Be sure to free the memory
correctly for both packages before quitting. finalize(mpi.finalize = FALSE) can
free the memory allocated by pbdMPI, but does not terminate MPI before calling
mpi.quit of Rmpi.
16



6. Q: Can I write my own collective functions for my own data type?
A: Yes, S4 methods allow users to add their own data type, and functions. Quick
examples can be found in pbdMPI/inst/examples/test_s4/.
7. Q: Does pbdMPI support long vector or 64-bit integer?
A: See Section 4.
8. Q: Does pbdMPI support Amazon Web Services (AWS EC2)?
A: See http://snoweye.github.io/pbdr/aws_ec2.html for setting a cluster on AWS
EC2.
9. Q: Does pbdMPI support multiple nodes in VirtualBox?
A: See http://snoweye.github.io/pbdr/multiple_nodes.html for setting a cluster
with two nodes in VirtualBox. It is extensible to multiple nodes by linked or full cloning
with a few network modifications. A pure text file multiple_nodes.txt contains detail
steps for the setting.
10. Q: A simple pbdMPI testing code hangs but simple MPI pure C code is working?
A: If your VirtualBox has multiple adapters (for example, eth0 for NAT/host, eth1
using internal 192.168.*.* for MPI communication), then you may consider to bring
down eth0 next.
Shell Command

sudo ip link set eth0 down





Further, you may also consider to consult network experts for IP and routing table
configurations when multiple adapters are required. R/Rscript may not know multiple
adapters nor how networking or routing table is setting up. It is just easier for MPI to
use a single adapter, open all INPUT/OUTPUT/FORWARD ports, stop all firewall,
etc. MPI is designed for high performance computing, so don’t put too much extra stuffs
to decline the performance. (Thanks for Alba Martı́nez-Ruiz and Cristina Montañola
in Universidad Católica de la Ssma. Concepción, chil providing errors and issues.)
11. Q: (Linux/Unix/Mac) Can I install and run OpenMPI or MPICH locally without root
permission?
A: Yes. You don’t need root permission to install or run MPI applications. For general
installation of MPI libraries, please see pbdMPI/INSTALL first. For example, you may
install OpenMPI version 1.8.2 under any private user account by
R Script

tar zxvf openmpi -1.8.2. tar . gz
cd openmpi -1.8.2
. / configure \
-- prefix = / home / user . id / work - my / local / ompi \
CFLAGS = - fPIC
make
make install

The MPI library and binary will be installed at /home/user.id/work-my/local/ompi/.
Then, you may add this path to system environment PATH by
17





R Script

export OMPI = / home / user . id / work - my / local / ompi
export PATH = $ OMPI / bin : $ PATH





8.2. Programming
1. Q: What are pbdMPI’s high level back-ends for embarrassingly parallel calculations?
A: See man pages and examples of pbdLapply(), pbdSapply(), pbdApply(), and
task.pull() for more details. Some options of those functions, such as pbd.mode,
may be also useful for different data distribution in embarrassingly parallel calculations.
2. Q: Can I run task jobs by using pbdMPI?
A: Yes, it is relatively straightforward for parallel tasks. Neither extra automatic functions nor further command/data communication is required. In other words, SPMD is
easier for Monte Carlo, bootstrap, MCMC simulation and statistical analysis for ultralarge datasets. A more efficient way, such as task pull parallelism, can be found in next
Q&A.
Example 1:
SPMD R Script





suppressMessages ( library ( pbdMPI , quietly = TRUE ) )
init ()
id <- get . jid ( total . tasks )
# ## Using a loop .
for ( i in id ) {
# ## Put independent task i script here .
}
# ## Or using apply - like functions .
lapply ( id , function ( i ) {
# ## Put independent task i script here .
})


finalize ()



Note that id gets different values on different processors, accomplishing total.tasks
across all processors. Also note that any data and partial results are not shared across
the processors unless communicated.
Example 2:


SPMD R Script
suppressMessages ( library ( pbdMPI , quietly = TRUE ) )
init ()
# ## Directly using a loop .

18



for ( i in 1: total . tasks ) {
if ( i %% comm . size () == comm . rank () ) {
# ## Put independent task i script here .
}
}
# ## Or using apply - like function .
lapply (1: total . tasks , function ( i ) {
if ( i %% comm . size () == comm . rank () ) {
# ## Put independent task i script here .
}
})
finalize ()




3. Q: Can I use unblocked send functions, such as isend()? Or, does isend() truly
unblocked?
A: The answer is no for pbdMPI earlier than version 0.2-2, but it is changed since
version 0.2-3. A temporary buffer list SPMD.NB.BUFFER is used to store all objects being
sent by isend(). The buffer is created and cumulated in .pbd_env, but released as
wait() is called. Although this may take some performance and space, this can avoid
gc() and memory overwrite before actual sending is done.
4. Q: Can I run un-barrier task jobs, such as task pull parallelism, by using pbdMPI?
A: Yes, it is relatively straightforward via pbdMPI API function task.pull() in SPMD.
For example, the next is available in demo which has a user defined function FUN() run
on workers, and master (rank 0) controls the task management.
Shell Command

mpiexec - np 4 Rscript -e " demo ( task _ pull , ' pbdMPI ' , ask =F , echo = F ) "




SPMD R Script (task pull)

# ## Initial .
suppressMessages ( library ( pbdMPI , quietly = TRUE ) )



# ## Examples .
FUN <- function ( jid ) {
Sys . sleep (1)
jid * 10
}
ret <- task . pull (1:10 , FUN )
comm . print ( ret )
if ( comm . rank () == 0) {
ret . jobs <- unlist ( ret )
ret . jobs <- ret . jobs [ names ( ret . jobs ) == " ret " ]
print ( ret . jobs )
}

19



# ## Finish .
finalize ()




5. Q: What if I want to run task push or pull by using pbdMPI?
A: No problem. As in the two proceeding examples, task push or pull can be done in
the same way by using rank 0 as the manager and the other ranks as workers. However,
we do not recommend it except perhaps for inhomogeneous computing environments
and independent jobs.
6. Q: Are S4 methods more efficient?
A: Yes and No. S4 methods are a little less efficient than using switch ... case
... in C, but most default methods use raw with un- and serialize which may cost
3-10 times more than using integer or double. Instead of writing C code, it is easier
to take advantage of S4 methods to extend to general R objects (matrix, array, list,
data.frame, and class ...) by communicating with basic data types (integer and
double) and avoiding serialization.
7. Q: Can I disable the MPI initialization of pbdMPI when I call library(pbdMPI)?
A: Yes, you can set a hidden variable .__DISABLE_MPI_INIT__ in the .GlobalEnv
before calling library(pbdMPI). For example,
SPMD R Script

assign ( " . _ _ DISABLE _ MPI _ INIT _ _ " , TRUE , envir = . GlobalEnv )
library ( pbdMPI )
ls ( all . names = TRUE )
init ()
ls ( all . names = TRUE )
finalize ( mpi . finalize = FALSE )

Note that we are *NOT* supposed to kill MPI in the finalize step if MPI is initialized by external applications. But some memory allocated by pbdMPI has to be free,
mpi.finalize = FALSE is set above.
To avoid some initialization issues of MPI, pbdMPI uses a different way than Rmpi.
pbdMPI allows you to disable initializing communicators when loading the library, and
later on you can call init to initialize or obtain communicators through .__MPI_APTS__
as in the next question.
8. Q: Can pbdMPI take or export communicators?
A: Yes, the physical memory address is set to the variable .__MPI_APTS__ in the
.GlobalEnv through a call to init(). The variable points to a structure containing MPI
structure arrays preallocated while pbdMPI is loaded. pbdMPI/src/pkg_* provides a
mechanism to take or export external/global variables at the C language level.

8.3. MPI Errors
1. Q: If compilation successful, but load fails with segfault
20





Error Message

* * testing if installed package can be loaded
sh : line 1: 2905 Segmentation fault
' / usr / local / R / 3.0.0 / intel13 / lib64 / R / bin / R ' --no - save -- slave
2>&1 <
/ tmp / RtmpGkncGK / file1e541c57190
ERROR : loading failed
* * * caught segfault * * *
address ( nil ) , cause ' unknown


'





A: Basically, pbdMPI and all pbdR are tested and have stable configuration in GNU
environment. However, other compilers are also possible such as Intel compiler. This
message may come from the system of login node does not have a MPI system, MPI
system is only allowed to be loaded in computing node, or MPI shared library is not
loaded correctly and known to R. The solution is to use extra flag to R CMD INSTALL
-no-test-load pbdMPI*.tar.gz, and use export LD_PRELOAD=... as the answer to
the next question.
2. Q: If installation fails with
Error Message

Error in dyn . load ( file , DLLpath = DLLpath , ...) :
unable to load shared object ' / ... / pbdMPI / libs / pbdMPI . so ' :
libmpi . so : cannot open shared object file : No such file or
directory






A: OpenMPI may not be installed in the usual location, so the environment variable
LD_LIBRARY_PATH should be set to the libmpi.so path, such as
Shell Command

export LD _ LIBRARY _ PATH = / usr / local / openmpi / lib : $ LD _ LIBRARY _ PATH





where /usr/local/openmpi/lib should be replaced by the path to libmpi.so. Or,
use export LD_PRELOAD=... to preload the MPI library if the library name is not
conventional, such as
Shell Command

export LD _ PRELOAD = / usr / local / openmpi / lib / libmpi . so : $ LD _ PRELOAD





Another solution may be to use the unix command ldconfig to setup the correct path.
3. Q: pbdMPI installs successfuly, but fails at initialization when calling the function
init() with error message
Error Message

/ usr / lib / R / bin / exec / R : symbol lookup error :
/ usr / lib / openmpi / lib / openmpi / mca _ paffinity _ linux . so : undefined
symbol :
mca _ base _ param _ reg _ int

21





A: The linked library at installation may be different from the runtime library, especially
when your system has more than one MPI systems. Since the library at installation
is detected by autoconf (configure) and automake (Makevars), it can be linked with
OpenMPI library, but MPICH2 or LAM/MPI is searched before OpenMPI according
to $PATH.
Solutions:
 Check which MPI system is your favorite to call. If you use OpenMPI, then you
have to link with OpenMPI. Similarly, for MPICH2.
 Or, only kepp the MPI system you do like and drop others.
 Use --with-mpi-type to specify the MPI type.
 Use --with-mpi-include and --with-mpi-libpath to specify the right version.

4. Q: (Mac) If installs successfully, but fails at initialization with
Error Message

Library not loaded : / usr / lib / libmpi .0. dylib





A: Please make sure the GNU compiler, R, OpenMPI, and pbdMPI are all built and
installed under unified conditions, such as 64-bits environment. 32-bits R may not be
able to load 64-bits OpenMPI nor pbdMPI.
5. Q: (Linux) If OpenMPI mpiexec fails with
Error Message

mca : base : component _ find : unable to open
/ ... / openmpi / lib / openmpi / mca _ paffinity _ hwloc :
/ ... / openmpi / lib / openmpi / mca _ paffinity _ hwloc . so :
undefined symbol : opal _ hwloc _ topology ( ignored )
...
mca : base : component _ find : unable to open
/ ... / openmpi / lib / openmpi / mca _ carto _ auto _ detect :
/ ... / openmpi / lib / openmpi / mca _ carto _ auto _ detect . so :
undefined symbol : opal _ carto _ base _ graph _ get _ host _ graph _ fn
( ignored )
...

A: The linked MPI library libmpi.so may be missing or have a different name. OpenMPI builds shared/dynamic libraries by default and the target file libmpi.so is used
by pbdMPI/src/spmd.c through #include  and dlopen(...) in the file
pbdMPI/src/pkg_dl.c.
Solutions:
 Check if the path and version of libmpi.so are correct. In particular, one may
have different MPI systems installed.
 When linking with libmpi.so in OpenMPI, one must run/load pbdMPI with
OpenMPI’s libmpi.so. The same for LAM/MPI and MPICH2.
 Use export LD_PRELOAD=$PATH_TO_libmpi.so.* in command mode.

22





 Use the file /etc/ld.so.conf and the command ldconfig to manage personal
MPI installation.
 Or, recompile OpenMPI with a static library, and use libmpi.a instead.

6. Q: (Windows) If OpenMPI mpiexec fails with
Error Message

ORTE _ ERROR _ LOG : Error in file ..\..\..\ openmpi -1
.6\ orte \ mca \ ess \ hnp \ ess _ hnp _ module . c at line 194
...
ORTE _ ERROR _ LOG : Error in file ..\..\..\ openmpi -1
.6\ orte \ runtime \ orte _ init . c at line 128
...






A: Check if the network is unplugged, the network should be “ON” even on a single
machine. At least, the status of network interface should be correct.
7. Q: (Windows) If MPICH2 mpiexec fails with
Error Message

c :\ > " C :\ Program Files \ MPICH2 \ bin \ mpiexec . exe " - np 2 Rscript
C :\ my _ script . r
launch failed : CreateProcess ( Rscript C :\ my _ script . r ) on
failed , error 2 - The system cannot find the file specified .






A: Please try to use Rscript.exe in windows system.
8. Q: For MPICH2 users, if installation fails with
Error Message

/ usr / bin / ld : libmpich . a ( comm _ get _ attr . o ) : relocation R _ X86 _ 64 _ 32
against ` MPIR _ ThreadInfo ' can not be used when making a shared
object ; recompile with - fPIC
libmpich . a : could not read symbols : Bad value
collect2 : ld returned 1 exit status






A: MPICH2 by default does not install a shared library which means libmpich.so
is missing and pbdMPI trys to link with a static library libmpich.a instead. Try to
recompile MPICH2 with a flag --enable-shared and reinstall pbdMPI again.
9. Q: For MPICH2 and MPICH3 users, if installation fails with
Error Message

/ usr / bin / ld : cannot find - lopa
collect2 : error : ld returned 1 exit status
make : * * * [ pbdMPI . so ] Error 1
ERROR : compilation failed for package ' pbdMPI




'

A: By default, -lopa is required for some systems. However, some systems may not
have it and can be disable with a configuration flag when install pbdMPI, such as R
CMD INSTALL pbdMPI*.tar.gz -configure-args="-disable-opa".
23



10. Q: (MacOS 10.9.4 + OpenMPI 1.1.8) If compilation successful, but test load fails with
MCA errors such as “Symol not found”
Error Message

* * installing vignettes
` pbdMPI - guide . Rnw '
* * testing if installed package can be loaded
[??. ??.??.? ?.??:??] mca : base : component _ find : unable to open
/ ... / open - mpi / 1.8.1 / lib / openmpi / mca _ allocator _ basic :
dlopen ( / ... / open - mpi / 1.8.1 / lib / openmpi / mca _ allocator _ basic . so ,
9) :
Symbol not found : _ ompi _ free _ list _ item _ t _ class
Referenced from :
/ ... / open - mpi / 1.8.1 / lib / openmpi / mca _ allocator _ basic . so
Expected in : flat namespace
in / ... / open - mpi / 1.8.1 / lib / openmpi / mca _ allocator _ basic . so
( ignored )






A: The potential problem here is that mpicc -showme provides extra information, such
as multiple include and library paths, and configure is not able to parse correctly.
Therefore, it is easier to manually specify correct paths via -configure-args to R.
(Thanks for Eilidh Troup in University of Edinburgh, Scotland providing errors and
solutions.)
R Script

$ mpicc -- showme : compile
-I / usr / local / Cellar / open - mpi / 1.8.1 / include
$ mpicc -- showme : link
-L / usr / local / opt / libevent / lib
-L / usr / local / Cellar / open - mpi / 1.8.1 / lib - lmpi
$ R CMD INSTALL pbdMPI _ 0.2 -4. tar . gz \
-- configure - args = " -- with - mpi - type = OPENMPI \
-- with - mpi - include = / usr / local / Cellar / open - mpi / 1.8.1 / include \
-- with - mpi - libpath = / usr / local / Cellar / open - mpi / 1.8.1 / lib "






Note that ACX_MPI is also a good solution to fix configure.ac, however, it may screw
up other platforms, such as Solaris, and upset CRAN. Anyone is welcome to submit a
thoughful solution.
11. Q: (Windows) If OpenMPI mpiexec fails with
Error Message

d : / Compiler / gcc -4.9.3 / mingw _ 32 / bin / gcc
-I " D : / RCompile / recent /R -3.3.1 / include " - DNDEBUG
-I " C : / Program Files / Microsoft MPI / Inc / " - DMPI2 - DWIN
- DMSMPI _ NO _ DEPRECATE _ 20
-I " d : / Compiler / gcc -4.9.3 / local330 / include "
- O3 - Wall - std = gnu99 - mtune = core2 -c comm _ errors . c -o
comm _ errors . o
24



d : / Compiler / gcc -4.9.3 / mingw _ 32 / bin / gcc
-I " D : / RCompile / recent /R -3.3.1 / include "
- DNDEBUG -I " C : / Program Files / Microsoft MPI / Inc / " - DMPI2 - DWIN
- DMSMPI _ NO _ DEPRECATE _ 20
-I " d : / Compiler / gcc -4.9.3 / local330 / include "
- O3 - Wall - std = gnu99 - mtune = core2 -c comm _ sort _ double . c
-o comm _ sort _ double . o
In file included from spmd . h :7:0 ,
from comm _ api . h :7 ,
from comm _ sort _ double . c :1:
pkg _ global . h :16:17: fatal error : mpi . h : No such file or directory
# include < mpi .h >
^
compilation terminated .
make : * * * [ comm _ sort _ double . o ] Error 1




A: The C:/Program Files/Microsoft MPI/Inc/ may not exist for the MS-MPI v7.1
SDKs. The header file may in a different installation directory at C:/Program Files
(x86)/Microsoft SDKS/MPI/. See Section 7 for details.
12. Q: (Windows) If pbdMPI fails with
Error Message

> library ( pbdMPI )
Loading required package : rlecuyer
Error : . onLoad failed in loadNamespace () for ' pbdMPI ' , details :
call : inDL (x , as . logical ( local ) , as . logical ( now ) , ...)
error : unable to load shared object
' C : / Users / ... / pbdMPI / libs / x64 / pbdMPI . dll ' :
LoadLibrary failure : The specified module could not be found .






or with a system error like
Error Message

The program can ' t start because msmpi . dll is missing from your
computer . Try reinstalling the program to fix this problem .

A: Make sure MS-MPI is installed correctly and the msmpi.dll is accessible from PATH
before RGui is launched. Double check with Sys.getenv("PATH") and make sure something like C:/Program Files/Microsoft MPI/Bin/ is included in it. See Section 7 for
details.

8.4. Other Errors
1. Q: pbdMPI is linked with pbdPROF (Chen et al. 2013) and mpiP (Vetter and McCracken 2001). (i.e. --enable-pbdPROF is used in pbdMPI and --with-mpiP is used
in pbdPROF.) If pbdMPI compilation successful, but load fails with
25




Error Message

Error : . onLoad failed in loadNamespace () for ' pbdMPI ' , details :
call : dyn . load ( file , DLLpath = DLLpath , ...)
error : unable to load shared object ' pbdMPI . so ' :
pbdMPI / libs / pbdMPI . so : undefined symbol : _ Ux86 _ 64 _ getcontext






A: Some prerequisite packages by mpiP is installed incorrectly. Reinstall mpiP by
R Script

. / configure -- disable - libunwind CPPFLAGS = " - fPIC
-I / usr / lib / openmpi / include " LDFLAGS = " -L / usr / lib / openmpi / lib
- lmpi "

and followed by reinstall pbdPROF and pbdMPI.

26





References
Chen W-C Schmidt D, Sehrawat G, Patel P, Ostrouchov G (2013). “pbdPROF: Programming
with Big Data – MPI Profiling Tools.” R Package, URL https://cran.r-project.org/
package=pbdPROF.
Chen WC, Ostrouchov G, Schmidt D, Patel P, Yu H (2012). “pbdMPI: Programming with
Big Data – Interface to MPI.” R Package, URL https://cran.r-project.org/package=
pbdMPI.
Chen WC, Schmidt D (2015). “pbdZMQ: Programming with Big Data – Interface to ZeroMQ.”
R Package, URL https://cran.r-project.org/package=pbdZMQ.
Fisher R (1936). “The use of multiple measurements in taxonomic problems.” Annals of
Eugenics, 2, 179–188.
Ostrouchov G, Chen WC, Schmidt D, Patel P (2012). “Programming with Big Data in R.”
URL http://r-pbd.org/.
R Core Team (2012). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http:
//www.r-project.org/.
Raim A (2013). Introduction to distributed computing with pbdR at the UMBC High Performance Computing Facility (Technical report HPCF-2013-2). UMBC High Performance
Computing Facility, University of Maryland, Baltimore County.
Schmidt D, Chen WC (2015). “’pbdR’ Client/Server Utilities.” R Package, URL https:
//cran.r-project.org/package=pbdCS.
Schmidt D, Chen WC, Patel P, Ostrouchov G (2013). Speaking Serial R with a Parallel
Accent. R Vignette, URL https://cran.r-project.org/package=pbdDEMO.
Tierney L, Rossini AJ, Li N, Sevcikova H (2012). “snow: Simple Network of Workstations.”
R package (v:0.3-9), URL https://cran.r-project.org/package=snow.
Urbanek S (2011). “multicore: Parallel processing of R code on machines with multiple cores
or CPUs.” R package (v:0.1-7), URL https://cran.r-project.org/package=multicore.
Vetter JS, McCracken MO (2001). “Statistical scalability analysis of communication operations in distributed applications.” In Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming, PPoPP ’01, pp. 123–132.
ACM, New York, NY, USA. ISBN 1-58113-346-4. doi:10.1145/379539.379590. URL
http://doi.acm.org/10.1145/379539.379590.
Yu H (2002). “Rmpi: Parallel Statistical Computing in R.” R News, 2(2), 10–14. URL
https://cran.r-project.org/doc/Rnews/Rnews_2002-2.pdf.

27

Source Exif Data:

File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 30
Page Mode                       : UseOutlines
Author                          : Wei-Chen Chen, George Ostrouchov, Drew Schmidt, Pragneshkumar Patel, Hao Yu
Title                           : Quick Guide for pbdMPI
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.16
Keywords                        : —!!!—at, least, one, keyword, is, required—!!!—
Create Date                     : 2018:04:30 22:34:58-04:00
Modify Date                     : 2018:04:30 22:34:58-04:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015/Debian) kpathsea version 6.2.1

EXIF Metadata provided by EXIF.tools

Quick Guide For PbdMPI Pbd MPI

Navigation menu

Versions of this User Manual:

Views

Navigation