MPI Performance on Jazz over Myrinet
using LLCBench
LLCBench (Low-Level Characterization
Benchmarks) is composed of three components: MPBench, CacheBench, and
BLASBench. These are some of the results for the MPBench on
Jazz. There are two primary types of tests: point-to-point tests,
and global tests. The point-to-point measure bandwidth and
latency as a function of message size between two processors. The
global operations measure bandwidth as a function of message size for
the global operations: broadcast, reduce, allreduce, and
all-to-all. The global operations were run on 16 processors, as
this is an average size job on Jazz.
MPBench measures messages from 4 bytes to 2^16 bytes, in powers of two
for 100 integrations. Each test is run a single time before
testing to allow for cache setup and routing. The cache is then
flushed, and is flushed again before each new message size is
tested. The cache is not flushed between iterations of the same
message size. So, it is running in a ideal mode for testing a
particular message size. These 100 results are averaged for the
final
results. Timing is taken outside the loop of '100' to avoid
problems and costs with calling system timers that frequently.
I chose to run these tests to get an idealized
performance spec for myrinet on Jazz. Generally, real
applications get below these ideals. This is often because of
non-ideal message buffering and caching in the application.