Supported Engines

This section provides a description of the Available Engines in ADIOS2 and their specific parameters to allow extra-control from the user. Parameters are passed in key-value pairs for:

  1. Engine specific parameters

  2. Engine supported transports and parameters

Parameters are passed at:

  1. Compile time IO::SetParameters (adios2_set_parameter in C, Fortran)

  2. Compile time IO::AddTransport (adios2_set_transport_parameter in C, Fortran)

  3. Runtime Configuration Files in the ADIOS component.

BP4

The BP4 Engine writes and reads files in ADIOS2 native binary-pack (bp version 4) format. This is a new format for ADIOS 2.x which improves on the metadata operations of the older BP3 format. Compared to the older format, BP4 provides three main advantages:

  • Fast and safe appending of multiple output steps into the same file. Better performance than writing new files each step. Existing steps cannot be corrupted by appending new steps.

  • Streaming through files (i.e. online processing). Consumer apps can read existing steps while the Producer is still writing new steps. Reader’s loop can block (with timeout) and wait for new steps to arrive. Same reader code can read the entire data in post or in situ. No restrictions on the Producer.

  • Burst buffer support for writing data. It can write the output to a local file system on each compute node and drain the data to the parallel file system in a separate asynchronous thread. Streaming read from the target file system are still supported when data goes through the burst buffer. Appending to an existing file on the target file system is NOT supported currently.

BP4 files have the following structure given a “name” string passed as the first argument of IO::Open:

io.SetEngine("BP4");
adios2::Engine bpFile = io.Open("name", adios2::Mode::Write);

will generate:

% BP4 datasets are always a directory
name.bp/

% data and metadata files
name.bp/
        data.0
        data.1
        ...
        data.M
        md.0
        md.idx

Note

BP4 file names are compatible with the Unix (/) and Windows (\\) file system naming convention for directories and files.

This engine allows the user to fine tune the buffering operations through the following optional parameters:

  1. Profile: turns ON/OFF profiling information right after a run

  2. ProfileUnits: set profile units according to the required measurement scale for intensive operations

  3. Threads: number of threads provided from the application for buffering, use this for very large variables in data size

  4. InitialBufferSize: initial memory provided for buffering (minimum is 16Kb)

  5. BufferGrowthFactor: exponential growth factor for initial buffer > 1, default = 1.05.

  6. MaxBufferSize: maximum allowable buffer size (must be larger than 16Kb). If too large adios2 will throw an exception.

  7. FlushStepsCount: users can select how often to produce the more expensive collective metadata file in terms of steps: default is 1. Increase to reduce adios2 collective operations footprint, with the trade-off of reducing checkpoint frequency. Buffer size will increase until first steps count if MaxBufferSize is not set.

  8. NumAggregators (or SubStreams): Users can select how many sub-files (M) are produced during a run, ranges between 1 and the number of mpi processes from MPI_Size (N), adios2 will internally aggregate data buffers (N-to-M) to output the required number of sub-files. Default is 0, which will let adios2 to group processes per shared-memory-access (i.e. one per compute node) and use one process per node as an aggregator. If NumAggregators is larger than the number of processes then it will be set to the number of processes.

  9. AggregatorRatio: An alternative option to NumAggregators to pick every Nth process as aggregator. An integer divider of the number of processes is required, otherwise a runtime exception is thrown.

  10. OpenTimeoutSecs: (Streaming mode) Reader may want to wait for the creation of the file in io.Open(). By default the Open() function returns with an error if file is not found.

  11. BeginStepPollingFrequencySecs: (Streaming mode) Reader can set how frequently to check the file (and file system) for new steps. Default is 1 seconds which may be stressful for the file system and unnecessary for the application.

  12. StatsLevel: Turn on/off calculating statistics for every variable (Min/Max). Default is On. It has some cost to generate this metadata so it can be turned off if there is no need for this information.

  13. StatsBlockSize: Calculate Min/Max for a given size of each process output. Default is one Min/Max per writer. More fine-grained min/max can be useful for querying the data.

  14. NodeLocal or Node-Local: For distributed file system. Every writer process must make sure the .bp/ directory is created on the local file system. Required when writing to local disk/SSD/NVMe in a cluster. Note: the BurstBuffer* parameters are newer and should be used for using the local storage as temporary instead of this parameter.

  15. BurstBufferPath: Redirect output file to another location and drain it to the original target location in an asynchronous thread. It requires to be able to launch one thread per aggregator (see SubStreams) on the system. This feature can be used on machines that have local NVMe/SSDs on each node to accelerate the output writing speed. On Summit at OLCF, use “/mnt/bb/<username>” for the path where <username> is your user account name. Temporary files on the accelerated storage will be automatically deleted after the application closes the output and ADIOS drains all data to the file system, unless draining is turned off (see the next parameter). Note: at this time, this feature cannot be used to append data to an existing dataset on the target system.

  16. BurstBufferDrain: To write only to the accelerated storage but to not drain it to the target file system, set this flag to false. Data will NOT be deleted from the accelerated storage on close. By default, setting the BurstBufferPath will turn on draining.

  17. BurstBufferVerbose: Verbose level 1 will cause each draining thread to print a one line report at the end (to standard output) about where it has spent its time and the number of bytes moved. Verbose level 2 will cause each thread to print a line for each draining operation (file creation, copy block, write block from memory, etc).

  18. StreamReader: By default the BP4 engine parses all available metadata in Open(). An application may turn this flag on to parse a limited number of steps at once, and update metadata when those steps have been processed. If the flag is ON, reading only works in streaming mode (using BeginStep/EndStep); file reading mode will not work as there will be zero steps processed in Open().

Key

Value Format

Default and Examples

Profile

string On/Off

On, Off

ProfileUnits

string

Microseconds, Milliseconds, Seconds, Minutes, Hours

Threads

integer > 1

1, 2, 3, 4, 16, 32, 64

InitialBufferSize

float+units >= 16Kb

16Kb, 10Mb, 0.5Gb

MaxBufferSize

float+units >= 16Kb

at EndStep, 10Mb, 0.5Gb

BufferGrowthFactor

float > 1

1.05, 1.01, 1.5, 2

FlushStepsCount

integer > 1

1, 5, 1000, 50000

NumAggregators

integer >= 1

0 (one file per compute node), MPI_Size/2, … , 2, (N-to-1) 1

AggregatorRatio

integer >= 1

not used unless set, MPI_Size/N must be an integer value

OpenTimeoutSecs

float

0, 10.0, 5

BeginStepPollingFrequencySecs

float

1, 10.0

StatsLevel

integer, 0 or 1

1, 0

StatsBlockSize

integer > 0

a very big number, 1073741824 for blocks with 1M elements

NodeLocal

string On/Off

Off, On

Node-Local

string On/Off

Off, On

BurstBufferPath

string

“”, /mnt/bb/norbert, /ssd

BurstBufferDrain

string On/Off

On, Off

BurstBufferVerbose

integer, 0-2

0, 1, 2

StreamReader

string On/Off

On, Off

Only file transport types are supported. Optional parameters for IO::AddTransport or in runtime config file transport field:

Transport type: File

Key

Value Format

Default and Examples

Library

string

POSIX (UNIX), FStream (Windows), stdio, IME

The IME transport directly reads and writes files stored on DDN’s IME burst buffer using the IME native API. To use the IME transport, IME must be avaiable on the target system and ADIOS2 needs to be configured with ADIOS2_USE_IME. By default, data written to the IME is automatically flushed to the parallel filesystem at every EndStep() call. You can disable this automatic flush by setting the transport parameter SyncToPFS to OFF.

BP3

The BP3 Engine writes and reads files in ADIOS2 native binary-pack (bp) format. BP files are backwards compatible with ADIOS1.x and have the following structure given a “name” string passed as the first argument of IO::Open:

adios2::Engine bpFile = io.Open("name", adios2::Mode::Write);

will generate:

% collective metadata file
name.bp

% data directory and files
name.bp.dir/
            name.bp.0
            name.bp.1
            ...
            name.bp.M

Note

BP3 file names are compatible with the Unix (/) and Windows (\\) file system naming convention for directories and files.

Caution

The default BP3 engine will check if the .bp is the extension of the first argument of IO::Open and will add .bp and .bp.dir if not.

This engine allows the user to fine tune the buffering operations through the following optional parameters:

  1. Profile: turns ON/OFF profiling information right after a run

  2. ProfileUnits: set profile units according to the required measurement scale for intensive operations

  3. CollectiveMetadata: turns ON/OFF forming collective metadata during run (used by large scale HPC applications)

  4. Threads: number of threads provided from the application for buffering, use this for very large variables in data size

  5. InitialBufferSize: initial memory provided for buffering (minimum is 16Kb)

  6. BufferGrowthFactor: exponential growth factor for initial buffer > 1, default = 1.05.

  7. MaxBufferSize: maximum allowable buffer size (must be larger than 16Kb). If too large adios2 will throw an exception.

  8. FlushStepsCount: users can select how often to produce the more expensive collective metadata file in terms of steps: default is 1. Increase to reduce adios2 collective operations footprint, with the trade-off of reducing checkpoint frequency. Buffer size will increase until first steps count if MaxBufferSize is not set.

  9. NumAggregators (or SubStreams): Users can select how many sub-files (M) are produced during a run, ranges between 1 and the number of mpi processes from MPI_Size (N), adios2 will internally aggregate data buffers (N-to-M) to output the required number of sub-files. Default is 0, which will let adios2 to group processes per shared-memory-access (i.e. one per compute node) and use one process per node as an aggregator. If NumAggregators is larger than the number of processes then it will be set to the number of processes.

  10. AggregatorRatio: An alternative option to NumAggregators to pick every Nth process as aggregator. An integer divider of the number of processes is required, otherwise a runtime exception is thrown.

  11. Node-Local: For distributed file system. Every writer process must make sure the .bp/ directory is created on the local file system. Required for using local disk/SSD/NVMe in a cluster.

Key

Value Format

Default and Examples

Profile

string On/Off

On, Off

ProfileUnits

string

Microseconds, Milliseconds, Seconds, Minutes, Hours

CollectiveMetadata

string On/Off

On, Off

Threads

integer > 1

1, 2, 3, 4, 16, 32, 64

InitialBufferSize

float+units >= 16Kb

16Kb, 10Mb, 0.5Gb

MaxBufferSize

float+units >= 16Kb

at EndStep, 10Mb, 0.5Gb

BufferGrowthFactor

float > 1

1.05, 1.01, 1.5, 2

FlushStepsCount

integer > 1

1, 5, 1000, 50000

NumAggregators

integer >= 1

0 (one file per compute node), MPI_Size/2, … , 2, (N-to-1) 1

AggregatorRatio

integer >= 1

not used unless set, MPI_Size/N must be an integer value

Node-Local

string On/Off

Off, On

Only file transport types are supported. Optional parameters for IO::AddTransport or in runtime config file transport field:

Transport type: File

Key

Value Format

Default and Examples

Library

string

POSIX (UNIX), FStream (Windows), stdio, IME

HDF5

In ADIOS2, the default engine for reading and writing HDF5 files is called “HDF5”. To use this engine, you can either specify it in your xml config file, with tag <engine type=HDF5> or, set it in client code. For example, here is how to create a hdf5 reader:

adios2::IO h5IO = adios.DeclareIO("SomeName");
h5IO.SetEngine("HDF5");
adios2::Engine h5Reader = h5IO.Open(filename, adios2::Mode::Read);

In addition, with HDF5 distribution greater or equal to 1.11, one can use the engine HDF5Mixer to write files with the VDS (virtual dataset) feature from HDF5. The corresponding tag in the xml file is: <engine type=HDF5Mixer>

and a sample code for VDS writer is:

adios2::IO h5IO = adios.DeclareIO("SomeName");
h5IO.SetEngine("HDF5Mixer");
adios2::Engine h5Writer = h5IO.Open(filename, adios2::Mode::Write);

To read back the h5 files generated with VDS to ADIOS2, one can use the HDF5 engine. Please make sure you are using the HDF5 library that has version greater than or equal to 1.11 in ADIOS2.

The h5 file generated by ADIOS2 has two levels of groups: The top Group, / and its subgroups: Step0StepN, where N is number of steps. All datasets belong to the subgroups.

Any other h5 file can be read back to ADIOS as well. To be consistent, when reading back to ADIOS2, we assume a default Step0, and all datasets from the original h5 file belong to that subgroup. The full path of a dataset (from the original h5 file) is used when represented in ADIOS2.

We can pass options to HDF5 API from ADIOS xml configuration. Currently we support CollectionIO (default false), and chunk specifications. The chunk specification uses space to seperate values, and by default, if a valid H5ChunkDim exists, it applies to all variables, unless H5ChunkVar is specified. Examples:

<parameter key="H5CollectiveMPIO" value="yes"/>
<parameter key="H5ChunkDim" value="200 200"/>
<parameter key="H5ChunkVar" value="VarName1 VarName2"/>

We suggest to read HDF5 documentation before appling these options.

SST Sustainable Staging Transport

In ADIOS2, the Sustainable Staging Transport (SST) is an engine that allows direct connection of data producers and consumers via the ADIOS2 write/read APIs. This is a classic streaming data architecture where the data passed to ADIOS on the write side (via Put() deferred and sync, and similar calls) is made directly available to a reader (via Get(), deferred and sync, and similar calls).

SST is designed for use in HPC environments and can take advantage of RDMA network interconnects to speed the transfer of data between communicating HPC applications; however, it is also capable of operating in a Wide Area Networking environment over standard sockets. SST supports full MxN data distribution, where the number of reader ranks can differ from the number of writer ranks. SST also allows multiple reader cohorts to get access to a writer’s data simultaneously.

To use this engine, you can either specify it in your xml config file, with tag <engine type=SST> or, set it in client code. For example, here is how to create an SST reader:

adios2::IO sstIO = adios.DeclareIO("SomeName");
sstIO.SetEngine("SST");
adios2::Engine sstReader = sstIO.Open(filename, adios2::Mode::Read);

and a sample code for SST writer is:

adios2::IO sstIO = adios.DeclareIO("SomeName");
sstIO.SetEngine("SST");
adios2::Engine sstWriter = sstIO.Open(filename, adios2::Mode::Write);

The general goal of ADIOS2 is to ease the conversion of a file-based application to instead use a non-file streaming interconnect, for example, data producers such as computational physics codes and consumers such as analysis applications. However, there are some uses of ADIOS2 APIs that work perfectly well with the ADIOS2 file engines, but which will not work or will perform badly with streaming. For example, SST is based upon the “step” concept and ADIOS2 applications that use SST must call BeginStep() and EndStep(). On the writer side, the Put() calls between BeginStep and EndStep are the unit of communication and represent the data that will be available between the corresponding Begin/EndStep calls on the reader.

Also, it is recommended that SST-based applications not use the ADIOS2 Get() sync method unless there is only one data item to be read per step. This is because SST implements MxN data transfer (and avoids having to deliver all data to every reader), by queueing data on the writer ranks until it is known which reader rank requires it. Normally this data fetch stage is initiated by PerformGets() or EndStep(), both of which fulfill any pending Get() deferred operations. However, unlike Get() deferred, the semantics of Get() sync require the requested data to be fetched from the writers before the call can return. If there are multiple calls to Get() sync per step, each one may require a communication with many writers, something that would have only had to happen once if Get() differed were used instead. Thus the use of Get() sync is likely to incur a substantial performance penalty.

On the writer side, depending upon the chosen data marshaling option there may be some (relatively small) performance differences between Put() sync and Put() deferred, but they are unlikely to be as substantial as between Get() sync and Get() deferred.

Note that SST readers and writers do not necessarily move in lockstep, but depending upon the queue length parameters and queueing policies specified, differing reader and writer speeds may cause one or the other side to wait for data to be produced or consumed, or data may be dropped if allowed by the queueing policy. However, steps themselves are atomic and no step will be partially dropped, delivered to a subset of ranks, or otherwise divided.

The SST engine allows the user to customize the streaming operations through the following optional parameters:

1. RendezvousReaderCount: Default 1. This integer value specifies the number of readers for which the writer should wait before the writer-side Open() returns. The default of 1 implements an ADIOS1/flexpath style “rendezvous”, in which an early-starting reader will wait for the writer to start, or vice versa. A number >1 will cause the writer to wait for more readers and a value of 0 will allow the writer to proceed without any readers present. This value is interpreted by SST Writer engines only.

2. RegistrationMethod: Default “File”. By default, SST reader and writer engines communicate network contact information via files in a shared filesystem. Specifically, the "filename" parameter in the Open() call is interpreted as a path which the writer uses as the name of a file to which contact information is written, and from which a reader will attempt to read contact information. As with other file-based engines, file creation and access is subject to the usual considerations (directory components are interpreted, but must exist and be traversable, writer must be able to create the file and the reader must be able to read it). Generally the file so created will exist only for as long as the writer keeps the stream Open(), but abnormal process termination may leave “stale” files in those locations. These stray “.sst” files should be deleted to avoid confusing future readers. SST also offers a “Screen” registration method in which writers and readers send their contact information to, and read it from, stdout and stdin respectively. The “screen” registration method doesn’t support batch mode operations in any way, but may be useful when manually starting jobs on machines in a WAN environment that don’t share a filesystem. A future release of SST will also support a “Cloud” registration method where contact information is registered to and retrieved from a network-based third-party server so that both the shared filesystem and interactivity can be avoided. This value is interpreted by both SST Writer and Reader engines.

3. QueueLimit: Default 0. This integer value specifies the number of steps which the writer will allow to be queued before taking specific action (such as discarding data or waiting for readers to consume the data). The default value of 0 is interpreted as no limit. This value is interpreted by SST Writer engines only.

4. QueueFullPolicy: Default “Block”. This value controls what policy is invoked if a non-zero QueueLimit has been specified and new data would cause the queue limit to be reached. Essentially, the “Block” option ensures data will not be discarded and if the queue fills up the writer will block on EndStep until the data has been read. If there is one active reader, EndStep will block until data has been consumed off the front of the queue to make room for newly arriving data. If there is more than one active reader, it is only removed from the queue when it has been read by all readers, so the slowest reader will dictate progress. NOTE THAT THE NO READERS SITUATION IS A SPECIAL CASE: If there are no active readers, new timesteps are considered to have completed their active queueing immediately upon submission. They may be retained in the “reserve queue” if the ReserveQueueLimit is non-zero. However, if that ReserveQueueLimit parameter is zero, timesteps submitted when there are no active readers will be immediately discarded.

Besides “Block”, the other acceptable value for QueueFullPolicy is “Discard”. When “Discard” is specified, and an EndStep operation would add more than the allowed number of steps to the queue, some step is discarded. If there are no current readers connected to the stream, the oldest data in the queue is discarded. If there are current readers, then the newest data (I.E. the just-created step) is discarded. (The differential treatment is because SST sends metadata for each step to the readers as soon as the step is accepted and cannot reliably prevent that use of that data without a costly all-to-all synchronization operation. Discarding the newest data instead is less satisfying, but has a similar long-term effect upon the set of steps delivered to the readers.) This value is interpreted by SST Writer engines only.

5. ReserveQueueLimit: Default 0. This integer value specifies the number of steps which the writer will keep in the queue for the benefit of late-arriving readers. This may consist of timesteps that have already been consumed by any readers, as well as timesteps that have not yet been consumed. In some sense this is target queue minimum size, while QueueLimit is a maximum size. This value is interpreted by SST Writer engines only.

6. DataTransport: Default varies. This string value specifies the underlying network communication mechanism to use for exchanging data in SST. Generally this is chosen by SST based upon what is available on the current platform. However, specifying this engine parameter allows overriding SST’s choice. Current allowed values are “RDMA” and “WAN”. (ib and fabric are accepted as equivalent to RDMA and evpath is equivalent to WAN.) Generally both the reader and writer should be using the same network transport, and the network transport chosen may be dictated by the situation. For example, the RDMA transport generally operates only between applications running on the same high-performance interconnect (e.g. on the same HPC machine). If communication is desired between applications running on different interconnects, the Wide Area Network (WAN) option should be chosen. This value is interpreted by both SST Writer and Reader engines.

7. WANDataTransport: Default sockets. If the SST DataTransport parameter is “WAN, this string value specifies the EVPath-level data transport to use for exchanging data. The value must be a data transport known to EVPath, such as “sockets”, “enet”, or “ib”. Generally both the reader and writer should be using the same EVPath-level data transport. This value is interpreted by both SST Writer and Reader engines.

8. ControlTransport: Default tcp. This string value specifies the underlying network communication mechanism to use for performing control operations in SST. SST can be configured to standard TCP sockets, which are very reliable and efficient, but which are limited in their scalability. Alternatively, SST can use a reliable UDP protocol, that is more scalable, but as of ADIOS2 Release 2.4.0 still suffers from some reliability problems. (sockets is accepted as equivalent to tcp and udp, rudp, and enet are equivalent to scalable. Generally both the reader and writer should be using the same control transport. This value is interpreted by both SST Writer and Reader engines.

9. NetworkInterface: Default NULL. In situations in which there are multiple possible network interfaces available to SST, this string value specifies which should be used to generate SST’s contact information for writers. Generally this should NOT be specified except for narrow sets of circumstances. It has no effect if specified on Reader engines. If specified, the string value should correspond to a name of a network interface, such as are listed by commands like “netstat -i”. For example, on most Unix systems, setting the NetworkInterface parameter to “lo” (or possibly “lo0”) will result in SST generating contact information that uses the network address associated with the loopback interface (127.0.0.1). This value is interpreted by only by the SST Writer engine.

10. ControlInterface: Default NULL. This value is similar to the NetworkInterface parameter, but only applies to the SST layer which does messaging for control (open, close, flow and timestep management, but not actual data transfer). Generally the NetworkInterface parameter can be used to control this, but that also aplies to the Data Plane. Use ControlInterface in the event of conflicting specifications.

11. DataInterface: Default NULL. This value is similar to the NetworkInterface parameter, but only applies to the SST layer which does messaging for data transfer, not control (open, close, flow and timestep management). Generally the NetworkInterface parameter can be used to control this, but that also aplies to the Control Plane. Use DataInterface in the event of conflicting specifications. In the case of the RDMA data plane, this parameter controls the libfabric interface choice.

12. FirstTimestepPrecious: Default FALSE. FirstTimestepPrecious is a boolean parameter that affects the queueing of the first timestep presented to the SST Writer engine. If FirstTimestepPrecious is TRUE*, then the first timestep is effectively never removed from the output queue and will be presented as a first timestep to any reader that joins at a later time. This can be used to convey run parameters or other information that every reader may need despite joining later in a data stream. Note that this queued first timestep does count against the QueueLimit parameter above, so if a QueueLimit is specified, it should be a value larger than 1. Further note while specifying this parameter guarantees that the preserved first timestep will be made available to new readers, other reader-side operations (like requesting the LatestAvailable timestep in Engine parameters) might still cause the timestep to be skipped. This value is interpreted by only by the SST Writer engine.

13. AlwaysProvideLatestTimestep: Default FALSE. AlwaysProvideLatestTimestep is a boolean parameter that affects what of the available timesteps will be provided to the reader engine. If AlwaysProvideLatestTimestep is TRUE*, then if there are multiple timesteps available to the reader, older timesteps will be skipped and the reader will see only the newest available upon BeginStep. This value is interpreted by only by the SST Reader engine.

14. OpenTimeoutSecs: Default 60. OpenTimeoutSecs is an integer parameter that specifies the number of seconds SST is to wait for a peer connection on Open(). Currently this is only implemented on the Reader side of SST, and is a timeout for locating the contact information file created by Writer-side Open, not for completing the entire Open() handshake. Currently value is interpreted by only by the SST Reader engine.

15. SpeculativePreloadMode: Default AUTO. In some circumstances, SST eagerly sends all data from writers to every readers without first waiting for read requests. Generally this improves performance if every reader needs all the data, but can be very detrimental otherwise. The value AUTO for this engine parameter instructs SST to apply its own heuristic for determining if data should be eagerly sent. The value OFF disables this feature and the value ON causes eager sending regardless of heuristic. Currently SST’s heuristic is simple. If the size of the reader cohort is less than or equal to the value of the SpecAutoNodeThreshold engine parameter (Default value 1), eager sending is initiated. Currently value is interpreted by only by the SST Reader engine.

16. SpecAutoNodeThreshold: Default 1. If the size of the reader cohort is less than or equal to this value and the SpeculativePreloadMode parameter is AUTO, SST will initiate eager data sending of all data from each writer to all readers. Currently value is interpreted by only by the SST Reader engine.

Key

Value Format

Default and Examples

RendezvousReaderCount

integer

1

RegistrationMethod

string

File, Screen

QueueLimit

integer

0 (no queue limits)

QueueFullPolicy

string

Block, Discard

ReserveQueueLimit

integer

0 (no queue limits)

DataTransport

string

default varies by platform, RDMA, WAN

WANDataTransport

string

sockets, enet, ib

ControlTransport

string

TCP, Scalable

NetworkInterface

string

NULL

ControlInterface

string

NULL

DataInterface

string

NULL

FirstTimestepPrecious

boolean

FALSE, true, no, yes

AlwaysProvideLatestTimestep

boolean

FALSE, true, no, yes

OpenTimeoutSecs

integer

60

SpeculativePreloadMode

string

AUTO, ON, OFF

SpecAutoNodeThreshold

integer

1

SSC Strong Staging Coupler

The SSC engine is designed specifically for strong code coupling. Currently SSC only supports fixed IO pattern, which means once the first step is finished, users are not allowed to write or read a data block with a start and count that have not been written or read in the first step. SSC uses a combination of one sided MPI and two sided MPI methods. In any cases, all user applications are required to be launched within a single mpirun or mpiexec command, using the MPMD mode.

The SSC engine takes the following parameters:

  1. OpenTimeoutSecs: Default 10. Timeout in seconds for opening a stream. The SSC engine’s open function will block until the RendezvousAppCount is reached, or timeout, whichever comes first. If it reaches the timeout, SSC will throw an exception.

  2. MpiMode: Default TwoSided. MPI communication modes to use. Besides the default TwoSided mode using two sided MPI communications, MPI_Isend and MPI_Irecv, for data transport, there are four one sided MPI modes: OneSidedFencePush, OneSidedPostPush, OneSidedFencePull, and OneSidedPostPull. Modes with Push are based on the push model and use MPI_Put for data transport, while modes with Pull are based on the pull model and use MPI_Get. Modes with Fence use MPI_Win_fence for synchronization, while modes with Post use MPI_Win_start, MPI_Win_complete, MPI_Win_post and MPI_Win_wait.

  3. Threading: Default False. SSC will use threads to hide the time cost for metadata manipulation and data transfer when this parameter is set to true. SSC will check if MPI is initialized with multi-thread enabled, and if not, then SSC will force this parameter to be false. Please do NOT enable threading when multiple I/O streams are opened in an application, as it will cause unpredictable errors.

Key

Value Format

Default and Examples

OpenTimeoutSecs

integer

10, 2, 20, 200

MpiMode

string

TwoSided, OneSidedFencePush, OneSidedPostPush, OneSidedFencePull, OneSidedPostPull

Threading

bool

false, true

DataMan for Wide Area Network Data Staging

The DataMan engine is designed for data staging over the wide area network. It is supposed to be used in cases where a few writers send data to a few readers over long distance.

DataMan supports compression operators such as ZFP lossy compression and BZip2 lossless compression. Please refer to the operator section for usage.

The DataMan engine takes the following parameters:

  1. IPAddress: No default value. The IP address of the host where the writer application runs. This parameter is compulsory in wide area network data staging.

  2. Port: Default 50001. The port number on the writer host that will be used for data transfers.

  3. Timeout: Default 5. Timeout in seconds to wait for every send / receive operation. Packages not sent or received within this time are considered lost.

  4. RendezvousReaderCount: Default 1. This integer value specifies the number of readers for which the writer should wait before the writer-side Open() returns. By default, an early-starting writer will wait for the reader to start, or vice versa. A number >1 will cause the writer to wait for more readers, and a value of 0 will allow the writer to proceed without any readers present. This value is interpreted by DataMan Writer engines only.

  5. Threading: Default true for reader, false for writer. Whether to use threads for send and receive operations. Enabling threading will cause extra overhead for managing threads and buffer queues, but will improve the continuity of data steps for readers, and help overlap data transfers with computations for writers.

  6. TransportMode: Default fast. Only DataMan writers take this parameter. Readers are automatically synchronized at runtime to match writers’ transport mode. The fast mode is optimized for latency-critical applications. It enforces readers to only receive the latest step. Therefore, in cases where writers are faster than readers, readers will skip some data steps. The reliable mode ensures that all steps are received by readers, by sacrificing performance compared to the fast mode.

  7. MaxStepBufferSize: Default 128000000. In order to bring down the latency in wide area network staging use cases, DataMan uses a fixed receiver buffer size. This saves an extra communication operation to sync the buffer size for each step, before sending actual data. The default buffer size is 128 MB, which is sufficient for most use cases. However, in case 128 MB is not enough, this parameter must be set correctly, otherwise DataMan will fail.

Key

Value Format

Default and Examples

IPAddress

string

N/A, 22.195.18.29

Port

integer

50001, 22000, 33000

Timeout

integer

5, 10, 30

RendezvousReaderCount

integer

1, 0, 3

Threading

bool

true for reader, false for writer

TransportMode

string

fast, reliable

MaxStepBufferSize

integer

128000000, 512000000, 1024000000

Inline for zero-copy

The Inline engine provides in-process communication between the writer and reader, avoiding the copy of data buffers.

This engine is focused on the NN case: N writers share a process with N readers, and the analysis happens ‘inline’ without writing the data to a file or copying to another buffer. It is similar to the streaming SST engine, since analysis must happen per step.

To use this engine, you can either add <engine type=Inline> to your XML config file, or set it in your application code:

adios2::IO io = adios.DeclareIO("ioName");
io.SetEngine("Inline");
adios2::Engine inlineWriter = io.Open("inline_write", adios2::Mode::Write);
adios2::Engine inlineReader = io.Open("inline_read", adios2::Mode::Read);

Notice that unlike other engines, the reader and writer share an IO instance. Both the writer and reader must be opened before either tries to call BeginStep()/PerformPuts()/PerformGets(). There must be exactly one writer, and exactly one reader.

For successful operation, the writer will perform a step, then the reader will perform a step in the same process. When the reader starts its step, the only data it has available is that written by the writer in its process. The reader then can retrieve whatever data was written by the writer by using the double-pointer Get call:

void Engine::Get<T>(Variable<T>, T**) const;

This version of Get is only used for the inline engine. See the example below for details.

Note

Since the inline engine does not copy any data, the writer should avoid changing the data before the reader has read it.

Typical access pattern:

// ... Application data generation

inlineWriter.BeginStep();
inlineWriter.Put(var, in_data); // always use deferred mode
inlineWriter.EndStep();
// Unlike other engines, data should not be reused at this point (since ADIOS
// does not copy the data), though ADIOS cannot enforce this.
// Must wait until reader is finished using the data.

inlineReader.BeginStep();
double* out_data;
inlineReader.Get(var, &data);
// Now in_data == out_data.
inlineReader.EndStep();

InSitu MPI

Coming soon…

Null

The Null Engine by-passes any heavy I/O operations that other Engines might potentially execute, for example, memory allocations, buffering, transport data movement. Calls to the Null engine would effectively return immediately without doing any effective operations.

The overall goal is to provide a mechanism to isolate an application behavior without the ADIOS 2 footprint. Use this engine to have an idea of the overhead cost of using a certain ADIOS 2 Engine (similar to writing to /dev/null) in an application.

Supported Virtual Engine Names

This section provides a description of the Virtual Engines that can be used to set up an actual Engine with specific parameters. These virtual names are used for beginner users to simplify the selection of an engine and its parameters. The following I/O uses cases are supported by virtual engine names:

  1. File: File I/O (Default engine).

    This sets up the I/O for files. If the file name passed in Open() ends with “.bp”, then the BP4 engine will be used starting in v2.5.0. If it ends with “.h5”, the HDF5 engine will be used. For old .bp files (BP version 3 format), the BP3 engine will be used for reading (v2.4.0 and below).

  2. FileStream: Online processing via files.

    This allows a Consumer to concurrently read the data while the Producer is writing new output steps into it. The Consumer will wait for the appearance of the file itself in Open() (for up to one hour) and wait for the appearance of new steps in the file (in BeginStep() up to the specificed timeout in that function).

  3. InSituAnalysis: Streaming data to another application.

    This sets up ADIOS for transferring data from a Producer to a Consumer application. The Producer and Consumer are synchronized at Open(). The Consumer will receive every single output step from the Producer, therefore, the Producer will block on output if the Consumer is slow.

  4. InSituVisualization:: Streaming data to another application without waiting for consumption.

    This sets up ADIOS for transferring data from a Producer to a Consumer without ever blocking the Producer. The Producer will throw away all output steps that are not immediately requested by a Consumer. It will also not wait for a Consumer to connect. This kind of streaming is great for an interactive visualization session where the user wants to see the most current state of the application.

  5. CodeCoupling:: Streaming data between two applications for code coupling.

    Producer and Consumer are waiting for each other in Open() and every step must be consumed. Currently, this is the same as in situ analysis.

These virtual engine names are used to select a specific engine and its parameters. In practice, after selecting the virtual engine name, one can modify the settings by adding/overwriting parameters. Eventually, a seasoned user would use the actual Engine names and parameterize it for the specific run.

These are the actual settings in ADIOS when a virtual engine is selected. The parameters below can be modified before the Open call.

  1. File. Refer to the parameter settings for these engines of BP4, BP3 and HDF5 engines earlier in this section.

  2. FileStream. The engine is BP4. The parameters are set to:

Key

Value Format

Default and Examples

OpenTimeoutSecs

float

3600 (wait for up to an hour)

BeginStepPollingFrequencySecs

float

1 (poll the file system with 1 second frequency

StreamReader

bool

On (process metadata in streaming mode)

  1. InSituAnalysis. The engine is SST. The parameters are set to:

Key

Value Format

Default and Examples

RendezvousReaderCount

integer

1 (Producer waits for the Consumer in Open)

QueueLimit

integer

1 (only buffer one step)

QueueFullPolicy

string

Block (wait for the Consumer to get every step)

FirstTimestepPrecious

bool

false (SST default)

AlwaysProvideLatestTimestep

bool

false (SST default)

  1. InSituVisualization. The engine is SST. The parameters are set to:

Key

Value Format

Default and Examples

RendezvousReaderCount

integer

0 (Producer does NOT wait for Consumer in Open)

QueueLimit

integer

3 (buffer first step + last two steps)

QueueFullPolicy

string

Discard (slow Consumer will miss out on steps)

FirstTimestepPrecious

bool

true (First step is kept around for late Consumers)

AlwaysProvideLatestTimestep

bool

false (SST default)

  1. Code Coupling. The engine is SST. The parameters are set to:

Key

Value Format

Default and Examples

RendezvousReaderCount

integer

1 (Producer waits for the Consumer in Open)

QueueLimit

integer

1 (only buffer one step)

QueueFullPolicy

string

Block (wait for the Consumer to get every step)

FirstTimestepPrecious

bool

false (SST default)

AlwaysProvideLatestTimestep

bool

false (SST default)