Interface Components

Components Overview

Note

If you are doing simple tasks where performance is a non-critical aspect please go to the High-Level APIs section for a quick start. If you are an HPC application developer or you want to use ADIOS2 functionality in full please read this chapter.

The simple way to understand the big picture for the ADIOS2 unified user interface components is to map each class to the actual definition of the ADIOS acronym.

Component

Acronym

Function

ADIOS

ADaptable

Set MPI comm domain

Set runtime settings

Own other components

IO

I/O

Set engine

Set variables/attributes

Set compile-time settings

Engine

System

Execute heavy IO tasks

Manage system resources

ADIOS2’s public APIs are based on the natural choice for each supported language to represent each ADIOS2 components and its interaction with application datatypes. Thus,

Language

Component API

Application Data

C++(11/newer)

objects/member functions

pointers/references/std::vector

C

handler/functions

pointers

Fortran

handler/subroutines

arrays up to 6D

Python

objects/member functions

numpy arrays.

The following section provides a common overview to all languages based on the C++11 APIs. For each specific language go to the Full APIs section, but it’s highly recommended to read this section as components map 1-to-1 in other languages.

The following figure depicts the components hierarchy from the application’s point of view.

https://i.imgur.com/y7bkQQt.png:alt:my-picture2
  • ADIOS: the ADIOS component is the starting point between an application and the ADIOS2 library. Applications provide:
    1. the scope of the ADIOS object through the MPI communicator,

    2. an optional runtime configuration file (in XML format) to allow changing settings without recompiling.

    The ADIOS component serves as a factory of adaptable IO components. Each IO must have a unique name within the scope of the ADIOS class object that created them with the DeclareIO function.

  • IO: the IO component is the bridge between the application specific settings, transports. It also serves as a factory of:
    1. Variables

    2. Attributes

    3. Engines

  • Variable: Variables are the link between self-describing representation in the ADIOS2 library and data from applications. Variables are identified by unique names in the scope of the particular IO that created them. When the Engine API functions are called, a Variable must be provided along with the application data.

  • Attribute: Attributes add extra information to the overall variables dataset defined in the IO class. They can be single or array values.

  • Engine: Engines define the actual system executing the heavy IO tasks at Open, BeginStep, Put, Get, EndStep and Close. Due to polymorphism, new IO system solutions can be developed quickly reusing internal components and reusing the same API. If IO.SetEngine is not called, the default engine is the binary-pack bp file reader and writer: BPFile.

  • Operator: (under development) this component defines possible operations to be applied on adios2 self-describing data. This higher level abstraction is needed to provide support for: Callback functions, Transforms, Analytics functions, Data Models functionality, etc. Any required task will be executed within the Engine. One or many operators can be associated with any of the adios2 objects or a group of them.

***ADIOS ***

The adios2::ADIOS component is the initial contact point between an
application and the ADIOS2 library.Applications

can be classified as MPI and non -

MPI based.We start by focusing on MPI applications as their non - MPI equivalent just removes the MPI - specific communicator

.

..code -

block::c++

/** ADIOS class factory of IO class objects */ adios2::ADIOS adios(“config.xml”, MPI_COMM_WORLD);

This component is created by passing :

    • *Runtime config file **(optional)

: ADIOS2 xml runtime config file,

see : ref :Runtime Configuration Files 2. * *MPI communicator * * : which determines the scope of the ADIOS library components in an

application.3. *

*debug mode flag * * : turns on / off exceptions triggered by user inputs

.

..note:

Debug mode is deprecated

and will not affect ADIOS2 behavior.Unexpected system failures and runtime errors are always checked by throwing std::runtime_error

.Keep in mind that Segmentation Faults are NOT runtime exceptions.We

try to keep user interactions as friendly as possible, please report any bugs on GitHub: https://github.com/ornladios/ADIOS2/issues

adios2::ADIOS objects can be created in MPI and non-MPI (serial) mode. Optionally, a runtime configuration file can be passed to the constructor indicating the full file path, name and extension. Thus resulting in:

Constructors for MPI applications

/** Constructors */

// version that accepts an optional runtime adios2 config file
adios2::ADIOS (const std::string configFile,
               MPI_COMM mpiComm = MPI_COMM_SELF,
               const bool debugMode = adios2::DebugON );
adios2::ADIOS(MPI_COMM mpiComm = MPI_COMM_SELF,

const bool debugMode = adios2::DebugON);

/** Examples */ adios2::ADIOS adios(MPI_COMM_WORLD); adios2::ADIOS adios(“config.xml”, MPI_COMM_WORLD, adios2::DebugOFF);

Constructors for non-MPI (serial) applications

/** Constructors */
adios2::ADIOS (const std::string configFile,
               const bool debugMode = adios2::DebugON );

adios2::ADIOS(const bool debugMode = adios2::DebugON);

/** Examples */ adios2::ADIOS adios(“config.xml”); adios2::ADIOS adios; // Do not use () for empty constructor.

Tip

adios2::DebugON and adios::DebugOFF are aliases to true and false, respectively. Use them for code clarity.

Factory of IO components: Multiple IO components (IO tasks) can be created from within the scope of an ADIOS object by calling the DeclareIO function:

/** Signature */
adios2::IO ADIOS::DeclareIO(const std::string ioName);

/** Examples */ adios2::IO bpWriter = adios.DeclareIO(“BPWriter”); adios2::IO bpReader = adios.DeclareIO(“BPReader”);

This function returns a reference to an existing IO class object that lives inside the ADIOS object that created it. The ioName identifier input must be unique for each IO. Trying to declare an IO object with the same name twice will throw an exception. IO names are used to identify IO components in the runtime configuration file, Runtime Configuration Files

As shown in the diagram below, each resulting IO object is self-managed and independent, thus providing an adaptable way to perform different kinds of I/O operations. Users must be careful not to create conflicts between system level unique I/O identifiers: file names, IP address and port, MPI Send/Receive message rank and tag, etc.

{

default_fontsize = 18; default_shape = roundedbox; default_linecolor = blue; span_width = 150;

ADIOS->IO_1, B, IO_N[label = “DeclareIO”, fontsize = 13]; B[shape = “dots”]; ADIOS->B[style = “none”];

}

..tip:

    The ADIOS component is the only one whose memory is owned by the
        application.Thus applications must decide on its scope.Any
            other component of the ADIOS2 API refers to a component that
                lives inside the ADIOS component(e.g.IO, Operator) or
indirectly in the IO component(Variable, Engine)

IO

The IO component is the connection between how applications set up their input/output options by selecting an Engine and its specific parameters, subscribing variables to self-describe the data, and setting supported transport modes to a particular Engine. Think of IO as a control panel for all the user-defined parameters that specific applications would like to be able to fine tune. None of the IO operations are heavyweight until the Open function that generates an Engine is called. Its API allows for:

  • Being a factory of Variable and Attribute components containing self-describing information about the overall data in the input output process

  • Setting Engine-specific parameters and adding supported modes of transport

  • Being a factory of (same type) Engine to execute the actual IO tasks

Note

If two different engine types are needed (e.g. BPFile, SST) define two IOs. Also, at reading always define separate IOs to avoid name-clashing of Variables.

Setting a Particular Engine and its Parameters

Engines are the actual components executing the heavy operations in ADIOS2. Each IO must select a type of Engine depending on the application needs through the SetEngine function. The default is the BPFile engine, for writing and reading bp files, if SetEngine is not called.

/** Signature */
void adios2::IO::SetEngine( const std::string engineType );

/** Example */
bpIO.SetEngine("BPFileWriter");

Each Engine allows the user to fine tune execution of buffering and output tasks through passing parameters to the IO object that will then propagate to the Engine. For a list of parameters allowed by each engine see Available Engines.

Note

adios2::Params is an alias to std::map<std::string,std::string> to pass parameters as key-value string pairs, which can be initialized with curly-brace initializer lists.

/** Signature */
/** Passing several parameters at once */
void SetParameters(const adios2:Params& parameters);
/** Passing one parameter key-value pair at a time */
void SetParameter(const std::string key, const std::string value);

/** Examples */
io.SetParameters( { {"Threads", "4"},
                    {"ProfilingUnits", "Milliseconds"},
                    {"MaxBufferSize","2Gb"},
                    {"BufferGrowthFactor", "1.5" }
                    {"FlushStepsCount", "5" }
                  } );
io.SetParameter( "Threads", "4 );

Adding Supported Transports with Parameters

The AddTransport function returns an unsigned int handler for each transport that can be used with the Engine Close function at different times. AddTransport must provide library specific settings that the low-level system library interface allows. These options are expected to become more complex as new modes of transport are allowed beyond files (e.g. RDMA).

/** Signature */
unsigned int AddTransport( const std::string transportType,
                           const adios2::Params& parameters );

/** Examples */
const unsigned int file1 = io.AddTransport( "File",
                                            { {"Library", "fstream"},
                                              {"Name","file1.bp" }
                                            } );

const unsigned int file2 = io.AddTransport( "File",
                                            { {"Library", "POSIX"},
                                              {"Name","file2.bp" }
                                            } );

const unsigned int wan = io.AddTransport( "WAN",
                                          { {"Library", "Zmq"},
                                            {"IP","127.0.0.1" },
                                            {"Port","80"}
                                          } );

Defining, Inquiring and Removing Variables and Attributes

The template functions DefineVariable<T> allows subscribing self-describing data into ADIOS2 by returning a reference to a Variable class object whose scope is the same as the IO object that created it. The user must provide a unique name (among Variables), the dimensions: MPI global: shape, MPI local: start and offset, optionally a flag indicating that dimensions are known to be constant, and a data pointer if defined in the application. Note: actual data is not passed at this stage. This is done by the Engine functions Put/Get for Variables. See the Variable section for supported types and shapes.

Tip

adios2::Dims is an alias to std::vector<std::size_t>, while adios2::ConstantDims is an alias to bool true. Use them for code clarity.

/** Signature */
adios2::Variable<T>
    DefineVariable<T>(const std::string name,
                      const adios2::Dims &shape = {}, // Shape of global object
                      const adios2::Dims &start = {}, // Where to begin writing
                      const adios2::Dims &count = {}, // Where to end writing
                      const bool constantDims = false);

/** Example */
/** global array of floats with constant dimensions */
adios2::Variable<float> varFloats =
    io.DefineVariable<float>("bpFloats",
                             {size * Nx},
                             {rank * Nx},
                             {Nx},
                             adios2::ConstantDims);

Attributes are extra-information associated with the current IO object. The function DefineAttribute<T> allows for defining single value and array attributes. Keep in mind that Attributes apply to all Engines created by the IO object and, unlike Variables which are passed to each Engine explicitly, their definition contains their actual data.

/** Signatures */

/** Single value */
adios2::Attribute<T> DefineAttribute(const std::string &name,
                              const T &value);

/** Arrays */
adios2::Attribute<T> DefineAttribute(const std::string &name,
                              const T *array,
                              const size_t elements);

In situations in which a variable and attribute has been previously defined: 1) a variable/attribute reference goes out of scope, or 2) when reading from an incoming stream, IO can inquire the current variables and attributes and return a pointer acting as reference. If the inquired variable/attribute is not found, then nullptr is returned.

/** Signature */
adios2::Variable<T> InquireVariable<T>(const std::string &name) noexcept;
adios2::Attribute<T> InquireAttribute<T>(const std::string &name) noexcept;

/** Example */
adios2::Variable<float> varPressure = io.InquireVariable<T>("pressure");
if( varPressure ) // it exists
{
  ...
}

Note

The reason for returning a pointer when inquiring, unlike references when defining, is because nullptr is a valid state (e.g. variables hasn’t arrived in a stream, wasn’t previously defined or wasn’t written in a file).

Always check for nullptr in the pointer returned by InquireVariable<T> or InquireAttribute<T>

Caution

Since Inquire are template functions, name and type must both match the variable/attribute you are looking for.

Removing Variables and Attributes can be done one at a time or by removing all existing variables or attributes in IO.

/** Signature */
bool IO::RemoveVariable(const std::string &name) noexcept;
void IO::RemoveAllVariables( ) noexcept;

bool IO::RemoveAttribute(const std::string &name) noexcept;
void IO::RemoveAllAttributes( ) noexcept;

Caution

Remove functions must be used with caution as they generate dangling Variable/Attributes pointers or references if they didn’t go out of scope.

Tip

It is good practice to check the bool flag returned by RemoveVariable or RemoveAttribute.

Opening an Engine

The IO::Open function creates a new derived object of the abstract Engine class and returns a reference handler to the user. A particular Engine type is set to the current IO component with the IO::SetEngine function. Engine polymorphism is handled internally by the IO class, which allows subscribing future derived Engine types without changing the basic API.

Note

Currently only adios2::Mode:Write and adios2::Mode::Read are supported, adios2::Mode::Append is under development

/** Signatures */
/** Provide a new MPI communicator other than from ADIOS->IO->Engine */
adios2::Engine &adios2::IO::Open( const std::string &name,
                                  const adios2::Mode mode,
                                  MPI_Comm mpiComm );

/** Reuse the MPI communicator from ADIOS->IO->Engine \n or non-MPI serial mode */
adios2::Engine &adios2::IO::Open(const std::string &name,
                                 const adios2::Mode mode);


/** Examples */

/** Engine derived class, spawned to start Write operations */
adios2::Engine bpWriter = io.Open("myVector.bp", adios2::Mode::Write);

if(bpWriter) // good practice
{
  ...
}


/** Engine derived class, spawned to start Read operations on rank 0 */
if( rank == 0 )
{
    adios2::Engine bpReader = io.Open("myVector.bp",
                                       adios2::Mode::Read,
                                       MPI_COMM_SELF);
    if(bpReader) // good practice
    {
     ...
    }
}

Tip

It is good practice to always check the validity of each ADIOS object before operating on it using the explicit bool operator. if( engine ){ }

Caution

Always pass MPI_COMM_SELF if an Engine lives in only one MPI process. Open and Close are collective operations.

Variable

Self-describing Variables are the atomic unit of data representation in the ADIOS2 library when interacting with applications. Thus, the Variable component is the link between a piece of data coming from an application and its self-describing information or metadata. This component handles all application variables classified by data type and shape type.

Each IO holds its own set of Variables, each Variable is identified with a unique name. They are created using the reference from IO::DefineVariable<T> or retrieved using the pointer from IO::InquireVariable<T> functions in IO.

Variables Data Types

Currently, only primitive types are supported in ADIOS 2. Fixed-width types from <cinttypes> and <cstdint> should be preferred when writing portable code. ADIOS 2 maps primitive “natural” types to its equivalent fixed-width type (e.g. int -> int32_t). Acceptable values for the type T in Variable<T> (this is C++ only, see below for other bindings) along with their preferred fix-width equivalent in 64-bit platforms:

Data types Variables supported by ADIOS2 Variable<T>

std::string (only used for global and local values, not arrays)
char                      -> int8_t or uint8_t depending on compiler flags
signed char               -> int8_t
unsigned char             -> uint8_t
short                     -> int16_t
unsigned short            -> uint16_t
int                       -> int32_t
unsigned int              -> uint32_t
long int                  -> int32_t or int64_t (Linux)
long long int             -> int64_t
unsigned long int         -> uint32_t or uint64_t (Linux)
unsigned long long int    -> uint64_t
float                     -> always 32-bit = 4 bytes
double                    -> always 64-bit = 8 bytes
long double               -> platform dependent
std::complex<float>       -> always  64-bit = 8 bytes = 2 * float
std::complex<double>      -> always 128-bit = 16 bytes = 2 * double

Tip

It’s recommended to be consistent when using types for portability. If data is defined as a fixed-width integer, define variables in ADIOS2 using a fixed-width type, e.g. for int32_t data types use DefineVariable<int32_t>.

Note

C, Fortran APIs: the enum and parameter adios2_type_XXX only provides fixed-width types

Note

Python APIs: use the equivalent fixed-width types from numpy. If dtype is not specified, ADIOS 2 would handle numpy defaults just fine as long as primitive types are passed.

Variables Shape Types

Note

As of beta release version 2.2.0 local variable reads are not supported, yet. This is work in progress. Please use global arrays and values as a workaround.

ADIOS2 is designed out-of-the-box for MPI applications. Thus different application data shape types must be covered depending on their scope within a particular MPI communicator. The shape type is defined at creation from the IO object by providing the dimensions: shape, start, count in the IO::DeclareVariable<T> template function. The supported Variables by shape types can be classified as:

  1. Global Single Value: only name is required in their definition. This variables are helpful for storing global information, preferably managed by only one MPI process, that may or may not change over steps: e.g. total number of particles, collective norm, number of nodes/cells, etc.

    if( rank == 0 )
    {
       adios2::Variable<unsigned int> varNodes = adios2::DefineVariable<unsigned int>("Nodes");
       adios2::Variable<std::string> varFlag = adios2::DefineVariable<std::string>("Nodes flag");
       // ...
       engine.Put( varNodes, nodes );
       engine.Put( varFlag, "increased" );
       // ...
    }
    

    Note

    Variables of type string are defined just like global single values. In the current adios2 version multidimensional strings are supported for fixed size strings through variables of type char.

  2. Global Array: the most common shape used for storing self-describing data used for analysis that lives in several MPI processes. The image below illustrates the definitions of the dimension components in a global array: shape, start, and count.

    https://i.imgur.com/MKwNe5e.png:alt:my-picture2

    Warning

    Be aware of data ordering in your language of choice (Row-Major or Column-Major) as depicted in the above figure. Data decomposition is done by the application based on their requirements, not by adios2.

    Start and Count local dimensions can be later modified with the Variable::SetSelection function if it is not a constant dimensions variable.

  3. Local Value: single value-per-rank variables that are local to the MPI process. They are defined by passing the adios2::LocalValueDim enum as follows:

    adios2::Variable<int> varProcessID =
          io.DefineVariable<int>("ProcessID", {adios2::LocalValueDim})
    //...
    engine.Put<int>(varProcessID, rank);
    
  4. Local Array: single array variables that are local to the MPI process. These are more commonly used to write Checkpoint data, that is later read for Restart. Reading, however, needs to be handled differently: each process’ array has to be read separately, using SetSelection per rank. The size of each process selection should be discovered by the reading application by inquiring per-block size information of the variable, and allocate memory accordingly.

https://i.imgur.com/XLh2TUG.png:alt:my-picture3
  1. Joined Array (NOT YET SUPPORTED): in certain circumstances every process has an array that is different only in one dimension. ADIOS2 allows user to present them as a global array by joining the arrays together. For example, if every process has a table with a different number of rows, and one does not want to do a global communication to calculate the offsets in the global table, one can just write the local arrays and let ADIOS2 calculate the offsets at read time (when all sizes are known by any process).

    adios2::Variable<double> varTable = io.DefineVariable<double>(
          "table", {adios2::JoinedDim, Ncolumns}, {}, {Nrows, Ncolumns});
    

    Note

    Only one dimension can be joinable, every other dimension must be the same on each process.

Note

Constants are not handled separately from step-varying values in ADIOS2. Simply write them only once from one rank.

Attribute

Attributes are extra information associated with a particular IO component. They can be thought of a very simplified version of a Variable, but with the goal of adding extra metadata. The most common use is the addition of human-readable information available when producing data (e.g. "experiment name", "date and time", "04,27,2017", or a schema).

Currently, ADIOS2 supports single values and arrays of primitive types (excluding complex<T>) for the template type in the IO::DefineAttribute<T> and IO::InquireAttribute<T> function (in C++).

Data types Attributes supported by ADIOS2:

std::string
char
signed char
unsigned char
short
unsigned short
int
unsigned int
long int
long long int
unsigned long int
unsigned long long int
float
double
long double

The returned object (DefineAttribute or InquireAttribute) only serves the purpose to inspect the current Attribute<T> information within code.

Engine

The Engine abstraction component serves as the base interface to the actual IO Systems executing the heavy-load tasks performed when Producing and Consuming data.

Engine functionality works around two concepts from the application’s point-of-view:

  1. Self-describing variables are published and consumed in “steps” in either “File” random-access (all steps are available) or “Streaming” (steps are available as they are produced in a step-by-step fashion).

  2. Self-describing variables are published (Put) and consumed (Get) using a “sync” or “deferred” (lazy evaluation) policy.

Caution

The ADIOS 2 “step” is a logical abstraction that means different things depending on the application context. Examples: “time step”, “iteration step”, “inner loop step”, or “interpolation step”, “variable section”, etc. It only indicates how the variables were passed into ADIOS 2 (e.g. I/O steps) without the user having to index this information on their own.

Tip

Publishing/Consuming data can be seen as a round-trip in ADIOS 2. Put and Get APIs for write/append and read modes aim to be “symmetric”. Hence, reusing similar functions, objects, semantics as much as possible.

The rest of the section explains the important concepts

BeginStep

Begin logical step and return the status (via an enum) of the stream to be read/written. In streaming engines BeginStep in where the receiver tries to acquire a new step in the reading process. The full signature allows for a mode and timeout parameters. See Supported Engines for more information on what engine allows. A simplified signature allows each engine to pick reasonable defaults.

// Full signature
StepStatus BeginStep(const StepMode mode,
                     const float timeoutSeconds = -1.f);

// Simplified signature
StepStatus BeginStep();

EndStep

Ends logical step, flush to transports depending on IO parameters and engine default behavior.

Tip

To write portable code for a step-by-step access across adios2 engines (file and streaming engines) use BeginStep and EndStep.

Danger

Accessing random steps in read mode (e.g. Variable<T>::SetStepSelection in file engines) will create a conflict with BeginStep and EndStep and will throw an exception. In file engines, data is either consumed in a random-access or step-by-step mode, but not both.

Close

Close current engine and underlying transports. Engine object can’t be used after this call.

Tip

C++11: despite the fact that we use RAII, always use Close for explicit resource management and guarantee that the Engine data transport operations are concluded.

Put: modes and memory contracts

Put is the generalized abstract function for publishing data in adios2 when an Engine is created using Write, or Append, mode at IO::Open.

The most common signature is the one that passes a Variable<T> object for the metadata, a const piece of contiguous memory for the data, and a mode for either Deferred (data is collected until EndStep/PerformPuts/Close) or Sync (data is reusable immediately). This is the most common use case in applications.

  1. Deferred (default) or Sync mode, data is contiguous memory

    void Put(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);
    

Optionally, adios2 Engines can provide direct access to its buffer memory using an overload that returns a piece of memory to a variable block, basically a zero-copy. Variable<T>::Span is based on a subset of the upcoming C++20 std::span, which is non-owning and typed contiguous memory piece (it helps to review what std::span is, formerly known as array_view). Spans act as a 1D memory container meant to be filled out by the application. It is safely used as any other STL sequence container, with iterators begin() and end(), operator[] and at(), while also providing data() and size() functions to manipulate the internal pointer.

Variable<T>::Span is helpful in situations in which temporaries are needed to create contiguous pieces of memory from non-contiguous pieces (e.g. tables, arrays without ghost-cells), or just to save memory as the returned Variable<T>::Span can be used for computation, thus avoiding an extra copy from user memory into the adios buffer. Variable<T>::Span combines a hybrid Sync and Deferred mode, in which the initial value and memory allocations are Sync, while data population and metadata collection are done at EndStep/PerformPuts/Close. Memory contracts are explained later in this chapter followed by examples.

The following Variable<T>::Span signatures are available:

  1. Return a span setting a default T() value into a default buffer

    Variable<T>::Span Put(Variable<T> variable);
    
  2. Return a span setting an initial fill value into a certain buffer. If span is not returned then the fillValue is fixed for that block.

    Variable<T>::Span Put(Variable<T> variable, const size_t bufferID, const T fillValue);
    

Warning

As of version 2.4.0 only the default BP3 engine using the C++11 bindings supports Variable<T>::Span Put signatures. We plan to support this feature and add this to streaming Engines.

In summary, the following are the current Put signatures for publishing data in ADIOS 2:

  1. Deferred (default) or Sync mode, data is contiguous memory put in an adios2 buffer

    void Put(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);
    
  2. Return a span setting a default T() value into a default adios2 buffer. If span is not returned then the default T() is fixed for that block (e.g. zeros).

    Variable<T>::Span Put(Variable<T> variable);
    
  3. Return a span setting an initial fill value into a certain buffer. If span is not returned then the fillValue is fixed for that block.

    Variable<T>::Span Put(Variable<T> variable, const size_t bufferID, const T fillValue);
    

The following table summarizes the memory contracts required by adios2 engines between Put signatures and the data memory coming from an application:

Put

Data Memory

Contract

Deferred

Pointer

Contents

do not modify until PerformPuts/EndStep/Close

consumed at PerformPuts/EndStep/Close

Sync

Pointer

Contents

modify after Put

consumed at Put

Span

Pointer

Contents

modified by new Spans, updated span iterators/data

consumed at PerformPuts/EndStep/Close

Note

In Fortran (array) and Python (numpy array) avoid operations that modify the internal structure of an array (size) to preserve the address.

Each Engine will give a concrete meaning to each functions signatures, but all of them must follow the same memory contracts to the “data pointer”: the memory address itself, and the “data contents”: memory bits (values).

  1. Put in Deferred or lazy evaluation mode (default): this is the preferred mode as it allows Put calls to be “grouped” before potential data transport at the first encounter of PerformPuts, EndStep or Close.

    Put(variable, data);
    Put(variable, data, adios2::Mode::Deferred);
    

    Deferred memory contracts:

    • “data pointer” do not modify (e.g. resize) until first call to PerformPuts, EndStep or Close.

    • “data contents” consumed at first call to PerformPuts, EndStep or Close. It’s recommended practice to set all data contents before Put.

    Usage:

    // recommended use:
    // set "data pointer" and "data contents"
    // before Put
    data[0] = 10;
    
    // Puts data pointer into adios2 engine
    // associated with current variable metadata
    engine.Put(variable, data);
    
    // valid but not recommended
    // risk of changing "data pointer" (e.g. resize)
    data[1] = 10;
    
    // "data contents" must be ready
    // "data pointer" must be the same as in Put
    engine.EndStep();
    //engine.PerformPuts();
    //engine.Close();
    
    // now data pointer can be reused or modified
    

    Tip

    It’s recommended practice to set all data contents before Put in deferred mode to minimize the risk of modifying the data pointer (not just the contents) before PerformPuts/EndStep/Close.

  2. Put in Sync mode: this is the special case, data pointer becomes reusable right after Put. Only use it if absolutely necessary (e.g. memory bound application or out of scope data, temporary).

    Put(variable, *data, adios2::Mode::Sync);
    

Sync memory contracts:

  • “data pointer” and “data contents” can be modified after this call.

Usage:

// set "data pointer" and "data contents"
// before Put in Sync mode
data[0] = 10;

// Puts data pointer into adios2 engine
// associated with current variable metadata
engine.Put(variable, data, adios2::Mode::Sync);

// data pointer and contents can be reused
// in application
  1. Put returning a Span: signature that allows access to adios2 internal buffer.

    Use cases:

    • population from non-contiguous memory structures

    • memory-bound applications

    Limitations:

    • does not allow operations (compression)

    • must keep engine and variables within scope of span usage

    Span memory contracts:

    • “data pointer” provided by the engine and returned by span.data(), might change with the generation of a new span. It follows iterator invalidation rules from std::vector. Use span.data() or iterators, span.begin(), span.end() to keep an updated data pointer.

    • span “data contents” are published at the first call to PerformPuts, EndStep or Close

    Usage:

    // return a span into a block of memory
    // set memory to default T()
    adios2::Variable<int32_t>::Span span1 = Put(var1);
    
    // just like with std::vector::data()
    // iterator invalidation rules
    // dataPtr might become invalid
    // always use span1.data() directly
    T* dataPtr = span1.data();
    
    // set memory value to -1 in buffer 0
    adios2::Variable<float>::Span span2 = Put(var2, 0, -1);
    
    // not returning a span just sets a constant value
    Put(var3);
    Put(var4, 0, 2);
    
    // fill span1
    span1[0] = 0;
    span1[1] = 1;
    span1[2] = 2;
    
    // fill span2
    span2[1] = 1;
    span2[2] = 2;
    
    // here collect all spans
    // they become invalid
    engine.EndStep();
    //engine.PerformPuts();
    //engine.Close();
    
    // var1 = { 0, 1, 2 };
    // var2 = { -1., 1., 2.};
    // var3 = { 0, 0, 0};
    // var4 = { 2, 2, 2};
    

PerformsPuts

Executes all pending Put calls in deferred mode ad collect spans data

Get: modes and memory contracts

Get is the generalized abstract function for consuming data in adios2 when an Engine is created using Read mode at IO::Open. ADIOS 2 Put and Get APIs semantics are as symmetric as possible considering that they are opposite operations (e.g. Put passes const T*, while Get populates a non-const T*).

The following are the current Get signatures:

  1. Deferred (default) or Sync mode, data is contiguous pre-allocated memory

    Get(Variable<T> variable, const T* data, const adios2::Mode = adios2::Mode::Deferred);
    
  2. C++11 only, dataV is automatically resized by adios2 based on Variable selection

    Get(Variable<T> variable, std::vector<T>& dataV, const adios2::Mode = adios2::Mode::Deferred);
    

The following table summarizes the memory contracts required by adios2 engines between Get signatures and the pre-allocated (except when using C++11 std::vector) data memory coming from an application:

Get

Data Memory

Contract

Deferred

Pointer

Contents

do not modify until PerformPuts/EndStep/Close

populated at PerformPuts/EndStep/Close

Sync

Pointer

Contents

modify after Put

populated at Put

  1. Get in Deferred or lazy evaluation mode (default): this is the preferred mode as it allows Get calls to be “grouped” before potential data transport at the first encounter of PerformPuts, EndStep or Close.

    Get(variable, data);
    Get(variable, data, adios2::Mode::Deferred);
    

    Deferred memory contracts:

    • “data pointer”: do not modify (e.g. resize) until first call to PerformPuts, EndStep or Close.

    • “data contents”: populated at first call to PerformPuts, EndStep or Close.

    Usage:

    std::vector<double> data;
    
    // resize memory to expected size
    data.resize(varBlockSize);
    // valid if all memory is populated
    // data.reserve(varBlockSize);
    
    // Gets data pointer to adios2 engine
    // associated with current variable metadata
    engine.Get(variable, data.data() );
    
    // optionally pass data std::vector
    // leave resize to adios2
    //engine.Get(variable, data);
    
    // "data contents" must be ready
    // "data pointer" must be the same as in Get
    engine.EndStep();
    //engine.PerformPuts();
    //engine.Close();
    
    // now data pointer can be reused or modified
    

    Caution

    Use uninitialized memory at your own risk (e.g. vector reserve, new, malloc). Accessing unitiliazed memory is undefined behavior.

  2. Put in Sync mode: this is the special case, data pointer becomes reusable right after Put. Only use it if absolutely necessary (e.g. memory bound application or out of scope data, temporary).

    Get(variable, *data, adios2::Mode::Sync);
    

Sync memory contracts:

  • “data pointer” and “data contents” can be modified after this call.

Usage:

.. code-block:: c++

std::vector<double> data;

// resize memory to expected size
data.resize(varBlockSize);
// valid if all memory is populated
// data.reserve(varBlockSize);

// Gets data pointer to adios2 engine
// associated with current variable metadata
engine.Get(variable, data.data() );

// "data contents" are ready
// "data pointer" can be reused by the application

Note

As of v2.4 Get doesn’t support returning spans. This is future work required in streaming engines if the application wants a non-owning view into the data buffer for a particular variable block.

PerformsGets

Executes all pending Get calls in deferred mode

Engine usage example

The following example illustrates the basic API usage in write mode for data generated at each application step:

adios2::Engine engine = io.Open("file.bp", adios2::Mode::Write);

for( size_t i = 0; i < steps; ++i )
{
   // ... Application *data generation

   engine.BeginStep(); //next "logical" step for this application

   engine.Put(varT, dataT, adios2::Mode::Sync);
   // dataT memory already consumed by engine
   // Application can modify dataT address and contents

   // deferred functions return immediately (lazy evaluation),
   // dataU, dataV and dataW pointers must not be modified
   // until PerformPuts, EndStep or Close.
   // 1st batch
   engine.Put(varU, dataU);
   engine.Put(varV, dataV);

   // in this case adios2::Mode::Deferred is redundant,
   // as this is the default option
   engine.Put(varW, dataW, adios2::Mode::Deferred);

   // effectively dataU, dataV, dataW are "deferred"
   // until the first call to PerformPuts, EndStep or Close.
   // Application MUST NOT modify the data pointer (e.g. resize memory).
   engine.PerformPuts();

   // dataU, dataV, dataW pointers/values can now be reused

   // ... Application modifies dataU, dataV, dataW

   //2nd batch
   engine.Put(varU, dataU);
   engine.Put(varV, dataV);
   engine.Put(varW, dataW);
   // Application MUST NOT modify dataU, dataV and dataW pointers (e.g. resize),
   // optionally data can be modified, but not recommended
   dataU[0] = 10
   dataV[0] = 10
   dataW[0] = 10
   engine.PerformPuts();

   // dataU, dataV, dataW pointers/values can now be reused

   // Puts a varP block of zeros
   adios2::Variable<double>::Span spanP = Put<double>(varP);

   // Not recommended mixing static pointers,
   // span follows
   // the same pointer/iterator invalidation
   // rules as std::vector
   T* p = spanP.data();

   // Puts a varMu block of 1e-6
   adios2::Variable<double>::Span spanMu = Put<double>(varMu, 0, 1e-6);

   // p might be invalidated
   // by a new span, use spanP.data() again
   foo(spanP.data());

   // Puts a varRho block with a constant value of 1.225
   Put<double>(varMu, 0, 1.225);

   // it's preferable to start modifying spans
   // after all of them are created
   foo(spanP.data());
   bar(spanMu.begin(), spanMu.end());


   engine.EndStep();
   // spanP, spanMu are consumed by the library
   // end of current logical step,
   // default behavior: transport data
}

engine.Close();
// engine is unreachable and all data should be transported
...

Tip

Prefer default Deferred (lazy evaluation) functions as they have the potential to group several variables with the trade-off of not being able to reuse the pointers memory space until EndStep, PerformPuts, PerformGets, or Close. Only use Sync if you really have to (e.g. reuse memory space from pointer). ADIOS2 prefers a step-based IO in which everything is known ahead of time when writing an entire step.

Danger

The default behavior of adios2 Put and Get calls IS NOT synchronized, but rather deferred. It’s actually the opposite of MPI_Put and more like MPI_rPut. Do not assume the data pointer is usable after a Put and Get, before EndStep, Close or the corresponding PerformPuts/PerformGets. Avoid using TEMPORARIES, r-values, and out-of-scope variables in Deferred mode, use adios2::Mode::Sync if required.

Available Engines

A particular engine is set within the IO object that creates it with the IO::SetEngine function in a case insensitive manner. If the SetEngine function is not invoked the default engine is the BPFile for writing and reading self-describing bp (binary-pack) files.

Application

Engine

Description

File

BP4

HDF5

DEFAULT write/read ADIOS2 native bp files

write/read interoperability with HDF5 files

Wide-Area-Network (WAN)

DataMan

write/read TCP/IP streams

Staging

SST

write/read to a “staging” area: e.g. RDMA

Engine Polymorphism has a two-fold goal:

  1. Each Engine implements an orthogonal IO scenario targeting a use case (e.g. Files, WAN, InSitu MPI, etc) using a simple, unified API.

  2. Allow developers to build their own custom system solution based on their particular requirements in the own playground space. Reusable toolkit objects are available inside ADIOS2 for common tasks: bp buffering, transport management, transports, etc.

A class that extends Engine must be thought of as a solution to a range of IO applications. Each engine must provide a list of supported parameters, set in the IO object creating this engine using IO::SetParameters, IO::SetParameter, and supported transports (and their parameters) in IO::AddTransport. Each Engine’s particular options are documented in Supported Engines.

Operator

The Operator abstraction allows ADIOS2 to act upon the user application data, either from a adios2::Variable or a set of Variables in an adios2::IO object. Current supported operations are:

  1. Data compression/decompression, lossy and lossless.

  2. Callback functions (C++11 bindings only) supported by specific engines

ADIOS2 enables the use of third-party libraries to execute these tasks.

Warning

Make sure your ADIOS2 library installation used for writing and reading was linked with a compatible version of a third-party dependency when working with operators. ADIOS2 will issue an exception if an operator library dependency is missing.

Runtime Configuration Files

ADIOS2 supports passing an optional runtime configuration file to the ADIOS component constructor (adios2_init in C, Fortran).

This file contains key-value pairs equivalent to the compile time IO::SetParameters (adios2_set_parameter in C, Fortran), and IO::AddTransport (adios2_set_transport_parameter in C, Fortran).

Each Engine and Operator must provide a set of available parameters as described in the Supported Engines section. Up to version v2.6.0 only XML is supported, v2.6.0 and beyond support XML as well as YAML.

Warning

Configuration files must have the corresponding format extension .xml, .yaml: config.xml, config.yaml, etc.

XML

<?xml version="1.0"?>
<adios-config>
  <io name="IONAME_1">

    <engine type="ENGINE_TYPE">

      <!-- Equivalent to IO::SetParameters-->
      <parameter key="KEY_1" value="VALUE_1"/>
      <parameter key="KEY_2" value="VALUE_2"/>
      <!-- ... -->
      <parameter key="KEY_N" value="VALUE_N"/>

    </engine>

    <!-- Equivalent to IO::AddTransport -->
    <transport type="TRANSPORT_TYPE">
      <!-- Equivalent to IO::SetParameters-->
      <parameter key="KEY_1" value="VALUE_1"/>
      <parameter key="KEY_2" value="VALUE_2"/>
      <!-- ... -->
      <parameter key="KEY_N" value="VALUE_N"/>
    </transport>
  </io>

  <io name="IONAME_2">
    <!-- ... -->
  </io>
</adios-config>

YAML

Starting with v2.6.0, ADIOS supports YAML configuration files. The syntax follows strict use of the YAML node keywords mapping to the ADIOS2 components hierarchy. If a keyword is unknown ADIOS2 simply ignores it. For an example file refer to adios2 config file example in our repo.

---
# adios2 config.yaml
# IO YAML Sequence (-) Nodes to allow for multiple IO nodes
# IO name referred in code with DeclareIO is mandatory

- IO: "IOName"

  Engine:
     # If Type is missing or commented out, default Engine is picked up
     Type: "BP4"
     # optional engine parameters
     key1: value1
     key2: value2
     key3: value3

  Variables:

      # Variable Name is Mandatory
    - Variable: "VariableName1"
      Operations:
          # Operation Type is mandatory (zfp, sz, etc.)
        - Type: operatorType
          key1: value1
          key2: value2

    - Variable: "VariableName2"
      Operations:
          # Operations sequence of maps
        - {Type: operatorType, key1: value1}
        - {Type: z-checker, key1: value1, key2: value2}

  Transports:
      # Transport sequence of maps
    - {Type: file, Library: fstream}
    - {Type: rdma, Library: ibverbs}

  ...

Caution

YAML is case sensitive, make sure the node identifiers follow strictly the keywords: IO, Engine, Variables, Variable, Operations, Transports, Type.

Tip

Run a YAML validator or use a YAML editor to make sure the provided file is YAML compatible.