Command Line Utilities
ADIOS 2 provides a set of command line utilities for quick data exploration and
manipulation that builds on top of the library. The are located in the inside the adios2-install-location/bin
directory after a make install
Tip
Optionally the adios2-install-location/bin
location can be added to your PATH to avoid absolute paths when using adios2 command line utilities
Currently supported tools are:
bpls
: exploration of bp/hdf5 files data and metadata in human readable formatsadios_reorganize
adios2-config
sst_conn_tool
: SST staging engine connectivity diagnostic tool
bpls : Inspecting Data
The bpls
utility is for examining and pretty-printing the content of ADIOS output files (BP and HDF5 files).
By default, it lists the variables in the file including the type, name, and dimensionality.
Let’s assume we run the Heat Transfer tutorial example and produce the output by
$ mpirun -n 12 ./heatSimulation sim.bp 3 4 5 4 3 1
Process decomposition : 3 x 4
Array size per process : 5 x 4
Number of output steps : 3
Iterations per step : 1
$ mpirun -n 3 ./heatAnalysis sim.bp a.bp 3 1
$ bpls a.bp
double T 3*{15, 16}
double dT 3*{15, 16}
In our example, we have two arrays, T
and dT
.
Both are 2-dimensional double
arrays, their global size is 15x16
and the file contains 3 output steps
of these arrays.
Note
bpls is written in C++ and therefore sees the order of the dimensions in row major. If the data was written from Fortran in column-major order, you will see the dimension order flipped when listing with bpls, just as a code written in C++ or python would see the data.
Here is the description of the most used options
(use bpls -h
to print help on all options for this utility).
-l
Print the min/max of the arrays and the values of scalar values
$ bpls -l a.bp double T 3*{15, 16} = 0 / 200 double dT 3*{15, 16} = -53.1922 / 49.7861
-a
-A
List the attributes along with the variables.
-A
will print the attributes only.$ bpls a.bp -la double T 3*{15, 16} = 0 / 200 string T/description attr = "Temperature from simulation" string T/unit attr = "C" double dT 3*{15, 16} = -53.1922 / 49.7861 string dT/description attr = "Temperature difference between two steps calculated in analysis"
pattern
,-e
Select which variables/attributes to list or dump. By default the pattern(s) are interpreted as shell file patterns.
$ bpls a.bp -la T* double T 3*{15, 16} = 0 / 200
Multiple patterns can be defined in the command line.
$ bpls a.bp -la T/* dT/* string T/description attr = "Temperature from simulation" string T/unit attr = "C" string dT/description attr = "Temperature difference between two steps calculated in analysis"
If the -e option is given (all) the pattern(s) will be interpreted as regular expressions.
$ bpls a.bp -la T.* -e double T 3*{15, 16} = 0 / 200 string T/description attr = "Temperature from simulation" string T/unit attr = "C"
-D
Print the decomposition of a variable. In the BP file, the data blocks written by different writers are stored separately and have their own size info and min/max statistics. This option is useful at code development to check if the output file is written the way intended.
$ bpls a.bp -l T -D double T 3*{15, 16} = 0 / 200 step 0: block 0: [ 0: 4, 0:15] = 3.54199e-14 / 200 block 1: [ 5: 9, 0:15] = 58.3642 / 200 block 2: [10:14, 0:15] = 0 / 200 step 1: block 0: [ 0: 4, 0:15] = 31.4891 / 153.432 block 1: [ 5: 9, 0:15] = 68.2107 / 180.184 block 2: [10:14, 0:15] = 31.4891 / 161.699 step 2: block 0: [ 0: 4, 0:15] = 48.0431 / 135.225 block 1: [ 5: 9, 0:15] = 74.064 / 170.002 block 2: [10:14, 0:15] = 48.0431 / 147.87
In this case we find 3 blocks per output step and 3 output steps. We can see that the variable
T
was decomposed in the first (slow) dimension. In the above example, theT
variable in the simulation output (sim.bp
) had 12 blocks per step, but the analysis code was running on 3 processes, effectively reorganizing the data into fewer larger blocks.-d
Dump the data content of a variable. For pretty-printing, one should use the additional
-n
and-f
options. For selecting only a subset of a variable, one should use the-s
and-c
options.By default, six values are printed per line and using C style
-g
prints for floating point values.$ bpls a.bp -d T double T 3*{15, 16} (0, 0, 0) 124.925 124.296 139.024 95.2078 144.864 191.485 (0, 0, 6) 139.024 140.814 124.925 109.035 110.825 58.3642 (0, 0,12) 104.985 154.641 110.825 125.553 66.5603 65.9316 ... (2,14, 4) 105.918 116.842 111.249 102.044 93.3121 84.5802 (2,14,10) 75.3746 69.782 80.706 93.5492 94.7595 95.0709
For pretty-printing, use the additional
-n
and-f
options.$ bpls a.bp -d T -n 16 -f "%3.0f" double T 3*{15, 16} (0, 0, 0) 125 124 139 95 145 191 139 141 125 109 111 58 105 155 111 126 (0, 1, 0) 67 66 81 37 86 133 81 82 67 51 52 0 47 96 52 67 (0, 2, 0) 133 133 148 104 153 200 148 149 133 118 119 67 114 163 119 134 ... (2,13, 0) 98 98 96 96 115 132 124 109 97 86 71 63 79 98 97 95 (2,14, 0) 96 96 93 93 106 117 111 102 93 85 75 70 81 94 95 95
For selecting a subset of a variable, use the
-s
and-c
options. These options are N+1 dimensional for N-dimensional arrays with more than one steps. The first element of the options are used to select the starting step and the number of steps to print.The following example dumps a
4x4
small subset from the center of the array, one step from the second (middle) step:$ bpls a.bp -d T -s "1,6,7" -c "1,4,4" -n 4 double T 3*{15, 16} slice (1:1, 6:9, 7:10) (1,6, 7) 144.09 131.737 119.383 106.787 (1,7, 7) 145.794 133.44 121.086 108.49 (1,8, 7) 145.794 133.44 121.086 108.49 (1,9, 7) 144.09 131.737 119.383 106.787
-y
--noindex
Data can be dumped in a format that is easier to import later into other tools, like Excel. The leading array indexes can be omitted by using this option. Non-data lines, like the variable and slice info, are printed with a starting
;
.$ bpls a.bp -d T -s "1,6,7" -c "1,4,4" -n 4 --noindex ; double T 3*{15, 16} ; slice (1:1, 6:9, 7:10) 144.09 131.737 119.383 106.787 145.794 133.44 121.086 108.49 145.794 133.44 121.086 108.49 144.09 131.737 119.383 106.787
Note
HDF5 files can also be dumped with bpls if ADIOS was built with HDF5 support. Note that the HDF5 files do not contain min/max information for the arrays and therefore bpls always prints 0 for them:
$ bpls -l a.h5
double T 3*{15, 16} = 0 / 0
double dT 3*{15, 16} = 0 / 0
adios_reorganize
adios_reorganize
and adios_reorganize_mpi
are generic ADIOS tools
to read in ADIOS streams and output the same data into another ADIOS stream.
The two tools are for serial and MPI environments, respectively.
They can be used for
converting files between ADIOS BP and HDF5 format
using separate compute nodes to stage I/O from/to disk to/from a large scale application
reorganizing the data blocks for a different number of blocks
Let’s assume we run the Heat Transfer tutorial example and produce the output by
$ mpirun -n 12 ./heatSimulation sim.bp 3 4 5 4 3 1
Process decomposition : 3 x 4
Array size per process : 5 x 4
Number of output steps : 3
Iterations per step : 1
$ bpls sim.bp
double T 3*{15, 16}
In our example, we have an array, T
. It is a 2-dimensional double
array, its global size is 15x16
and the file contains 3 output steps
of this array. The array is composed of 12 separate blocks coming from the 12 producers in the application.
Convert BP file to HDF5 file
If ADIOS is built with HDF5 support, this tool can be used to convert between the two file formats.
$ mpirun -n 2 adios_reorganize_mpi sim.bp sim.h5 BPFile "" HDF5 "" 2 1 $ bpls sim.h5 double T 3*{15, 16} $ h5ls -r sim.h5 / Group /Step0 Group /Step0/T Dataset {15, 16} /Step1 Group /Step1/T Dataset {15, 16} /Step2 Group /Step2/T Dataset {15, 16}
Stage I/O through extra compute nodes
If writing data to disk is a bottleneck to the application, it may be worth to use extra nodes for receiving the data quickly from the application and then write to disk while the application continues computing. Similarly, data can be staged in from disk into extra nodes and make it available for fast read-in for an application. One can use one of the staging engines in ADIOS to perform this data staging (SST, SSC, DataMan).
Assuming that the heatSimulation is using SST instead of file I/O in a run (set in its
adios2.xml
configuration file), staging to disk can be done this way:Make sure adios2.xml sets SST for the simulation: <io name="SimulationOutput"> <engine type="SST"> </engine> </io> $ mpirun -n 12 ./heatSimulation sim.bp 3 4 5 4 3 1 : \ -n 2 adios_reorganize_mpi sim.bp staged.bp SST "" BPFile "" 2 1 $ bpls staged.bp double T 3*{15, 16}
Data is staged to the extra 2 cores and those will write the data to disk while the heatSimulation calculates the next step. Note, that this staging can only be useful if the tool can write all data to disk before the application produces the next output step. Otherwise, it will still block the application for I/O.
Reorganizing the data blocks in file for a different number of blocks
In the above example, the application writes the array from 12 processes, but then
adios_reorganize_mpi
reads the global arrays on 2 processes. The output file on disk will therefore contain the array in 2 blocks. This reorganization of the array may be useful if reading is too slow for a dataset created by many-many processes. One may want to reorganize a file written by tens or hundreds of thousands of processes if one wants to read the content more than one time and the read time proves to be a bottleneck in one’s work flow.$ mpirun -n 12 ./heatSimulation sim.bp 3 4 5 4 3 1 $ bpls sim.bp -D double T 3*{15, 16} step 0: block 0: [ 0: 4, 0: 3] block 1: [ 5: 9, 0: 3] block 2: [10:14, 0: 3] block 3: [ 0: 4, 4: 7] block 4: [ 5: 9, 4: 7] block 5: [10:14, 4: 7] block 6: [ 0: 4, 8:11] block 7: [ 5: 9, 8:11] block 8: [10:14, 8:11] block 9: [ 0: 4, 12:15] block 10: [ 5: 9, 12:15] block 11: [10:14, 12:15] step 1: block 0: [ 0: 4, 0: 3] block 1: [ 5: 9, 0: 3] block 2: [10:14, 0: 3] block 3: [ 0: 4, 4: 7] block 4: [ 5: 9, 4: 7] block 5: [10:14, 4: 7] block 6: [ 0: 4, 8:11] block 7: [ 5: 9, 8:11] block 8: [10:14, 8:11] block 9: [ 0: 4, 12:15] block 10: [ 5: 9, 12:15] block 11: [10:14, 12:15] step 2: block 0: [ 0: 4, 0: 3] block 1: [ 5: 9, 0: 3] block 2: [10:14, 0: 3] block 3: [ 0: 4, 4: 7] block 4: [ 5: 9, 4: 7] block 5: [10:14, 4: 7] block 6: [ 0: 4, 8:11] block 7: [ 5: 9, 8:11] block 8: [10:14, 8:11] block 9: [ 0: 4, 12:15] block 10: [ 5: 9, 12:15] block 11: [10:14, 12:15] $ mpirun -n 2 adios_reorganize_mpi sim.bp reorg.bp BPFile "" BPFile "" 2 1 $ bpls reorg.bp -D double T 3*{15, 16} step 0: block 0: [ 0: 6, 0:15] block 1: [ 7:14, 0:15] step 1: block 0: [ 0: 6, 0:15] block 1: [ 7:14, 0:15] step 2: block 0: [ 0: 6, 0:15] block 1: [ 7:14, 0:15]
adios2-config
adios2-config is provided to aid with non-CMake builds (e.g. manually generated Makefile). Running the adios2-config command under adios2-install-dir/bin/adios2-config will generate the following usage information:
./adios2-config --help
adios2-config (--help | [--c-flags] [--c-libs] [--cxx-flags] [--cxx-libs] [-fortran-flags] [--fortran-libs])
--help Display help information
-c Both compile and link flags for the C bindings
--c-flags Preprocessor and compile flags for the C bindings
--c-libs Linker flags for the C bindings
-x Both compile and link flags for the C++ bindings
--cxx-flags Preprocessor and compile flags for the C++ bindings
--cxx-libs Linker flags for the C++ bindings
-f Both compile and link flags for the F90 bindings
--fortran-flags Preprocessor and compile flags for the F90 bindings
--fortran-libs Linker flags for the F90 bindings
Please refer to the From non-CMake build systems for more information on how to use this command.
sst_conn_tool : SST network connectivity tool
The sst_conn_tool
utility exposes some aspects of SST network
connectivity parameters and activity in order to allow debugging of
SST connections.
In its simplest usage, it just lets you test an SST connection (between two runs of the program) and tells you the network information its trying, I.E. what IP address and port it determined to use for listening, and if it’s connecting somewhere what those parameters are. For example, you’d first run sst_conn_tool in one window and its output would look like this:
bash-3.2$ bin/sst_conn_tool
Sst writer is listening for TCP/IP connection at IP 192.168.1.17, port 26051
Sst connection tool waiting for connection…
To try to connect from another window, you run sst_conn_tool with the -c or —connect option:
bash-3.2$ bin/sst_conn_tool -c
Sst reader at IP 192.168.1.17, listening UDP port 26050
Attempting TCP/IP connection to writer at IP 192.168.1.17, port 26051
Connection success, all is well!
bash-3.2$
Here, it has found the contact information file, tried and succeeded in making the connection and has indicated that all is well. In the first window, we get a similar message about the success of the connection.
In the event that there is trouble with the connection, there is a “-i” or “—info” option that will provide additional information about the network configuration options. For example:
bash-3.2$ bin/sst_conn_tool -i
ADIOS2_IP_CONFIG best guess hostname is "sandy.local"
ADIOS2_IP_CONFIG Possible interface lo0 : IPV4 127.0.0.1
ADIOS2_IP_CONFIG Possible interface en0 : IPV4 192.168.1.56
ADIOS2_IP_CONFIG Possible interface en5 : IPV4 192.168.1.17
ADIOS2_IP_CONFIG best guess IP is "192.168.1.17"
ADIOS2_IP_CONFIG default port range is "any"
The following environment variables can impact ADIOS2_IP_CONFIG operation:
ADIOS2_IP - Publish the specified IP address for contact
ADIOS2_HOSTNAME - Publish the specified hostname for contact
ADIOS2_USE_HOSTNAME - Publish a hostname preferentially over IP address
ADIOS2_INTERFACE - Use the IP address associated with the specified network interface
ADIOS2_PORT_RANGE - Use a port within the specified range "low:high",
or specify "any" to let the OS choose
Sst writer is listening for TCP/IP connection at IP 192.168.1.17, port 26048
Sst connection tool waiting for connection...
Full options for sst_conn_tool:
Operational Modes:
-l
--listen
Display connection parameters and wait for an SST connection (default)
-c
--connect
Attempt a connection to an already-running instance of sst_conn_tool
Additional Options:
-i
--info
Display additional networking information for this host
-f
--file
Use file-based contact info sharing (default). The SST contact file is created in the current directory
-s
--screen
Use screen-based contact info sharing, SST contact info is displayed/entered via terminal
-h
--help
Display this message usage and options