Campaign Management
Campaign Management is a separate toolkit, and is used for collecting basic information and metadata about a collection of ADIOS2/HDF5 output files, images and text files, from a single application run or multiple runs, on one or multiple hosts. The campaign archive is a single file (.ACA) that can be transferred to other locations and shared with others interested in reading the collected data.
The .ACA campaign file can be opened by ADIOS2 and all the metadata can be processed (including the values of GlobalValue and LocalValue variables, or min/max of each Arrays at each step and decomposition/min/max of each block in an Array at each step). However, Get() operations will only succeed to read actual data of the arrays, if the data belonging to the campaign is either local or some mechanism for remote data access to the location of the data is set up in advance.
Warning
Campaign Management is fairly new, currently at version 0.7. It will change substantially in the future and campaign files produced by this version will have to be updated to newer versions. Make sure to use a compatible versions of ADIOS2 and hpc-campaign.
Requirements
The Campaign Reader engine uses SQlite3 and ZLIB for its operations and have to be turned on at configuration of ADIOS2 (-DADIOS2_USE_Campaign=ON for cmake). Check bpls -Vv to see if CAMPAIGN is in the list of “Available features”.
Caching data requires having a Redis key-value database running and the hiredis API available when building ADIOS2. Add the location of the hiredis library to -DCMAKE_PREFIX_PATH for cmake.
Limitations
The Campaign Reader engine only supports ReadRandomAccess mode, not step-by-step reading. Campaign management will need to change in the future to support sorting the steps from different outputs to a coherent order.
Example
The Gray-Scott example, that is included with ADIOS2, in examples/simulation/gray-scott, has two programs, Gray-Scott and PDF-Calc. The first one produces the main output gs.bp which includes the main 3D variables U and V, and a checkpoint file ckpt.bp with a single step in it. PDF-Calc processes the main output and produces histograms on 2D slices of U and V (U/bins and U/pdf) in pdf.bp. A campaign can include all the three output files as they logically belong together.
Assuming we are in
/lustre/orion/csc143/proj-shared/demo/gray-scott
on a machine that we named OLCF in our Campaign hostname in ~/.config/hpc-campaign/config.yaml
our campaignpath is set to /lustre/orion/csc143/proj-shared/adios-campaign-store/demoproject
# run application as usual
$ mpirun -n 4 adios2_simulations_gray-scott settings-files.json
$ ls -d *.bp
ckpt.bp gs.bp
$ ACA=demoproject/frontier_gray-scott_100
$ hpc_campaign manager $ACA --truncate data gs.bp --name gs
$ hpc_campaign manager $ACA data ckpt.bp --name checkpoint
$ mpirun -n 3 adios2_simulations_gray-scott_pdf-calc gs.bp pdf.bp 1000
$ ls -d *.bp
ckpt.bp gs.bp pdf.bp
$ hpc_campaign manager $ACA data pdf.bp --name pdf
$ hpc_campaign manager $ACA text settings-files.json --store --name input/settings.json
$ hpc_campaign manager $ACA info
=============================
ADIOS Campaign Archive, version 0.7, created on Mar 20 08:19
Hosts and directories:
OLCF longhostname = frontier05341.frontier.olcf.ornl.gov
1. /lustre/orion/csc143/proj-shared/demo/gray-scott
Other Datasets:
0fce4b1173f432f7ae5d2282df9077a6 ADIOS Sep 10 14:25 gs
3a4bf0b14cc33424a470862bd67ed007 ADIOS Sep 10 14:25 checkpoint
b42d0da4a0793adca341ace1ff6e628d ADIOS Sep 10 14:28 pdf
85a0b724b22f37a4a79ad8a0cf1127d1 TEXT Sep 10 14:24 input/settings.json
# The campaign archive is small compared to the data it points to
$ du -sh *bp
263K ckpt.bp
385M gs.bp
104K pdf.bp
$ du -sh /lustre/orion/csc143/proj-shared/adios-campaign-store/$ACA
97K /lustre/orion/csc143/proj-shared/adios-campaign-store/demoproject/frontier_gray-scott_100.aca
# ADIOS can list the content of the campaign archive
$ bpls -l $ACA
double checkpoint/U {4, 34, 34, 66} = 0.171103 / 1
double checkpoint/V {4, 34, 34, 66} = 1.71086e-19 / 0.438921
int32_t checkpoint/step scalar = 700
double gs/U 100*{64, 64, 64} = 0.0908114 / 1
double gs/V 100*{64, 64, 64} = 0 / 0.674804
int32_t gs/step 100*scalar = 10 / 1000
double pdf/U/bins 100*{1000} = 0.0908235 / 1
double pdf/U/pdf 100*{64, 1000} = 0 / 4096
double pdf/V/bins 100*{1000} = 0 / 0.67413
double pdf/V/pdf 100*{64, 1000} = 0 / 4096
int32_t pdf/step 100*scalar = 10 / 1000
char input/settings.json {440} = A / Z
# scalar over steps is available in metadata
$ bpls -l $ACA -d pdf/step -n 10
int32_t pdf/step 10*scalar = 100 / 1000
( 0) 10 20 30 40 50 60 70 80 90 100
(10) 110 120 130 140 150 160 170 180 190 200
(20) 210 220 230 240 250 260 270 280 290 300
(30) 310 320 330 340 350 360 370 380 390 400
(40) 410 420 430 440 450 460 470 480 490 500
(50) 510 520 530 540 550 560 570 580 590 600
(60) 610 620 630 640 650 660 670 680 690 700
(70) 710 720 730 740 750 760 770 780 790 800
(80) 810 820 830 840 850 860 870 880 890 900
(90) 910 920 930 940 950 960 970 980 990 1000
# Array decomposition including min/max are available in metadata
$ bpls -l $ACA -D gs/V
double gs/V 10*{64, 64, 64} = 8.24719e-63 / 0.515145
step 0:
block 0: [ 0:63, 0:31, 0:31] = 0 / 0.600691
block 1: [ 0:63, 32:63, 0:31] = 0 / 0.600691
block 2: [ 0:63, 0:31, 32:63] = 0 / 0.600691
block 3: [ 0:63, 32:63, 32:63] = 0 / 0.600691
...
step 99:
block 0: [ 0:63, 0:31, 0:31] = 3.99938e-09 / 0.441838
block 1: [ 0:63, 32:63, 0:31] = 3.99946e-09 / 0.441802
block 2: [ 0:63, 0:31, 32:63] = 3.99966e-09 / 0.44183
block 3: [ 0:63, 32:63, 32:63] = 3.99955e-09 / 0.441833
# Array data is only available if data is local
$ bpls -l $ACA -d pdf/U/bins -n 10 -c "1,-1"
double pdf/U/bins 100*{1000} = 0.0908235 / 1
slice (0:0, 0:999)
(0, 0) 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992
...
(0,990) 1 1 1 1 1 1 1 1 1 1
$ bpls -l $ACA -d pdf/U/bins -n 10 -s "-1,0" -c "1,-1"
double pdf/U/bins 100*{1000} = 0.0908235 / 1
slice (99:99, 0:999)
(0, 0) 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992 0.999992
...
(0,990) 1 1 1 1 1 1 1 1 1 1
# TEXT data can be dumped if it is local, or stored in the ACA file itself (see --store option)
$ bpls -l $ACA -dSy input/settings.json
; char input/settings.json {440} = A / Z
"{
"L": 64,
...
"mesh_type": "image"
}
"
Remote access
For now, we have one way to access data, through SSH port forwarding and running a remote server program to read in data on the remote host and to send back the data to the local ADIOS program. adios2_remote_server is included in the adios installation. You need to use the one built on the host.
Assuming the campaign archive was synced to a local machine’s campaign store under csc143/demoproject, now we can look at some of the content:
$ hpc_campaign list gray-scott
csc143/demoproject/frontier_gray-scott_100.aca
$ bpls -l csc143/demoproject/frontier_gray-scott_100.aca
double ckpt/U {4, 34, 34, 66} = 0.171103 / 1
...
char input/settings.json {440} = A / Z
# metadata stored inside the campaign can be read without remote access
$ bpls -l csc143/demoproject/frontier_gray-scott_100.aca -d pdf.bp/step
int32_t pdf/step 10*scalar = 100 / 1000
( 0) 10 20 30 40 50 60 70 80 90 100
...
(90) 910 920 930 940 950 960 970 980 990 1000
# text (and image) files stored (embedded) in the ACA file can be read without remote access
$ bpls -l csc143/demoproject/frontier_gray-scott_100.aca -dyS input/settings.json
; char input/settings.json {440} = A / Z
"{
"L": 64,
...
"mesh_type": "image"
}
"
To read array data though, we need to set up remote data access. On the local machine set up ~/.config/hpc-campaign/hosts.yaml so that the campaign connector can find how to connect to OLCF.
Assuming that
I am user user007 at OLCF
installed adios2 into ~/dtn/sw/adios2
$ cat ~/.config/hpc-campaign/hosts.yaml
OLCF:
dtn-ssh:
protocol: ssh
host: dtn.olcf.ornl.gov
user: user007
authentication: passcode
serverpath: ~/dtn/sw/adios2/bin/adios2_remote_server
args: -background -report_port_selection -v -v -l ~/dtn/log.adios2_remote_server -t 16
verbose: 1
First, we need to launch the hpc_campaign connector, specifying to load the host configuration, and to listen on port 30000 for the requests for connections.
$ hpc_campaign connector -c ~/.config/hpc-campaign/hosts.yaml -p 30000
SSH Tunnel Server: 127.0.0.1 30000
Assuming the campaign archive was synced to a local machine’s campaign store under csc143/demoproject, now we can retrieve data:
# array data is requested from the remote server
# read 16 values (4x4x4) from U from last step, from offset 30,30,30
$ bpls -l csc143/demoproject/frontier_gray-scott_100.aca -d gs/U -s "-1,30,30,30" -c "1,4,4,4" -n 4
double gs/U 100*{64, 64, 64} = 0.0908114 / 1
slice (99:99, 30:33, 30:33, 30:33)
(99,30,30,30) 0.891887 0.899848 0.899847 0.891884
(99,30,31,30) 0.899851 0.908275 0.908275 0.899849
(99,30,32,30) 0.899852 0.908276 0.908276 0.89985
(99,30,33,30) 0.89189 0.899851 0.899851 0.891889
(99,31,30,30) 0.899848 0.908273 0.908272 0.899845
(99,31,31,30) 0.908275 0.916976 0.916975 0.908273
(99,31,32,30) 0.908276 0.916977 0.916976 0.908274
(99,31,33,30) 0.899851 0.908275 0.908275 0.899849
(99,32,30,30) 0.899847 0.908272 0.908271 0.899844
(99,32,31,30) 0.908275 0.916976 0.916975 0.908272
(99,32,32,30) 0.908275 0.916976 0.916976 0.908273
(99,32,33,30) 0.89985 0.908274 0.908274 0.899848
(99,33,30,30) 0.891886 0.899846 0.899845 0.891882
(99,33,31,30) 0.89985 0.908274 0.908273 0.899847
(99,33,32,30) 0.89985 0.908275 0.908274 0.899848
(99,33,33,30) 0.891888 0.899849 0.899849 0.891886
This array data should be listed after the connection manager pops up a window asking for the passcode to login to OLCF, and logs on screen activity similar to this:
$ hpc_campaign connector -c ~/.config/hpc-campaign/hosts.yaml -p 30000
SSH Tunnel Server: 127.0.0.1 30000
Client 127.0.0.1:
Request : /run_service?group=OLCF&service=dtn-ssh
Parsed Request: {'group': ['OLCF'], 'service': ['dtn-ssh']}
Remote service request: {'group': ['OLCF'], 'service': ['dtn-ssh']}
...
Connecting to remote server dtn.olcf.ornl.gov:22 ...
Service command: ~/dtn/sw/adios2/bin/adios2_remote_server -background -report_port_selection -v -v -l ~/dtn/log.adios2_remote_server -t 16
Parsing service response...
LINE: port:58547;msg:no_error;cookie:0xd93d91e3643c9869
SERVICE DATA: {'port': '58547', 'msg': 'no_error', 'cookie': '0xd93d91e3643c9869'}
Service data: {'port': '58547', 'msg': 'no_error', 'cookie': '0xd93d91e3643c9869'}
Checking if port 28000 is available.
Opening tunnel for local port 28000 to dtn.olcf.ornl.gov:58547
Got the forward server
Starting.
Connected! Tunnel open ('127.0.0.1', 50492) -> ('160.91.195.184', 22) -> ('dtn.olcf.ornl.gov', 58547)