Skip to content
Snippets Groups Projects

Parallel Segments (pseg) Demo

This demo specifies a clavrx_options and file_list
You can change these however you like, it doesn't even need to be GOES.
I recommend first testing your configuration by running ./clavrxorb

This demo runs as fast as possible, meaning that there is no limit to the number of workers created. Therefore you should only run it on a powerful machine with lots of memory (>50GB)
Also try tuning the segment size (scan_lines) in clavrx_options. Smaller segments means more workers to run in parallel, but also more memory used.

Dependencies

This demo is very light on dependencies, but requires a non-ancient Python version (>=3.7) and numpy.

The other major dependency is clavrx, which must support tracing and reopening the netcdf every segment. This was implemented in commit c1809117028f on 2022-11-14, so any newer version of clavrx is supported.

This demo also requires the nm tool to read the symbol table of the executable, though the values could be hardcoded if necessary.

Invocation

Parallel Segments

python run_pseg.py is a wrapper that executes ./clavrxorb

On SSEC machines the basic miniconda environment satisfies all requirements

module load miniconda/3.7-base

/usr/bin/time -v python run_pseg.py

Pseg will redirect clavrx output to files in the cwd and periodically display the process tree to help monitor progress.

Normal clavrx

The clavrx executable can be used normally too

./clavrxorb

Performance

Note that memory usage is difficult to measure in the parallel case. You need cgroups to accurately measure the physical set size (pss), which is what slurm checks. For the most part you can add up the resident set size (rss) for all of the concurrent clavrxorbs.

Parallel Segments

        Command being timed: "python run_pseg.py"
        User time (seconds): 2097.66
        System time (seconds): 326.85
        Percent of CPU this job got: 660%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 6:06.94
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 7512888
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 4
        Minor (reclaiming a frame) page faults: 91394621
        Voluntary context switches: 35157
        Involuntary context switches: 4777
        Swaps: 0
        File system inputs: 15332856
        File system outputs: 649152
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

Normal

        Command being timed: "./clavrxorb"
        User time (seconds): 1850.83
        System time (seconds): 35.67
        Percent of CPU this job got: 96%
        Elapsed (wall clock) time (h:mm:ss or m:ss): 32:28.49
        Average shared text size (kbytes): 0
        Average unshared data size (kbytes): 0
        Average stack size (kbytes): 0
        Average total size (kbytes): 0
        Maximum resident set size (kbytes): 12181464
        Average resident set size (kbytes): 0
        Major (requiring I/O) page faults: 0
        Minor (reclaiming a frame) page faults: 10085835
        Voluntary context switches: 10582
        Involuntary context switches: 1286
        Swaps: 0
        File system inputs: 4638016
        File system outputs: 568936
        Socket messages sent: 0
        Socket messages received: 0
        Signals delivered: 0
        Page size (bytes): 4096
        Exit status: 0

How it works

If clavrx is executed with environmental variable CLAVRX_ENABLE_TRACER=1 it runs in "tracing mode". This means that at certain points during the processing it will stop, and another process must signal it with SIGCONT. Another environmental variable CLAVRX_TRACER_CLONES=N will create N clones each processing segment. This happens after the L1b data is read in, but before processing starts. So, with clavrx in tracing mode and set to make a single clone every segment, we can dedicate a single clone to reading the L1b and fork a worker clone for each segment in parallel.

Slides

Parallel Segments with Clavrx Tracer
How the CLAVR-x Tracer System Works

More Notes

  • the responsibility of writing the netcdf has returned to clavrx