RAIN Software Architecture
This repository (MetObsCommon) is the dumping ground for all functionality that may be shared between the software for the rooftop, SPARC, or any other instrument that is managed by the RAIN team. Below are some example executions of commands in this module that may be useful.
Note that some of the below commands may be calling code outside of this package like those in the AossTower or MendotaBuoy package. For more details and up to date examples and command line options see the documentation for those specific packages. In the future more examples and information will be provided in this gitlab project's Wiki.
Real world code can be found in the 'crontab' of the 'metobs' user on the rain01 and rain02 servers.
Note
The below commands call a generic python. When run in the real world these should use a python with the mentioned packages installed and use the full path to that environment's python executable.
Initialize InfluxDB v2 Database
This should only be needed once when setting up a server with a fresh InfluxDB database.
Prerequisites
Follow the installation instructions for InfluxDB v2.x:
https://docs.influxdata.com/influxdb
For operational use we currently install InfluxDB as a systemd service:
At the time of writing this would include:
`bash
sudo yum install influxdb2
`
We'll also need the influx CLI to initalize the database.
https://docs.influxdata.com/influxdb/v2.6/reference/cli/influx/
At the time of writing this would include:
`bash
sudo yum install influxdb2-cli
`
Setup database
These commands should be run as the metobs user (I think) so configuration is stored there.
influx setup -u metobs -p PASSWORD -o metobs -b metobs_raw --retention 0 -n metobs_operator -f
Where PASSWORD should be replaced by the password in the
/home/metobs/influxdb_operator_password.txt
file. This creates a user
"metobs" and an organization named "metobs". It creates a initial bucket
called "metobs_raw". The retention period for this bucket is set to 0 (infinite) initially
but will be changed later once data has been added and aggregated to "realtime" buckets.
This configuration is saved under the profile "metobs_operator". It is named this because this is
an operator account that has permissions to all buckets on all organizations. It should not be
used for normal usage. Lastly, the -f
stops you from being prompted for confirmation.
Since no token was specified in this command InfluxDB will generate one for us
and store it in ~/.influxdbv2/configs
.
Note
The password file should be read-only by the "metobs" user on the server.
Create realtime bucket
influx bucket create -o metobs -n metobs_realtime -d "Aggregated data for realtime displays" -r 0
This way we have a "metobs_raw" bucket for all full resolution data and a "metobs_realtime" for all aggregated/averaged data. List the buckets to get the bucket IDs to be used when creating users.
influx bucket ls
Create operational tokens
A read-only token for the metobs_raw and metobs_realtime buckets:
influx auth create -o metobs --read-bucket abcd --read-bucket efgh -d "Read-only access to metobs buckets"
This creates a token (printed to the terminal) in the
"metobs" organization that has read access to the buckets with IDs "abcd" and
"efgh". Replace these IDs with the IDs for the raw and realtime buckets from
the previous influx bucket ls
command.
A read-write token for ingest purposes:
influx auth create -o metobs --read-bucket abcd --read-bucket efgh --write-bucket abcd --write-bucket efgh --read-tasks --write-tasks -d "Read-write access to metobs buckets"
Make sure to note the tokens from these commands. They will not be readable anywhere else.
Store the read-only token in /home/metobs/influxdbv2_metobs_ro.txt
and the
read-write in /home/metobs/influxdbv2_metobs_rw.txt
.
Change the permissions for these two files to read-only for the metobs user:
chmod 400 /home/metobs/influxdbv2_metobs_ro.txt
chmod 400 /home/metobs/influxdbv2_metobs_rw.txt
Create recurring Tasks
Install averaging tasks to create average fields in the metobs_realtime bucket:
python -m metobscommon.influxdb --influxdb-token <READ_WRITE_TOKEN> create_tasks
Backfill InfluxDB Database
Insert data from an old tower file:
python -m aosstower.level_00.influxdb --influxdb-token <READ_WRITE_TOKEN> -vvv --bulk 5000 /data1/raw/aoss/tower/2018/05/08/aoss_tower.2018-05-08.ascii
The above command sends data in blocks of 5000 records. This is to improve performance of sending data to the InfluxDB instead of sending one record at a time. A bulk value of 5000-10000 is preferred.
Compute the averages for 5 second tower and data:
python -m metobscommon.influxdb --influxdb-token <READ_WRITE_TOKEN> -vvv run_manual_average --stations aoss.tower -s 2018-05-07T00:00:00 -e 2018-05-08T22:00:00 -d 1m 5m 1h
Note the above computes the 1m, 5m, and 1h averages. The time range (-s/-e) must be at whole intervals for the average intervals specified otherwise partial averages will be written. Any existing data points in the InfluxDB are overwritten by the new data points created during the execution of this script. Also note that long time ranges can take a while to process and may use a large amount of memory.
To insert a series of tower files:
find /data1/raw/mendota/buoy/2018/ -name "*.ascii" -print0 | sort -z | xargs -r0 -n1 python -m aosstower.level_00.influxdb -vvv --bulk 5000 --influxdb-token <READ_WRITE_TOKEN>
The above command sorts the files by name which is important for the best performance.
Note
For the Buoy instrument there are typically more than one ascii file (metdata and limnodata). Both of these files should be added before the averaging command is run.
A bash one-liner for processing one year and running averages 1 month at a time (to avoid overwhelming the database):
# AOSS Tower
for YEAR in `seq 2023 -1 2006`; do echo ${YEAR}; time find /data1/raw/aoss/tower/${YEAR} -name "*.ascii" -print0 | sort -z | xargs -r0 -n1 /opt/metobs/aoss_tower/bin/python -m aosstower.level_00.influxdb -v --sleep-interval 5.0 --bulk 10000 --influxdb-token $(</home/metobs/influxdbv2_metobs_rw.txt); for month in `seq -f %02.0f 1 12`; do /opt/metobs/aoss_tower/bin/python -m metobscommon.influxdb -vvv --influxdb-token $(</home/metobs/influxdbv2_metobs_rw.txt) run_manual_average --stations aoss.tower -s ${YEAR}-${month}-01T00:00:00 -e $(date -d "${YEAR}-${month}-01 + 1 month" +%Y-%m-%dT%H:%M:%S); sleep 5.0; done; echo "Done with $YEAR"; sleep 60; done
# Mendota Buoy
for YEAR in `seq 2023 -1 2013`; do echo ${YEAR}; time find /data1/raw/mendota/buoy/${YEAR} \( -name "*limnodata.*ascii" -o -name "*metdata.*.ascii" \) -print0 | sort -z | xargs -r0 -n1 /opt/metobs/mendota_buoy/bin/python -m mendotabuoy.level_00.influxdb -v --sleep-interval 5.0 --bulk 10000 --influxdb-token $(</home/metobs/influxdbv2_metobs_rw.txt); for month in `seq -f %02.0f 2 11`; do /opt/metobs/mendota_buoy/bin/python -m metobscommon.influxdb -vvv --influxdb-token $(</home/metobs/influxdbv2_metobs_rw.txt) run_manual_average --stations mendota.buoy -s ${YEAR}-${month}-01T00:00:00 -e $(date -d "${YEAR}-${month}-01 + 1 month" +%Y-%m-%dT%H:%M:%S); sleep 5.0; done; echo "Done with $YEAR"; sleep 60; done