Snippets Groups Projects

Sophia Reiner authored 1 year ago

71bf118c

71bf118c 1 year ago

Name	Last commit	Last update
README.md
environment.yml
nc_to_binary_era5.py
nc_to_binary_test.py
nc_to_binary_wrf.py

AR Grids: Docs

Goal:

Extract and reformat netCDF files to a simple binary file separated by pressure level and variable

Setup:

In order to run these scripts, you need to download a few python packages.

I prefer to use conda, but you can use other package managers such as pip if you wish. To install conda, check this link.

Initialize the conda environment using environment.yml:

conda env create -f environment.yml
conda activate argrids

Scripts:

WRF grids:

nc_to_binary_wrf.py

This script is used for reading in WRF data (not used currently). Since the data grids are so large and high resolution, some downsampling was done. All the grids are sampled down by a factor of 2 in both line and element. Also, the number of elements is a multiple of 4.

The variables which were extracted into separate binary files include:

XLAT, XLONG, CTOPHT, CTOPHT_TOT, CLRNIDX, QVAPOR

variables

Property	Type	Description
XLAT	float	Latitude, south is negative
XLONG	float	Longitude, west is negative
CTOPHT	float	Cloud top height
CTOPHT_TOT	float	Cloud top height resolved + unresolved
CLRNIDX	float	Clearness index
QVAPOR	float	levels 200, 250, 300, 400, 500, 600, 700, 850

QVAPOR:

For QVAPOR, we’re considering only the levels [200, 250, 300, 400, 500, 600, 700, 850].

These correspond to the indices [55, 59, 63, 70, 75, 80, 87, 91]

We convert the QVAPOR fields from floating point to a range 0 to 255, so each level will need its own slope and offset for the conversion. There will eventually be several days of files that will be processed, but the slope/offset is constant for the entire time range.

For each pressure level in this first file, determine the range of QVAPOR values. As a first guess, expand the range by 20%, compute slope/offset to convert floats to bytes (0->255). For example, if the QVAPOR range at one of the levels is 0.2 to 1.0 (a range of 0.8), expanding the range by 20% is 0.16 (from 0.8 to 0.96), so the new min and max values will be

min = 0.2- 1/2 (0.16) = 0.12

max = 1.0 + 1/2 (0.16) = 1.08

Finally, we apply a log_10 scale then convert to range 0-255

There is a separate script to run WRF processed winds: nc_to_binary_test.py

variables

Property	Type	Description
XLAT	float	Latitude, south is negative
XLONG	float	Longitude, west is negative
umet	float
vmet	float

These scripts aren’t optimized, probably change them to use xarray/dask in the future for faster data loading and accessing

ERA5 grids:

nc_to_binary_era5.py

** need to change the input and output directory before running

variables

Property	Description
xlat	Latitude
xlong	Longitude
cc	fraction of cloud cover
r	Relative humidity
u	U component of wind
v	V component of wind
level	pressure levels in the data
q	Specific humidity

cc: original values 0 to 1, so multiply by 100 and write out as 1-byte ints. No stretching.

r: original values 0 to 100, write out as 1-byte ints. No stretching.

q: this is the only grid with the stretch and +/-20%. Apply log scale, then scale to 0 to 255, write out as 1-byte ints (same as WRF above)

u and v: write out directly as binary as floats

Warnings:

runtime warning, log_10 on negative values, they are just masked, which I think is fine