Tests and CI
Testing is tricky, ideally we’d want to test using actual files but that is almost 1 GB of data that we don’t want to put into this repo. To get started I’d ignore testing on real data for now and just find a function or two and write a simple unit test for that part.
-
Once there are some tests, add CI to the git project
-
- Simulate smaller data? For instance, a VIIRS geolocation file has 3200x3232 lat/lons. But seeing as how we will use libraries like xarray to read the file and also write the code such that it works with arbitrary lat/lon dimensions, then there is no reason we couldn’t make a geolocation granule that is only 10x10. Think of picking an interesting feature (an area with a variation in the output data). If we did create this tiny file for testing, then why not to write the “truth (what we believe is the correct answer)” into that tiny file and our testing can run and verify if it matches the truth. Over time I’d imagine we will find legitimate cases where our truth wasn’t correct but now we are confident in the new one. But I’d use an issue to document our initial truth and why it was wrong and then update to the new truth.
- I really like the idea of simulating small quantities of data for testing but there are also advantages to running on full granules. However, typically our scientists think in terms of granules so using full granules for the test cases might make more sense.
- Git-Lfs? I’ve never managed to make this work well with tests, not to say it can’t but not I’m not describing it as I’ve never made it work