Skip to content

GitLab

  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • UW-Glance UW-Glance
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 16
    • Issues 16
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 0
    • Merge requests 0
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Container Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • Eva Schiffer
  • UW-GlanceUW-Glance
  • Issues
  • #35

Closed
Open
Created Feb 05, 2019 by Alan De Smet@adesmetDeveloper

Excessive memory usage

A NetCDF4 byte-sized variable of size 5424×5424×33 would take about 0.9GB, 3.6GB if unpacked to 32-bit integers. Trying to run glance stats needs about 44GB, which is far more than expected. Similarly large files can use seemingly excessive amounts of memory trying to plot.

Here's a simple test case: test.nc (6.7K) . Despite the small size, it has a 5424×5424×33. glance stats peaks at about 44GB on my system.

If you set a 32GB virtual memory limit, it fails on my system:

$ ulimit -v 33554432
$ glance stats ./test.nc  ./test.nc 
--------------------------------
testvar

Traceback (most recent call last):
  File "/home/adesmet/bin/glance", line 11, in <module>
    load_entry_point('uwglance', 'console_scripts', 'glance')()
  File "/home/adesmet/src/glance/pyglance/glance/compare.py", line 1613, in main
    rc = lower_locals[args[0].lower()](*args[1:])
  File "/home/adesmet/src/glance/pyglance/glance/compare.py", line 1314, in stats
    output_channel=toPrintTo)
  File "/home/adesmet/src/glance/pyglance/glance/compare.py", line 1083, in stats_library_call
    variable_stats = statistics.StatisticalAnalysis.withSimpleData(aData, bData, amiss, bmiss, epsilon=epsilon)
  File "/home/adesmet/src/glance/pyglance/glance/stats.py", line 883, in withSimpleData
    new_object._create_stats(diffInfo)
  File "/home/adesmet/src/glance/pyglance/glance/stats.py", line 910, in _create_stats
    self.comparison   = NumericalComparisonStatistics(diffInfoObject)
  File "/home/adesmet/src/glance/pyglance/glance/stats.py", line 756, in __init__
    self.correlation                = delta.compute_correlation(aData, bData, valid_in_both)  if not noData else np.nan
  File "/home/adesmet/src/glance/pyglance/glance/delta.py", line 122, in compute_correlation
    toReturn = compute_r_function(good_x_data, good_y_data)[0]
  File "/usr/lib/python2.7/dist-packages/scipy/stats/stats.py", line 3018, in pearsonr
    xm, ym = x - mx, y - my
MemoryError

I ran into this on real AIT_Framework GOES-16 output data. Trying to run glance reportGen on my 65GB RAM laptop was failing (and wedging my laptop in the process).

There are more examples (but without the output data!) at #34 (closed), where I did my initial investigation.

My system is Ubuntu 18.04.1 LTS (64-bit), running on a 4-core x86-64 Intel CPU with 64GB of memory.

Edited Apr 23, 2019 by Alan De Smet
Assignee
Assign to
Time tracking