CAM raw reader is HORRIBLE in parallel
This issue was created automatically from an original Mantis Issue. Further discussion may take place here.
The CAM raw reader is horrible in parallel. It is horrible from a memory perspective as well as a performance perspective.
My test is as follows:
- Read famip5_ne240b.cam.h3.2001-09-09-03600.nc. This is a parallel reader.
- Enable opacity mapping for surfaces. (I don't think this is the issue).
- Plane (offset left side of screen). Texture map. Clip.
- Plane (offset right side of screen). Texture map. Clip.
Running remote server, performance is as follows:
- 1 cores - 14 seconds final render - 1.58 GB/core
- 2 cores - ?? seconds final render - 1.62 GB/core
- 4 cores - ?? seconds final render - 1.24 GB/core
- 8 cores - 26 seconds final render - 1.04 GB/core (8.45 total)
So, 8 cores runs about half the speed of 1 core, and takes around 6 times as much memory. Ouch!!
Please don't work on this until it has been discussed at a bi-weekly meeting, or you talk to Utkarsh or me.
I have a few more observations, will include them in and additional post.