SDFITS Memory Leak Report
I tried to establish the library or application responsible for the memory leak ocurring in Sdfits.
This wiki page describes many, but not all of the the steps in the investigation of the sdfits memory leak problem.
This documentation serves to:
- Document how I reached my conclusion, but never actually proved it.
- Document the steps I've taken in trouble shooting the problem. Either I or someone else can build on the investigative process in the upcoming cycle.
- Lists solutions that were attempted and their results.
- Lists various solution strategies with estimated manhours and the risks associated with each strategy.
The memory leak appears to be a fitsio SWIG issue, but I can't honestly say that I can pin point the exact cause(s).
I've come to this conclusion by excluding other causes.
There are several options we can pursue to fix the issue. The most promising options are unfortunately also the most manhour intensive.
I removed all SWIG library function calls that write the binary data from Sdfits. I kept the code that reads the data from the astronomy fits files. Sdfits uses several classes in sparrow to perform its tasks (Project, Scan etc.). This removed the substantial memory leak, thus exonorating sparrow.
- Exonerating the non swigged cfitsio library
The spectrometer and sdfits use the same cfitsio library function to write the Spectrometer Binary data. Of course, sdfits uses our swigged version and the spectrometer uses the c version. I tested the spectrometer fits writing capabilities using the simulator. My setup created very large fits files in a very short time frame. No manifestation of a memory leak ocurred in the Spectrometer, thus exonerating the non swigged cfitsio library.
I added deletes of the returned objects from the fitsio library calls. This had no effect.
At this point I had hoped to pinpoint the exact SWIGged function causing the leak. Unfortunately, this turned out to be a wild goose chase.
You can really only see the memory leak when large arrays are written, thus for the spectrometer data, the memory leak rears its ugly head in the write col float function. But it is not necessarily the only function that has a memory leak in our swigged cfitsio. I wrote these same columns as doubles using a different library call, and the memory leak still occurred.
The leak did not occur if I used the string cfitsio function. (But was incredibly slow!).
Result: I verified that there is a memory leak in at least 2 of the swigged cfitsio calls. Big deal! What I really needed to do was replace or rewrite the swigged cfitsio functions used.
- I tried to reload the cfitsio module in sdfits after every scan, hoping that a reload would cause the memory to be release.
No improvement
- There are many web messages and pages concerning swig memory leaks, I attempted to use some of the suggested fixes in our swig files. Most of these suggestions included adding deletes to the swigged files.
Either the files did not compile or no improvement was seen.
- I downloaded pCFITSIO which is a swigged version of fitsio for python. I compared "our" swig .i files and those of pCFITSIO. The are considerably different. Most interesting is how they handle the calloc and mallocs from within the files. I decided to make this version and modify SDFits to use it for all the fits writes performed within sdfits. The sparrow libraries would still use the original swigged cfitsio code.
Not so simple!
pCFITSIO uses a newer version of numarray than the current McPython/sparrow build. It requires a dll called libnumeric, found only in the newer versions of numarray. I downloaded the new version of numarray and installed it into my own sparrow directory. Adding directories to my python path etc. I tried to make pCFITSIO use the new version of numarray while all other python programs use McPython numarray. I could not get this to work because numarray is included as part of python not as an external package, so my PTYHONPATH modifications did not effect the directory that numarray was found.
As part of the python/sparrow build project, a new McPython, called llama, was built with new versions of all libraries.
I've built the new swigged cfitsio module as myfitsio. I have made several attempts at using the llama python build with my new swigged cfitsio. I'm still working through module and path issues. I think this is promising but do not have the time to pursue it this cycle.
1. There is a new c/fortran fits library that we could download and try with our existing swig setup.
Estimated time: 1 day
Risk: I looked at the documentation for the fits modifications for the new package and did not see any changes that appeared to be relevant. It probably won't fix the problem.
2. Work on getting pCFITSIO to work and modify sdfits to use it for writing fits. It might be umbrella'd under the McPython/sparrow project.
I've started this work already.
Estimate: 1 week
Risk: This version may have memory leaks as well
3. We know the memory leak is not in the c fits code. We could modify sdfits python program to either
- a. Run the sdfits program from within a from a C++ program. The sdfits would still use the sparrow libraries to get the fits data from the observing fits file, but all writes would be done directly through C++. Paul has embedded python code in C++ before, in the AntennaCharacterization and Inclinometer. He thinks this will require considerable effort.
Estimate 2 weeks
Risk: Clugy, its a work around not a fix
- b. Write a C++ fits deamon. The sdfits program could get the data through the existing methods, namely sparrow libraires but instead of writing the data, sdfits would send the data to be written to the C++ fits daemon via sockets or some other mechanism.
Estimate 2 weeks but probably not as time intensive as (a)
Risk: Clugy, its a work around not a fix (not as clugy (a)
Network clogging potential
4. Compare the pCFITSIO swig .i files and modify ours as appropriate to try and get of memory leak in our swigged cfitsio.
I briefly mad an attempt at this but had build issues.
Estimate: 1 week
Risk: This version may have memory leaks as well
5. Become more conversant in swig and fix the damn problem in our swigged version!
Estimate: 3 weeks
Risk: Failure!
6. Use pyfits which is a python fits writing module. It is currently used by turtle to write the GO fits file. Mark says Eric thinks pyfits is slow.
Estimate: 1.5 weeks
Risk: Possible big performance hit in sdfits!
-- MelindaMello - 16 Mar 2005
Revision r1.4 - 19 Mar 2005 - 19:37 GMT - MelindaMello Parents: PlanOfRecordC22005
|
Content copyright © 1999-2007 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
|
| |