wiki:OsgDebugging

Version 7 (modified by Torben Dannhauer, 14 years ago) (diff)

typos

Short Introduction to Debugging and Hunting Memory Leaks on Windows and Linux

This indroduction adresses beginners but assumes they are familiar at least with basic debugging techniques (stop and go, watching values..). If you are searching for advanced tips, please use the MSDN and Valgrind documentation.

Windows

The windows debugging HowTo is based on Microsoft Visual Studio 2008 Professional. Using another MSVC Version should be possible without modification. ...should... :)

In debug builds MSVC outputs memoryleaks automatically. Unfortunately the message does not contain filenames or line numbers but only lost bytes by default. Additionally MSVC stops the memory accounting to early, so the static elements of OSG are not freed already and result in a lot of false memory leak reports. This behaviour can only be corrected with crude workarounds like linking to MFC or similar.

At least MSVC can be configured to report all memory leaks inside the source files with filenames and line numbers to a output file. This allows to identify some of the "true" memory leaks and correct them already on the windows platform.

The best practice on windows is to use MSVC to eliminate all obvious memory leaks and to continue with Valgrind on Linux for final checks

What do we need to hunt the memory leaks? A report of every leak with the following information

  • Filename where the leak was caused.
  • Function name where the leak was caused.
  • Line number where the leak was caused.

To achieve this, we have to perform two steps:

  • Manipulation main.cpp to start memory leak detection
  • Adding some definitions to each source file to replace the new operator with a debug version which outputs the requiered information.

main.cpp

To enable the debug memory allocation, the following codelines are required before the main function:

#ifdef _DEBUG
        #ifdef WIN32
                // Declare this in header.
                #define _CRTDBG_MAP_ALLOC
                #include <stdlib.h>
                #include <crtdbg.h>
        #endif 
#endif 

Inside the main() function wo have to specify in detail how to track the memory. Because the MSVC debug windows is quite slow for large outputs, we use the file output mem_log.txt in the application path:

#ifdef _DEBUG
        #ifdef WIN32
                #include <leakDetection.h>      // Must be inside main function. In classes: headerfile inside class is sufficient
                int tmp_flag;

                HANDLE log_file = CreateFile("mem_log.txt", GENERIC_WRITE,FILE_SHARE_WRITE,
                        NULL, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);

                _CrtSetReportMode(_CRT_ASSERT,_CRTDBG_MODE_FILE | _CRTDBG_MODE_WNDW |
                        _CRTDBG_MODE_DEBUG);
                _CrtSetReportMode(_CRT_WARN,_CRTDBG_MODE_FILE | _CRTDBG_MODE_DEBUG);
                _CrtSetReportMode(_CRT_ERROR,_CRTDBG_MODE_FILE | _CRTDBG_MODE_WNDW |
                        _CRTDBG_MODE_DEBUG);

                // output to the file
                _CrtSetReportFile(_CRT_ASSERT, log_file);
                _CrtSetReportFile(_CRT_WARN, log_file);
                _CrtSetReportFile(_CRT_ERROR, log_file);

                tmp_flag = _CrtSetDbgFlag(_CRTDBG_REPORT_FLAG);
                tmp_flag |= _CRTDBG_ALLOC_MEM_DF;
                tmp_flag |= _CRTDBG_DELAY_FREE_MEM_DF;
                tmp_flag |= _CRTDBG_LEAK_CHECK_DF;

                _CrtSetDbgFlag(tmp_flag);
        #endif 
#endif

To enable the debug output with line numbers, we have to redefine the new operator. Because preprocessor #define commands work an a per file base, we have to repeat the redefinition in each headerfile (inside the class). The header file contains the following code:

#ifdef _DEBUG
        #ifdef WIN32
                #ifndef DBG_NEW
                        #define DBG_NEW new ( _NORMAL_BLOCK , __FILE__ , __LINE__ )
                        #define new DBG_NEW
                #endif
        #endif
#endif

This include file has to be included in the header file of each class, because the define redefinition of new is file based. The following lines show the source fiel preparation with the headerfiel above:

class visual_core : public osg::Referenced
{
        #include <leakDetection.h>

public:
        visual_core(osg::ArgumentParser& arguments_);
....
}

The output in the logfile looks like:

Detected memory leaks!
Dumping objects ->
{4436425} normal block at 0x3135E588, 24 bytes long.
 Data: <    p 51     *  > 80 80 10 01 70 E2 35 31 80 80 10 01 10 2A 00 00 
{4436424} normal block at 0x3135E530, 24 bytes long.
 Data: <  51  51  51l   > 10 E1 35 31 B8 E0 35 31 D8 E4 35 31 6C F2 02 00 
{4436423} normal block at 0x3135E4D8, 24 bytes long.
 Data: <    0 51     E  > 80 80 10 01 30 E5 35 31 80 80 10 01 B4 45 04 00 
{4436422} normal block at 0x3135E480, 24 bytes long.

...

{142} normal block at 0x011059A8, 192 bytes long.
 Data: < ?4^            > D4 3F 34 5E 00 00 00 00 01 00 00 00 00 00 00 00 
Object dump complete.

The number in front of every entry is the number of the memory allocation which could be used to identify the allocation and to set breakpoints only for this operation.

If any true memory loss inside the code if found, it is possible to identify it on the source file specification and line number like this entry:

{3756} normal block at 0x012F4AE8, 32 bytes long.
 Data: <osgPlugins-2.9.9> 6F 73 67 50 6C 75 67 69 6E 73 2D 32 2E 39 2E 39 
.\src\core\visual_core.cpp(50) : {3668} normal block at 0x012EFBE8, 8000 bytes long.
 Data: <                > CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD 

Memory leaks are only reported with linenumbers if you redefine the new operator via the include file'''

Linux

To debug osgVisual on Linux, be sure you have compiled and installed OSG with debug symbols (It is not interfering with the release build). Build osgVisual as Debug (osgVisuald) and start it with valgrind:

valgrind --error-limit=no --leak-check=full ./osgVisuald <yourParameters> /path/to/database.ive

In the output, Valgrind will output errors in osgVisual as well as in involved system libraries like libGL or X11.

For memory leaks, the "Heap Summary" is very interesting. Here Valgrin lists all possible and sure memory loss. For each incident it lists a call stack to allow to follow the program path over 10 steps to the function the memory loss happens. For all libraries with debug build, it also shows the line number, so it is quite easy to identify the relevant code. The following example shows a memory leak at the heap:

==4281== HEAP SUMMARY:                                                                                                                                                                                                                                                                                                      
==4281==    in use at exit: 32,183 bytes in 413 blocks                                                                                                                                                                                                                                                                     
==4281==   total heap usage: 3,086,029 allocs, 3,085,616 frees, 1,582,088,128 bytes allocated                                                                                                                                                                                                                               
==4281==

...

==4281== 128 (24 direct, 104 indirect) bytes in 1 blocks are definitely lost in loss record 52 of 77
==4281==    at 0x4C2596C: operator new(unsigned long) (vg_replace_malloc.c:220)
==4281==    by 0x6F85405: __gnu_cxx::new_allocator<std::_List_node<osg::ref_ptr<osg::Texture::TextureObject> > >::allocate(unsigned long, void const*) (new_allocator.h:89)
==4281==    by 0x6F8411D: std::_List_base<osg::ref_ptr<osg::Texture::TextureObject>, std::allocator<osg::ref_ptr<osg::Texture::TextureObject> > >::_M_get_node() (stl_list.h:316)
==4281==    by 0x6F834BE: std::list<osg::ref_ptr<osg::Texture::TextureObject>, std::allocator<osg::ref_ptr<osg::Texture::TextureObject> > >::_M_create_node(osg::ref_ptr<osg::Texture::TextureObject> const&) (stl_list.h:461)
==4281==    by 0x6F828AE: std::list<osg::ref_ptr<osg::Texture::TextureObject>, std::allocator<osg::ref_ptr<osg::Texture::TextureObject> > >::_M_insert(std::_List_iterator<osg::ref_ptr<osg::Texture::TextureObject> >, osg::ref_ptr<osg::Texture::TextureObject> const&) (stl_list.h:1407)
==4281==    by 0x6F81831: std::list<osg::ref_ptr<osg::Texture::TextureObject>, std::allocator<osg::ref_ptr<osg::Texture::TextureObject> > >::push_back(osg::ref_ptr<osg::Texture::TextureObject> const&) (stl_list.h:920)
==4281==    by 0x6F7A638: osg::Texture::TextureObjectSet::orphan(osg::Texture::TextureObject*) (Texture.cpp:641)
==4281==    by 0x6F7B14F: osg::Texture::TextureObjectManager::releaseTextureObject(osg::Texture::TextureObject*) (Texture.cpp:839)
==4281==    by 0x6F7B89F: osg::Texture::releaseTextureObject(unsigned int, osg::Texture::TextureObject*) (Texture.cpp:930)
==4281==    by 0x6F7D104: osg::Texture::dirtyTextureObject() (Texture.cpp:1103)
==4281==    by 0x6F7C2CA: osg::Texture::~Texture() (Texture.cpp:991)
==4281==    by 0x6F719C4: osg::Texture2D::~Texture2D() (Texture2D.cpp:50)

...

==4281== LEAK SUMMARY:
==4281==    definitely lost: 432 bytes in 6 blocks
==4281==    indirectly lost: 208 bytes in 2 blocks
==4281==      possibly lost: 0 bytes in 0 blocks
==4281==    still reachable: 31,223 bytes in 404 blocks
==4281==         suppressed: 0 bytes in 0 blocks
==4281== Reachable blocks (those to which a pointer was found) are not shown.
==4281== To see them, rerun with: --leak-check=full --show-reachable=yes
==4281==
==4281== For counts of detected and suppressed errors, rerun with: -v
==4281== ERROR SUMMARY: 692992 errors from 2368 contexts (suppressed: 7 from 7)

Tips & Troubleshooting

  • Because debugging large programs is very slow, debug osgVisual only with small databases. Sometimes, it is even sufficient to debug osgVisual without any terrain database loaded.