毕业快一年,做了2个项目,都是在别人的代码上作开发,苦不堪言:bug 实在是太多。这一年中有大半的时间是在改别人的bug, 也积累了一些经验,和大家分享。我的方法大多数都来自《Windows程序调试》(英文名 Debugging Windows Programs)。那本书里讲了很多方法,我只挑对我自己帮助最大的:
1. 调试内存破坏。
这种bug的表现就是不定时,不定地方的崩溃。这种bug我一共碰到2次,每一次都花了很长时间,尤其是第二次,花了大家三天时间。其原因是堆(heap)被破坏掉了。
我的方法是这样的:在可能出现问题的地方加上对堆的检查,用如下代码:
// Get the current state of the flag
// and store it in a temporary variable
int tmpFlag = _CrtSetDbgFlag_CRTDBG_REPORT_FLAG );
// Turn On (OR) - Keep freed memory blocks in the
// heap’s linked list and mark them as freed
tmpFlag |= __CRTDBG_CHECK_ALWAYS_DF;
// Set the new state for the flag
_CrtSetDbgFlag( tmpFlag );
int *nn =new int;
delete nn;
// Turn Off (AND) - prevent _CrtCheckMemory from
// being called at every allocation request,It will cause much time
tmpFlag &= ~_CRTDBG_CHECK_ALWAYS_DF;
// Set the new state for the flag
_CrtSetDbgFlag( tmpFlag );
如果之前堆已经坏掉了,那么程序(Debug版)就会在分配内存的地方中断,在这儿是int *nn =new int;
第一次破坏堆的代码是这样:
typedef struct aa
{
int a;
} AA;
AA s[n];
int i = 0;
for(i=0;i<n;i++)
{
…
…
for(i=0;i<n;i++){…}
…
…
s[i].a = 0;
}
内外都使用i做循环变量,就这样把堆破坏了。
第二次的代码比较隐蔽,我先是使用map文件找到了崩溃的地方,但是一看是一个window的API,就放过了,后来还是再用上面的方法,又定位到那个API:
GetFileAttributesEx(szFile, GetFileExInfoStandard, &attributes);
发现szFile这个参数有问题,是一个不合法的文件名,然后再调用这个API之前作文件名的合法检查,就没事了。呵呵,真是这个API干的,看来我们还是不能让Microsoft什么都做。
2. 查找memory leak
可以通过内存分配号来查找memory leak, 方法是这样的(我以前写的+msdn,就不翻译了):
You can see the information below in your VC output window if the app have memory leaks
Detected memory leaks!
Dumping objects ->
{18} normal block at 0x00780E80, 64 bytes long.
Data: < > CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD
Object dump complete.
If you can not see the dump information in VC, you can add the code to the file crtdbg.h:
#ifdef _DEBUG
#define _CRTDBG_MAP_ALLOC
#define _INC_MALLOC
#include <stdlib.h>
// custom functions declaration (ATL + BETA version problems)
extern "C"
{
void * __cdecl _alloca(size_t);
#define alloca _alloca
}
#endif
As you can see, _CrtDumpMemoryLeaks gives you much more useful information when _CRTDBG_MAP_ALLOC is defined. Without _CRTDBG_MAP_ALLOC defined, the display shows you:
- the memory allocation number (inside the curly braces).
- the type of block (normal, client, or CRT).
- the memory location in hexadecimal form.
- the size of the block in bytes.
- the contents of the first 16 bytes (also in hexadecimal).
You can run your program twice in the same way, then you can find that the memory allocation number of the leaked memory is always the same, so you can use the memory allocation number to find the memory leak; directly to say, you can break the program by memory allocation number.
The below is taken from MSDN(Detecting and Isolating Memory Leaks Using Microsoft Visual C++)
Setting a Breakpoint on a Memory Allocation Number
The file name and line number in the memory leak report tell you where leaked memory is allocated, but knowing where the memory is allocated is not always sufficient to identify the problem. Often an allocation will be called many times during a run of the program, but it may leak memory only on certain calls. To identify the problem, you must know not only where the leaked memory is allocated but also the conditions under which the leak occurs. The piece of information that makes it possible for you to do this is the memory allocation number. This is the number that appears in curly braces, after the file name and line number when those are displayed. For example, in the following output, “18” is the memory allocation number. It means the leaked memory is the 18th block of memory allocated in your program.
Detected memory leaks!
Dumping objects ->
C:\PROGRAM FILES\VISUAL STUDIO\MyProjects\leaktest\leaktest.cpp(20) : {18}
normal block at 0x00780E80, 64 bytes long.
Data: < > CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD CD
Object dump complete.
The CRT library counts all memory blocks allocated during a run of the program, including memory allocated by the CRT library itself or by other libraries, such as MFC. Therefore, an object with allocation number n will be the nth object allocated in your program but may not be the nth object allocated by your code. (In most cases, it will not be.)
You can use the allocation number to set a breakpoint at the location where memory is allocated. To do this, set a location breakpoint near the start of your program. When your program breaks at that point, you can set such a memory allocation breakpoint from the QuickWatch dialog box or the Watch window. In the Watch window, for example, type the following expression in the Name column:
_crtBreakAlloc
If you are using the multithreaded dynamic-link library (DLL) version of the CRT library (the /MD option), you must include the context operator, as shown here:
{,,msvcrtd.dll}_crtBreakAlloc
Now, press RETURN. The debugger evaluates the call and places the result in the Value column. This value will be –1 if you have not set any breakpoints on memory allocations. Replace the value in the Value column with the allocation number of the memory allocation where you want to break—for example, 18 to break at the allocation shown in the output earlier.
After you set breakpoints on the memory allocations in which you are interested, you can continue debugging. Be careful to run the program under the same conditions as the previous run so the allocation order does not change. When your program breaks at the specified memory allocation, you can look at the Call Stack window and other debugger information to determine the conditions under which the memory was allocated. If necessary, you can continue execution of the program from that point to see what happens to the object and perhaps determine why it is not properly deallocated. (Setting a Data breakpoint on the object may be helpful.)
Although it is usually easier to set memory allocation breakpoints in the debugger, you can set them in your code, if you prefer. To set a memory allocation breakpoint in your code, add a line like this (for the 18th memory allocation):
_crtBreakAlloc = 18;
As an alternative, you can use the _CrtSetBreakAlloc function, which has the same effect:
_CrtSetBreakAlloc(18);
As to your APP, you must rename the file msvcrtd.dll to let the APP use the dll file in the folder System32. You should add {,,msvcrtd.dll}_crtBreakAlloc
to the Watch. You’d use Step Into and set the breakpoint.
3. 跨进程调试。
比如两个程序APP1 和 APP2, 由APP1 调起 APP2, 这时候你想调试 APP2. 可以这样做,在APP2的initinstance()中合适的地方加上一句ASSERT(FALSE); 让APP2停下来,然后用VC Attach 过去,就可以到APP2对应的cpp 里去设断点了。
《Windows程序调试》还讲了很多方法:比如设高级断点,远程调试,map文件等等,都是非常有用的方法,我在这儿都列出来,未免太罗嗦了。对于原理方面的东西,希望大家多去看书和msdn. 再就是大家一定要写出高质量的代码,多加一句检查,可能会省去别人三天的时间。
有不妥当的地方,希望大家指正。