If programming special effects is the best part of game programming, debugging is surely the worst. It's the most time-consuming, aggravating, niggling, nit-picking, unforgiving, tedious aspect of game programming. It will slow your progress and cause you to miss deadlines, age prematurely, and lose your sense of equanimity, if not your breakfast. In short, it is a ghastly process that we all would avoid if we could.
Unfortunately, we can't avoid debugging games. Debugging is part of the development process. Bugs are a given--I have never known a program to be developed without introducing at least one bug. I am sure, at some point, even Hello World was once spelled Hello Wolrd.
We can't avoid bugs, all we can do is find ways to deal with them. Our goal, when debugging games, should be to minimize the debugging process. That is, we should strive to get it done (and get it done right) in the shortest amount of time possible. This could still be a considerable amount of time, so be warned! The gremlins that plague us are sneaky and persistent, and as we gird our loins and march into battle, we must prepare ourselves for a long seige.
Bugs can occur practically anyplace in the development process. Let's begin by looking at the process itself and examine the places where bugs are often introduced:
Ideally, you want your game program to be executed from mode 3 (text mode), set the video mode to mode 20 (Mode X graphics), execute for a while, then return to mode 3 and exit. In an ideal world, the mode you end up with is the mode you started with. But in an ideal world, we wouldn't have any bugs, would we?
If a program terminates abnormally, you could end up in mode 20, or you could end up in some other video mode. If your screen looks strange upon exit, you probably had an abnormal termination. You may even have a DOS error message on the screen (such as "Integer divide by 0") that you can't read. To read this message, turn on your printer, redirect DOS output to your printer (Ctrl+Print Screen ought to do it), and hopefully your error message will print on your printer and shed some light on why you had an abnormal termination.
To set your video environment back to mode 3, use the DOS MODE command, as in "MODE CO80". This will probably fix the problem, but if your abnormal termination was serious, something else in your system memory may have been clobbered, and you'll have to reboot.
An abnormal termination, as we just explored, is a good indication your code has been clobbered. This happens when the copy of the program that is in RAM is somehow overwritten or corrupted. It no longer works the way you expect it to, because it is no longer capable of working. Something has caused it to break.When your code gets clobbered, your program may not be the only code affected. Other parts of memory may be damaged as well. Your TSRs may no longer work, you may lose some of your environment variables, you can even (horrors!) overwrite your CMOS. When this happens, you may reboot and get the message "fixed disk C: not found". Don't panic! Your hard disk has most likely not been damaged, only the chip that keeps track of it has been overwritten. You can run your system's SETUP program to get your hard disk back.
This is one of the worst things that can happen when your data is out of control. Try to avoid this kind of bug if you can. Let's look at some ways code can be clobbered.
This most often happens when you write beyond the end of an array. An incrementing problem will do it. When the array has been filled, and data continues to be written, it has to go somewhere. Very often, data will be written to the code segment. Whole functions can be wiped out this way. A data overflow in one part of your program can destroy a function in a completely unrelated part of your program. This is what makes this kind of bug so hard to find: You have to look in places other than in the function in which the bug appears to happen.
Even one byte out of bounds can destroy a whole function, and it is nearly impossible to predict which function will be affected. This is a bug that is difficult to avoid and difficult to find!
Null pointers happen when an item, such as an array or a structure, is declared but space is not allocated for it. For example, the following code will result in a null pointer error:
char *p; p[0] = 0;
When a pointer is declared, it begins life pointing to an integer 0, which is called NULL. This thing, NULL, is supposed to be nothing. That is, by definition, a pointer pointing to NULL points to nothing. Except that it doesn't. It points to something. I don't know exactly what it points to, but whatever it is, you can write to it and mess up your program.
My Microsoft C compiler manual says a null pointer points to the NULL segment, which is a special area in low memory that is normally not used. Writing to this segment will trigger a null pointer assignment error message. I suspect this is compiler specific, though. I would not expect other compilers' null pointers to point to the same thing Microsoft's null pointers point to.
Whatever your null pointer points to, don't write to that area. Instead, allocate space somewhere else and point your pointer to it. Here are two better ways to write the above code:
char p[ARRAYSIZE]; p[0] = 0;Note that the space for the array is allocated at compile-time. In this example, space for the array is allocated at runtime:
char *p; p = malloc(ARRAYSIZE); p[0] = 0;
Either method is fine, as long as space is allocated for the data and the pointer is directed to point to that space. Until this happens, the pointer points to NULL. When working with structures, null pointers are especially easy to incorporate into your code. In Chapter 12, we discussed declaring structures and allocating space for them. You may recall, that this was a rather complicated process. If your code has been clobbered, examine your structures carefully. They are a very likely candidate for a null pointer assignment.
Dangling nodes occur when you are working with a linked list of structures. We saw an example of this in Chapter 14. The code shown here may result in a dangling node:
/* do the action functions for all the objects */ for (node=bottom_node; node!=(OBJp)NULL; node=node->next) { node->action(node); }
This code looks simple enough, but a problem occurs when the action function for the object deletes a node from the list. When the node is deleted, so is the pointer to the next node. The result is that node=node->next doesn't know what to point to.
This bug is specific to the compiler, and results are unpredictable. I found the code executed perfectly with the Microsoft C compiler, and hung tighter than a drum with the Borland compiler. Worse, I had no idea where to look for the source of the bug. It took me days to track this bug down and fix it. This is a bad bug! Avoid it if you can.
Dangling nodes also occur when you remove or reassign the pointer to a node before you free it. Improperly freed nodes will fill up RAM over a period of time. Your program will run fine for a while, then it will just crash. Don't you just hate when that happens?
Mice introduce their own special problems into a program. Fortunately, we don't use a mouse in our game. Unfortunately, we do use a mouse in our game editor. Therefore, we will have to deal with mouse bugs.
Mouse droppings occur when fragments of the mouse cursor are left on the screen when the mouse moves. The most common cause of this is failing to turn the mouse cursor off when writing to the screen. The best solution is to always turn the mouse cursor off when writing anything to any part of the screen, or doing anything else to video memory, like a page flip. This is a rule. Everybody has to do this. I am not making this up. It is a big pain, and it causes your mouse to flicker, but you still have to do it, every single time.
There are exceptions. It is possible to constrain mouse motion to one part of the screen while writing to another part of the screen. It is also possible to fix the mouse in one position, and constrain screen writes to areas that do not touch the mouse cursor. These solutions are often more trouble than they are worth, but occasionally they may be useful.
You may also turn the mouse cursor off altogether, and use your own cursor, as we did in the level editor in Chapter 6.
In the interest of fast, optimized code, Fastgraph does very little error checking. That means, functions are not automatically clipped at the edge of the screen. You may draw off the edge of the screen with unpredictable results (usually an image will wrap around and end up on a different part of the screen). Fastgraph also does not check arguments that are passed to functions. You can pass nonsensical values to Fastgraph functions, and the functions will execute to the best of their ability, sometimes with disasterous results. For example, I once called fg_paint() and passed (x,y) coordinates to it that were outside the closed polygon area I wanted to paint. The resulting flood fill filled to the edge of the screen, over the edge of the screen, and kept right on filling. It filled video memory, it filled RAM, and then it filled my CMOS! That was bad! By the way, since then Ted has written a version of fg_paint() called fg_flood() that checks for clipping limits (which, by default, are the screen extents). The fill function with clipping is a bit slower than the function without it. When speed is critical, error checking must be left out of the low-level code, and it is up to the programmer to check for errors at the high level. Here are some examples of common coordinate errors to check for.
Remember, screen coordinates start at 0 and go to a value one less than the length or width of the screen. So don't try to draw a rectangle by calling:
fg_rect(0,320,0,200);That is a mistake! You have gone too far! What you really wanted was:
fg_rect(0,319,0,199);Similarly, remember the width of a rectangle is x2-x1+1. Don't forget the +1! If you want to transfer 20 rows and 20 columns starting at x = 100, y = 100, the proper code is
fg_transfer(100,119,100,119,0,19,0,0);
not:
fg_transfer(100,120,100,120,0,20,0,0);
Parallel dimensions are something you find in the Twilight Zone--a coordinate system that appears logical, but is actuallly unrelated to the reality you are working in. You enter a parallel dimension when you try to address screen coordinates in a manner that does not describe them as they actually are.
My point is: Fastgraph addresses rectangles on the screen in an (X1,X2,Y1,Y2) sequence. Other graphics libraries may address the screen in an (X1,Y1,X2,Y2) sequence. The first method addresses the edges of the rectangle, the second method addresses the corners. Neither strategy is "right" or "wrong," they are just different. Problems arise when you try to mix them, the biggest problem being you forget which is which. It is quite easy to mangle your x and y coordinates, especially when you have developed the habit of working in one system and you must change to the other. Sometimes a simple conversion function or a macro will help.
Have you ever exited a program that left your computer in such a mangled state, you had to reboot, but found you could not? That is, a soft reboot (pressing Ctrl+Alt+Delete) failed to work. You may be able to press a reset key on your computer to reboot, or you may have to turn your computer off and on again to get a reboot. (Warning! When powering down your computer, give the hard disk a little break. Let it spin down and come to a stop before restarting the system. The extra few seconds this takes may save wear and tear on your hard drive.) What causes this inability to reboot?
One likely cause is abnormal termination of your game. Remember, we have done some tricky things to the system. For example, we have installed a low-level keyboard handler.The purpose of this is to trap and process all keystrokes before the BIOS keyboard handler gets them. That includes Ctrl+Alt+Delete. The BIOS won't initiate the reboot, because it never sees the keystroke combination.
We have also re-vectored the timer interrupt to increase the clock tick rate. If the program terminates abnormally, this may not be restored to the default 18.2 clock ticks per second. Since disk controllers depend on the 18.2 tick per second rate, you may find you have trouble reading and writing files. The best thing to do after an abnormal program termination is power down.
We've discussed a number of bugs and their causes, but the question is, how do you find them and fix them? Understanding the nature of bugs does not solve this problem. You still have to develop a debugging process, and develop methods for tracking down and killing bugs.
This is not as easy as it sounds. There are few hard-and-fast rules when it comes to debugging. Experienced programmers develop a feel for the debugging process. They go through certain steps, but they may not even be sure what those steps are. As their debugging skills improve, they find themselves looking first for obvious potential problems, then for non-obvious potential problems, then for unique and unusual bugs they have never seen before. Less experienced programmers tend to focus on the obvious sources for bugs, and it takes them weeks to work their way through to the non-obvious ones. It is important to remember, the source of your bug may be completely unrelated to what you think is causing your bug! Don't waste too much time focusing on the same lines of code over and over. You may be missing the point altogether.
Here are some suggested steps in the debugging process.
If you can isolate a bug, you can fix it. Isolating a bug consists of identifying the function or functions the bug occurs in, then finding which lines of code in those functions are responsible for the bug. This may involve reducing the code to the smallest amount of code needed to reproduce the problem. If you have a program with 10,000 lines of code, you may have trouble isolating a bug. One solution is to write another program. That is, if you think your problem is in one function, try writing a small program with just that function and function main() that calls it. Does the same bug appear? If it does, then you were right, the bug is in that function. If not, your bug may very well be in some other part of the code you have not examined yet. I find the code reduction method is most helpful with very tricky bugs. It is time consuming, but it will give you consistent results.
The first step in isolating a bug is being able to reproduce it. That means, you run the program to the same point, do the same thing, and the same bug occurs every time. This isn't always easy. Some bugs appear to be perfectly random, and efforts to reproduce them are futile. Keep trying, though. Usually there is a sequence of events that triggers the bug, you just haven't discovered it yet.
When I was debugging Tommy's Adventures, I had many bugs relating to redrawing tiles after scrolling. I knew I wanted to adjust the layout array and redraw the proper tiles that had been covered by sprites, but for some reason I had many problems. For example, I mistakenly adjusted the layout array for the visual page when I should have adjusted the layout array for the hidden page. In retrospect, that mistake seems obvious, but during development I found it quite a hard bug to find. The symptoms of the bug were kind of cute--when Tommy ran around, he would leave remnants of little red tennis shoes all over the screen. The long, long hours of tracking down the source of the bug were not so cute, though.
One way I isolated the problem was by dumping values to a file. I dumped all the values I could think of-- Tommy's x and y position, the coordinates of the tiles he covered, their tile attributes, the tile origins, the layout array, and so on. I finally discovered the values in the layout array were not the values I expected them to be, and I was able to isolate and fix the problem.
To dump values to a file, I open a file called DEBUG.DAT for writing in text mode. Since I use this file so often, I code it into my program. I use a preprocessor directive to define a term called debug, and I only open the file if the term has been defined, as follows:
#define debug#ifdef debug /* text file used for debugging purposes */ dstream = fopen("debug.dat","wt"); #endifI leave this code in the program even when I am not debugging. It makes it convenient for me to open a file and dump values to it when I need to. If don't want to open the debug file, I comment out the definition, like this:
/* #define debug */Alternatively, you can use a conditional compilation to turn debugging on and off. For example, with some compilers you can use a flag such as "/Ddebug" when compiling.
Sometimes I want to be able to dump values to a file at the press of a keystroke. I press the D key to debug. This code, if placed somewhere in the activate_level() function, will trap the keystroke, pause program execution, and dump values to the debug file:
#ifdef debug /* if needed, dump a bunch of debug information to a file */ if (fg_kbtest(KB_D)) /* press 'd' for debug */ { /* unload keyboard handler, slow down clock rate */ fg_kbinit(0); set_rate(0); fprintf(dstream,"Tommy is at x = %d y = %d\n",player->x,player->y); tile_x = (player->x)/16; tile_y = (player->y+1)/16; fprintf(dstream,"tile_x = %d tile_y = %d \n",tile_x,tile_y); /* tile number */ tile_num = (int)backtile[tile_x][tile_y]; fprintf(dstream,"tile_num = %d \n",tile_num); /* tile attributes */ fprintf(dstream,"tile attributes: "); for (i = 0; i < 8; i++) fprintf (dstream,"%d ", test_bit(background_attributes[tile_num],i)); fprintf(dstream,"\n"); /* bounding box info */ fprintf(dstream,"bound_x = %d bound_width = %d\n", player->image->bound_x,player->image->bound_width); /* flush the output file */ fflush(dstream); /* wait for a keystroke */ fg_waitkey(); /* restore keyboard handler and clock rate */ fg_kbinit(1); set_rate(8); } #endif
Notice that I set the clock rate to the normal 18.2 ticks per second before writing to the file. Also notice that I disable the low-level keyboard handler. This makes it easy to wait for a keystroke using Fastgraph's fg_waitkey() function. I print out information about the sprite, about the tiles, and about the return values of certain functions including can_move_left(), etc. After waiting for a keypress, the clock rate and low-level keyboard handler are restored to the state they need to be in for game play, and the game continues. I can then exit the game, and look at the contents of DEBUG.DAT, which will contain conveniently labeled and formatted debugging information.
Dumping values to a file is great if your bug is not the sort of bug that causes your system to hang. If your system hangs, you may discover your debug file is empty. That is because disk I/O is buffered. The output information is only written to a file when a certain amount of it has accumulated, perhaps 256 or 512 bytes. Unfortunately, your system may hang before that much data has accumulated.
The solution is to flush the output buffer before continuing. This is simple to do. A call to C's fflush() function will handle the job for you.
In an effort to make a programmer's life easier, many companies have developed tools to help you debug. Some of these are useful. Whether you choose to use them is a matter of personal preference. Some people swear by one tool or another, others forge ahead without them. Personally, I rarely use debugging tools, but I think they have value. Briefly, here are some suggestions.
An interactive debugger such as Borland's Turbo Debugger or Microsoft's CodeView will let you step through your code one line or one function at a time, set break points, examine the contents of memory locations or registers, and in general watch everything your code does as it executes. This sounds wonderful, and in fact it is. There are only a few problems. For one thing, they are time-consuming to master. By the time you install them and learn how to use them, you could have (arguably) already found your bug. The biggest problem, though, seems to be in swapping from a graphics video mode to a text video mode. In general, debuggers are not aware of Mode X, and even if they were, they tend to wipe out whatever was in video memory when they take over the screen to display your debugging information. That means your pages and tiles are gone! It is theoretically possible to swap video graphics to RAM (or extended or expanded memory) and then swap it back to video memory as needed, but I have never tried that, and to be perfectly honest, I don't want to. It sounds like a very difficult and time-consuming solution.
Debuggers are sometimes run in a dual-monitor mode, and I know programmers who swear by this method. It involves installing two video cards in your system, one a monochrome text card and the other a regular VGA card. The debugger information is displayed on the monochrome screen, and the program runs on the VGA screen. The reason I have never tried this is because I don't have enough room on my desk for another monitor. I think it sounds like a good idea, though, and it seems to work for many people. I have heard it tends to slow down your program. That is, the monochrome display will slow down what is happening on the VGA display. This is not necessarily a bad thing. Sometimes slowing down the action of a game while debugging allows you to see what is happening more clearly.
A text reformatter such as Gimpel's PC-lint will check your code for a variety of problems and formatting inconsistencies. You may be surprised at what you will find when you run your source code through PC-lint.
A memory checking program such as Bounds-Checker by Nu-Mega Technologies or MemCheck by StratosWare Corporation will help you locate problems such as array overflows. I have not tried either of these programs, but again, some programmers swear by them. If you think you need help in this area, I encourage you to check them out.
The best debugging happens before bugs occur. If you design and write your code carefully, develop good coding habits, and test your code thoroughly, you can minimize your debugging problems.
I feel like your mother, nagging you to do what you already know you should do. Be consistent in your indentations! Use meaningful variable names! Include lots of comments! Keep your elbows off the table! You know how to write clean code. Just do it.
I have heard people say there is no such thing as a bug-free program. I don't believe that. My background is in mathematics, where there is aways a correct solution to any problem. I believe, with any program, there will be one correct way to write it. There may also be 100,000 incorrect ways. Some of the incorrect ways involve major bugs, and some of them involve bugs that are so obtuse they may not surface for years.
Beta testers are your first line of defense against both the obvious and the obtuse bugs. Use them liberally. You may notice, not all beta testers are created equal. There are certain beta testers that find more than their share of bugs. I don't know why this is. Perhaps they spend more time with the program, or they are more experimental with keystroke combinations, or perhaps they are more aggressive in their quest to find your mistakes. (Their attitude may be "Aha! Gotcha! You may be a brilliant game programmer, but I found your big mistake!") When it comes to finding bugs, perhaps some people are just lucky that way. Use and appreciate the people that can find your bugs. Give them free copies of your game, and if possible, mention their name somewhere in your documentation. It doesn't cost much to keep a beta tester happy, and it is worth every penny. Every bug found before your release will save you a lot of pain after your release.
Have you ever accidently put an fopen() call inside a loop? This would cause the same file to be opened over and over, allocating a 512 byte buffer each time, so eventually your program would ran out of room and crash. That's bad! Don't do that!
Have you ever used C's memcpy() function to copy one array into another array, without noticing one array was declared int and the other was declared char. That's dangerous! Don't do that either!
Some bugs just defy description. Even if you are an experienced C programmer and you think you have already encountered every bug known to man, a new one will crop up that will leave you baffled. The worst part about it is, when you finally figure it out, you will not feel relieved or triumphant, you will probably just feel dumb.
Debugging is a painful, but necessary, part of the development cycle. I hope this chapter gives you some ideas to make the process a little easier for you. The only other thing I can offer you is a few words of encouragement--hang in there! It's hard, but you can do it!
Cover |
Contents |
Downloads
Fastgraph Home Page |
books |
Magazine Reprints
Copyright © 1998 Ted Gruber Software Inc. All Rights Reserved.
Awards |
Acknowledgements |
Introduction
Chapter 1 |
Chapter 2 |
Chapter 3 |
Chapter 4 |
Chapter 5 |
Chapter 6
Chapter 7 |
Chapter 8 |
Chapter 9 |
Chapter 10 |
Chapter 11 |
Chapter 12
Chapter 13 |
Chapter 14 |
Chapter 15 |
Chapter 16 |
Chapter 17 |
Chapter 18
Appendix |
License Agreement |
Glossary |
Installation Notes |
Home Page
So you want to be a Computer Game Developer