Debugging MPI programs with GDB
Debugging programs that run in parallel can be really difficult. It is hard to keep track of state across multiple processes and observe bugs that are timing-dependent, such as deadlocks. This post will not solve any of these problems, but at least it will show you how to easily attach a debugger to a C++ program that uses the Message Passing Interface (MPI).
This is a quick tip in two parts.
Part 1: The code snippet
#include <fstream>
#include <unistd.h>
#include <iostream>
using namespace std;
void attach_debugger(bool condition) {
if (!condition) return;
bool attached = false;
// also write PID to a file
ofstream os("/tmp/mpi_debug.pid");
os << getpid() << endl;
os.close();
cout << "Waiting for debugger to be attached, PID: "
<< getpid() << endl;
while (!attached) sleep(1);
}
This snippet will check if the condition is true (such as rank == 0
), and if
so will output its process ID (PID) to both the standard output and a temporary
file. After that, it will enter an endless loop with seemingly no method of
escape, if it weren’t for Part Two!
Part 2: The debugger
I typically use the GNU Debugger GDB, since it can be easily used from the command line.
When running an MPI program that contains above snippet, it will prompt you to attach a debugger.
Now, you could type gdb -p xyz
where xyz is the PID the program just spit out, change the attached flag
and resume program execution, but that is really uncomfortable if you have to do it twice, or twenty times
in a row!
Instead, put the following snippet into ~/.gdbinit
:
define dbg
select function attach_debugger
set attached = true
end
Then you can simply run gdb -p $(</tmp/mpi_debug.pid)
, type dbg
and press enter, and voilà, you are ready
to search for those pesky bugs.
Happy bug hunting!
Comments
To add a comment, write an email with this link.