wiki:ServerDebug

Version 8 (modified by davea, 12 years ago) (diff)

--

Debugging server components

A grab-bag of techniques for debugging BOINC server software:

Log files

Each server component (scheduler, feeder, transitioner, etc.) has its own log file. These files are in the log_HOSTNAME subdirectory of the project directory. Most error conditions are reported in the log files.

If you're interested in the history of a particular job, grep for WU#12345 or RESULT#12345 (where 12345 represents the ID) in the log files. The html/ops pages also provide an interface for this.

To control the verbosity of the log files:

Debugging the scheduler

Start by setting the appropriate logging options and examining the log file.

If this doesn't reveal the problem, or if the scheduler is crashing, you can run it under a debugger like gdb. The scheduler is a CGI program; it reads a request from stdin and writes a reply to stdout. So you can debug it as follows:

  • Copy the "scheduler_request_X.xml" file from a client to the machine running the scheduler. (X = your project URL)
  • Run the scheduler under the debugger, giving it this file as stdin, i.e.:
    gdb cgi
    (set a breakpoint if desired)
    r < scheduler_request_X.xml
    
  • You may have to doctor the database as follows to keep the scheduler from rejecting the request:
    update host set rpc_seqno=0, rpc_time=0 where hostid=N
    

This is useful for figuring out why your project is generating 'no work available' messages.

As an alternative to this, edit sched/handle_request.cpp, and put a call to debug_sched("debug_sched"); just before sreply.write(fout, sreq);. Then, after recompiling, touch a file called 'debug_sched' in the project root directory. This will cause transcripts of all subsequent scheduler requests and replies to be written to the cgi-bin/ directory with separate small files for each request. The file names are sched_request_H_R and sched_reply_H_R where H=hostid and R=rpc sequence number. This can be turned off by deleting the 'debug_sched' file.

To get core files for scheduler crashes, uncomment the following line in sched/sched_main.cpp, and recompile:

#define DUMP_CORE_ON_SEGV 1

MySQL interfaces

You should become familiar with MySQL tools such as

Database query tracing

If you run server components with -d 4, their database queries will be logged. This is verbose but extremely useful for tracking down database-level problems.