Version 46 (modified by 14 years ago) (diff) | ,
---|
The BOINC Wrapper
Any existing application (or sequence of applications) can be run under BOINC using a wrapper program supplied by BOINC. The wrapper runs the applications as subprocesses, and handles all communication with the core client (e.g., to report CPU time and fraction done).
The source code of wrapper is in boinc/samples. You can get pre-compiled versions here:
The job description file
The wrapper reads a file with logical name 'job.xml'. This file has the format:
<job_desc> [ <parallel/> ] <task> <application>worker</application> [ <stdin_filename>stdin_file</stdin_filename> ] [ <stdout_filename>stdout_file</stdout_filename> ] [ <stderr_filename>stderr_file</stderr_filename> ] [ <command_line>--foo bar</command_line> ] [ <weight>X</weight> ] [ <checkpoint_filename>filename</checkpoint_filename> ] [ <fraction_done_filename>filename</fraction_done_filename> ] [ <exec_dir>dirname</exec_dir> ] [ <setenv>VARNAME=VAR_VALUE</setenv> ] [ <daemon/> ] [ <append_cmdline_args/> ] </task> [ ... ] </job_desc>
The job file describes a sequence of tasks. Include the <parallel/> element if any of the tasks is multi-threaded (see Multi-Thread Apps). The descriptor for each task includes:
- application
- The logical name of the application, or 'worker program'.
- stdin_filename, stdout_filename, stderr_filename
- The logical names of the files to which stdin, stdout, and stderr are to be connected (if any).
- command_line
- command-line arguments to be passed to the worker program. wrapper itself may be passed command-line arguments (specified in the input template); these are passed to each worker program after those specified in the job file.
- weight
- the contribution of each task to the overall fraction done is proportional to its weight (floating-point, default 1). For example, if your job has tasks A and B, and A uses 100 times more CPU time than B, set A.weight=100 and B.weight=1.
- checkpoint_filename
- the name of the checkpoint file used by the app, if any. When this is modified, the wrapper assumes that a checkpoint has been completed and notifies the core client.
- fraction_done_filename
- the name of a file to which the app will periodically write its fraction done (0 to 1). This is used by the wrapper to report overall fraction done.
- exec_dir
- The directory to start the application (relative to slot, or use $PROJECT_DIR macro)
- setenv
- Environmental variable needed for the applications run-time environment - you can have more than one <setenv> entry, use the VARNAME=VAR_VALUE form, e.g. LD_LIBRARY_PATH=$PROJECT_DIR:$LD_LIBRARY_PATH
- daemon
- Denotes that this task is a 'daemon' process that should run in the background asynchronously while the other tasks are run sequentially. The wrapper will shut down this daemon when the last task has exited
- append_cmdline_args
- if set, the wrapper's command-line arguments are passed to the program (after those in <command_line>).
The job file can specify multiple tasks. This is useful for two purposes:
- To handle jobs that involve multiple steps (e.g., pre-processing and post-processing).
- To break a long job up into smaller pieces. This provides a form of checkpointing: wrapper does checkpointing at the task level, so that lost CPU time is limited even if the legacy applications themselves are not restartable.
Notes:
- Normally the job file is part of the application version (it's the same between workunits). Alternatively, it can be part of the workunit (e.g. if its command line elements differ between workunits). This requires that you use the same worker program logical names for all platforms.
- Files opened directly by a worker program must have the <copy_file/> tag. This requires version 5.5 or higher of the BOINC core client (you can specify this limit at either the application or project level.
- Worker programs must exit with zero status; nonzero values are interpreted as errors by the wrapper.
- If you run wrapper in standalone mode (while debugging), you must provide input files with the proper logical, not physical, names.
- The job file may be slightly different for different platforms (i.e. app_versions) due to directory requirements (exec_dir) and environment variables (setenv) required. You will therefore want to make and track different versions for each app_version you are supporting.
Physical file management
You can use the wrapper together with physical file management, where you directly access files in your project directory. For example, you could create a job whose first task unpacks a zip file into the project directory, and whose subsequent tasks access these files.
The support for this is:
- If a worker program name begins with "$PROJECT_DIR", that substring is replaced with the project directory, and the name is treated as a physical name.
- In task command lines, "$PROJECT_DIR" is replaced with the project directory.
Graphics
You can include a graphics app with a wrapper-based application. If you do this, pass the --graphics option to wrapper.
Example
Here's an example that shows how to use the wrapper. We assume that you have already created a project with root directory PROJECT/, and that you have an executable program for a particular target platform (say "worker_windows_intelx6_0.exe" for Win32)
- Download the wrapper for the target platform (see links above) to your server.
- Create an application named 'worker' and a corresponding directory 'PROJECT/apps/worker'. In this directory, create a directory 'wrapper_windows_intelx86_22420.exe'. Put the files 'wrapper_windows_intelx86_22420.exe', and 'worker_windows_intelx86_0.exe' there. Rename the latter file to 'worker=worker_windows_intelx86_0.exe' (this gives it the logical name 'worker').
- In the same directory, create a file 'job.xml=job_1.12.xml'
(1.12 is a version number) containing
<job_desc> <task> <application>worker</application> <stdin_filename>stdin</stdin_filename> <stdout_filename>stdout</stdout_filename> <command_line>10</command_line> </task> </job_desc>
This file (which has logical name 'job.xml' and physical name 'job_1.12.xml') is read by 'wrapper'; it tells it the name of the worker program, what files to connect to its stdin/stdout, and a command line. - In the 'PROJECT/templates' directory create a workunit template file called 'worker_wu':
<file_info> <number>0</number> </file_info> <file_info> <number>1</number> </file_info> <workunit> <file_ref> <file_number>0</file_number> <open_name>in</open_name> <copy_file/> </file_ref> <file_ref> <file_number>1</file_number> <open_name>stdin</open_name> </file_ref> <rsc_fpops_bound>1000000000000</rsc_fpops_bound> <rsc_fpops_est>1000000000000</rsc_fpops_est> </workunit>
and a result template file called 'worker_result'<file_info> <name><OUTFILE_0/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url> </file_info> <file_info> <name><OUTFILE_1/></name> <generated_locally/> <upload_when_present/> <max_nbytes>5000000</max_nbytes> <url><UPLOAD_URL/></url> </file_info> <result> <file_ref> <file_name><OUTFILE_0/></file_name> <open_name>out</open_name> <copy_file/> </file_ref> <file_ref> <file_name><OUTFILE_1/></file_name> <open_name>stdout</open_name> </file_ref> </result>
- Run bin/update_versions to create an app version and to copy the application files to the 'PROJECT/download' directory.
- Run bin/start to start the daemons.
- Run a script like
#! /bin/sh cp download/input `bin/dir_hier_path input` cp download/input2 `bin/dir_hier_path input2` bin/create_work -appname worker -wu_name worker_nodelete \ -wu_template templates/worker_wu \ -result_template templates/worker_result \ input input2
to generate a workunit. The input files in the 'create_work' command must be in the same order as in the workunit template file (worker_wu). Otherwise the client will generate errors when processing the workunit.
To understand how all this works: at the beginning of execution, the file layout is:
Project directory | slot directory |
input | in (copy of project/input) |
job_1.12.xml | job.xml (link to project/job_1.12.xml) |
input2 | stdin (link to project/input2) |
worker_nodelete_0 | stdout (link to project/worker_nodelete_0) |
worker_5.10_windows_intelx86.exe | worker (link to project/worker_5.10_windows_intelx86.exe) |
wrapper_5.10_windows_intelx86.exe | wrapper_5.10_windows_intelx86.exe (link to project/wrapper_5.10_windows_intelx86.exe) |
The wrapper program executes the worker, connecting its stdin to project/input2 and its stdout to project/worker_nodelete_0.
The worker program opens 'in' for reading and 'out' for writing.
When the worker program finishes, the wrapper sees this and exits. Then the BOINC core client copies slot/out to project/worker_nodelete_1.
GenWrapper: A more general BOINC wrapper
When the functionality of the BOINC Wrapper is not enough, there is a generic solution which uses POSIX-like shell scripting, instead of the XML config file, for describing jobs: You can have complex control flows (loops, branches, etc), but remember "with great power must also come -- great responsibility!"