[[PageOutline]]
= Generating work =
To submit a job:
* Write XML 'template files' that describe the job's input and outputs (typically the same template files will be used for many jobs).
* Create the job's input file(s) in the download directory.
* Invoke a BOINC function or script that submits the job.
Once this is done, BOINC takes over: it creates one or more instances of the job, distributes them to client hosts, collects the output files, finds a canonical instance, assimilates the canonical instance, and deletes files.
During the testing phase of a project, you can use the [http://boinc.berkeley.edu/busy_work.php make_work daemon] to replicate a given workunit as needed to maintain a constant supply of work. This is useful while testing and debugging the application.
== Input and output template files ==
An input template file has the form
{{{
0
[ , other attributes]
[ ... ]
0
NAME
[ ... ]
[ -flags xyz ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ x ]
[ X ]
}}}
The components are:
, ::
Each pair describes an [http://boinc.berkeley.edu/files.php#file input file] and [http://boinc.berkeley.edu/files.php#file_ref the way it's referenced].
::
The command-line arguments to be passed to the main program.
::
The amount of credit to be granted for successful completion of this workunit. Use this only if you know in advance how many FLOPs it will take. Your [http://boinc.berkeley.edu/validate_simple.php validator] must use get_credit_from_wu() as its compute_granted_credit() function.
Other elements::
[http://boinc.berkeley.edu/work.php Work unit attributes]
Workunit database records include a field, 'xml_doc', that is an XML-format description of the workunit's input files. This is derived from the workunit template as follows:
* Within a element, x identifies the order of the file. It is replaced with elements giving the filename, download URL, MD5 checksum, and size.
* Within a element, x is replaced with an element giving the filename.
An output template file has the form
{{{
32768
result.sah
}}}
Result database records include a field, 'xml_doc_in', that is an XML-format description of the result's output files. This is derived from the result template as follows:
* is replaced with a string of the form 'wuname_resultnum_n' where wuname is the workunit name and resultnum is the ordinal number of the result (0, 1, ...).
* is replaced with the upload URL.
== Moving input files to the download directory ==
If you're using a flat download directory, just put input files in that directory. If you're using [http://boinc.berkeley.edu/hier_dir.php hierarchical upload/download directories], you must put each input file in the appropriate directory; the directory is determined by the file's name. To find this directory, call the C++ function
{{{
dir_hier_path(
const char* filename,
const char* root, // root of download directory
int fanout, // from config.xml
char* result, // path of file in hierarchy
bool create_dir=false // create dir if it's not there
);
}}}
If you're using scripts, you can invoke the program
{{{
dir_hier_path filename
}}}
It prints the full pathname and creates the directory if needed. Run this in the project's root directory. For example:
{{{
cp test_workunits/12ja04aa `bin/dir_hier_path 12ja04aa`
}}}
copies an input file from the test_workunits directory to the download directory hierarchy.
== Creating workunit records ==
Workunits can be created using either a script (using the create_work program) or a program (using the create_work() function). The input files must already be in the download hierarchy.
The utility program is
{{{
create_work
-appname name // application name
-wu_name name // workunit name
-wu_template filename // WU template filename
// relative to project root; usually in templates/
-result_template filename // result template filename
// relative to project root; usually in templates/
[ -batch n ]
[ -priority n ]
// The following may be passed in the WU template,
// or as command-line arguments to create_work,
// or not passed at all (defaults will be used)
[ -command_line "-flags foo" ]
[ -rsc_fpops_est x ]
[ -rsc_fpops_bound x ]
[ -rsc_memory_bound x ]
[ -rsc_disk_bound x ]
[ -delay_bound x ]
[ -min_quorum x ]
[ -target_nresults x ]
[ -max_error_results x ]
[ -max_total_results x ]
[ -max_success_results x ]
[ -additional_xml 'x' ]
infile_1 ... infile_m // input files
}}}
The program must be run in the project root directory. The workunit parameters are documented [http://boinc.berkeley.edu/work.php here]. The -additional_xml argument can be used to supply, for example, 12.4.
BOINC's library (backend_lib.C,h) provides the functions:
{{{
int create_work(
DB_WORKUNIT&,
const char* wu_template, // contents, not path
const char* result_template_filename, // relative to project root
const char* result_template_filepath, // absolute or relative to current dir
const char** infiles, // array of input file names
int ninfiles
SCHED_CONFIG&,
const char* command_line = NULL,
const char* additional_xml = NULL
);
}}}
create_work() creates a workunit. The arguments are similar to those of the utility program; some of the information is passed in the DB_WORKUNIT structure, namely the following fields:
{{{
name
appid
}}}
The following may be passed either in the DB_WORKUNIT structure or in the workunit template file:
{{{
rsc_fpops_est
rsc_fpops_bound
rsc_memory_bound
rsc_disk_bound
batch
delay_bound
min_quorum
target_nresults
max_error_results
max_total_results
max_success_results
}}}
== Examples ==
=== Making one workunit ===
Here's a program that generates one workunit (error-checking is omitted for clarity):
{{{
#include "backend_lib.h"
main() {
DB_APP app;
DB_WORKUNIT wu;
char wu_template[LARGE_BLOB_SIZE];
char* infiles[] = {"infile"};
SCHED_CONFIG config;
config.parse_file();
boinc_db.open(config.db_name, config.db_host, config.db_passwd);
app.lookup("where name='myappname'");
wu.clear(); // zeroes all fields
wu.appid = app.id;
wu.min_quorum = 2;
wu.target_nresults = 2;
wu.max_error_results = 5;
wu.max_total_results = 5;
wu.max_success_results = 5;
wu.rsc_fpops_est = 1e10;
wu.rsc_fpops_bound = 1e11;
wu.rsc_memory_bound = 1e8;
wu.rsc_disk_bound = 1e8;
wu.delay_bound = 7*86400;
read_filename("templates/wu_template.xml", wu_template, sizeof(wu_template));
create_work(
wu,
wu_template,
"templates/results_template.xml",
"templates/results_template.xml",
infiles,
1,
config
);
}
}}}
This program must be run in the project directory since it expects to find the config.xml file in the current directory.
=== Making lots of workunits ===
If you're making lots of workunits (e.g. to do the various parts of a parallel computation) you'll want the workunits to differ either in their input files, their command-line arguments, or both.
For example, let's say you want to run a program on ten input files 'file0', 'file1', ..., 'file9'. You might modify the above program with the following code:
{{{
char filename[256];
char* infiles[1];
infiles[0] = filename;
...
for (i=0; i<10; i++) {
sprintf(filename, "file%d", i);
create_work(
wu,
wu_template,
"templates/results_template.xml",
"templates/results_template.xml",
infiles,
1,
config
);
}
}}}
Note that you only need one workunit template file and one result template file.
Now suppose you want to run a program against a single input file, but with ten command lines, '-flag 0', '-flag 1', ..., '-flag 9'. You might modify the above program with the following code:
{{{
char command_line[256];
...
for (i=0; i<10; i++) {
sprintf(command_line, "-flag %d", i);
create_work(
wu,
wu_template,
"templates/results_template.xml",
"templates/results_template.xml",
infiles,
1,
config,
command_line
);
}
}}}
Again, you only need one workunit template file and one result template file.