wiki:RemoteJobs

Version 25 (modified by davea, 13 years ago) (diff)

--

API for remote job submission

This document describes an API for remotely submitting jobs to a BOINC server. The API supports the submission of batches of jobs. A batch could contain a single job, or many thousands of jobs. Currently, the API has two restrictions:

  • All jobs in a batch must use the same application.
  • There can be no dependencies between jobs.

BOINC provides a PHP library that implements the API. The underlying web services are implemented as HTTP/XML RPCs, and it is possible to create bindings in other languages.

The API can be used, for example, to create web interfaces for job submission, as in a science portal web site:

No image "submit.png" attached to RemoteJobs

Access control and quotas

This system is coupled with BOINC's multi-user project features, which include access control: users can submit jobs only if they have been given access by project administrators, and admins can restrict the apps for which each user is allowed to submit jobs. Users have quotas how resources are allocated to their jobs.

Input and output files

Input files can be supplied in any of the following ways:

  • local: the file is resident on the BOINC server, and is specified by its path.
  • inline: the file is included in the job submission request XML message. It will be served to clients from BOINC server.
  • semilocal: the file is on a data server other than the BOINC server. It is specified by its URL. It will be downloaded by the BOINC server during job submission, and served to clients from the BOINC server.
  • remote: the file is on a data server other than the BOINC server, and will be served to clients from that data server. It is specified by the URL, the file size, and the file MD5.

As jobs are completed, their output files are available to the submitting user via HTTP. When a batch is complete, a zipped archive of all its output files is available via HTTP.

PHP interface

The following functions are provided in the PHP file submit.inc, which is independent of other BOINC code and can be used in the Portal web code.

boinc_submit_batch()

Submits a batch.

Arguments: a "request object" whose fields include

  • project: the project URL
  • authenticator: the user's authenticator
  • app_name: the name of the application for which jobs are being submitted
  • batch_name: a symbolic name for the batch. Need not be unique.
  • jobs: an array of job descriptors, each of which contains
    • rsc_fpops_est: an estimate of the FLOPs used by the job
    • command_line: command-line arguments to the application
    • input_files: an array of input file descriptors, each of which contains
      • mode: "inline", "local", "semilocal", or "remote".
      • url: the file's URL (semilocal and remote modes)
      • size: the file's size in bytes (remote mode)
      • md5: the file's MD5 (remote mode)
      • data: the file contents (inline mode)
      • path: the file's absolute path on the BOINC server (local mode)

Result: a 2-element array containing

  • The batch ID
  • An error message (null if success)

The following example submits a 10-job batch:

$req->project = "http://foo.bar.edu/test/";
$req->authenticator = "xxx";
$req->app_name = "uppercase";
$req->jobs = array();

$f->source = "http://foo.bar.edu/index.php";
$job->input_files = array($f);

for ($i=10; $i<20; $i++) {
    $job->rsc_fpops_est = $i*1e9;
    $job->command_line = "--t $i";
    $req->jobs[] = $job;
}

list($e, $errmsg) = boinc_estimate_batch($req);
if ($errmsg) {
    echo "Error: $errmsg\n";
} else {
    echo "The batch will take about $e seconds to complete\n";
}

boinc_estimate_batch()

Returns an estimate of the elapsed time required to complete a batch.

Arguments: same as boinc_submit_batch() (only relevant fields need to be populated).

Return value: a 2-element array containing

  • The elapsed time estimate, in seconds
  • An error message (null if success)

boinc_query_batches()

Returns a list of this user's batches, both in progress and complete.

Argument: a request object with elements

  • project and authenticator: as above.

Result: a 2-element array. The first element is an array of batch descriptor objects, each with the following fields:

  • id: batch ID
  • state: values are
    • 1: in progress
    • 2: completed (all jobs either succeeded or had fatal errors)
    • 3: aborted
    • 4: retired
  • name: the batch name
  • app_name: the application name
  • create_time: when the batch was submitted
  • est_completion_time: current estimate of completion time
  • njobs: the number of jobs in the batch
  • fraction_done: the fraction of the batch that has been completed (0..1)
  • nerror_jobs: the number of jobs that had fatal errors
  • completion_time: when the batch was completed
  • credit_estimate: BOINC's initial estimate of the credit that would be granted to complete the batch, including replication
  • credit_canonical: the actual credit granted to canonical instances
  • credit_total: the actual credit granted to all instances

boinc_query_batch()

Gets batch details.

Argument: a request object with elements

  • project and authenticator: as above
  • batch_id: specifies a batch.

Result: a 2-element array. The first element is a batch descriptor object as described above, with one additional field:

  • jobs: an array of job descriptor objects, each one containing
    • id: the database ID of the job's workunit record
    • canonical_instance_id: if the job has a canonical result, its database ID

boinc_query_job()

Gets job details.

Argument: a request object with elements:

  • project and authenticator: as above
  • job_id: specifies a job.

Result: a 2-element array. The first element is a job descriptor object with the following fields:

  • instances: an array of job instance descriptors, each containing:
    • name: the instance's name
    • id: the ID of the corresponding result record
    • state: a string describing the instance's state (unsent, in progress, complete, etc.)
    • outfile: if the instance is over, a list of output file descriptors, each containing
      • size: file size in bytes

boinc_abort_batch()

Argument: a request object with elements

  • project and authenticator: as above,
  • batch_id: specifies a batch.

Result: an error message, null if successful

boinc_get_output_file()

Get a URL for a particular output file.

Argument: a request object with elements

  • project and authenticator: as above,
  • instance_name: specifies a job instance,
  • file_num: the ordinal number of one of the output files.

Result: a URL from which the output file can be downloaded.

boinc_get_output_files()

Argument: a request object with elements

  • project and authenticator: as above,
  • batch_id: specifies a batch.

Result: a URL from which a zipped archive of all output files from the batch can be downloaded (only the outputs of "canonical" instances are included).

boinc_retire_batch()

Delete server storage (files, DB records) associated with a batch.

Argument: a request object with elements

  • project and authenticator: as above,
  • batch_id: specifies a batch.

Result: an error message, null if successful

HTTPS/XML interface

At a lower level, the APIs are accessed by sending a POST request, using HTTPS, to PROJECT_URL/submit.php. The inputs and outputs of each function are XML documents. The format of the request and reply XML documents can be inferred from inc/submit.inc and user/submit.php.

Bindings of these RPCs can be implemented in languages other than PHP.

Example web interface

An example of a web interface for job submission and control, based on this API, can be found here: http://boinc.berkeley.edu/trac/browser/trunk/boinc/html/user/submit_example.php

This example is functional and it shows how to use the API. However, you will have to modify it heavily for your particular applications and web site.