wiki:PythonAppDev

Version 9 (modified by kadam, 17 years ago) (diff)

added client run script and a few questions to be answered

PyBOINC: simplified BOINC application development in Python

T(DesignDocument)?

This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed.

Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, app_types.py, defines the input and output types:

class Input:
    def __init__(self, arg):
        self.value = arg

class Output:
    def __init__(self, arg):
        self.value = arg;

The second file, app_master.py, is the master program:

import app_types

def make_calls():
    for i in range(100):
       input = Input(i);
       pyboinc_call('app_slave.py', input)

def handle_result(output):
    sum += output.value

sum = 0

pyboinc_master(make_calls, handle_result)

print "The answer is %d", sum

The third file, pyboinc_slave.py, is the slave function:

import app_types

input = pyboinc_get_input()
output = Output(input.value*input.value);
pyboinc_return_output(output);

The procedure for running this program is:

  • Create a BOINC project
  • Run a script ops/py_boinc.php that configures the project to use PyBOINC
  • Set an environment var PYBOINC_DIR to the root directory of the project
  • Create a directory (anywhere) containing the above files
  • In that directory, type
    python app_master.py
    
  • This command may take a long time. If it's aborted via ^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves.

Implementation

PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are:

  • ID
  • ID of user who submitted this batch
  • path of 'batch directory'

PyBOINC uses the following files and subdirectories in the job directory:

  • pyboinc_checkpoint: If present, this contains a job ID
  • new/: result files not yet handled
  • old/: result files already handled

PyBOINC uses Python's Pickler class for serialization.

The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code:

python app_client.py
  • Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter?

PyBOINC uses the following daemons:

  • validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms)
  • assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/

Pseudocode for the various PyBOINC functions:

static jobID

pyboinc_call(slave_filename, input)
    create a uniquely-named file x in the download hierarchy, file name should contain batch ID
    Pickler(x).dump(input)
    create_work()

pyboinc_master(make_calls, handle_result)
    read jobID from pyboinc_checkpoint
    if none
        create a batch record; jobID = its ID
        make_calls()
        write jobID to checkpoint file
    move all files from old/ to new/
    while (not all jobs done)
        if there is a file x in new/
            output = Pickler.load(x)
            handle_result(output)
            move x to old/
        else
            sleep(1)

pyboinc_get_input()
    boinc_resolve_filename("input", infile)
    return Pickler.load(infile)

pyboinc_return_output(output)
    boinc_resolve_filename("output", outfile)
    Pickler(outfile).dump(output)