= PyBOINC: simplified BOINC application development in Python = This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed. Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, '''app_types.py''', defines the input and output types: {{{ class Input: def __init__(self, arg): self.value = arg class Output: def __init__(self, arg): self.value = arg; }}} The second file, '''app_master.py''', is the master program: {{{ import app_types def make_calls(): for i in range(100): input = Input(i); pyboinc_call('app_slave.py', input) def handle_result(output): sum += output.value sum = 0 pyboinc_master(make_calls, handle_result) print "The answer is %d", sum }}} The third file, '''pyboinc_slave.py''', is the slave function: {{{ import app_types input = pyboinc_get_input() output = Output(input.value*input.value); pyboinc_return_output(output); }}} The procedure for running this program is: * Create a BOINC project * Run a script ops/py_boinc.php that configures the project to use PyBOINC * Set an environment var PYBOINC_DIR to the root directory of the project * Create a directory (anywhere) containing the above files * In that directory, type {{{ python app_master.py }}} * This command may take a long time. If it's aborted via !^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves. == Implementation == PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are: * ID * ID of user who submitted this batch * path of 'batch directory' PyBOINC uses the following files and subdirectories in the job directory: * pyboinc_checkpoint: If present, this contains a job ID * new/: result files not yet handled * old/: result files already handled PyBOINC uses Python's [http://docs.python.org/lib/node317.html Pickler] class for serialization. The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code: {{{ python app_client.py }}} * Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter? PyBOINC uses the following daemons: * validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms) * assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/ Pseudocode for the various PyBOINC functions: {{{ static jobID pyboinc_call(slave_filename, input) create a uniquely-named file x in the download hierarchy, file name should contain batch ID Pickler(x).dump(input) create_work() pyboinc_master(make_calls, handle_result) read jobID from pyboinc_checkpoint if none create a batch record; jobID = its ID make_calls() write jobID to checkpoint file move all files from old/ to new/ while (not all jobs done) if there is a file x in new/ output = Pickler.load(x) handle_result(output) move x to old/ else sleep(1) pyboinc_get_input() boinc_resolve_filename("input", infile) return Pickler.load(infile) pyboinc_return_output(output) boinc_resolve_filename("output", outfile) Pickler(outfile).dump(output) }}}