Version 13 (modified by 14 years ago) (diff) | ,
---|
PyBOINC: simplified BOINC application development in Python
This is a proposed design for making developing BOINC applications as simple as possible. PyBOINC provides a master/slave model: the master runs on the server, and the slave is distributed.
Here's an example, which sums the squares of integers from 1 to 100. The application consists of three files. The first, app_types.py, defines the input and output types:
class Input: def __init__(self, arg): self.value = arg class Output: def __init__(self, arg): self.value = arg;
The second file, app_master.py, is the master program:
import app_types def make_calls(): for i in range(100): input = Input(i); pyboinc_call('app_slave.py', input) def handle_result(output): sum += output.value sum = 0 pyboinc_master(make_calls, handle_result) print "The answer is %d", sum
The third file, pyboinc_slave.py, is the slave function:
import app_types input = pyboinc_get_input() output = Output(input.value*input.value); pyboinc_return_output(output);
The procedure for running this program is:
- Create a BOINC project
- Run a script ops/py_boinc.php that configures the project to use PyBOINC
- Set an environment var PYBOINC_DIR to the root directory of the project
- Create a directory (anywhere) containing the above files
- In that directory, type
python app_master.py
- This command may take a long time. If it's aborted via ^C, it may be repeated later. In that case no new jobs are created, and the master waits for the completion of the remaining slaves.
Implementation
PyBOINC uses a new table, 'batch', which represents a group of jobs. Its fields are:
- ID
- ID of user who submitted this batch
- path of 'batch directory'
PyBOINC uses the following files and subdirectories in the job directory:
- pyboinc_checkpoint: If present, this contains a job ID
- new/: result files not yet handled
- old/: result files already handled
PyBOINC uses Python's Pickler class for serialization.
The PyBOINC setup script creates an application 'pyboinc'. Its work units have two input files: a Python program, and a data file. Its application runs a Python interpreter on the program file. The executable of the application is a shell script for linux/mac and a batch file for windows, which executes the python interpreter with the client code:
python app_client.py
- Question: what if python interpreter is not present on a windows box? Is the license of python allows distribution of the interpreter?
PyBOINC uses the following daemons:
- validator: uses the sample bitwise validator (need to check what python produces with floating-point operations on the different platforms)
- assimilator: uses a variant of sample_assimilator. Given a completed result, it looks up the batch record, then copies the output file to BATCH_DIR/new/
Pseudocode for the various PyBOINC functions:
static jobID pyboinc_call(slave_filename, input) create a uniquely-named file x in the download hierarchy, file name should contain batch ID Pickler(x).dump(input) create_work() pyboinc_master(make_calls, handle_result) read jobID from pyboinc_checkpoint if none create a batch record; jobID = its ID make_calls() write jobID to checkpoint file move all files from old/ to new/ while (not all jobs done) if there is a file x in new/ output = Pickler.load(x) handle_result(output) move x to old/ else sleep(1) pyboinc_get_input() boinc_resolve_filename("input", infile) return Pickler.load(infile) pyboinc_return_output(output) boinc_resolve_filename("output", outfile) Pickler(outfile).dump(output)