Changes between Version 6 and Version 7 of CondorBoinc


Ignore:
Timestamp:
Nov 15, 2012, 7:02:51 PM (12 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CondorBoinc

    v6 v7  
    44
    55This document describes the design of Condor-B, extensions to BOINC and Condor
    6 so that a BOINC-based volunteer computing project can provide resources to a Condor pool.
     6so that a BOINC-based volunteer computing project can provide computing resources to a Condor pool.
    77
    8 Goals:
    9 
    10  * From the job submitter's viewpoint,
    11    things should look as much like Condor as possible:
    12    i.e. they prepare Condor submit files and use condor_submit.
    13 
    14  * Exception to the above: applications to be used in this way must be set up ahead of time
    15    on the BOINC server.
    16 
    17 == Issues ==
     8An important design goal is transparency:
     9from the job submitter's viewpoint,
     10things should look as much like Condor as possible:
     11i.e. they prepare Condor submit files and use condor_submit.
    1812
    1913Condor-B must address some basic differences between Condor and BOINC:
     
    2519    Files may be used by many jobs.
    2620    In Condor, a file is associated with a job, and has a single name.
    27   * BOINC is designed for apps that have a fixed, predetermined number of
    28     input and output files; Condor doesn't have this restriction.
     21  * BOINC is designed for apps for which the number and names of output files
     22    is fixed at the time of job submission;
     23    Condor doesn't have this restriction.
    2924
    3025 * Application concept:
     
    3631   A job is associated with an application, not an app version.
    3732
    38 == Proposed architecture ==
     33
     34== Applications ==
     35
     36Applications could run on BOINC in several ways:
     37 * As native BOINC applications.
     38   This would require making source-code modifications and recompiling
     39   for different platforms, linking with the BOINC API library.  Too complex.
     40 * In virtual machines.
     41   This would eliminate multi-platform issues
     42   but would require volunteer hosts to have VirtualBox installed.
     43   Maybe someday.
     44 * Using the BOINC wrapper.
     45   Requires apps to be built for different platforms, but no source code mods.
     46   Let's use this to start.
     47
     48The Condor pool admins will select a set of applications to run under BOINC.
     49For each app, they must
     50
     51 * Create a BOINC "application"
     52 * Create input and output templates
     53 * Compile the app for one or more platforms
     54 * Create BOINC "app versions", with associated job.xml files for the wrapper
     55
     56== Data model ==
     57
     58Goal: minimize data transfer and storage on the BOINC server.
     59To do this, we'll add the following to BOINC:
     60
     61 * DB tables for files, and for batch/file associations
     62 * daemon for deleting files and DB records of files with
     63   no associations, or past all lease ends
     64
     65For output files, we'll take the approach that each job has (from BOINC's viewpoint)
     66a single output file, which is a zipped archive of its actual outputs.
     67This will get copied to the submitter host, unzipped,
     68and its components moved to the appropriate directory.
     69
     70== Job submission mechanism  ==
    3971
    4072We'll use Condor's existing mechanism for sending jobs to
     
    4274This will involve 2 components:
    4375
    44  * A "BOINC GAHP" program.
     76 * A "BOINC GAHP" program: runs as a daemon process on the submit node.
     77   This does the following:
     78   * Handle RPCs (over pipes) from the Condor job router to
     79     submit and monitor jobs.
     80   * Periodically poll the BOINC server for completed jobs;
     81     when a job is newly completed,
     82         download its output from the BOINC server,
     83         and store it into the appropriate directories on the submit node.
    4584 * A new class in Condor's job_router for managing communication
    4685   with the BOINC GAHP.
     
    144183
    145184
    146 == File management mechanism ==
    147 
    148 To keep track of input files,
    149 we'll add the following to BOINC:
    150 
    151  * DB tables for files, and for batch/file associations
    152  * daemon for deleting files and DB records of files with
    153    no associations, or past all lease ends
    154 
    155185== Changes to BOINC ==
    156186
     
    160190 * Add lease_end field to batch
    161191
    162 
    163 
    164192== Implementation notes ==
    165193
    166194The BOINC GAHP could be implemented in PHP, Python, or C++.
    167 My inclination is to use Python.
     195My inclination is to use Python; we can assume it's available on the submit node.
     196