Changes between Version 9 and Version 10 of CondorBoinc


Ignore:
Timestamp:
Nov 24, 2012, 12:09:10 AM (11 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CondorBoinc

    v9 v10  
    66so that a BOINC-based volunteer computing project can provide computing resources to a Condor pool.
    77
    8 An important design goal is transparency:
     8A central design goal is transparency:
    99from the job submitter's viewpoint,
    10 things should look as much like Condor as possible:
     10things should look exactly like Condor:
    1111i.e. they prepare Condor submit files and use condor_submit.
    1212
     
    2020    In Condor, a file is associated with a job, and has a single name.
    2121  * BOINC is designed for apps for which the number and names of output files
    22     is fixed at the time of job submission;
     22    is fixed at the time of job submission.
    2323    Condor doesn't have this restriction.
    2424
    2525 * Application concept:
    2626   In Condor, a job is associated with a single executable, and can run
    27    only on hosts of the appropriate platform (and possibly other attributes,
    28    as specified by the job's !ClassAd).
     27   only on hosts of the appropriate platform
     28   (and possibly other attributes, as specified by the job's !ClassAd).
    2929   In BOINC, there may be many app versions for a single application:
    3030   e.g. versions for different platforms, GPU types, etc.
    3131   A job is associated with an application, not an app version.
    3232
    33 == Applications ==
    34 
    35 Applications could run on BOINC in several ways:
    36  * As native BOINC applications.
    37    This would require making source-code modifications and recompiling
    38    for different platforms, linking with the BOINC API library.  Too complex.
    39  * In virtual machines.
     33== BOINC environment choices ==
     34
     35BOINC offers three "environments" in which applications can be deployed:
     36 * '''Native''':
     37   This requires making source-code modifications and recompiling
     38   for different platforms, linking with the BOINC API library.
     39 * '''Virtual machine-based''':
    4040   This would eliminate multi-platform issues
    4141   but would require volunteer hosts to have VirtualBox installed.
    42    Maybe someday.
    43  * Using the BOINC wrapper.
     42 * '''BOINC wrapper''':
    4443   Requires apps to be built for different platforms, but no source code mods.
    45    Let's use this to start.
     44
     45Using the BOINC wrapper is the path of least resistance at this point.
    4646
    4747The Condor pool admins will select a set of applications to run under BOINC.
     
    5656
    5757Goal: minimize data transfer and storage on the BOINC server.
    58 To do this, we'll add the following to BOINC:
    59 
    60  * DB tables for files, and for batch/file associations
    61  * daemon for deleting files and DB records of files with
    62    no associations, or past all lease ends
     58To do this, we'll add the following mechanism to BOINC:
     59
     60 * DB tables for files, and for batch/file associations (with lease ends).
     61   File names will be based on MD5s.
     62 * Web RPCs for querying and uploading files.
     63 * Daemon for deleting files and DB records of files with
     64   no associations, or past all lease ends.
    6365
    6466For output files, we'll take the approach that each job has (from BOINC's viewpoint)
    65 a single output file, which is a zipped archive of its actual outputs.
     67a single output file, which is a zipped archive of its actual output files.
    6668This will get copied to the submitter host, unzipped,
    6769and its components moved to the appropriate directory.
     
    6971== Job submission mechanism  ==
    7072
    71 We'll use Condor's existing mechanism for sending jobs to
    72 non-Condor back ends.
     73We'll use Condor's existing mechanism for sending jobs to non-Condor back ends.
    7374This will involve 2 components:
    7475
     
    7980   * Periodically poll the BOINC server for completed jobs;
    8081     when a job is newly completed,
    81         download its output from the BOINC server,
    82         and store it into the appropriate directories on the submit node.
     82    download its output from the BOINC server,
     83    and store it into the appropriate directories on the submit node.
    8384 * A new class in Condor's job_router for managing communication
    8485   with the BOINC GAHP.
     
    9899         job name
    99100         cmdline
    100                  for each input file
    101                 path on submit node
    102                         name by which app will open file
    103                  bool return_all_output_files
    104                  if the above not set: for each output file
    105                         open name (what the app will create)
    106                         final name (e.g. may have process appended)
    107                  directory where output file(s) go
     101         for each input file
     102             path on submit node
     103             name by which app will open file
     104         bool return_all_output_files (or regular expression)
     105         if the above not set
     106             for each output file
     107                open name (what the app will create)
     108                final name (e.g. may have process appended)
     109         directory where output file(s) go
    108110   output:
    109111      error code
    110112}}}
    111113
    112 What the BOINC GAHP does:
     114The BOINC GAHP handles this as follows:
    113115
    114116 * Make list of all input files
     
    186188 * All jobs belong to a single BOINC account.
    187189
    188 
    189190== Changes to BOINC ==
    190191
     
    194195 * Add lease_end field to batch
    195196
    196 == Implementation notes ==
    197 
    198 The BOINC GAHP could be implemented in PHP, Python, or C++.
     197== BOINC GAHP implementation notes ==
     198
     199The BOINC GAHP could be implemented in Python or C++.
    199200My inclination is to use Python; we can assume it's available on the submit node.