Changes between Version 4 and Version 5 of AppCoprocessor


Ignore:
Timestamp:
Mar 12, 2008, 9:45:36 AM (17 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AppCoprocessor

    v4 v5  
    1 = Managing allocated resources =
     1= Using coprocessors =
    22
    3 BOINC has a hardwired model of various resources:
    4 
    5  * CPUs
    6  * physical memory and swap space
    7  * disk space
    8 
    9 Some hosts have other processing resources:
     3This document describes BOINC's support for applications that use coprocessors such as
    104
    115 * GPU(s)
    126 * SPEs in a Cell processor
    137
    14 How can we extend BOINC to take these resources into account,
    15 i.e. to send the best available app version to hosts,
    16 and to execute the best combination of apps on a given host?
    17 
    188We'll assume that these resources are "allocated" rather than "scheduled":
    19 an application using a resource has it locked
     9an application using a coprocessor has it locked
    2010while the app is in memory,
    2111even if the app is suspended by BOINC or descheduled by the OS.
     
    2313== Proposed design ==
    2414
    25  1. We define an XML notation for resources.  This might look like
     15 1. The BOINC client will probe for coprocessors, and report them in scheduler requests.  The XML looks like:
    2616{{{
    27 <resource>
    28     <type>Cell SPE</type>
    29     <number>6</number>
    30 </resource>
    31 <resource>
    32     <type>NVIDIA 8800 GPU with 1.7 driver</type>
    33     <number>1</number>
    34 </resource>
     17<coprocs>
     18   <coproc_cuda>
     19      <count>1</count>
     20      <name>GeForce 8800 GT (1)</name>
     21      <totalGlobalMem>...</totalGlobalMem>
     22      ...
     23   </coproc_cuda>
     24</coprocs>
    3525}}}
    36  1. The BOINC client will discover resources, and will pass the resource description in scheduler request messages.
    37  1. An app_version record (in the server DB) will have a new field '''resource_requirements''' of the form.
     26 1. An app_version record (in the server DB) will have a new field '''coproc_req''', which is a character string encoding its coprocessor requirements.
     27 1. The scheduler is linked with a project-specific function
    3828{{{
    39 <resource>
    40     <type>REGEXP</type>
    41     <number>n</number>
    42 </resource>
    43 ...
     29bool coprocessor_compatible(COPROCS&, char coproc_req, double& flops);
    4430}}}
    45  A host is "compatible" with an app_version if, for each required resource, the host has at least n instances of a resource whose name matches REGEXP.
    46  1. In addition, app_version will have an '''acceleration''' field, representing the (approximate) speedup relative to CPU-only execution.
    47  1. The scheduler will be modified so that, when sending a job to a host, it finds the compatible app_version for which '''acceleration''' is greatest.
    48  1. The scheduler reply will include app_version.resource_requirements and app_version.acceleration.
    49  1. The client will be modified so that it keeps track of resource allocation, i.e. how many instances of each resource are free. It only runs an app if enough instances are available, and it decrements the counts accordingly.
    50  1. The client will be modified to use app_version.acceleration in estimating job completion times.
     31This function:
     32 * returns true if the coprocessor resources (COPROC&) are sufficient for the app version
     33 * fills in the num_used fields of the elements of COPROC, indicating how many instances of each coprocessor will be used
     34 * returns (in flops) the estimated FLOPS (used to estimate job completion time)
     35
     36 1. The scheduler will be modified so that, when sending a job to a host, it finds the compatible app_version for which flops is greatest.
     37 1. The scheduler reply will include, for each app version, the list of coprocessors that it will use, and the estimated FLOPS.
     38 1. The client will be modified so that it keeps track of coprocessor allocation, i.e. how many instances of each are free. It only runs an app if enough instances are available, and it decrements the counts accordingly.
     39 1. The client will be modified to use app_version.flops in estimating job completion times.
    5140
    5241== Questions ==