Changes between Version 1 and Version 2 of HomogeneousAppVersion


Ignore:
Timestamp:
May 20, 2012, 3:35:59 PM (13 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HomogeneousAppVersion

    v1 v2  
    44specify that multiple instances of a job must be run on hosts
    55whose CPU and OS type are similar,
    6 to ensure that correct results are identical or
    7 sufficiently similar to compare.
     6to ensure that correct results are identical or sufficiently similar to compare.
    87
    98The HR mechanism doesn't handle GPU app versions;
     
    1110run with a GPU app version and another instance is run with a CPU app version.
    1211
    13 We considered adding GPU info to the HR mechanism.
    14 This turned out to be infeasible.
    15 
    16 Instead, Kevin Reed and I propose adding a new mechanism called '''homogeneous app version''' (HAV),
     12Instead, there is a mechanism called '''homogeneous app version''' (HAV)
    1713which ensures that instances of a given job are run using the same app version
    1814(e.g., Win32/CUDA etc.).
    19 This can be specified on a per-application basis.
     15This flag can be specified on a per-application basis.
     16You can set it using the [http://boinc.berkeley.edu/trac/wiki/HtmlOps admin web interface].
    2017
    2118Notes:
     
    2421 * Use this only when you're sure that all app versions are correct,
    2522   since it eliminates cross-checking between versions.
    26 
    27 == Implementation notes ==
    28 
    29 New DB fields
    30 
    31  * APP::homogeneous_app_version (bool)
    32  * WORKUNIT::app_version_id  (int)
    33 
    34 The latter is maintained like wu.hr_class:
    35 it's set when we first dispatch an instance of the job,
    36 and it's cleared if all instances error out.
    37 
    38 Change to best_app_version():
    39 {{{
    40 if app.homogeneous_app_version and wu.app_version_id
    41    check if this host supports the app version's platform
    42    if app version has plan class, check if host can handle it
    43    check if we need work for the resource type
    44 }}}
    45 
    46 In some cases this may result in using a non-optional app version;
    47 e.g. we might use a CUDA 2.0 version for a host capable of running CUDA 2.3.
    48 So be it.
    49 
    50 It's possible that the shared-memory job cache could get clogged up
    51 with jobs already committed to rare app versions.
    52 I don't have a plan for dealing with this.