Changes between Version 1 and Version 2 of HomogeneousAppVersion
- Timestamp:
- May 20, 2012, 3:35:59 PM (13 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
HomogeneousAppVersion
v1 v2 4 4 specify that multiple instances of a job must be run on hosts 5 5 whose CPU and OS type are similar, 6 to ensure that correct results are identical or 7 sufficiently similar to compare. 6 to ensure that correct results are identical or sufficiently similar to compare. 8 7 9 8 The HR mechanism doesn't handle GPU app versions; … … 11 10 run with a GPU app version and another instance is run with a CPU app version. 12 11 13 We considered adding GPU info to the HR mechanism. 14 This turned out to be infeasible. 15 16 Instead, Kevin Reed and I propose adding a new mechanism called '''homogeneous app version''' (HAV), 12 Instead, there is a mechanism called '''homogeneous app version''' (HAV) 17 13 which ensures that instances of a given job are run using the same app version 18 14 (e.g., Win32/CUDA etc.). 19 This can be specified on a per-application basis. 15 This flag can be specified on a per-application basis. 16 You can set it using the [http://boinc.berkeley.edu/trac/wiki/HtmlOps admin web interface]. 20 17 21 18 Notes: … … 24 21 * Use this only when you're sure that all app versions are correct, 25 22 since it eliminates cross-checking between versions. 26 27 == Implementation notes ==28 29 New DB fields30 31 * APP::homogeneous_app_version (bool)32 * WORKUNIT::app_version_id (int)33 34 The latter is maintained like wu.hr_class:35 it's set when we first dispatch an instance of the job,36 and it's cleared if all instances error out.37 38 Change to best_app_version():39 {{{40 if app.homogeneous_app_version and wu.app_version_id41 check if this host supports the app version's platform42 if app version has plan class, check if host can handle it43 check if we need work for the resource type44 }}}45 46 In some cases this may result in using a non-optional app version;47 e.g. we might use a CUDA 2.0 version for a host capable of running CUDA 2.3.48 So be it.49 50 It's possible that the shared-memory job cache could get clogged up51 with jobs already committed to rare app versions.52 I don't have a plan for dealing with this.