Context Navigation

Changes between Version 17 and Version 18 of GpuWorkFetch

Timestamp:: Dec 29, 2008, 10:53:34 AM (17 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GpuWorkFetch

-                      v17
+                      v18
 This document proposes a modification to the work-fetch system that solves these problems.
 For simplicity, the design assumes that there is only one GPU type (CUDA).
 It is straightforward to extend the design to handle additional GPU types.
+For simplicity, the design considers only one GPU type (CUDA).
+However, it is straightforward to extend the design to handle additional GPU types.
 == Terminology ==
 …
 New abstraction: '''processing resource''' or PRSC.
 There are two processing resource types: CPU and CUDA.
+The notion of long-term debt
 === Per-resource-type backoff ===
 …
 This is stored in an object of class PRSC_WORK_FETCH.
 Data members of PRSC_WORK_FETCH
+Data members of PRSC_WORK_FETCH:
 '''ninstances'''
 …
 It has the following "persistent" members (i.e., saved in state file):
-'''double long_term_debt*'''
 '''backoff timer'''*:  how long to wait until ask project for work specifically for this PRSC;
 double this any time we ask for work for this rsc and get none
 …
 }}}
+Each project has the following work-fetch-related state:
+'''double long_term_debt*''': the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project.
 === debt accounting ===
 {{{
+for each resource type
+   R.accumulate_debt(dt)
+for each resource type R
+   for each project P
+      if P is not backed off for R
+         P.R.LTD += share
+   for each running job J, project P
+      for each resource R used by J
+         P.R.LTD -= share*dt
 }}}
 …
    clear backoff for the PRSC of each returned job
 }}}
 == Scheduler changes ==
 {{{