Context Navigation

Changes between Version 21 and Version 22 of GpuWorkFetch

Timestamp:: Dec 30, 2008, 9:59:04 AM (17 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

GpuWorkFetch

-                      v21
+                      v22
 This document proposes a modification to the work-fetch system that solves these problems.
-For simplicity, the design considers only one GPU type (CUDA).
-However, it is straightforward to extend the design to handle additional GPU types.
 == Example ==
 …
  * The host's GPU is twice as fast as its CPU.
 In this case, the target behavior is for the host to use
+% of the CPU for project B,
+% of the GPU for project B,
+and 75% of the GPU for project A.
+In this case, the target behavior is:
+ * the CPU is used 100% by project B
+ * the GPU is used 75% by project A and 25% by project B
 This provides equal processing to the two projects.
 == Terminology ==
+New abstraction: '''processing resource''' or PRSC.
+The CPU is a PRSC and each coprocessor type is a PRSC.
 A job sent to a client is associated with an app version,
+which uses some number (possibly fractional) of CPUs and CUDA devices.
+ * A '''CPU job''' is one that uses only CPU.
+ * A '''CUDA job''' is one that uses CUDA (and may use CPU as well).
+which uses some number (possibly fractional) of CPUs,
+and some number of instances of a particular coprocessor type.
+This design does not accommodate:
+ * jobs that use more than one coprocessor type
+ * jobs that change their resource usage dynamically (e.g. coprocessor jobs that decide to use the CPU instead).
 == Scheduler request and reply message ==
 …
 New fields in the scheduler request message:
+'''double cpu_req_seconds''': number of CPU seconds requested
+'''double cpu_req_ninstances''': send enough jobs to occupy this many CPUs
+ '''double cpu_req_seconds''':: number of CPU seconds requested
+ '''double cpu_req_ninstances''':: send enough jobs to occupy this many CPUs
 And for each coprocessor type:
+'''double req_seconds''': number of instance-seconds requested
+'''double req_ninstances''': send enough jobs to occupy this many instances
+ '''double req_seconds''':: number of instance-seconds requested
+ '''double req_ninstances''':: send enough jobs to occupy this many instances
 For compatibility with old servers, the message still has '''work_req_seconds''',
 …
 == Client ==
-New abstraction: '''processing resource''' or PRSC.
-There are two processing resource types: CPU and CUDA.
 === Long-term debt ===
 We'll continue to use the idea of '''long-term debt''' (LTD).
 LTD represents how much work (measured in device instance-seconds) is "owed" to each project.
 This increases over time in proportion to its resource share,
 and decreases as it uses resources.
 Simplified summary: when we need work for a resource,
+We'll continue to use the idea of '''long-term debt''' (LTD),
+which represents how much work (measured in device instance-seconds) is "owed" to each project P.
+This increases over time in proportion to P's resource share,
+and decreases as P uses resources.
+Simplified summary of the new policy: when we need work for a resource,
 we ask the project that may have that type of job and whose LTD is greatest.
 The idea of using RAC as a surrogate for LTD was set aside for various reasons.
+The idea of using RAC as a surrogate for LTD was discussed and set aside for various reasons.
 The notion of LTD needs to span resources;
 …
 but this would lose information.
 So the current plan is:
  * There is a separate LTD for each resource
+The current plan is:
+ * There is a separate LTD for each resource type
  * The "overall LTD", which is used in the work-fetch decision, is the sum of the resource LTDs, weighted by the speed of the resource (FLOPs per instance-second).
 …
 We propose the following:
  * For each project P and resource R there is a boolean flag D(P, R) indication whether P should accumulate debt for R.  The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it.
+ * For each project P and resource R there is a boolean flag D(P, R) indicating whether P should accumulate debt for R.  The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it.
  * D(P, R) is initially false.
  * If P supplies a job for R, D(P,R) is set to true.
 …
 Instead, we maintain a separate backoff timer per (project, PRSC).
 This is doubled whenever we ask for only work of that type and don't get any work;
+The backoff interval is doubled up to a limit whenever we ask for work of that type and don't get any work;
 it's cleared whenever we get a job of that type.
 …
 Data members of PRSC_WORK_FETCH:
+'''ninstances'''
+ '''ninstances''':: number of instances of this resource type
 Used/set by rr_simulation()):
+'''double shortfall''': shortfall for this resource
+'''double nidle''': number of currently idle instances
+ '''double shortfall''':: shortfall for this resource
+ '''double nidle''':: number of currently idle instances
 Member functions of PRSC_WORK_FETCH:
+'''rr_init()''': called at the start of RR simulation.
+Compute project shares for this PRSC, and clear overall and per-project shortfalls.
+'''set_nidle()''': called by RR sim after initial job assignment.
+ '''rr_init()''':: called at the start of RR simulation.  Compute project shares for this PRSC, and clear overall and per-project shortfalls.
+ '''set_nidle()''':: called by RR sim after initial job assignment.
 Set nidle to # of idle instances.
+'''accumulate_shortfall(dt)''': called by RR sim for each time interval during work buf period.
+ '''accumulate_shortfall(dt)''':: called by RR sim for each time interval during work buf period.
 {{{
 shortfall += dt*(ninstances - instances in use)
 …
 }}}
+'''select_project()''':
+select the best project to request this type of work from.
+It's the project not backed off for this PRSC,
+and for which LTD + p->shortfall is largest,
+also taking into consideration overworked projects etc.
+'''accumulate_debt(dt)''':
+ '''select_project()''':: select the best project to request this type of work from. It's the project not backed off for this PRSC, and for which LTD + p->shortfall is largest, also taking into consideration overworked projects etc.
+ '''accumulate_debt(dt)'''::
 for each project p:
 {{{
 …
 It has the following "persistent" members (i.e., saved in state file):
+'''backoff timer'''*:  how long to wait until ask project for work specifically for this PRSC;
+double this any time we ask for work for this rsc and get none
+(maximum 24 hours).
+Clear it when we ask for work for this PRSC and get some job.
+ '''backoff timer'''*::  how long to wait until ask project for work specifically for this PRSC;
+double this any time we ask for work for this rsc and get none (maximum 24 hours). Clear it when we ask for work for this PRSC and get some job.
 And the following transient members (used by rr_simulation()):
 '''double share''': # of instances this project should get based on resource share
+ '''double share''':: # of instances this project should get based on resource share
 relative to the set of projects not backed off for this PRSC.
+'''instances_used''': # of instances currently being used
+'''double shortfall'''
+'''accumulate_shortfall(dt)'''
+ '''instances_used''':: # of instances currently being used
+ '''double shortfall'''::
+ '''accumulate_shortfall(dt)'''::
 {{{
 shortfall += dt*(share - instances_used)
 …
 Each project has the following work-fetch-related state:
+'''double long_term_debt*''': the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project.
+ '''double long_term_debt*''':: the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project.
 === debt accounting ===