Changes between Version 21 and Version 22 of GpuWorkFetch


Ignore:
Timestamp:
Dec 30, 2008, 9:59:04 AM (15 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GpuWorkFetch

    v21 v22  
    2323This document proposes a modification to the work-fetch system that solves these problems.
    2424
    25 For simplicity, the design considers only one GPU type (CUDA).
    26 However, it is straightforward to extend the design to handle additional GPU types.
    27 
    2825== Example ==
    2926
     
    3330 * The host's GPU is twice as fast as its CPU.
    3431
    35 In this case, the target behavior is for the host to use
    36 100% of the CPU for project B,
    37 25% of the GPU for project B,
    38 and 75% of the GPU for project A.
     32In this case, the target behavior is:
     33 * the CPU is used 100% by project B
     34 * the GPU is used 75% by project A and 25% by project B
     35
    3936This provides equal processing to the two projects.
    4037
    4138== Terminology ==
    4239
     40New abstraction: '''processing resource''' or PRSC.
     41The CPU is a PRSC and each coprocessor type is a PRSC.
     42
    4343A job sent to a client is associated with an app version,
    44 which uses some number (possibly fractional) of CPUs and CUDA devices.
    45 
    46  * A '''CPU job''' is one that uses only CPU.
    47  * A '''CUDA job''' is one that uses CUDA (and may use CPU as well).
     44which uses some number (possibly fractional) of CPUs,
     45and some number of instances of a particular coprocessor type.
     46
     47This design does not accommodate:
     48
     49 * jobs that use more than one coprocessor type
     50 * jobs that change their resource usage dynamically (e.g. coprocessor jobs that decide to use the CPU instead).
    4851
    4952== Scheduler request and reply message ==
     
    5154New fields in the scheduler request message:
    5255
    53 '''double cpu_req_seconds''': number of CPU seconds requested
    54 
    55 '''double cpu_req_ninstances''': send enough jobs to occupy this many CPUs
     56 '''double cpu_req_seconds''':: number of CPU seconds requested
     57 '''double cpu_req_ninstances''':: send enough jobs to occupy this many CPUs
    5658
    5759And for each coprocessor type:
    5860
    59 '''double req_seconds''': number of instance-seconds requested
    60 
    61 '''double req_ninstances''': send enough jobs to occupy this many instances
     61 '''double req_seconds''':: number of instance-seconds requested
     62 '''double req_ninstances''':: send enough jobs to occupy this many instances
    6263
    6364For compatibility with old servers, the message still has '''work_req_seconds''',
     
    6970== Client ==
    7071
    71 New abstraction: '''processing resource''' or PRSC.
    72 There are two processing resource types: CPU and CUDA.
    73 
    7472=== Long-term debt ===
    7573
    76 We'll continue to use the idea of '''long-term debt''' (LTD).
    77 LTD represents how much work (measured in device instance-seconds) is "owed" to each project.
    78 This increases over time in proportion to its resource share,
    79 and decreases as it uses resources.
    80 Simplified summary: when we need work for a resource,
     74We'll continue to use the idea of '''long-term debt''' (LTD),
     75which represents how much work (measured in device instance-seconds) is "owed" to each project P.
     76This increases over time in proportion to P's resource share,
     77and decreases as P uses resources.
     78Simplified summary of the new policy: when we need work for a resource,
    8179we ask the project that may have that type of job and whose LTD is greatest.
    8280
    83 The idea of using RAC as a surrogate for LTD was set aside for various reasons.
     81The idea of using RAC as a surrogate for LTD was discussed and set aside for various reasons.
    8482
    8583The notion of LTD needs to span resources;
     
    9492but this would lose information.
    9593
    96 So the current plan is:
    97 
    98  * There is a separate LTD for each resource
     94The current plan is:
     95
     96 * There is a separate LTD for each resource type
    9997 * The "overall LTD", which is used in the work-fetch decision, is the sum of the resource LTDs, weighted by the speed of the resource (FLOPs per instance-second).
    10098
     
    104102We propose the following:
    105103
    106  * For each project P and resource R there is a boolean flag D(P, R) indication whether P should accumulate debt for R.  The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it.
     104 * For each project P and resource R there is a boolean flag D(P, R) indicating whether P should accumulate debt for R.  The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it.
    107105 * D(P, R) is initially false.
    108106 * If P supplies a job for R, D(P,R) is set to true.
     
    117115
    118116Instead, we maintain a separate backoff timer per (project, PRSC).
    119 This is doubled whenever we ask for only work of that type and don't get any work;
     117The backoff interval is doubled up to a limit whenever we ask for work of that type and don't get any work;
    120118it's cleared whenever we get a job of that type.
    121119
     
    127125Data members of PRSC_WORK_FETCH:
    128126
    129 '''ninstances'''
     127 '''ninstances''':: number of instances of this resource type
    130128
    131129Used/set by rr_simulation()):
    132130
    133 '''double shortfall''': shortfall for this resource
    134 
    135 '''double nidle''': number of currently idle instances
     131 '''double shortfall''':: shortfall for this resource
     132 '''double nidle''':: number of currently idle instances
    136133
    137134Member functions of PRSC_WORK_FETCH:
    138135
    139 '''rr_init()''': called at the start of RR simulation.
    140 Compute project shares for this PRSC, and clear overall and per-project shortfalls.
    141 
    142 '''set_nidle()''': called by RR sim after initial job assignment.
     136 '''rr_init()''':: called at the start of RR simulation.  Compute project shares for this PRSC, and clear overall and per-project shortfalls.
     137 '''set_nidle()''':: called by RR sim after initial job assignment.
    143138Set nidle to # of idle instances.
    144 
    145 '''accumulate_shortfall(dt)''': called by RR sim for each time interval during work buf period.
     139 '''accumulate_shortfall(dt)''':: called by RR sim for each time interval during work buf period.
    146140{{{
    147141shortfall += dt*(ninstances - instances in use)
     
    150144}}}
    151145
    152 '''select_project()''':
    153 select the best project to request this type of work from.
    154 It's the project not backed off for this PRSC,
    155 and for which LTD + p->shortfall is largest,
    156 also taking into consideration overworked projects etc.
    157 
    158 '''accumulate_debt(dt)''':
     146 '''select_project()''':: select the best project to request this type of work from. It's the project not backed off for this PRSC, and for which LTD + p->shortfall is largest, also taking into consideration overworked projects etc.
     147
     148 '''accumulate_debt(dt)'''::
    159149for each project p:
    160150{{{
     
    168158It has the following "persistent" members (i.e., saved in state file):
    169159
    170 '''backoff timer'''*:  how long to wait until ask project for work specifically for this PRSC;
    171 double this any time we ask for work for this rsc and get none
    172 (maximum 24 hours).
    173 Clear it when we ask for work for this PRSC and get some job.
     160 '''backoff timer'''*::  how long to wait until ask project for work specifically for this PRSC;
     161double this any time we ask for work for this rsc and get none (maximum 24 hours). Clear it when we ask for work for this PRSC and get some job.
    174162
    175163And the following transient members (used by rr_simulation()):
    176164
    177 '''double share''': # of instances this project should get based on resource share
     165 '''double share''':: # of instances this project should get based on resource share
    178166relative to the set of projects not backed off for this PRSC.
    179 
    180 '''instances_used''': # of instances currently being used
    181 
    182 '''double shortfall'''
    183 
    184 '''accumulate_shortfall(dt)'''
     167 '''instances_used''':: # of instances currently being used
     168 '''double shortfall'''::
     169 '''accumulate_shortfall(dt)'''::
    185170{{{
    186171shortfall += dt*(share - instances_used)
     
    188173
    189174Each project has the following work-fetch-related state:
    190 
    191 '''double long_term_debt*''': the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project.
     175 '''double long_term_debt*''':: the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project.
    192176
    193177=== debt accounting ===