Changes between Version 21 and Version 22 of GpuWorkFetch
- Timestamp:
- Dec 30, 2008, 9:59:04 AM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GpuWorkFetch
v21 v22 23 23 This document proposes a modification to the work-fetch system that solves these problems. 24 24 25 For simplicity, the design considers only one GPU type (CUDA).26 However, it is straightforward to extend the design to handle additional GPU types.27 28 25 == Example == 29 26 … … 33 30 * The host's GPU is twice as fast as its CPU. 34 31 35 In this case, the target behavior is for the host to use36 100% of the CPU for project B, 37 25% of the GPU for project B, 38 and 75% of the GPU for project A. 32 In this case, the target behavior is: 33 * the CPU is used 100% by project B 34 * the GPU is used 75% by project A and 25% by project B 35 39 36 This provides equal processing to the two projects. 40 37 41 38 == Terminology == 42 39 40 New abstraction: '''processing resource''' or PRSC. 41 The CPU is a PRSC and each coprocessor type is a PRSC. 42 43 43 A job sent to a client is associated with an app version, 44 which uses some number (possibly fractional) of CPUs and CUDA devices. 45 46 * A '''CPU job''' is one that uses only CPU. 47 * A '''CUDA job''' is one that uses CUDA (and may use CPU as well). 44 which uses some number (possibly fractional) of CPUs, 45 and some number of instances of a particular coprocessor type. 46 47 This design does not accommodate: 48 49 * jobs that use more than one coprocessor type 50 * jobs that change their resource usage dynamically (e.g. coprocessor jobs that decide to use the CPU instead). 48 51 49 52 == Scheduler request and reply message == … … 51 54 New fields in the scheduler request message: 52 55 53 '''double cpu_req_seconds''': number of CPU seconds requested 54 55 '''double cpu_req_ninstances''': send enough jobs to occupy this many CPUs 56 '''double cpu_req_seconds''':: number of CPU seconds requested 57 '''double cpu_req_ninstances''':: send enough jobs to occupy this many CPUs 56 58 57 59 And for each coprocessor type: 58 60 59 '''double req_seconds''': number of instance-seconds requested 60 61 '''double req_ninstances''': send enough jobs to occupy this many instances 61 '''double req_seconds''':: number of instance-seconds requested 62 '''double req_ninstances''':: send enough jobs to occupy this many instances 62 63 63 64 For compatibility with old servers, the message still has '''work_req_seconds''', … … 69 70 == Client == 70 71 71 New abstraction: '''processing resource''' or PRSC.72 There are two processing resource types: CPU and CUDA.73 74 72 === Long-term debt === 75 73 76 We'll continue to use the idea of '''long-term debt''' (LTD) .77 LTD represents how much work (measured in device instance-seconds) is "owed" to each project.78 This increases over time in proportion to its resource share,79 and decreases as ituses resources.80 Simplified summary : when we need work for a resource,74 We'll continue to use the idea of '''long-term debt''' (LTD), 75 which represents how much work (measured in device instance-seconds) is "owed" to each project P. 76 This increases over time in proportion to P's resource share, 77 and decreases as P uses resources. 78 Simplified summary of the new policy: when we need work for a resource, 81 79 we ask the project that may have that type of job and whose LTD is greatest. 82 80 83 The idea of using RAC as a surrogate for LTD was set aside for various reasons.81 The idea of using RAC as a surrogate for LTD was discussed and set aside for various reasons. 84 82 85 83 The notion of LTD needs to span resources; … … 94 92 but this would lose information. 95 93 96 So the current plan is:97 98 * There is a separate LTD for each resource 94 The current plan is: 95 96 * There is a separate LTD for each resource type 99 97 * The "overall LTD", which is used in the work-fetch decision, is the sum of the resource LTDs, weighted by the speed of the resource (FLOPs per instance-second). 100 98 … … 104 102 We propose the following: 105 103 106 * For each project P and resource R there is a boolean flag D(P, R) indicati onwhether P should accumulate debt for R. The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it.104 * For each project P and resource R there is a boolean flag D(P, R) indicating whether P should accumulate debt for R. The idea is that if D(P,R) is true, then it's likely that P would supply a job for R if we asked it. 107 105 * D(P, R) is initially false. 108 106 * If P supplies a job for R, D(P,R) is set to true. … … 117 115 118 116 Instead, we maintain a separate backoff timer per (project, PRSC). 119 Th is is doubled whenever we ask for onlywork of that type and don't get any work;117 The backoff interval is doubled up to a limit whenever we ask for work of that type and don't get any work; 120 118 it's cleared whenever we get a job of that type. 121 119 … … 127 125 Data members of PRSC_WORK_FETCH: 128 126 129 '''ninstances''' 127 '''ninstances''':: number of instances of this resource type 130 128 131 129 Used/set by rr_simulation()): 132 130 133 '''double shortfall''': shortfall for this resource 134 135 '''double nidle''': number of currently idle instances 131 '''double shortfall''':: shortfall for this resource 132 '''double nidle''':: number of currently idle instances 136 133 137 134 Member functions of PRSC_WORK_FETCH: 138 135 139 '''rr_init()''': called at the start of RR simulation. 140 Compute project shares for this PRSC, and clear overall and per-project shortfalls. 141 142 '''set_nidle()''': called by RR sim after initial job assignment. 136 '''rr_init()''':: called at the start of RR simulation. Compute project shares for this PRSC, and clear overall and per-project shortfalls. 137 '''set_nidle()''':: called by RR sim after initial job assignment. 143 138 Set nidle to # of idle instances. 144 145 '''accumulate_shortfall(dt)''': called by RR sim for each time interval during work buf period. 139 '''accumulate_shortfall(dt)''':: called by RR sim for each time interval during work buf period. 146 140 {{{ 147 141 shortfall += dt*(ninstances - instances in use) … … 150 144 }}} 151 145 152 '''select_project()''': 153 select the best project to request this type of work from. 154 It's the project not backed off for this PRSC, 155 and for which LTD + p->shortfall is largest, 156 also taking into consideration overworked projects etc. 157 158 '''accumulate_debt(dt)''': 146 '''select_project()''':: select the best project to request this type of work from. It's the project not backed off for this PRSC, and for which LTD + p->shortfall is largest, also taking into consideration overworked projects etc. 147 148 '''accumulate_debt(dt)''':: 159 149 for each project p: 160 150 {{{ … … 168 158 It has the following "persistent" members (i.e., saved in state file): 169 159 170 '''backoff timer'''*: how long to wait until ask project for work specifically for this PRSC; 171 double this any time we ask for work for this rsc and get none 172 (maximum 24 hours). 173 Clear it when we ask for work for this PRSC and get some job. 160 '''backoff timer'''*:: how long to wait until ask project for work specifically for this PRSC; 161 double this any time we ask for work for this rsc and get none (maximum 24 hours). Clear it when we ask for work for this PRSC and get some job. 174 162 175 163 And the following transient members (used by rr_simulation()): 176 164 177 '''double share''': # of instances this project should get based on resource share165 '''double share''':: # of instances this project should get based on resource share 178 166 relative to the set of projects not backed off for this PRSC. 179 180 '''instances_used''': # of instances currently being used 181 182 '''double shortfall''' 183 184 '''accumulate_shortfall(dt)''' 167 '''instances_used''':: # of instances currently being used 168 '''double shortfall''':: 169 '''accumulate_shortfall(dt)''':: 185 170 {{{ 186 171 shortfall += dt*(share - instances_used) … … 188 173 189 174 Each project has the following work-fetch-related state: 190 191 '''double long_term_debt*''': the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project. 175 '''double long_term_debt*''':: the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project. 192 176 193 177 === debt accounting ===