Changes between Version 26 and Version 27 of GpuWorkFetch
- Timestamp:
- Jan 27, 2009, 11:51:36 AM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GpuWorkFetch
v26 v27 24 24 * LTD is computed solely on the basis of CPU time used, so it doesn't provide a meaningful comparison between projects that use only GPUs, or between a GPU and CPU projects. 25 25 26 == Example == 26 == Examples == 27 28 In following, A and B are projects. 29 30 === Example 1 === 27 31 28 32 Suppose that: 29 * Project A has only GPU jobs and projectB has both GPU and CPU jobs.30 * A host is attached to projectsA and B with equal resource shares.33 * A has only GPU jobs and B has both GPU and CPU jobs. 34 * The host is attached to A and B with equal resource shares. 31 35 * The host's GPU is twice as fast as its CPU. 32 36 33 In this case, the target behavior is: 34 * the CPU is used 100% by project B 35 * the GPU is used 75% by project A and 25% by project B 36 37 This provides equal processing to the two projects. 38 39 == Terminology == 37 The target behavior is: 38 * the CPU is used 100% by B 39 * the GPU is used 75% by A and 25% by B 40 41 This provides equal total processing to A and B. 42 43 === Example 2 === 44 45 A has a 1-year CPU job with no slack, so it runs in high-priority mode. 46 B has jobs available. 47 48 Goal: after A's job finishes, B gets the CPU for a year. 49 50 Variation: a new project C is attached when A's job finishes. 51 It should immediately share the CPU with B. 52 53 === Example 3 === 54 55 A has GPU jobs but B doesn't. 56 After a year, B gets a GPU app. 57 58 Goal: A and B immediately share the GPU. 59 60 == Resource types == 40 61 41 62 New abstraction: '''processing resource type''' just "resource type". 42 Examples :63 Examples of resource types: 43 64 * CPU 44 * A type of GPU 45 * the SPE processors in a Cell 65 * A coprocessor type (a kind of GPU, or the SPE processors in a Cell) 46 66 47 67 A job sent to a client is associated with an app version, … … 49 69 and some number of instances of a particular coprocessor type. 50 70 51 This design does not accommodate:52 53 * jobs that use more than one coprocessor type54 * jobs that change their resource usage dynamically (e.g. coprocessor jobs that decide to use the CPU instead).55 56 71 == Scheduler request and reply message == 57 72 … … 66 81 '''double req_instances''':: send enough jobs to occupy this many instances 67 82 83 The semantics: a scheduler should send jobs for a resource type 84 only if the request for that type is nonzero. 85 68 86 For compatibility with old servers, the message still has '''work_req_seconds''', 69 87 which is the max of the req_seconds. 70 88 71 The semantics: a scheduler should send jobs for a resource type 72 only if the request for that type is nonzero. 73 74 == Client == 75 76 === Per-resource-type backoff === 89 == Per-resource-type backoff == 77 90 78 91 We need to handle the situation where e.g. there's a GPU shortfall … … 95 108 we may ask it for resource B as well, even if it's backed off for B. 96 109 97 == = Long-term debt ===110 == Long-term debt == 98 111 99 112 We continue to use the idea of '''long-term debt''' (LTD), … … 118 131 119 132 * There is a separate LTD for each resource type 120 * The "overall LTD", which isused in the work-fetch decision, is the sum of the resource LTDs, weighted by the speed of the resource (FLOPs per instance-second).133 * The "overall LTD", used in the work-fetch decision, is the sum of the resource LTDs, weighted by the speed of the resource (FLOPs per instance-second). 121 134 122 135 Per-resource LTD is maintained as follows: … … 133 146 134 147 135 == = Work-fetch state ===136 137 Each resource has its own set of data related to work fetch. 138 This is stored in an object of class PRSC_WORK_FETCH. 139 140 Data members of PRSC_WORK_FETCH:148 == Client data structures == 149 150 === RSC_WORK_FETCH === 151 152 Work-fetch state for a particular resource types. 153 Data members: 141 154 142 155 '''ninstances''':: number of instances of this resource type … … 147 160 '''double nidle''':: number of currently idle instances 148 161 149 Member functions of PRSC_WORK_FETCH:162 Member functions: 150 163 151 164 '''rr_init()''':: called at the start of RR simulation. Compute project shares for this PRSC, and clear overall and per-project shortfalls. 152 165 '''set_nidle()''':: called by RR sim after initial job assignment. 153 166 Set nidle to # of idle instances. 154 '''accumulate_shortfall( dt)''':: called by RR sim for each time interval during work buf period.167 '''accumulate_shortfall()''':: called by RR sim for each time interval during work buf period. 155 168 {{{ 156 169 shortfall += dt*(ninstances - instances in use) … … 169 182 }}} 170 183 171 Each PRSC also needs to have some per-project data. 172 This is stored in an object of class PRSC_PROJECT_DATA. 184 === RSC_PROJECT_WORK_FETCH === 185 186 State for a (resource type, project pair). 173 187 It has the following "persistent" members (i.e., saved in state file): 174 188 175 '''backoff timer'''*:: how long to wait until ask project for work specifically for this PRSC;189 '''backoff_interval''':: how long to wait until ask project for work specifically for this PRSC; 176 190 double this any time we ask for work for this rsc and get none (maximum 24 hours). Clear it when we ask for work for this PRSC and get some job. 191 '''backoff_time''':: back off until this time 192 '''debt''': long term debt 177 193 178 194 And the following transient members (used by rr_simulation()): 179 195 180 '''double share''':: # of instances this project should get based on resource share196 '''double runnable_share''':: # of instances this project should get based on resource share 181 197 relative to the set of projects not backed off for this PRSC. 182 198 '''instances_used''':: # of instances currently being used 183 '''double shortfall''':: 184 '''accumulate_shortfall(dt)''':: 185 {{{ 186 shortfall += dt*(share - instances_used) 187 }}} 188 189 Each project has the following work-fetch-related state: 190 '''double long_term_debt*''':: the amount of processing (including GPU, but expressed in terms of CPU seconds) owed to this project. 191 192 === debt accounting === 193 {{{ 194 195 for each resource type R 196 for each project P 197 if P is not backed off for R 198 P.R.LTD += share 199 for each running job J, project P 200 for each resource R used by J 201 P.R.LTD -= share*dt 202 }}} 203 204 === RR simulation === 205 206 {{{ 207 cpu_work_fetch.rr_init() 208 cuda_work_fetch.rr_init() 209 210 compute initial assignment of jobs 211 cpu_work_fetch.set_nidle(); 212 cuda_work_fetch.set_nidle(); 213 214 do simulation as current 215 on completion of an interval dt 216 cpu_work_fetch.accumulate_shortfall(dt) 217 cuda_work_fetch.accumulate_shortfall(dt) 218 }}} 219 220 === Work fetch === 221 222 {{{ 199 200 === PROJECT_WORK_FETCH === 201 202 Per-project work fetch state. 203 Members: 204 '''overall_debt''':: weighted sum of per-resource debts 205 206 === WORK_FETCH === 207 208 Overall work-fetch state. 209 210 === Pseudo-code === 211 212 The top-level function is: 213 {{{ 214 WORK_FETCH::choose_project() 223 215 rr_simulation() 224 216 … … 254 246 255 247 }}} 248 249 {{{ 250 251 for each resource type R 252 for each project P 253 if P is not backed off for R 254 P.R.LTD += share 255 for each running job J, project P 256 for each resource R used by J 257 P.R.LTD -= share*dt 258 }}} 259 260 === RR simulation === 261 262 {{{ 263 cpu_work_fetch.rr_init() 264 cuda_work_fetch.rr_init() 265 266 compute initial assignment of jobs 267 cpu_work_fetch.set_nidle(); 268 cuda_work_fetch.set_nidle(); 269 270 do simulation as current 271 on completion of an interval dt 272 cpu_work_fetch.accumulate_shortfall(dt) 273 cuda_work_fetch.accumulate_shortfall(dt) 274 }}} 275 276 === Work fetch === 277 256 278 257 279 === Handling scheduler reply === … … 299 321 300 322 The idea of using RAC as a surrogate for LTD was discussed and set aside for various reasons. 323 324 This design does not accommodate: 325 326 * jobs that use more than one coprocessor type 327 * jobs that change their resource usage dynamically (e.g. coprocessor jobs that decide to use the CPU instead). 328