Changes between Version 15 and Version 16 of GpuWorkFetch
- Timestamp:
- Dec 26, 2008, 1:50:45 PM (16 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
GpuWorkFetch
v15 v16 37 37 * A '''CUDA job''' is one that uses CUDA (and may use CPU as well). 38 38 39 == Scheduler request ==39 == Scheduler request and reply message == 40 40 41 41 New fields in the scheduler request message: … … 52 52 this is the max of (cpu,cuda)_req_seconds. 53 53 54 New fields in the scheduler reply message (these are not currently used): 55 56 '''double have_cpu_jobs''': this project sometimes has CPU jobs for this platform (although this reply may not include any). 57 58 '''double have_cuda_jobs''': same, for CUDA jobs. 59 54 60 == Client == 55 61 … … 59 65 === Per-resource-type backoff === 60 66 61 We need to handle the situation where there's a GPU shortfall67 We need to handle the situation where e.g. there's a GPU shortfall 62 68 but no projects are supplying GPU work 63 69 (for either permanent or transient reasons). 64 70 We don't want an overall work-fetch backoff from those projects. 71 65 72 Instead, we maintain a separate backoff timer per (project, PRSC). 66 This is doubled whenever we ask for only work of that type and don't get any ;73 This is doubled whenever we ask for only work of that type and don't get any work; 67 74 it's cleared whenever we get a job of that type. 68 75 … … 85 92 86 93 '''rr_init()''': called at the start of RR simulation. 87 Compute share of each project for this PRSC, 88 and clear shortfall. 94 Compute project shares for this PRSC, and clear overall and per-project shortfalls. 89 95 90 96 '''set_nidle()''': called by RR sim after initial job assignment. … … 92 98 93 99 '''accumulate_shortfall(dt)''': called by RR sim for each time interval during work buf period. 94 {{{ 95 nidle_now = ninstances - instances in use 96 shortfall += dt*(nidle_now) 100 {{{ 101 shortfall += dt*(ninstances - instances in use) 97 102 for each project p not backed off for this PRSC 98 103 p->PRSC_PROJECT_DATA.accumulate_shortfall(dt) … … 102 107 select the best project to request this type of work from. 103 108 It's the project not backed off for this PRSC, 104 and for which LTD + this->shortfall is largest109 and for which LTD + p->shortfall is largest 105 110 106 111 '''accumulate_debt(dt)''': … … 162 167 163 168 {{{ 169 rr_simulation() 170 164 171 if cuda_work_fetch.nidle 165 172 cpu_work_fetch.shortfall = 0 … … 195 202 196 203 === Handling scheduler reply === 197 204 {{{ 198 205 if no jobs returned 199 206 double backoff for each requested PRSC 200 207 else 208 clear backoff for the PRSC of each returned job 209 }}} 201 210 == Scheduler changes == 202 211 {{{ … … 204 213 have_cpu_app_versions 205 214 have_cuda_app_versions 206 per-req vars215 per-request vars 207 216 bool coproc_request 208 217 ncpu_jobs_sending