| 3 | | == Current policy == |
| 4 | | |
| 5 | | * Weighted round-robin simulation |
| 6 | | * get per-project and overall CPU shortfalls |
| 7 | | * see what misses deadline |
| 8 | | * If overall shortfall, get work from project with highest LTD |
| 9 | | * Scheduler request includes just "work_req_seconds". |
| 10 | | |
| 11 | | Problems: |
| 12 | | |
| 13 | | There may be no CPU shortfall, but GPU is idle |
| 14 | | |
| 15 | | If GPU is idle, we should get work from a project that potentially has jobs for it. |
| 16 | | |
| 17 | | If the project has both CPU and GPU jobs, we may need to tell to send only GPU jobs. |
| 18 | | |
| 19 | | LTD isn't meaningful with GPUs |
| 20 | | |
| 21 | | == New policy == |
| 22 | | |
| 23 | | {{{ |
| 24 | | A CPU job is one that uses only CPU time |
| 25 | | A CUDA job is one that uses CUDA (and may use CPU as well) |
| | 3 | == Problems with the current work fetch policy == |
| | 4 | |
| | 5 | The current work-fetch policy is essentially: |
| | 6 | * Do a weighted round-robin simulation, computing overall CPU shortfall |
| | 7 | * If there's a shortfall, request work from the project with highest LTD |
| | 8 | |
| | 9 | The scheduler request has a single number "work_req_seconds" |
| | 10 | indicating the total duration of jobs being requested. |
| | 11 | |
| | 12 | This policy has various problems. |
| | 13 | |
| | 14 | * There's no way for the client to say "I have N idle CPUs, so send me enough jobs to use them all". |
| | 15 | |
| | 16 | And many problems related to GPUs: |
| | 17 | |
| | 18 | * There may be no CPU shortfall, but GPUs are idle; no work will be fetched. |
| | 19 | |
| | 20 | * If a GPU is idle, we should get work from a project that potentially has jobs for it. |
| | 21 | |
| | 22 | * If a project has both CPU and GPU jobs, we may need to tell it to send only GPU (or only CPU) jobs. |
| | 23 | |
| | 24 | * LTD is computed solely on the basis of CPU time used, so it doesn't provide a meaningly comparison between projects that use only GPUs, or between a GPU project and a CPU project. |
| | 25 | |
| | 26 | This document proposes a work-fetch system that solves these problems. |
| | 27 | |
| | 28 | For simplicity, the design assumes that there is only one GPU time (CUDA). |
| | 29 | It is straightforward to extend the design to handle additional GPU types. |
| | 30 | |
| | 31 | == Terminology == |
| | 32 | |
| | 33 | A job sent to a client is associated with an app version, |
| | 34 | which uses some number (possibly fractional) of CPUs and CUDA devices. |
| | 35 | |
| | 36 | * A '''CPU job''' is one that uses only CPU. |
| | 37 | * A '''CUDA job''' is one that uses CUDA (and may use CPU as well). |
| | 38 | |
| | 39 | == Scheduler request == |
| | 40 | |
| | 41 | New fields in scheduler request message: |
| | 42 | |
| | 43 | '''double cpu_req_seconds''': number of CPU seconds requested |
| | 44 | |
| | 45 | '''double cuda_req_seconds''': number of CUDA seconds requested |
| | 46 | |
| | 47 | '''double ninstances_cpu''': send enough jobs to occupy this many CPUs |
| | 48 | |
| | 49 | '''double ninstances_cuda''': send enough jobs to occupy this many CUDA devs |
| | 50 | |
| | 51 | For compatibility with old servers, the message still has '''work_req_seconds'''; |
| | 52 | this is the max of (cpu,cuda)_req_seconds. |
| | 53 | |
| | 54 | == Client == |
| | 55 | |
| | 56 | New abstraction: '''processing resource''' or PRSC. |
| | 57 | There are two processing resource types: CPU and CUDA. |
| | 58 | |
| | 59 | Each PRSC has its own |
| | 60 | |