| | 1 | = Application planning = |
| | 2 | |
| | 3 | '''Application planning''' is a mechanism that lets the scheduler decide, |
| | 4 | using project-supplied logic, |
| | 5 | whether an application is able to run on a particular host, |
| | 6 | and if so what resources it will use and how fast it will run. |
| | 7 | It works as follows. |
| | 8 | |
| | 9 | An app_version record (in the server DB) has a character string field '''plan_class'''. |
| | 10 | This identifies the range of processing resources that the application |
| | 11 | requires and is able to use. |
| | 12 | You can define these however you like, |
| | 13 | e.g. "cuda_1_1" apps require a CUDA-enabled GPU, |
| | 14 | "mt32" is a multithreaded app able to use 32 CPUs, etc. |
| | 15 | |
| | 16 | The scheduler is linked with a project-supplied function |
| | 17 | {{{ |
| | 18 | bool app_plan(HOST&, char* plan_class, HOST_USAGE&); |
| | 19 | }}} |
| | 20 | The HOST argument describes the host's CPU(s), |
| | 21 | and includes a field 'coprocs' listing its coprocessors. |
| | 22 | |
| | 23 | When called with a particular HOST and plan class, |
| | 24 | the function returns true if the host's resources are sufficient for apps of that class. |
| | 25 | If true, it populates the HOST_USAGE structure: |
| | 26 | {{{ |
| | 27 | struct HOST_USAGE { |
| | 28 | COPROCS coprocs; // coprocessors used by the app (name and count) |
| | 29 | double ncpus; // #CPUs used by app (may be fractional) |
| | 30 | double flops; // estimated FLOPS |
| | 31 | char opaque[256]; // passed to the app in init_data.xml |
| | 32 | }; |
| | 33 | }}} |
| | 34 | |
| | 35 | When deciding whether to send a job to a host, |
| | 36 | the scheduler examines all latest-version app_versions for the platform, |
| | 37 | calls '''app_plan()''' for each, |
| | 38 | and selects the one for which flops is greatest. |
| | 39 | |
| | 40 | The scheduler reply includes, for each app version, an XML encoding of HOST_USAGE. |
| | 41 | |
| | 42 | The client keeps track of coprocessor allocation, i.e. how many instances of each are free. |
| | 43 | It only runs an app if enough instances are available. |
| | 44 | |
| | 45 | The client uses app_version.usage.flops to estimate job completion times. |