Version 4 (modified by 17 years ago) (diff) | ,
---|
Application planning
Application planning is a mechanism that lets the scheduler decide, using project-supplied logic, whether an application is able to run on a particular host, and if so what resources it will use and how fast it will run. It works as follows.
An app_version record (in the server DB) has a character string field plan_class. This identifies the range of processing resources that the application requires and is able to use. You can define these however you like, e.g. "cuda_1.1" apps require a CUDA-enabled GPU, "mt32" is a multithreaded app able to use 32 CPUs, etc.
The scheduler is linked with a project-supplied function
bool app_plan(HOST&, char* plan_class, HOST_USAGE&);
The HOST argument describes the host's CPU(s), and includes a field 'coprocs' listing its coprocessors.
When called with a particular HOST and plan class, the function returns true if the host's resources are sufficient for apps of that class. If true, it populates the HOST_USAGE structure:
struct HOST_USAGE { COPROCS coprocs; // coprocessors used by the app (name and count) double ncpus; // #CPUs used by app (may be fractional) double flops; // estimated FLOPS char opaque[256]; // passed to the app in init_data.xml };
When deciding whether to send a job to a host, the scheduler examines all latest-version app_versions for the platform, calls app_plan() for each, and selects the one for which flops is greatest.
The scheduler reply includes, for each app version, an XML encoding of HOST_USAGE.
The client keeps track of coprocessor allocation, i.e. how many instances of each are free. It only runs an app if enough instances are available.
The client uses app_version.usage.flops to estimate job completion times.
Notes
- It's not always optimal to use only the version with highest FLOPS. Suppose there's a GPU version and a multithread version. On some machines it might be best to use some of each.