Version 7 (modified by 10 years ago) (diff) | ,
---|
Multi-size apps
The difference in throughput between a slow processor (e.g. an Android device that runs infrequently) and a fast processor (e.g. a GPU that's always on) can be a factor of 1,000 or more. Having a single job size can therefore present problems:
- If the size is small, hosts with GPUs get huge numbers of jobs. This causes performance problems on the client and a high DB load on the server.
- If the size is large, slow hosts can't get jobs, or they get jobs that take weeks to finish.
To address this, BOINC provides a mechanism that tries to send large jobs to fast devices and small jobs to slow devices.
How it works
A multi-size application has a set of N size classes, 0 ... N-1. Each job belongs to a size class. Jobs of size class i are smaller than those of size class i+1. You decide how many size classes to have, and how large the jobs of a given size class are.
A size_census script periodically computes statistics about the "effective speed" of devices for each multi-size app, where effective speed is the device speed times host availability. In particular, it computes and maintains the boundaries of the N quantiles.
When a host requests work for a particular device, the scheduler computes its quantile for each multi-size application. It preferentially sends it jobs of the corresponding size class. If it must send jobs of a different size class, it prefers smaller classes.
Set up the application
To make an app multi-size, set the n_size_classes field of its database entry. Currently this must be done manually, e.g.
update app set n_size_classes=3 where id=14;
Job creation
Set the size class of jobs as you create them. From C++:
... wu.size_class = 2; ret = create_work(wu, ...);
From scripts or command line:
create_work ... --size_class 2
Don't forget to set wu.rsc_fpops_est and wu.rsc_fpops_bound appropriately as well.
You may want your work generator to maintain a supply of jobs of each size class. To find the number of unsent jobs of a given size class, use
int count_unsent_results(int&, int appid, int size_class);
Daemon configuration
The script size_census.php computes effective speed statistics for multi-size apps, and writes them to flat files (named size_census_APPNAME) in the project directory. Arrange to run it periodically by putting the following in your config.xml:
<task> <cmd>run_in_ops size_census.php</cmd> <output>size_census.out</output> <period>24 hour</period> </task>
If you run the script with the --all_apps option, it will compute the statistics of all apps, not just multi-size ones. This is useful when you're getting things set up.
For each multi-size app, you must run a daemon size_regulator that regulates the flow of jobs into the shared-memory job cache, making sure that cache doesn't get clogged with jobs of a single size
<daemon> <cmd>size_regulator --app_name uppercase --lo 10 --hi 30 --sleep_time 10</cmd> <output>size_regulator_uppercase.out</output> <pid_file>size_regulator_uppercase.pid</pid_file> <disabled>1</disabled> </daemon>
The command-line options of size_regulator are
- --app_name
- name of the application
- --lo
- keep at least this many jobs of each size class in cache
- --hi
- keep at most this many jobs of each size class in cache
- --sleep_time
- sleep this long if nothing to do
The follow options correspond to those for feeder; use the same one.
- --random_order
- --priority_asc
- --priority_order
- --priority_order_create_time
Configuration
You use this feature you must use include the following in your config.xml:
<job_size_matching>1</job_size_matching>