Changes between Initial Version and Version 1 of MultiSize


Ignore:
Timestamp:
Apr 24, 2013, 8:41:15 PM (11 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • MultiSize

    v1 v1  
     1= Multi-size apps =
     2
     3The difference in throughput between a slow processor
     4(e.g. an Android device that runs infrequently)
     5and a fast processor (e.g. a GPU that's always on)
     6can be a factor of 1,000 or more.
     7Having a single job size can therefore present problems:
     8
     9 * If the size is small, hosts with GPUs get huge numbers of jobs.
     10   This causes performance problems on the client
     11   and a high DB load on the server.
     12 * If the size is large, slow hosts can't get jobs,
     13   or they get jobs that take weeks to finish.
     14
     15To address this, BOINC provides a mechanism
     16that tries to match large jobs to fast devices.
     17
     18== How it works ==
     19
     20A '''multi-size application''' has a set of N '''size classes''', 0 ... N-1.
     21Each job belongs to a size class.
     22Jobs of size class i are smaller than those of size class i+1.
     23You decide how many size classes to have,
     24and how large the jobs of a given size class are.
     25
     26The BOINC scheduler maintains statistics about the "effective speed" of devices for each multi-size app,
     27where effective speed is the device speed times host availability.
     28In particular, it computes and maintains the boundaries of the N quantiles.
     29
     30When a host requests for a particular device,
     31the scheduler computes its quantile for each multi-size application.
     32It preferentially sends it jobs of the corresponding size class.
     33If it must send jobs of a different size class, it prefers smaller classes.
     34
     35== Set up the application ==
     36
     37To make an app multi-size, set the '''n_size_classes''' field of its database entry.
     38Currently this must be done manually, e.g.
     39{{{
     40update app set n_size_classes=3 where id=14;
     41}}}
     42
     43== Job creation ==
     44
     45Set the size class of jobs as you create them.
     46From C++:
     47{{{
     48...
     49wu.size_class = 2;
     50ret = create_work(wu, ...);
     51}}}
     52From scripts or command line:
     53{{{
     54create_work ... --size_class 2
     55}}}
     56
     57Don't forget to set wu.rsc_fpops_est and wu.rsc_fpops_bound appopriately as well.
     58
     59== Daemon configuration ==
     60
     61Arrange to periodically run '''size_census''',
     62which computes effective speed statistics:
     63{{{
     64    <task>
     65      <cmd>run_in_ops size_census</cmd>
     66      <output>size_census.out</output>
     67      <period>24 hour</period>
     68    </task>
     69}}}
     70
     71For each multi-size app, you must run a daemon '''size_regulator'''
     72that regulates the flow of jobs into the shared-memory job cache,
     73making sure that cache doesn't get clogged with jobs of a single size
     74{{{
     75    <daemon>
     76      <cmd>size_regulator --app_name uppercase --lo 10 --hi 30 --sleep_time 10</cmd>
     77      <output>size_regulator_uppercase.out</output>
     78      <pid_file>size_regulator_uppercase.pid</pid_file>
     79      <disabled>1</disabled>
     80    </daemon>
     81}}}
     82
     83The command-line options of size_regulator are
     84
     85 --app_name :: name of the application
     86 --lo :: keep at least this many jobs of each size class in cache
     87 --hi :: keep at most this many jobs of each size class in cache
     88 --sleep_time :: sleep this long if nothing to do
     89
     90The follow options correspond to those for '''feeder'''; use the same one
     91
     92 --random_order ::
     93 --priority_asc ::
     94 --priority_order ::
     95 --priority_order_create_time ::