Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of MultiSize

Timestamp:: Apr 24, 2013, 8:41:15 PM (12 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

MultiSize

                       v1
+= Multi-size apps =
+The difference in throughput between a slow processor
+(e.g. an Android device that runs infrequently)
+and a fast processor (e.g. a GPU that's always on)
+can be a factor of 1,000 or more.
+Having a single job size can therefore present problems:
+ * If the size is small, hosts with GPUs get huge numbers of jobs.
+   This causes performance problems on the client
+   and a high DB load on the server.
+ * If the size is large, slow hosts can't get jobs,
+   or they get jobs that take weeks to finish.
+To address this, BOINC provides a mechanism
+that tries to match large jobs to fast devices.
+== How it works ==
+A '''multi-size application''' has a set of N '''size classes''', 0 ... N-1.
+Each job belongs to a size class.
+Jobs of size class i are smaller than those of size class i+1.
+You decide how many size classes to have,
+and how large the jobs of a given size class are.
+The BOINC scheduler maintains statistics about the "effective speed" of devices for each multi-size app,
+where effective speed is the device speed times host availability.
+In particular, it computes and maintains the boundaries of the N quantiles.
+When a host requests for a particular device,
+the scheduler computes its quantile for each multi-size application.
+It preferentially sends it jobs of the corresponding size class.
+If it must send jobs of a different size class, it prefers smaller classes.
+== Set up the application ==
+To make an app multi-size, set the '''n_size_classes''' field of its database entry.
+Currently this must be done manually, e.g.
+{{{
+update app set n_size_classes=3 where id=14;
+}}}
+== Job creation ==
+Set the size class of jobs as you create them.
+From C++:
+{{{
+...
+wu.size_class = 2;
+ret = create_work(wu, ...);
+}}}
+From scripts or command line:
+{{{
+create_work ... --size_class 2
+}}}
+Don't forget to set wu.rsc_fpops_est and wu.rsc_fpops_bound appopriately as well.
+== Daemon configuration ==
+Arrange to periodically run '''size_census''',
+which computes effective speed statistics:
+{{{
+    <task>
+      <cmd>run_in_ops size_census</cmd>
+      <output>size_census.out</output>
+      <period>24 hour</period>
+    </task>
+}}}
+For each multi-size app, you must run a daemon '''size_regulator'''
+that regulates the flow of jobs into the shared-memory job cache,
+making sure that cache doesn't get clogged with jobs of a single size
+{{{
+    <daemon>
+      <cmd>size_regulator --app_name uppercase --lo 10 --hi 30 --sleep_time 10</cmd>
+      <output>size_regulator_uppercase.out</output>
+      <pid_file>size_regulator_uppercase.pid</pid_file>
+      <disabled>1</disabled>
+    </daemon>
+}}}
+The command-line options of size_regulator are
+ --app_name :: name of the application
+ --lo :: keep at least this many jobs of each size class in cache
+ --hi :: keep at most this many jobs of each size class in cache
+ --sleep_time :: sleep this long if nothing to do
+The follow options correspond to those for '''feeder'''; use the same one
+ --random_order ::
+ --priority_asc ::
+ --priority_order ::
+ --priority_order_create_time ::