Changes between Version 3 and Version 4 of AutoFlops


Ignore:
Timestamp:
Sep 21, 2009, 11:12:00 AM (15 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • AutoFlops

    v3 v4  
    1010== App units ==
    1111
    12 '''App units''' are a project-defined measure of how much computation a completed job did.
     12'''App units''' are a project-defined, application-specific measure of computation.
    1313They are typically a count of iterations of the app's main loop.
    14 They should be approximately proportional to FLOPs performed,
     14They should be roughly proportional to FLOPs performed,
    1515but it doesn't matter what the proportion is
    1616(i.e., you don't have to count the FLOPs in your main loop).
     
    3838
    3939The predictions don't have to be exact.
    40 In fact, it's OK if they're systematically too high or low,
    41 as long as there's a linear correlation.
     40In fact, it's OK if they're systematically too high or low.
    4241However, if predicted app units are not linearly correlated with
    4342actual app units, bad completion time estimates will result.
     
    5049== Job completion time estimates and bounds ==
    5150
    52 The BOINC client maintains a per-app-version estimate seconds_per_app_unit;
    53 The completion time estimate of a job J is
    54 {{{
    55 seconds_per_app_unit
     51The BOINC client maintains a estimate '''seconds_per_app_unit(V)'''
     52of elapsed time per app unit for the app version V.
     53It reports this to the scheduler.
     54
     55The scheduler's completion time estimate for a job J
     56using app version V on a given host is
     57{{{
     58seconds_per_app_unit(V)
    5659* J.predicted_aus
    5760* (app.mean_actual_aus/app.mean_predicted_aus + X*au_stdev)
     
    6669== Estimating FLOPS per app unit ==
    6770
    68 For credit-granting purposes
    69 we want to estimate the number of FLOPs per app unit.
     71For credit-granting purposes we want to estimate the number of FLOPs per app unit.
    7072
    7173When the scheduler dispatches a job,
     
    7981The server sends the client peak_flops_per_sec(J).
    8082When the client returns a job, it includes a value
    81 raw_flops_per_sec(J).
     83'''raw_flops_per_sec(J)'''.
    8284This is usually the same as peak_flops_per_sec(J)
    8385but it may be less (see note 2 below).
     
    8587Suppose a job J is executed on a given host using app version V,
    8688and that it reports A app units and uses elapsed time T.
    87 We then define raw_flops(J) as T * raw_flops_per_sec(J).
    88 We define raw_flops_per_au(J) as raw_flops(J)/A.
     89We then define
     90{{{
     91raw_flops(J) = T * raw_flops_per_sec(J)
     92raw_flops_per_au(J) = raw_flops(J)/A
     93}}}
    8994
    9095If we run jobs on lots of different hosts,
     
    156161and aborting or discarding others.
    157162
    158 This can be discouraged either by mechanisms
    159 that reduce the number of jobs/day when a job is aborted or times out.
     163This can be discouraged by server mechanisms:
     164
     165 * reducing the number of jobs/day when a job is aborted or times out.
     166 * resend jobs.
    160167
    161168== Anonymous platform notes ==