= Job runtime estimation =

== The old system ==

Jobs have a FLOP count estimate, wu.rsc_fpops_est.

When sending an app version to a host,
the scheduler estimates its FLOPS.
This is either the CPU benchmark,
or a value assigned by the app_plan() function
(the app_plan function is expected to predict
the performance of an app on all possible hosts).

The client maintains a per-project duration correction factor (DCF),
intended to measure the efficiency of the project's apps,
and the systematic error in wu.rsc_fpops_est.
DCF is used to scale runtime estimates on both client and server side.

Problems with the old system:

 * Projects can have lots of apps.  A single DCF does not suffice.
 * Projects can't be expected to predict app performance,
   either in wu.rsc_fpops_est or in app_plan().

== The new system ==

Projects still have to supply wu.rsc_fpops_est.

The new system has a large overlap with [CreditNew the new credit system];
read that document first.
In particular, we now maintain:

 * A '''host_app_version''' database record
   per (host, app version), or per (host, app, resource type) in the case of anonymous platform.
   This record includes the average elapsed time per wu.rsc_fpops_est.
 * for each app version, a '''pfc_scale''' which approximates the efficiency
   of the app version relative to the most efficient version.

The app_plan() function now returns peak FLOPS,
not the expected actual FLOPS.

DCF is no longer used.

In the process of selecting an app version for each job,
the scheduler estimates its actual FLOPS.
This is stored in BEST_APP_VERSION.HOST_USAGE.flops.

=== Regular case ===

An app version's FLOPS estimate is initially the peak FLOPS.
We then look at the host_app_version record.
If it exists, and there are sufficient samples, we set
{{{
estimated_flops = 1/host_app_version.et.avg
}}}

Otherwise, is app_version.pfc_scale is defined,

{{{
estimated_flops *= app_version.pfc_scale
}}}

=== Anonymous platform case ===

If the host_app_version record exists and there are sufficient samples,
{{{
estimated_flops = 1/host_app_version.et.avg
}}}

Otherwise, we use the estimate supplied by the client.
This may be specified in the app_info.xml file.
If not, the current client passes the peak FLOPS.

Older clients (predating GPU support) don't pass a FLOPS estimate.
In this case we use the CPU benchmark.

The estimated FLOPS is used to estimate job runtime on the server side.

However, the only way to change the client's runtime estimate is by
adjusting the wu.rsc_fpops_est that we send to the client.
So, in the first case above, we scale wu.rsc_fpops_est by
{{{
(old estimate flops)/(new estimated flops)
}}}

== Implementation notes ==

At the start of send_work(), the scheduler enumerates all host_app_version records for this host.
At the end of the request, when host_scale_time is updated,
we do updates or inserts as appropriate.