Context Navigation

Changes between Version 8 and Version 9 of CreditNew

Timestamp:: Nov 4, 2009, 12:24:38 PM (16 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

CreditNew

-                      v8
+                      v9
 so that the average is the same for each version.
 The adjustment is always downwards:
 we maintain the average PFC*(V) of PFC() for each app version V,
+we maintain the average PFC^mean^(V) of PFC() for each app version V,
 find the minimum X.
 An app version V's jobs are then scaled by the factor
+{{{
 S(V) = (X/PFC*(V))
+}}}
+ S(V) = (X/PFC^mean^(V))
 The result for a given job J
 is called "Version-Normalized Peak FLOP Count", or VNPFC(J):
+{{{
+VNPFC(J) = PFC(J) * (X/PFC*(V))
+}}}
+ VNPFC(J) = PFC(J) * (X/PFC^mean^(V))
 Notes:
 …
    (e.g., workunit.rsc_fpops_est)
    we can normalize by this to reduce the variance,
    and make PFC*(V) converge more quickly.
+   and make PFC^mean^(V) converge more quickly.
  * ''a posteriori'' estimates of job size may exist also
    (e.g., an iteration count reported by the app)
 …
 then, for that app,
 hosts should get the same average granted credit per job.
 To ensure this, for each application A we maintain the average VNPFC*(A),
 and for each host H we maintain VNPFC*(H, A).
+To ensure this, for each application A we maintain the average VNPFC^mean^(A),
+and for each host H we maintain VNPFC^mean^(H, A).
 The '''claimed credit''' for a given job J is then
+{{{
 VNPFC(J) * (VNPFC*(A)/VNPFC*(H, A))
+}}}
+ VNPFC(J) * (VNPFC^mean^(A)/VNPFC^mean^(H, A))
 There are some cases where hosts are not sent jobs uniformly:
 …
 This can be done by dividing
 each sample in the computation of VNPFC* by WU.rsc_fpops_est
+each sample in the computation of VNPFC^mean^ by WU.rsc_fpops_est
 (in fact, there's no reason not to always do this).
 …
    and increases the claimed credit of hosts that are more efficient
    than average.
  * VNPFC* is averaged over jobs, not hosts.
+ * VNPFC^mean^ is averaged over jobs, not hosts.
 == Computing averages ==
 …
  * One-time cheats (like claiming 1e304) can be prevented by
    capping VNPFC(J) at some multiple (say, 10) of VNPFC*(A).
+   capping VNPFC(J) at some multiple (say, 10) of VNPFC^mean^(A).
  * Cherry-picking: suppose an application has two types of jobs,
   which run for 1 second and 1 hour respectively.
 …
   Suppose a client systematically refuses the 1 hour jobs
   (e.g., by reporting a crash or never reporting them).
   Its VNPFC*(H, A) will quickly decrease,
+  Its VNPFC^mean^(H, A) will quickly decrease,
   and soon it will be getting several thousand times more credit
   per actual work than other hosts!
 …
   whenever a job errors out, times out, or fails to validate,
   set the host's error rate back to the initial default,
   and set its VNPFC*(H, A) to VNPFC*(A) for all apps A.
+  and set its VNPFC^mean^(H, A) to VNPFC^mean^(A) for all apps A.
   This puts the host to a state where several dozen of its
   subsequent jobs will be replicated.
 …
 Unrelated to the credit proposal, but in a similar spirit.
 The server will maintain ET*(H, V), the statistics of
+The server will maintain ET^mean^(H, V), the statistics of
 job runtimes (normalized by wu.rsc_fpops_est) per
 host and application version.
 The server's estimate of a job's runtime is then
+{{{
 R(J, H) = wu.rsc_fpops_est * ET*(H, V)
+}}}
+ R(J, H) = wu.rsc_fpops_est * ET^mean^(H, V)
 == Implementation ==