Changes between Version 8 and Version 9 of CreditNew
- Timestamp:
- Nov 4, 2009, 12:24:38 PM (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
CreditNew
v8 v9 134 134 so that the average is the same for each version. 135 135 The adjustment is always downwards: 136 we maintain the average PFC *(V) of PFC() for each app version V,136 we maintain the average PFC^mean^(V) of PFC() for each app version V, 137 137 find the minimum X. 138 138 An app version V's jobs are then scaled by the factor 139 {{{ 140 S(V) = (X/PFC*(V))141 }}} 139 140 S(V) = (X/PFC^mean^(V)) 141 142 142 143 143 The result for a given job J 144 144 is called "Version-Normalized Peak FLOP Count", or VNPFC(J): 145 {{{ 146 VNPFC(J) = PFC(J) * (X/PFC*(V)) 147 }}} 145 146 VNPFC(J) = PFC(J) * (X/PFC^mean^(V)) 148 147 149 148 Notes: … … 162 161 (e.g., workunit.rsc_fpops_est) 163 162 we can normalize by this to reduce the variance, 164 and make PFC *(V) converge more quickly.163 and make PFC^mean^(V) converge more quickly. 165 164 * ''a posteriori'' estimates of job size may exist also 166 165 (e.g., an iteration count reported by the app) … … 204 203 then, for that app, 205 204 hosts should get the same average granted credit per job. 206 To ensure this, for each application A we maintain the average VNPFC *(A),207 and for each host H we maintain VNPFC *(H, A).205 To ensure this, for each application A we maintain the average VNPFC^mean^(A), 206 and for each host H we maintain VNPFC^mean^(H, A). 208 207 The '''claimed credit''' for a given job J is then 209 {{{ 210 VNPFC(J) * (VNPFC*(A)/VNPFC*(H, A))211 }}} 208 209 VNPFC(J) * (VNPFC^mean^(A)/VNPFC^mean^(H, A)) 210 212 211 213 212 There are some cases where hosts are not sent jobs uniformly: … … 219 218 220 219 This can be done by dividing 221 each sample in the computation of VNPFC *by WU.rsc_fpops_est220 each sample in the computation of VNPFC^mean^ by WU.rsc_fpops_est 222 221 (in fact, there's no reason not to always do this). 223 222 … … 227 226 and increases the claimed credit of hosts that are more efficient 228 227 than average. 229 * VNPFC *is averaged over jobs, not hosts.228 * VNPFC^mean^ is averaged over jobs, not hosts. 230 229 231 230 == Computing averages == … … 312 311 313 312 * One-time cheats (like claiming 1e304) can be prevented by 314 capping VNPFC(J) at some multiple (say, 10) of VNPFC *(A).313 capping VNPFC(J) at some multiple (say, 10) of VNPFC^mean^(A). 315 314 * Cherry-picking: suppose an application has two types of jobs, 316 315 which run for 1 second and 1 hour respectively. … … 319 318 Suppose a client systematically refuses the 1 hour jobs 320 319 (e.g., by reporting a crash or never reporting them). 321 Its VNPFC *(H, A) will quickly decrease,320 Its VNPFC^mean^(H, A) will quickly decrease, 322 321 and soon it will be getting several thousand times more credit 323 322 per actual work than other hosts! … … 325 324 whenever a job errors out, times out, or fails to validate, 326 325 set the host's error rate back to the initial default, 327 and set its VNPFC *(H, A) to VNPFC*(A) for all apps A.326 and set its VNPFC^mean^(H, A) to VNPFC^mean^(A) for all apps A. 328 327 This puts the host to a state where several dozen of its 329 328 subsequent jobs will be replicated. … … 335 334 336 335 Unrelated to the credit proposal, but in a similar spirit. 337 The server will maintain ET *(H, V), the statistics of336 The server will maintain ET^mean^(H, V), the statistics of 338 337 job runtimes (normalized by wu.rsc_fpops_est) per 339 338 host and application version. 340 339 341 340 The server's estimate of a job's runtime is then 342 {{{ 343 R(J, H) = wu.rsc_fpops_est * ET*(H, V)344 }}} 341 342 R(J, H) = wu.rsc_fpops_est * ET^mean^(H, V) 343 345 344 346 345 == Implementation ==