Changes between Version 30 and Version 31 of CreditNew
- Timestamp:
- Mar 26, 2010, 1:36:22 PM (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
CreditNew
v30 v31 100 100 == ''A priori'' job size estimates and bounds == 101 101 102 Projects supply estimates of the FLOPs used by a job 103 (wu.rsc_fpops_est) 104 and a limit on FLOPS, after which the job will be aborted 105 (wu.rsc_fpops_bound). 106 107 Previously, inaccuracy of rsc_fpops_est caused problems. 108 The new system still uses rsc_fpops_est, 109 but its primary purpose is now to indicate the relative size of jobs. 110 Averages of job sizes are normalized by rsc_fpops_est, 111 and if rsc_fpops_est is correlated with actual size, 102 For each job, the project supplies 103 * an estimate of the FLOPs used by a job (wu.fpops_est) 104 * a limit on FLOPS, after which the job will be aborted 105 (wu.fpops_bound). 106 107 Previously, inaccuracy of fpops_est caused problems. 108 The new system still uses fpops_est, 109 but its primary purpose is now to indicate the relative sizes of jobs. 110 111 Averages of FLOP count and elapsed time 112 are normalized by fpops_est (see below), 113 and if fpops_est is correlated with actual size, 112 114 these averages will converge more quickly. 113 114 We'll denote workunit.rsc_fpops_est as E(J).115 115 116 116 Notes: … … 129 129 based on the resources used by the job and their peak speeds. 130 130 131 Ifthe job is finished in elapsed time T,131 When the job is finished in elapsed time T, 132 132 we define peak_flop_count(J), or PFC(J) as 133 133 … … 136 136 Notes: 137 137 138 * PFC(J) is not cheat-proof; e.g. cheaters can falsify elapsed time. 138 * PFC(J) is not cheat-proof; 139 cheaters can falsify elapsed time or device attributes. 139 140 * We use elapsed time instead of actual device time (e.g., CPU time). 140 141 If a job uses a resource inefficiently … … 156 157 but is limited and normalized in the following ways: 157 158 159 == Computing averages == 160 161 The policies described below involve computing averages 162 of various quantities. 163 This computation must take into account: 164 165 * The quantities being averaged may gradually change over time 166 (e.g. average job size may change) 167 and we need to track this. 168 This done as follows: for the first N samples 169 (N = ~100 for app versions, ~10 for hosts) 170 we take the straight average. 171 After that we use an exponentially-weighted average 172 (with appropriate parameter for app version and host) 173 174 * A given sample may be wildly off, 175 and we can't let this mess up the average. 176 Samples after the first are capped at 10 times the current average. 177 178 * We keep track of the number of samples, 179 and use an average only if its number of samples 180 is above a '''sample threshold'''. 181 182 == Data == 183 184 We maintain the following estimates: 185 186 app.min_avg_pfc:: an estimate of the average actual FLOPS for an app 187 (normalized by wu.fpops_est) 188 app_version.pfc_avg:: the average of PFC(J)/wu.fpops_est for an app version. 189 host_app_version.pfc_avg:: for each app version V and host H, 190 the average of PFC(J)/wu.fpops_est for jobs completed by H using A. 191 158 192 == Sanity check == 159 193 160 If PFC(J) is infinite or is > wu. rsc_fpops_bound,194 If PFC(J) is infinite or is > wu.fpops_bound, 161 195 J is assigned a "default PFC" and other processing is skipped. 162 196 Default PFC is determined as follows: 163 197 164 * If min_avg_pfc(A) is defined (see below)then165 166 D = min_avg_pfc(A) * E(J)198 * If app.min_avg_pfc is defined then 199 200 D = app.min_avg_pfc * wu.fpops_est 167 201 168 202 * Otherwise 169 203 170 D = wu. rsc_fpops_est204 D = wu.fpops_est 171 205 172 206 == Cross-version normalization == … … 179 213 so that the average is the same for each version. 180 214 181 We maintain the average PFC^mean^(V) of PFC(J)/E(J) for each app version V. 182 We periodically compute PFC^mean^(CPU) and PFC^mean^(GPU), 183 and compute X as follows: 215 For each app, we periodically compute cpu_pfc 216 (the weighted average of app_version.pfc over CPU app versions) 217 and similarly gpu_pfc. 218 We then compute X as follows: 184 219 185 220 * If there are only CPU or only GPU versions, 186 and at least 2 versions are above asample threshold,187 X is the average.188 189 * If there are both, and at least 1 of each is above asample221 and at least 2 versions are above sample threshold, 222 X is their average (weighted by # samples). 223 224 * If there are both, and at least 1 of each is above sample 190 225 threshold, let X be the min of the averages. 191 226 192 If X is defined, then for each version V we set 193 194 Scale(V) = (X/PFC^mean^(V)) 195 196 An app version V's jobs are scaled by this factor. 197 198 For each app, we maintain min_avg_pfc(A), 199 the average PFC for the most efficient version of A. 200 This is an estimate of the app's average actual FLOPS. 227 If X is defined, then for each app version 228 229 app_version.pfc_scale = (X/app_version.pfc_avg) 230 231 The PFC of the app version's jobs are scaled by this factor. 201 232 202 233 If X is defined, then we set 203 234 204 min_avg_pfc(A) = X 205 206 Otherwise, if a version V is above sample threshold, we set 207 208 min_avg_pfc(A) = PFC^mean^(V) 209 210 Notes: 235 app.min_avg_pfc = X 236 237 Otherwise, if an app version is above sample threshold, we set 238 239 app.min_avg_pfc = app_version.pfc_avg 240 241 Notes: 242 * Doesn't host normalization (see below) subsume version normalization? 243 Not if there are both CPU and GPU versions, because of the "min". 211 244 * Version normalization is only applied if at least two 212 245 versions are above sample threshold. … … 237 270 Assume jobs for a given app are distributed uniformly among hosts. 238 271 Then the average credit per job should be the same for all hosts. 239 To ensure this, for each app version V and host H 240 we maintain PFC^mean^(H, A), 241 the average of PFC(J)/E(J) for jobs completed by H using A. 242 243 This yields the host scaling factor 244 245 Scale(H) = (PFC^mean^(V)/PFC^mean^(H, A)) 272 273 We scale PFC by the factor 274 275 app_version.pfc_avg / host_app_version.pfc_avg 246 276 247 277 There are some cases where hosts are not sent jobs uniformly: … … 251 281 jobs to GPUs with more processors. 252 282 253 The normalization by E(J) handles this254 (assuming that wu.fpops_est is set appropriately). 255 256 Notes: 257 * For some apps,the host normalization mechanism is prone to283 The normalization by wu.fpops_est handles this. 284 285 Notes: 286 * For apps with large variance of job sizes, 287 the host normalization mechanism is prone to 258 288 a type of cheating called "cherry picking". 259 289 A mechanism for defeating this is described below. … … 262 292 and increases the claimed credit of hosts that are more efficient 263 293 than average. 264 265 == Computing averages ==266 267 Computation of averages needs to take into account:268 269 * The quantities being averaged may gradually change over time270 (e.g. average job size may change)271 and we need to track this.272 This done as follows: for the first N samples273 (N = ~100 for app versions, ~10 for hosts)274 we take the straight average.275 After that we use an exponential average276 (with appropriate alpha for app version and host)277 278 * A given sample may be wildly off,279 and we can't let this mess up the average.280 Non-first samples are capped at 10 times the current average.281 294 282 295 == Anonymous platform == … … 290 303 (-2 for CPU, -3 for NVIDIA GPU, -4 for ATI). 291 304 292 If min_avg_pfc(A)is defined and293 PFC^mean^(H, V) is above asample threshold,305 If app.min_avg_pfc is defined and 306 host_app_version.pfc_avg is above sample threshold, 294 307 we normalize PFC by the factor 295 308 296 min_avg_pfc(A)/PFC^mean^(H, V)309 app.min_avg_pfc/host_app_version.pfc_avg 297 310 298 311 Otherwise the claimed PFC is 299 312 300 min_avg_pfc(A)*E(J) 301 302 If min_avg_pfc(A) is not defined, the claimed PFC is 303 304 wu.rsc_fpops_est 313 app.min_avg_pfc(A)*wu.fpops_est 314 315 If app.min_avg_pfc is not defined, the claimed PFC is 316 317 wu.fpops_est 318 319 Notes: 320 321 * We don't assume that anonymous platform apps on 322 different hosts but with the same platform and resource type 323 are comparable. 305 324 306 325 == Summary == … … 309 328 310 329 * the "claimed PFC" F 311 * a flag "approx" that is true if F 312 is an approximation and may not be comparable 313 with other instances of the job 330 * a flag "approx" that is true if F is an approximation 331 and may not be comparable with other instances of the job 314 332 315 333 The algorithm: … … 317 335 pfc = peak FLOP count(J) 318 336 approx = true; 319 if pfc > wu. rsc_fpops_bound320 if min_avg_pfc(A)is defined321 F = min_avg_pfc(A) * E(J)337 if pfc > wu.fpops_bound 338 if app.min_avg_pfc is defined 339 F = app.min_avg_pfc * wu.fpops_est 322 340 else 323 F = wu. rsc_fpops_est341 F = wu.fpops_est 324 342 else 325 343 if job is anonymous platform 326 hav = host_app_version record 327 if min_avg_pfc(A) is defined 328 if hav.pfc.n > threshold 344 if app.min_avg_pfc is defined 345 if host_app_version.pfc_avg is above sample threshold 329 346 approx = false 330 F = min_avg_pfc(A) /hav.pfc.avg347 F = app.min_avg_pfc / host_app_version.pfc_avg 331 348 else 332 F = min_avg_pfc(A) * E(J)349 F = app.min_avg_pfc * wu.fpops_est 333 350 else 334 F = wu. rsc_fpops_est351 F = wu.fpops_est 335 352 else 336 353 F = pfc; … … 344 361 The claimed credit of a job (in Cobblestones) is 345 362 346 C = F * 200/86400e9363 C = F * 200/86400e9 347 364 348 365 If replication is not used, this is the granted credit. … … 353 370 Otherwise: 354 371 355 if min_avg_pfc(A)is defined356 C = min_avg_pfc(A)*E(J)372 if app.min_avg_pfc is defined 373 C = app.min_avg_pfc*wu.fpops_est 357 374 else 358 C = wu. rsc_fpops_est * 200/86400e9375 C = wu.fpops_est * 200/86400e9 359 376 360 377 == Cross-project version normalization == … … 505 522 Unrelated to the credit proposal, but in a similar spirit. 506 523 The server will maintain ET^mean^(H, V), the statistics of 507 job runtimes (normalized by wu. rsc_fpops_est) per524 job runtimes (normalized by wu.fpops_est) per 508 525 host and application version. 509 526 510 527 The server's estimate of a job's runtime is then 511 528 512 R(J, H) = wu. rsc_fpops_est * ET^mean^(H, V)529 R(J, H) = wu.fpops_est * ET^mean^(H, V) 513 530 514 531 … … 522 539 int app_version_id; // generalized for anon platform 523 540 AVERAGE pfc; 524 AVERAGE_VAR et; // elapsed time / wu. rsc_fpops_est541 AVERAGE_VAR et; // elapsed time / wu.fpops_est 525 542 double host_scale_time; 526 543 bool scale_probation;