Changes between Version 6 and Version 7 of CreditNew
- Timestamp:
- Nov 3, 2009, 2:37:20 PM (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
CreditNew
v6 v7 16 16 is the ratio of actual FLOPS to peak FLOPS. 17 17 18 GPUs typically have a much higher (50-100X) peak FLOPS than GPUs.18 GPUs typically have a much higher (50-100X) peak FLOPS than CPUs. 19 19 However, application efficiency is typically lower 20 20 (very roughly, 10% for GPUs, 50% for CPUs). … … 156 156 It's not exactly "Actual FLOPs", since the most efficient 157 157 version may not be 100% efficient. 158 * There are two sources of variance in PFC(V): 159 the variation in host efficiency, 160 and possibly the variation in job size. 161 If we have an ''a priori'' estimate of job size 162 (e.g., workunit.rsc_fpops_est) 163 we can normalize by this to reduce the variance, 164 and make PFC*(V) converge more quickly. 165 * ''a posteriori'' estimates of job size may exist also 166 (e.g., an iteration count reported by the app) 167 but using this for anything introduces a new cheating risk, 168 so it's probably better not to. 169 158 170 159 171 == Cross-project normalization == … … 190 202 191 203 Assuming that hosts are sent jobs for a given app uniformly, 192 then for a given app204 then, for that app, 193 205 hosts should get the same average granted credit per job. 194 206 To ensure this, for each application A we maintain the average VNPFC*(A), … … 200 212 201 213 There are some cases where hosts are not sent jobs uniformly: 202 * job-size matching 214 * job-size matching (smaller jobs sent to slower hosts) 203 215 * GPUGrid.net's scheme for sending some (presumably larger) 204 216 jobs to GPUs with more processors. 205 In these cases we must scale 217 In these cases average credit per job must differ between hosts, 218 according to the types of jobs that are sent to them. 219 220 This can be done by dividing 221 each sample in the computation of VNPFC* by WU.rsc_fpops_est 222 (in fact, there's no reason not to always do this). 206 223 207 224 Notes: 208 * Th ismechanism reduces the claimed credit of hosts225 * The host normalization mechanism reduces the claimed credit of hosts 209 226 that are less efficient than average, 210 227 and increases the claimed credit of hosts that are more efficient … … 269 286 }}} 270 287 271 == Jobs versus app units ==272 To deal with this, we can weight jobs by workunit.rsc_flops_est.273 274 If a project changes between jobs to app units,275 it must reset276 277 288 == Cross-project scaling factors == 278 289 … … 288 299 granted credit = claimed credit. 289 300 290 For jobs that are replicated, granted credit isbe301 For jobs that are replicated, granted credit should be 291 302 set to the min of the valid results 292 303 (min is used instead of average to remove the incentive … … 315 326 == Job runtime estimates == 316 327 328 Unrelated to the credit proposal, but in a similar spirit. 329 The server will maintain ET*(H, V), the statistics of 330 job runtimes (normalized by wu.rsc_fpops_est) per 331 host and application version. 332 333 The server's estimate of a job's runtime is then 334 {{{ 335 R(J, H) = wu.rsc_fpops_est * ET*(H, V) 336 }}} 337 317 338 == Implementation == 318 339