Changes between Version 23 and Version 24 of CreditNew
- Timestamp:
- Mar 9, 2010, 9:41:25 PM (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
CreditNew
v23 v24 107 107 but addresses its problems in a new way. 108 108 109 When a job is issued to a host, the scheduler specifies usage(J,D), 110 J's usage of processing resource D: 111 how many CPUs and how many GPUs (possibly fractional). 109 When a job J is issued to a host, 110 the scheduler specifies flops_est(J), 111 a FLOPS estimate based on the resources used by the job 112 and their peak speeds. 112 113 113 114 If the job is finished in elapsed time T, … … 119 120 Notes: 120 121 122 * PFC(J) is 121 123 * We use elapsed time instead of actual device time (e.g., CPU time). 122 124 If a job uses a resource inefficiently … … 130 132 dynamically determines the CPU usage. 131 133 For now, though, we'll just use the scheduler's estimate. 134 * For projects (CPDN) that grant partial credit via 135 trickle-up messages, substitute "partial job" for "job". 136 These projects must include elapsed time, 137 app version ID, and FLOPS estimate in the trickle message. 132 138 133 139 The granted credit for a job J is proportional to PFC(J), … … 384 390 so we'll move "max_results_day" from the host table to host_app_version. 385 391 392 == Anonymous platform == 393 394 For anonymous platform apps, since we don't necessarily 395 know anything about the devices involved, 396 we don't try to estimate PFC. 397 Instead, we give the average credit for the app, 398 scaled by the job size. 399 400 The server maintains host_app_version record for anonymous platform, 401 and it keeps track of elapsed time statistics there. 402 These have app_version_id = -1 for CPU, -2 for NVIDIA GPU, -3 for ATI. 403 404 == App plan functions == 405 406 App plan functions no longer have to make a FLOPS estimate. 407 They just have to return the peak device FLOPS. 408 409 The scheduler adjusts this, 410 using the elapsed time statistics, 411 to get the app_version.flops_est it sends to the client 412 (from which job durations are estimated). 413 386 414 == Job runtime estimates == 387 415 … … 450 478 == Compatibility == 451 479 480