Changes between Version 23 and Version 24 of CreditNew


Ignore:
Timestamp:
Mar 9, 2010, 9:41:25 PM (15 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • CreditNew

    v23 v24  
    107107but addresses its problems in a new way.
    108108
    109 When a job is issued to a host, the scheduler specifies usage(J,D),
    110 J's usage of processing resource D:
    111 how many CPUs and how many GPUs (possibly fractional).
     109When a job J is issued to a host,
     110the scheduler specifies flops_est(J),
     111a FLOPS estimate based on the resources used by the job
     112and their peak speeds.
    112113
    113114If the job is finished in elapsed time T,
     
    119120Notes:
    120121
     122 * PFC(J) is
    121123 * We use elapsed time instead of actual device time (e.g., CPU time).
    122124   If a job uses a resource inefficiently
     
    130132   dynamically determines the CPU usage.
    131133   For now, though, we'll just use the scheduler's estimate.
     134 * For projects (CPDN) that grant partial credit via
     135   trickle-up messages, substitute "partial job" for "job".
     136   These projects must include elapsed time,
     137   app version ID, and FLOPS estimate in the trickle message.
    132138
    133139The granted credit for a job J is proportional to PFC(J),
     
    384390so we'll move "max_results_day" from the host table to host_app_version.
    385391
     392== Anonymous platform ==
     393
     394For anonymous platform apps, since we don't necessarily
     395know anything about the devices involved,
     396we don't try to estimate PFC.
     397Instead, we give the average credit for the app,
     398scaled by the job size.
     399
     400The server maintains host_app_version record for anonymous platform,
     401and it keeps track of elapsed time statistics there.
     402These have app_version_id = -1 for CPU, -2 for NVIDIA GPU, -3 for ATI.
     403
     404== App plan functions ==
     405
     406App plan functions no longer have to make a FLOPS estimate.
     407They just have to return the peak device FLOPS.
     408
     409The scheduler adjusts this,
     410using the elapsed time statistics,
     411to get the app_version.flops_est it sends to the client
     412(from which job durations are estimated).
     413
    386414== Job runtime estimates ==
    387415
     
    450478== Compatibility ==
    451479
     480