Context Navigation

← Previous Change
Wiki History
Next Change →

Changes between Initial Version and Version 1 of CreditOptions

Timestamp:: May 1, 2018, 1:04:38 PM (7 years ago)
Author:: davea
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

CreditOptions

                       v1
+= Credit options (proposal)
+"Credit" is a number associated with completed jobs,
+reflecting how much floating-point computation was (or could have been) done.
+For CPU applications the basic formula is:
+unit of credit = 1/200 day runtime on a CPU whose Whetstone benchmark is 1 GFLOPS.
+Whetstone measures peak performance, and applications that do a lot of memory or disk access get lower FLOPS.
+So credit measures peak, not actual, FLOPs.
+Credit is used for two purposes:
+) For users, to see their rate of progress,
+to compete with other users or teams,
+and to compare the performance of hosts.
+) To get an estimate of the peak performance available to a particular project,
+or of the volunteer host pool as a whole.
+For 2) we care only about averages.
+For 1) we also care about parity between similar jobs;
+users get upset if someone else gets a lot more credit for a similar job.
+BOINC provides 4 ways of determining credit.
+The choice (per app) depends on the properties of the app:
+ * If you can estimate a job's FLOPs in advance, use '''pre-assigned''' credit.
+ * Else if you can estimate a job's FLOPs after if completes, use '''post-assigned''' credit.
+ * Else if the app has only CPU versions, use '''runtime credit'''.
+ * Else use '''adaptive credit'''.
+== Pre-assigned credit ==
+You can use this if each job does the same computation.
+Measure the runtime on a machine with known Whetstone benchmarks.
+Pick a machine with enough RAM that you're not paging.
+The credit is then
+(runtime in days)*benchmark*ncpus*200
+ncpus is the number of CPUs used by the app version; use a sequential version if possible.
+You can also use it if the runtime is a linear function of
+some job attribute (e.g. input file size) that's known in advance.
+Currently specified with the --additional_xml flag or argument to create_work (cmdline or API).
+This is ugly.
+Proposal: make it an official argument to both local and remote job-submission APIs.
+== Post-assigned credit ==
+Use this if you can estimate the FLOPs done by a completed job,
+based on the contents of its output files or stderr.
+For example, if your app has an outer loop,
+and you can measure (as above) the credit C due for each iteration,
+the job credit is C times the number of iterations performed.
+To use this: in your validator, have the init_result() function return the credit for the job.
+== Runtime-based credit ==
+Use this if the app has only CPU app versions.
+The "claimed credit" for a job instance is runtime*ncpus*benchmark.
+To use this: pass the --credit_from_runtime option to the app's validator.
+The app's efficiency (the ratio between peak FLOPS and actual FLOPS)
+can vary somewhat between hosts (e.g. because of different memory speeds,
+or because small RAM causes paging).
+Therefore there will be variation between claimed credit for identical jobs,
+but generally this will be a factor of 2 or so.
+Runtime-based credit can't be used if the app has GPU versions
+because efficiency can vary by orders of magnitude between CPU and GPU versions.
+Currently this assigns the claimed credit of the canonical result to all results.
+TODO: average over valid results.
+== Adaptive credit ==
+Use this if you have GPU apps, and are unable to estimate FLOPs even after job completion.
+This method maintains performance statistics on a (host, app version) level,
+and uses these to normalize credit between CPU and GPU versions.
+See [CreditNew].
+To use: this is the default.
+If you use this, the adaptation will happen faster if you provide
+values for workunit fp_ops_est that are correlated with the actual FLOPs.
+Use a constant value if you're not sure.