| 5 | | * eliminate the need for projects to supply FLOPs estimate for jobs |
| 6 | | * eliminate the need for projects to supply FLOPS estimates for app versions (in app_plan()) |
| 7 | | * simplify credit-granting policy |
| 8 | | |
| 9 | | == Outline == |
| 10 | | |
| 11 | | === Server === |
| 12 | | For each app, maintain |
| 13 | | * flops_avg: the estimated average number of FLOPs used by the app's jobs |
| 14 | | * flops_stdev: the standard deviation of the above |
| 15 | | |
| 16 | | Note: if a project has different types of jobs for a given app, |
| 17 | | with widely differing durations, |
| 18 | | they should create separate apps for them. |
| 19 | | That will reduce the variance of the estimate. |
| 20 | | |
| 21 | | Initially this is set to a high value (e.g. 1 GFLOP-day). |
| 22 | | |
| 23 | | Update: whenever a complete job is reported, |
| 24 | | let x = duration * (host's flops_est). |
| 25 | | Update flops_avg in a way that favors decrease over increase. |
| 26 | | That way, hosts that execute the app efficiently (close to peak hardware speed) |
| 27 | | have a larger weight in the estimate. |
| | 5 | * Eliminate the need for projects to supply FLOPs estimate for jobs |
| | 6 | * Eliminate the need for projects to supply FLOPS estimates for app versions (in app_plan()) |
| | 7 | * Improve job completion time estimates by using different duration correction factors for different app versions |
| | 8 | * Simplify the credit-granting policy |
| | 9 | |
| | 10 | == App units == |
| | 11 | |
| | 12 | '''App units''' are a project-defined measure of how much computation a completed job did. |
| | 13 | They are typically a count of iterations of the app's main loop. |
| | 14 | They should be approximately proportional to FLOPs performed, |
| | 15 | but it doesn't matter what the proportion is |
| | 16 | (i.e., you don't have to count the FLOPs in your main loop). |
| | 17 | |
| | 18 | Apps use a new API to report how many app units they've performed |
| | 19 | (call this right before boinc_exit()): |
| | 20 | {{{ |
| | 21 | void boinc_app_units(double x); |
| | 22 | }}} |
| | 23 | This is optional (the default is one). |
| | 24 | However, if an app's jobs have highly variable size and don't report app units, |
| | 25 | BOINC will tend to give too little credit for them. |
| | 26 | |
| | 27 | The following API functions are deprecated: |
| | 28 | {{{ |
| | 29 | boinc_ops_per_cpu_second() |
| | 30 | boinc_ops_cumulative() |
| | 31 | }}} |
| | 32 | |
| | 33 | Job submission has a new optional argument '''predicted_app_units''': |
| | 34 | the predicted number of app units the job will take. |
| | 35 | The default is one. |
| | 36 | The '''rsc_fpops_est''' and '''rsc_fpops_bound''' arguments |
| | 37 | to job creation are deprecated. |
| | 38 | |
| | 39 | The predictions don't have to be exact. |
| | 40 | In fact, it's OK if they're systematically too high or low, |
| | 41 | as long as there's a linear correlation. |
| | 42 | However, if predicted app units are not linearly correlated with |
| | 43 | actual app units, bad completion time estimates will result. |
| | 44 | |
| | 45 | The server maintains the mean and variance of both actual and predicted app units, |
| | 46 | and scales prediction by the ratio of the means. |
| | 47 | It also maintains '''au_variance''', the variance of (actual/predicted), |
| | 48 | which is used for completion time estimates and bounds. |
| | 49 | |
| | 50 | == Job completion time estimates and bounds == |
| | 51 | |
| | 52 | The BOINC client maintains a per-app-version estimate seconds_per_app_unit; |
| | 53 | The completion time estimate of a job J is |
| | 54 | {{{ |
| | 55 | seconds_per_app_unit |
| | 56 | * J.predicted_aus |
| | 57 | * (app.mean_actual_aus/app.mean_predicted_aus + X*au_stdev) |
| | 58 | }}} |
| | 59 | where X is, say, 2 (meaning that for 95% of jobs, the actual completion time |
| | 60 | will be less than the estimate). |
| | 61 | |
| | 62 | The completion time bound for a job is the same, |
| | 63 | but with X=4 or so (meaning that 99.99% of jobs will complete without |
| | 64 | being aborted due to time limit). |
| | 65 | |
| | 66 | == Estimating FLOPS per app unit == |
| | 67 | |
| | 68 | For credit-granting purposes |
| | 69 | we want to estimate the number of FLOPs per app unit. |
| | 70 | |
| | 71 | When the scheduler dispatches a job, |
| | 72 | it knows the numbers of CPUs and GPUs used, |
| | 73 | as well as the peak hardware FLOPS of each device |
| | 74 | (for CPUs, this is the Whetstone benchmark result; |
| | 75 | for GPUs, it's derived from hardware parameters). |
| | 76 | We define '''peak_flops_per_sec(J)''' |
| | 77 | as the sum of these counts times the peak hardware FLOPS of the devices. |
| | 78 | |
| | 79 | The server sends the client peak_flops_per_sec(J). |
| | 80 | When the client returns a job, it includes a value |
| | 81 | raw_flops_per_sec(J). |
| | 82 | This is usually the same as peak_flops_per_sec(J) |
| | 83 | but it may be less (see note 2 below). |
| | 84 | |
| | 85 | Suppose a job J is executed on a given host using app version V, |
| | 86 | and that it reports A app units and uses elapsed time T. |
| | 87 | We then define raw_flops(J) as T * raw_flops_per_sec(J). |
| | 88 | We define raw_flops_per_au(J) as raw_flops(J)/A. |
| | 89 | |
| | 90 | If we run jobs on lots of different hosts, |
| | 91 | the distribution of raw_flops_per_au(J) will have variance, |
| | 92 | because applications run with different efficiency on different hardware. |
| | 93 | Suppose an app is memory-intensive. |
| | 94 | Running on machine A with a fast CPU (say 10 GFLOPS Whetstone) and a slow memory system, |
| | 95 | it might do 1 GFLOPS (10% efficiency), |
| | 96 | while on machine B with a slow CPU (say 1 GFLOPS Whetstone) and a fast memory system, |
| | 97 | it might also do 1 GFLOPS (100% efficiency). |
| | 98 | A and B would report raw_flops_per_au(J) as 10 GFLOPS and 1 GFLOPS respectively. |
| | 99 | |
| | 100 | In the absence of FLOPS counting, we can't estimate the true FLOPS/app_unit. |
| | 101 | The best we can do is to use raw_flops_per_au(J) from the most efficient hosts. |
| | 102 | So what we will do is to use an order statistic, |
| | 103 | say the 5th percentile value, from the distribution of raw_flops_per_au(J). |
| | 104 | We'll call this est_flops_per_au(app). |
| | 105 | |
| | 106 | Notes: |
| | 107 | |
| | 108 | 1) Typically GPUs are much less efficient than CPUs. |
| | 109 | If a large fraction of jobs (say, >95%) are being done by GPUs, |
| | 110 | then the 5th percentile value won't give us a good estimate. |
| | 111 | To deal with this we could maintain separate arrays |
| | 112 | for GPU and CPU app versions, and take the min of the two 5th-percentile values. |
| | 113 | |
| | 114 | 2) If an application has only very inefficient app versions, |
| | 115 | we'll get an inflated estimate of FLOPS/app_unit. |
| | 116 | Assuming that GPU apps tend to be more inefficient than CPU apps, |
| | 117 | this will happen with projects that have only a GPU app (e.g., GPUGRID.net). |
| | 118 | Here's a scheme to address this problem: |
| | 119 | |
| | 120 | a) have the client report raw_flops_per_sec instead of using the server value. |
| | 121 | |
| | 122 | b) pass the client a flag saying that an app has both CPU and GPU versions. |
| | 123 | |
| | 124 | c) If a client has an app version V for an app with only GPU versions, |
| | 125 | and it also has app versions V1...Vn for apps that have |
| | 126 | both CPU and GPU versions, |
| | 127 | then use the average of the est_flops_per_au for V1...Vn |
| | 128 | in the raw_flops_per_au(J) that it reports for V. |
| | 129 | |
| | 130 | The net effect is to propagate efficiency estimates between projects. |
| | 131 | |
| | 132 | 3) If an app's jobs have variance that is not reflected in app units, |
| | 133 | est_flops_per_au may be way too low |
| | 134 | (suppose, e.g., that 5% of jobs end immediately). |
| | 135 | A solution to this problem is to define a "reference job" |
| | 136 | that's executed on all machines, |
| | 137 | and use only that job to compute est_flops_per_au. |
| | 138 | |
| | 139 | == Credit == |
| | 140 | |
| | 141 | Grant each validated job credit proportional to |
| | 142 | {{{ |
| | 143 | app.mean_app_units * app.est_flops_per_au |
| | 144 | }}} |
| | 145 | |
| | 146 | All jobs for a given app get the same credit. |
| | 147 | A user doesn't get more credit for a longer-than-average job, |
| | 148 | but it averages out in the end. |
| | 149 | |
| | 150 | This removes the incentive for hackers to report exaggerated |
| | 151 | job statistics (say, app units). |
| | 152 | |
| | 153 | However, it introduces a new form of credit cheating: "cherry picking", |
| | 154 | i.e. completing only short jobs |
| | 155 | (based on input files, or on trial execution) |
| | 156 | and aborting or discarding others. |
| | 157 | |
| | 158 | This can be discouraged either by mechanisms |
| | 159 | that reduce the number of jobs/day when a job is aborted or times out. |
| | 160 | |
| | 161 | == Anonymous platform notes == |
| | 162 | |
| | 163 | == Implementation == |
| | 164 | |
| | 165 | === Protocol === |
| | 166 | |
| | 167 | Request message: |
| | 168 | * for each reported job, add <app_units> |
| | 169 | * for each app version, add <seconds_per_app_unit> |
| | 170 | |
| | 171 | Reply message: |
| | 172 | * for each app, <est_flops_per_au> |
| | 173 | * for each app version, < |
| | 174 | |
| | 175 | === Database === |
| | 176 | |
| | 177 | app.est_flops_per_au |
| | 178 | |
| | 179 | === Transitioner === |
| | 180 | |
| | 181 | The transitioner maintains, for each app, |
| | 182 | an array of the N most recent values of raw_flops_per_au(J) for that app. |
| | 183 | When this array is full, the transitioner computes the 5th percentile value of the array. |
| | 184 | It then exponentially averages this value with app.est_flops_per_au. |
| | 185 | |
| | 186 | === Scheduler === |