Changes between Version 3 and Version 4 of CreditNew
- Timestamp:
- Nov 3, 2009, 9:25:51 AM (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
CreditNew
v3 v4 7 7 For GPUs, it's given by a manufacturer-supplied formula. 8 8 9 Applications access memory, 9 However, other factors affect application performance. 10 For example, applications access memory, 10 11 and the speed of a host's memory system is not reflected 11 12 in its Whetstone score. … … 15 16 is the ratio of actual FLOPS to peak FLOPS. 16 17 17 GPUs typically have a much higher (50-100X) peak speedthan GPUs.18 GPUs typically have a much higher (50-100X) peak FLOPS than GPUs. 18 19 However, application efficiency is typically lower 19 20 (very roughly, 10% for GPUs, 50% for CPUs). … … 29 30 about the same amount of credit per day for a given host. 30 31 31 It's easy to show that both goals can't be satisfied simultaneously 32 when there is more than one type of processing resource. 32 It's easy to show that both goals can't be satisfied simultaneously. 33 33 34 34 == The first credit system == … … 40 40 }}} 41 41 There were then various schemes for taking the 42 average or min of theclaimed credit of the replicas of a job,42 average or min claimed credit of the replicas of a job, 43 43 and using that as the "granted credit". 44 44 … … 65 65 We call this approach "Actual-FLOPs-based". 66 66 67 SETI@home had an application thatallowed counting of FLOPs,68 and they adopted this system .69 They added a scaling factor so that theaverage credit per job67 SETI@home's application allowed counting of FLOPs, 68 and they adopted this system, 69 adding a scaling factor so that average credit per job 70 70 was the same as the first credit system. 71 71 … … 84 84 == Goals of the new (third) credit system == 85 85 86 * Completely automate credit- projects don't have to86 * Completely automated - projects don't have to 87 87 change code, settings, etc. 88 88 … … 90 90 91 91 * Limited project neutrality: different projects should grant 92 about the same amount of credit per CPU hour, 93 averaged over hosts. 92 about the same amount of credit per CPU hour, averaged over hosts. 94 93 Projects with GPU apps should grant credit in proportion 95 94 to the efficiency of the apps. … … 99 98 == Peak FLOP Count (PFC) == 100 99 101 This system goes back tothe Peak-FLOPS-based approach,100 This system uses the Peak-FLOPS-based approach, 102 101 but addresses its problems in a new way. 103 102 … … 126 125 For now, though, we'll just use the scheduler's estimate. 127 126 128 The idea of the system is thatgranted credit for a job J is proportional to PFC(J),127 The granted credit for a job J is proportional to PFC(J), 129 128 but is normalized in the following ways: 130 129 131 130 == Cross-version normalization == 132 131 133 134 132 If a given application has multiple versions (e.g., CPU and GPU versions) 135 the average granted credit is the same for each version. 133 the granted credit per job is adjusted 134 so that the average is the same for each version. 136 135 The adjustment is always downwards: 137 we maintain the average PFC*(V) of PFC() for each app version, 138 find the minimum X, 139 then scale each app version's jobs by (X/PFC*(V)). 140 The result is called "Version-Normalized Peak FLOP Count", or VNPFC(J). 141 142 Notes: 143 * This mechanism provides device neutrality. 136 we maintain the average PFC*(V) of PFC() for each app version V, 137 find the minimum X. 138 An app version V's jobs are then scaled by the factor 139 {{{ 140 S(V) = (X/PFC*(V)) 141 }}} 142 143 The result for a given job J 144 is called "Version-Normalized Peak FLOP Count", or VNPFC(J): 145 {{{ 146 VNPFC(J) = PFC(J) * (X/PFC*(V)) 147 }}} 148 149 Notes: 144 150 * This addresses the common situation 145 151 where an app's GPU version is much less efficient than the CPU version … … 150 156 It's not exactly "Actual FLOPs", since the most efficient 151 157 version may not be 100% efficient. 152 * Averages are computed as a moving average,153 so that the system will respond quickly as job sizes change154 or new app versions are deployed.155 158 156 159 == Cross-project normalization == … … 158 161 If an application has both CPU and GPU versions, 159 162 then the version normalization mechanism uses the CPU 160 version as a "sanity check" to limit the credit granted forGPU jobs.163 version as a "sanity check" to limit the credit granted to GPU jobs. 161 164 162 165 Suppose a project has an app with only a GPU version, 163 166 so there's no CPU version to act as a sanity check. 164 167 If we grant credit based only on GPU peak speed, 165 the project will grant much more credit per GPU hour than 166 other projects,violating limited project neutrality.167 168 The solution to thisis: if an app has only GPU versions,169 then we scale its granted credit by the average scaling factor170 for that GPU type among projects that 171 do have both CPU and GPU versions.168 the project will grant much more credit per GPU hour than other projects, 169 violating limited project neutrality. 170 171 A solution to this: if an app has only GPU versions, 172 then for each version V we let 173 S(V) be the average scaling factor 174 for that GPU type among projects that do have both CPU and GPU versions. 172 175 This factor is obtained from a central BOINC server. 176 V's jobs are then scaled by S(V) as above. 173 177 174 178 Notes: 175 179 176 180 * Projects will run a periodic script to update the scaling factors. 177 * Rather than GPU type, we'll actually use plan class,181 * Rather than GPU type, we'll probably use plan class, 178 182 since e.g. the average efficiency of CUDA 2.3 apps may be different 179 fromthat of CUDA 2.1 apps.183 than that of CUDA 2.1 apps. 180 184 * Initially we'll obtain scaling factors from large projects 181 185 that have both GPU and CPU apps (e.g., SETI@home). 182 Eventually we'll use an average (weighted by work done) over multiple projects. 186 Eventually we'll use an average (weighted by work done) over multiple projects 187 (see below). 183 188 184 189 == Host normalization == 185 190 186 For a given application, all hosts should get the same average granted credit per job. 191 For a given application, 192 all hosts should get the same average granted credit per job. 187 193 To ensure this, for each application A we maintain the average VNPFC*(A), 188 194 and for each host H we maintain VNPFC*(H, A). … … 201 207 some (presumably larger) jobs to GPUs with more processors. 202 208 To deal with this, we can weight jobs by workunit.rsc_flops_est. 209 210 == Computing averages == 211 * Averages are computed as a moving average, 212 so that the system will respond quickly as job sizes change 213 or new app versions are deployed. 214 215 == Jobs versus app units == 216 217 == Cross-project scaling factors == 203 218 204 219 == Replication and cheating == … … 262 277 double min_avg_vnpfc; // min value of app_version.avg_vnpfc 263 278 }}} 279