Version 2 (modified by 10 years ago) (diff) | ,
---|
Generalizing credit
Note: in the following
- FLOP = floating-point operation
- FLOPs = plural of FLOP
- FLOPS = FLOPs per second
The current credit system is based on FLOPs: a 1 GFLOPS computer running BOINC all the time gets 200 credits per day. We did things this way because BOINC was designed to support scientific computing, where most apps are floating-point intensive and FLOPS is the standard unit of performance (for example, supercomputer performance is measured in FLOPS).
For grant-writing purposes I need to be able say "BOINC has a peak performance of X PetaFLOPS"; the current credit system lets me do this.
Recently, a new project (Bitcoin Utopia, or BU) started. Their jobs involve Bitcoin mining, which consists of computing SHA256 hash functions. This is an integer algorithm. It can be done on CPUs and GPUs, but it can be done much faster on ASICs. These ASICs can only do hashing; they can't do floating-point math, and they can't be used by any BOINC project other than BU.
The question was: how should BU grant credit for its jobs? One approach is to decide on an "equivalence" between hashes and FLOPs, and assign credit based on the current formula. How many FLOPs is a hash equivalent to? One approach is to look at a CPU or GPU, measure its FLOPS and its hashes/sec, and divide. Depending on the device, this gives an answer in the range of 1,000 to 10,000 FLOPs per hash.
BU did these things, which are completely reasonable. But it turns out - because ASICs are so fast - that BU is granting huge amounts of credit. With fewer than 1,000 users, BU is granting more credit than the 300,000 users of all other projects combined.
This situation has 2 undesirable consequences:
- Credit no longer measures FLOPs; BOINC's combined average credit no longer measures its peak performance in FLOPS.
- The competitive balance between projects is lost. BU will always grant far more credit than other projects, for a type of computation that is specific to BU and is not usable for other projects.
Proposal
The basic problem is that we have a credit system based on FLOPs, but we want to give credit for things (like hashes) that are not FLOPs. A similar situation actually already exists in BOINC. We'd like to be able to give credit for disk storage and network communication; some projects have applications that use these resources rather than computing. But there's no obvious way to translate storage or bandwidth into "equivalent FLOPs", and even if there were, we'd be destroying the meaning of credit as a measure of FLOPs.
So, I propose that, rather than trying to shoehorn everything into one number, we keep track of multiple types of credit. In particular, I propose 4 types:
- Computing credit: general-purpose FLOPs, i.e. what we have now.
- Storage credit, measured in byte/seconds.
- Network credit, measured in bytes, the sum of upload and download.
- Project-specific credit. Projects can define and grant this however they like. For BU, this would be hashes. Other projects, like Wildlife@home, might grant credit for a human activity like annotating video.
The BOINC database will maintain each of these types of credit for each host, user, and team. It will store both total and recent average for each type.
Wherever we show credit on project web sites - leader boards, user and team pages, etc. - we'll show one or more credit types; this will be configurable by the project.
The new types of credit will be included in the XML statistics files exported by projects. Statistics sites (such as BOINCStats) will be extended to show the new types of credit.
Discussion
Eric pointed out the possibility of a variant of the SETI@home app that uses an ASIC or FPGA to compute FFTs. What if these were 1000X faster than GPUs or CPUs? We'd have the same problem as we do now with BU.
My feeling about this is that computing credit should be measure 'general-purpose' FLOPs, i.e. FLOPs that are usable by most science applications. FFT FLOPs are not general-purpose. So the right thing would be for SETI@home to grant both computing credit and project-specific credit. CPU and GPU jobs would be granted both; jobs done by ASICs or FPGAs would be granted only project-specific credit.
Similarly, BU could grant computing credit for mining jobs done by CPU or GPU; but for ASIC jobs it would grant only project-specific credit.
Of course this is all subjective and fuzzy; you might argue that GPU FLOPs are not general-purpose because some apps don't map well to GPUs. But we need to draw a line somewhere, and I think we've reached a point where GPUs can be considered general-purpose.