wiki:AppMultiThread

Context Navigation

Version 6 (modified by Nicolas, 18 years ago) (diff)
m

API for multi-thread apps

T(DesignDocument)?

Why write a multi-threaded app?

The average number of cores per PC will increase over the next few years, possibly at a faster rate than the average amount of available RAM.

Depending on your application and project, it may be desirable to develop a multi-threaded application. Possible reasons to do this:

If your application's memory footprint is large enough that, on some PCs, there's not enough RAM to run a separate copy of the app on each CPU.

If you want to reduce the turnaround time of your jobs (either because of human factors, or to reduce server occupancy).

Writing and debugging a multi-threaded app is often hard. You may be able to use existing libraries of numerical "kernels" that are already multi-threaded.

Assumptions

A 'multi-thread app' A uses multiple threads, say Nthreads(A). The average number of processors used, Ncpus(A), may be less (because of I/O or synchronization).

Ideally, on a host with N CPUs, we want Ncpus(A), summed over running apps, to be about N. If it's less, we're not using CPU time. If it's more:

we increase latency without increasing throughput
we use more RAM than needed
higher synchronization overhead

We assume that applications may be able to change Nthreads(A) dynamically in response to hints from BOINC. Nthreads(A) need not be equal to the hint.

Example: suppose

we have an 80-core CPU
app A can use 1,2,4,8,16,32 threads
app B can use 1,2,4,8,16,32,64 threads

Then we want to have either (16,64) or (32,32) threads most of the time.

Proposal

API functions:

int boinc_target_nthreads();
void boinc_actual_nthreads(int);

An application calls boinc_target_nthreads() periodically, at points where it is able to change its number of threads. It calls boinc_actual_nthreads() to report its actual number of threads.

A WU DB record can specify "max average ncpus", an estimate of Ncpus(A) on a host with arbitrarily many CPUs. This is used by the client and scheduler to estimate completion time.

Implementation

Shared-memory messages:

core->app (process control channel): <target_nthreads>
app->core (process control channel): <actual_nthreads>

Client maintains estimates of CPU efficiency per job, uses this to scale target_nthreads.

Implementation (enforce_schedule()): as we schedule jobs, decrement CPU count by scaled actual_nthreads. rr_simulation() needs to be modified too.

Download in other formats:

Plain Text