Changes between Version 1 and Version 2 of LowLatency


Ignore:
Timestamp:
Apr 25, 2007, 1:21:16 PM (17 years ago)
Author:
Nicolas
Comment:

Required manual changes to automatic conversion.

Legend:

Unmodified
Added
Removed
Modified
  • LowLatency

    v1 v2  
    11= Low-latency computing =
    22
    3       BOINC was originally designed for high-throughput computing, and one of its basic design goals was to minimize the number of scheduler RPCs (in order to reduce server load and increase scalability). In particular, when a client requests work from a server and there is none, the client uses exponential backoff, up to a maximum backoff off 1 day or so. This policy limits the number of scheduler requests to (roughly) one per job. However, this backoff policy is inappropriate for '''low-latency''' computing, by which we main projects whose tasks must be completed in a few minutes or hours. Such projects require a '''minimum connection rate''', rather than seeking to minimize the connection rate.
     3BOINC was originally designed for high-throughput computing, and one of its basic design goals was to minimize the number of scheduler RPCs (in order to reduce server load and increase scalability). In particular, when a client requests work from a server and there is none, the client uses exponential backoff, up to a maximum backoff off 1 day or so. This policy limits the number of scheduler requests to (roughly) one per job. However, this backoff policy is inappropriate for '''low-latency''' computing, by which we main projects whose tasks must be completed in a few minutes or hours. Such projects require a '''minimum connection rate''', rather than seeking to minimize the connection rate.
    44
    55For example, if you need to get batches of 10,000 jobs completed with 5 minute latency, and each job takes 2 minutes of computing, you'll need to arrange to get 10,000 scheduler requests every 3 minutes (and you'll need a server capable to handling this request rate).
    66
    77
    8 === The minimum connection rate ===
    9  Suppose that, at a given time, the project has N hosts online, and that each host has 1 CPU that computes at X FLOPS.
     8== The minimum connection rate ==
     9Suppose that, at a given time, the project has N hosts online, and that each host has 1 CPU that computes at X FLOPS.
    1010
    1111Suppose that the project's work consists of 'batches' of M jobs. Each batch is generated at a particular time, and all the jobs must be completed within time T. For simplicity, assume that a batch is not created until the previous batch has been completed, and that each has is given at most one job from each batch. Suppose that each job takes Y seconds to complete on the representative X-FLOPS CPU.
     
    1818
    1919
    20 === How to do low-latency computing ===
    21  The key component in the above is the ability to control Z, the time between requests for a given host. Starting with version 5.6 of the BOINC client, it is now possible to control this: each scheduler reply can include a tag
    22 
     20== How to do low-latency computing ==
     21The key component in the above is the ability to control Z, the time between requests for a given host. Starting with version 5.6 of the BOINC client, it is now possible to control this: each scheduler reply can include a tag
    2322
    2423{{{<next_rpc_delay>x</next_rpc_delay>}}}
    25       telling the client when to contact the scheduler again. By varying this value, a project can achieve a rate of connection requests necessary to achieve its latency bounds.  The current BOINC scheduler code has no support for sending this tag, or for figuring out what its value should be. If you want to do low-latency computing, the scheduler must be modified and extended as follows:
    2624
     25telling the client when to contact the scheduler again. By varying this value, a project can achieve a rate of connection requests necessary to achieve its latency bounds.  The current BOINC scheduler code has no support for sending this tag, or for figuring out what its value should be. If you want to do low-latency computing, the scheduler must be modified and extended as follows:
    2726
    2827 * Keep track of how many active hosts you have (this will change over time).