wiki:VolunteerStorage

Version 7 (modified by davea, 13 years ago) (diff)

Name changed from DistributedFileMgt? to VolunteerStorage

Volunteer storage

BOINC provides features that support volunteer storage: that is, distributed data management systems based on volunteer resources.

Storage applications

One application of volunteer storage is pure storage: volunteer hosts are used to store data that originates from the server. Data units have parameters such as target reliability (loss rate), read latency, and read throughput. The application decides which hosts to use and how to replicate data. It may stripe or code the data, since clients don’t need to access the data in its original form.

Other storage applications combine storage and computation in various ways:

  • Archival of computational results: for example, Climateprediction.net proposed storing the large (2 GB) output files of climate model runs on the host for several months, so that they are available to scientists if something interesting is found in the small summary files that are sent to the server.
  • Dataset storage: for example, gene and protein databases could be distributed across a client pool, and could be queried via BLAST or other standard applications. \MapReduce?-type systems also fall in this category.
  • Data stream buffering: for instruments that produce large amounts of data, volunteer storage can provide a large buffer, increasing the time window during which transient events can be re-analyzed.
  • Locality scheduling: a job assignment policy that preferentially sends jobs whose input files are already resident on that client, thus reducing data server load. Data files are “sticky”, and remain resident on hosts until there are no more jobs that use them, at which point they are deleted.

BOINC's volunteer storage architecture

The architecture involves two layers:

http://boinc.berkeley.edu/storage.png

The BOINC distributed file management provides basic mechanisms:

  • Server-side estimation of host parameters such as future availability, latency, and upload/download speed.
  • File transfers between server and client.
  • Possibly peer-to-peer file transfers using Attic.
  • Maintenance of a database table tracking which files are present on which hosts;
  • A mechanism in the client for allocating disk space to projects and deciding when a project must delete files;
  • A mechanism for conveying this information to the server, and for the server to tell the client which files to delete.

The upper layer consists of storage applications. Each of the storage applications listed above has goals that drive the data placement and replication policy. For example, Dataset Storage applications would try to store an amount of data per host in proportion to its available processing power, to minimize the time needed to process the entire data set.

A storage application is implemented as several components:

  • A "plug-in" to the BOINC scheduler, which is called on each scheduler RPC;
  • A daemon program that handles timeouts;
  • Interface programs, for example, programs allowing users to submit and retrieve files.

The BOINC distributed file management API

The distributed file management API includes several functions, each of which can be invoked as a command-line program are as a C++ function call. To use these features, you must include

<msg_to_host/>

in your config.xml.

Sending files to hosts

From an interface program or daemon, call

int send_file(int host_id, const char* file_name)

or use the command line program

send_file --host_id X --file_name Y

From the scheduler plugin, call

int send_file_sched(const char* filename);

Retrieving files

From an interface program or daemon, call

int get_file(int host_id, const char* file_name)

or use the command line program

get_file --host_id X --file_name Y

From the scheduler plugin, call

int get_file_sched(const char* filename);

Deleting files

From an interface program or daemon, call

int delete_file(int host_id, const char* file_name)

or use the command line program

delete_file -host_id X -file_name Y

From the scheduler plugin, call

int delete_file_sched(const char* filename);

Implementation notes

From interface programs, put_file and get_file create a message to the host (in the msg_to_host table). The message is a chunk of XML that describes a "virtual" app and app version (with no associate executable), and a workunit and result that contain a single input or output file. The result has a name of the form file_xfer_*, which tells the scheduler to treat it specially.