wiki:BoincOverview

Version 4 (modified by davea, 6 years ago) (diff)

--

Computing with BOINC

BOINC is a platform for distributed high throughput computing, i.e. large numbers of independent compute-intensive jobs, where there performance goal is high rate of job completion rather than low turnaround time of individual jobs. It also offers low-level mechanisms for distributed data storage.

BOINC has a client/server architecture. The server distributes jobs. The client runs on worker nodes, which execute jobs.

BOINC was originally designed for volunteer computing, where the worker nodes are consumer devices (desktop and laptop computers, tables, smartphones) volunteered by their owners. It addresses the various challenges inherent in this environment (heterogeneity, host churn and unreliability, scale, security, and so on).

There are a number of volunteer-computing BOINC projects such as SETI@home, LHC@home, IBM World Community Grid, and so on. The BOINC client can be "attached" to one or many of these; it processes jobs for the projects to which it is attached.

BOINC can also be used for in-house computing within an organization (e.g. a company). In this case case the worker nodes are cluster nodes or other organizational computers, and they are attached to the organization's server.

BOINC is distributed under the LGPL v3 open-source license. It can be used for any purpose (academic, commercial, or private) and can be used with applications that are not open-source.

Getting started

To compute using BOINC, you'll need to set up a BOINC server and configure your applications to run under BOINC. Instructions for doing this are here.

If you're doing in-house computing, install the BOINC client on your computers, and you're done. This is detailed here; we won't discuss it further.

In the volunteer computing case, you'll need to get clients to attach to your server. There are several ways to do this:

  • Create a public-facing web site for your project. Announce it and publicize it using whatever channels are available to you: mass media, social media, newletters, paid advertising, etc.
  • Contact us and ask to have your project listed by BOINC. You'll be asked to demonstrate that a) your project is doing what you claim it is, and b) you're following a set of security practices. Your project will then a) be announced on the BOINC web site news column, b) be listed on the BOINC web site, and c) appear in the list of projects shown in the BOINC client GUI.
  • Contact us and ask to have your project included in Science United, a framework in which volunteers sign up for science areas instead of projects. You'll need to tell us what types of research your project is doing, and then you'll automatically get computing power from volunteers who have registered an interest in those areas. This has the advantage that you don't have to create a public-facing web site or do any publicity. In addition, you can ask to be included in Science United even before you've created your project. At that point we can tell you roughly how much computer power you'll get, and you can decide whether this justifies the investment in creating a project.

These approaches are not mutually exclusive; you can do any or all of them.

Organizational options

The volunteer computing projects using BOINC vary in terms of their organizational structure and the set of scientists they serve. Examples include:

  • Research group. The project is operated by a single research group, and serves the members of that group. Examples include SETI@home, Rosetta@home, and Einstein@home.
  • Research community. The project is operated by a single research group, but serves a broader community in that science area. An example is Climateprediction.net, which is based at Oxford but collaborates with projects around the world.
  • Science Gateway. The project is operated by a science gateway, i.e. a web site that serves a particular scientific community, and that provides HTC as well as other functions. An example (in progress) is nanoHUB.
  • University-wide umbrella project. The project is operated by a university, and serves the researchers at that university. An example (no longer operating) is the University of Westminster in London. This idea is elaborated on here.
  • HPC provider. The project is operated by an HPC provider such as a supercomputing center. It processes the provider's HTC jobs (i.e. the jobs that don't actually need a supercomputer), and serves the provider's clients that have HTC workloads. An example (in progress) is Texas Advanced Computing Center (TACC).

There are several advantages in having BOINC projects that are high in the organizational hierarchy, and that serve many scientists:

  • The cost of maintaining a BOINC project is roughly constant, regardless of its size. For large projects, the cost per scientist is lower.
  • Publicity options: high-level organizational entities typically have existing publicity mechanisms (e.g. alumni magazines, newsletters, etc.) that can be leveraged to recruit volunteers.
  • Longevity: the duration of one scientist's need for HTC is generally shorter than that of a group of scientists. There are benefits in having a project last a long time (e.g. amortizing the startup cost).
  • Continuity: similarly, one scientist's computing workload may be sporadic, while that of a group of scientists is more continuous. Some volunteers prefer projects with continuous workloads.

So if you're thinking about using BOINC, consider the possible scope of your project.