wiki:BossaIntro

Version 11 (modified by Nicolas, 17 years ago) (diff)

no click here, darnit

Berkeley Open System for Skill Aggregation (Bossa)

T(DesignDocument)?

Bossa is a software infrastructure for creating projects that use skills or knowledge of large numbers of volunteers, via the Internet, to accomplish a set of tasks. This has been variously termed skill aggregation, distributed thinking, and human computing.

Bossa is designed to accommodate a wide range of tasks. In particular:

  • Tasks may be short (performed online via a single web page) or may take several weeks and involve running separate programs.
  • Tasks may be performed by a single user or by a group of cooperating users.
  • Tasks may be automatically validated, or may require validation by comparing redundant instances.

Bossa consists of a MySQL database schema and a set of PHP pages. Projects install these components on their server, and add their own PHP scripts to generate, show, and handle tasks. Bossa uses BOINC Basics for grouping and communication, and can use BOLT for volunteer training.

Bossa is under development. See the reference manual? for the current implementation.

Abstractions

A Bossa project has one or more skill apps. A skill app has a dynamic set of skill tasks. Each one has an associated set of arguments describing its parameters or input files. Each skill task has a set of task instances. Each one represents a copy of the task, either in progress or completed. Each instance is assigned either to a user or to a team.

Skill apps are classified as:

  • Online: the task is performed by a single user, sequentially, in a web browser.
  • Offline: the task is not online, e.g. because it's potentially handled by a group of users, or requires other asynchronous activity.

An app has an associated URL identifying a script that takes an task ID argument and displays the task instance. The task may consist either of a single web page or a sequence of web pages. In either case the last page in the sequence, when done, should call Bossa API functions to record the completion of the task, and perhaps display another task.

Skill apps are either:

  • Individually validated: the app has a server-side program that examines a completed instance and decides if it's valid.
  • Group validated: the app has a server-side program that examines a group of instances, sees if there's a consensus, and if so constructs a canonical result and marks the instances as valid or invalid.

A project can configure:

  • A maximum number of outstanding offline tasks per user or group
  • A maximum number of tasks per day issued per user or group

Volunteer characteristics

For each skill app and each user, Bossa maintains skill estimate, an estimate of the user's skill at that task. This is maintained in the user's project-specific XML document. Normally it's a single number in [0..1], and it's initially zero.

The skill estimated can be computed in any of several ways:

  • The results of the user's interaction with a Bolt course associated with the skill app.
  • The user's performance on "calibration tasks" mixed into the stream.
  • The fraction of the user's results classified as invalid by redundancy.

Skill estimates are used for two purposes:

  • To decide whether to give tasks to a user;
  • To decide how many redundant instances of a given task are needed.

Implementation

To get work, a user goes to a particular Bossa-supplied page. There he sees a list of skill apps for which tasks are available and for which he is qualified, and links to courses for other apps. Online and offline apps are listed separately. Each app has an estimate of the time or other resources required to complete the task.

Selecting an online app invokes the Bossa scheduler script, which selects a task instance suitable for the user, and redirects to its instance URL.

Selecting an offline app invokes the Bossa scheduler, which selects a task and redirects to its instance-start URL.

Team administrators are provided with an interface for getting offline tasks for the team. The scheduler allows a team to get instances only for apps for which some team member has the required skill.

Users and teams are provided with an interface for seeing a list of pending offline tasks. They can indicate that one of them is completed; this takes them to the instance-complete URL for that task.

Integration with BOINC

Some offline tasks may involve computation done through BOINC; i.e. if the task is assigned to a team, the computation is queued in the project's BOINC server and dispatched to members of the team. (Or if the task is assigned to a user with many computers, those computers are used).

Such projects should provide a web interface for submitting such jobs. TODO: describe the API by which this script creates the WU. Give an example.