wiki:ValidationIntro

Version 22 (modified by davea, 10 years ago) (diff)

--

Validation

A validator is a program that decides whether completed jobs are "valid". You must specify a validator for each application in your project, and include it in the <daemons> section of your project configuration file.

Validating a completed job has 2 parts:

  • Syntax check: verify that its output files are present on the server and have the correct format.
  • Replication check: if the job is replicated, compare its replicas. If a strict majority are found to be "equivalent", those replicas are considered valid and the rest are marked as invalid.

Validators grant credit for valid results. By default, this uses the mechanism described here. There are several alternative ways of granting credit.

BOINC includes several "standard" validators:

sample_trivial_validator
Marks a job as valid if its output files are present. Use this if all hosts are trusted.
sample_substr_validator
Marks a job as valid if its stderr output includes a string specified by the --stderr_string command-line arg. If the --reject_if_present arg is present, the logic is inverted: a job is valid if its stderr does NOT include the string.
sample_bitwise_validator
Output files are equivalent if they agree byte for byte. This can be used if your application generates exactly matching results (either because it does no floating-point arithmetic, or because you use homogeneous redundancy). If output files are gzip archives, use the --is_gzip command-line arg. This will skip the gzip header when comparing files.

For a given application, you may want to define your own syntax check for output files, or your own notion of "equivalence" (for example, regarding floating-point numbers as equivalent if they agree within some tolerance). In such cases, you can develop a custom validator for the application.

Command-line arguments

A validator (either standard or custom) has the following command-line arguments:

--app appname
Name of the application
[ -d N ]
Sets the log verbosity level. 1=low ... 4=high.
[ --sleep_interval n ]
Sleep n secs if there was nothing to do within the last pass
[ --one_pass_N_WU N ]
Validate at most N WUs, then exit
[ --one_pass ]
Make one pass through WU table, then exit
[ --dry_run ]
Don't update db, just write logs (for debugging)
[ --mod n i ]
Process only WUs with (id mod n) == i. This option lets you run multiple instances of the validator for increased performance.
[ --max_wu_id n ]
Process only WUs with id <= n
[ --min_wu_id n ]
Process only WUs with id >= n
[ --max_granted_credit X ]
Grant no more than this amount of credit to a job.
[ --update_credited_job ]
For each valid result, create an entry in the credited_job database table. This lets you keep track of which user contributed to each WU, even if you use db_purge.
[ --wu_id N ]
Process the WU with the given ID (for debugging).

Options related to credit. Don't use these options in general. With no options, validators will grant credit in a fair way.

[ --no_credit ]
Don't grant credit (use this if you grant credit via trickle messages).
[ --credit_from_runtime ]
Grant credit proportional to (runtime * CPU FLOPS). Use this if:
  • the app has only single-threaded CPU versions, and
  • the app's jobs do different amounts of computation on different hosts, e.g. if they exit after a fixed amount of time.
  • Details.
[ --credit_from_wu ]
Grant credits written in WU template. See CreditAlt and JobSubmission.