Version 7 (modified by 16 years ago) (diff) | ,
---|
Low-level validator framework
BOINC's simple validator framework is sufficient in almost all cases. If for some reason you need more control, you can use the low-level framework (on which the simple framework is based).
To make a validator program using the low-level framework, link validator.cpp with two application-specific functions:
int check_set( vector<RESULT> results, DB_WORKUNIT& wu, int& canonicalid, double& credit, bool& retry );
- check_set() takes a set of results (all with outcome=SUCCESS). It reads and compares their output files. If there is a quorum of matching results, it selects one of them as the canonical result, returning its ID. In this case it also returns the credit to be granted for correct results for this workunit.
- If, when an output file for a result has a nonrecoverable error (e.g. the directory is there but the file isn't, or the file is present but has invalid contents), then it must set the result's outcome (in memory, not database) to outcome=RESULT_OUTCOME_VALIDATE_ERROR and validate_state=VALIDATE_STATE_INVALID.
Use BOINC's back-end utility functions (in sched/validate_util.cpp) to get file pathnames and to distinguish recoverable and nonrecoverable file-open errors.
- If a canonical result is found, check_set() must set the validate_state field of each non-ERROR result (in memory, not database) to either validate_state=VALIDATE_STATE_VALID or validate_state=VALIDATE_STATE_INVALID.
- If a recoverable error occurs while reading output files (e.g. a directory wasn't visible due to NFS mount failure) then check_set() should return retry=true. This tells the validator to arrange for this WU to be processed again in a few hours.
- check_set() should return nonzero if a major error occurs. This tells the validator to write an error message and exit.
int check_pair(RESULT& new_result, RESULT& canonical_result, bool& retry);
- check_pair() compares a new result to the canonical result. In the absence of errors, it sets the new result's validate_state to either VALIDATE_STATE_INVALID or VALIDATE_STATE_VALID.
- If it has a nonrecoverable error reading an output file of either result, or if the new result's output file is invalid, it must set the new result's outcome (in memory, not database) to VALIDATE_ERROR.
- If it has a recoverable error while reading an output file of either result, it returns retry=true, which causes the validator to arrange for the WU to be examined again in a few hours.
- check_pair() should return nonzero if a major error occurs. This tells the validator to write an error message and exit.
Neither function should delete files or access the BOINC database.
Examples of these two functions may be found in validate_util2.cpp, which implements the simple validator framework.
Pseudocode
int check_set( vector<RESULT> results, DB_WORKUNIT& wu, int& canonicalid, double& credit, bool& retry ); Define N := length of result vector, and let M := N. check_set() will ALWAYS be called with N>=wu.min_quorum. check_set() will ALWAYS be called with ALL results satisfying result.outcome == RESULT_OUTCOME_SUCCESS result.validate_state == VALIDATE_STATE_INIT check_set() should NEVER modify wu [although it is not declared const] [1] Syntax pass (optional) for (all N results) { if (one or more of the result's output files can be read and one or more of those files contains erroneous or invalid or incorrect output, i.e. bad file syntax) { set result.outcome=RESULT_OUTCOME_VALIDATE_ERROR; set result.validate_state=VALIDATE_STATE_INVALID; decrement counter: M = M-1; } // erroneous or incorrect or invalid output files else if (result has a potentially recoverable error, i.e. NFS directory not mounted, server is unreachable, upload server unreachable) { dont not modify result.validate_state; dont not modify result.outcome; decrement counter M = M-1; set retry=true; } // recoverable error else if (every output file of the result is unreadable or fails to exist) { set result.outcome=RESULT_OUTCOME_VALIDATE_ERROR; set result.validate_state=VALIDATE_STATE_INIT; decrement counter: M = M-1; } // all result output files unreadable or nonexistent } // end of syntax pass loop over all N results Define REMAINING RESULTS to be those that do NOT fall into one of the three categories above. There are M of these. If the syntax pass has been skipped, then M == N. if (M < wu.min_quorum) { don't modify canonicalid; don't modify credit; leave retry as set above; leave result.outcome unchanged for M remaining results; leave result.validate_state unchanged for M remaining results; return 0; } // fewer than min_quorum results remain At any point in this process, if a major error occurs, check_set() should return nonzero. This will cause the validator to exit. If this happens, it does not matter how you have set or modified result.outcome, result.validate_state, retry, credit, or canonicalid. // END OF OPTIONAL SYNTAX PASS [2] Comparison pass (required). We have M>=wu.min_quorum REMAINING RESULTS results with result.outcome == RESULT_OUTCOME_SUCCESS result.validate_state == VALIDATE_STATE_INIT All the output files of all of these results are readable. All of the output files for a given result are, when taken "in isolation" apparently valid. [If these conditions are not met then you must do the "syntax pass" above.] if (one of these results is determined to be THE correct [canonical] result) { for (correct result) { set result.validate_state=VALIDATE_STATE_VALID; set canonicalid=result.id; } // canonical result for (the REMAINING M - 1 results) { // NOTE: what is below can be done by calling // check_pair(result, canonical_result) if (result is correct, matches canonical) { result.validate_state=VALIDATE_STATE_VALID; } else { result.validate_state=VALIDATE_STATE_INVALID; } } // loop over remaining M-1 results set credit; leave retry as set from the syntax pass above; return 0; } // found canonical result else { // You are UNABLE to determine if one of the M REMAINING RESULTS // is correct, so: do not modify result.outcome for ANY of M remaining results; do not modify result.validate_state for ANY of M remaining results; do not set credit; do not set canonicalid; leave retry as set from the syntax pass; return 0; } // did not find canonical result At any point in this process, if a major error occurs, check_set() should return nonzero. This will cause the validator to exit. If this happens, it does not matter how you have set result.outcome, result.validate_state, retry, credit, or canonicalid for ANY of the results. // end of Comparison pass