Example 3: Calibration jobs and adaptive replication
This example uses a more refined replication policy: instead of doing 2-fold replication, we estimate the error rate of volunteers using calibration tasks, and do enough instance of each job so that the overall probability of error is below a threshold.
We'll track the error rate separately for positive and negative jobs (i.e., images with and without an ellipse). The opaque data structure for users has components:
nneg: # of negative calibration jobs completed
nneg_err: of those, the number of errors
npos: # of positive calibration jobs completed
npos_err: of those, the number of errors
From these we'll derive:
neg_err_rate: error rate for negative cases
pos_err_rate: error rate for positive cases
Our replication policy is:
- A job is marked Done if either
- There are N positive instances that match within 20 pixels, and for which the product of pos_err_rate (for the corresponding users) is less than 1e-3, or
- there are N negative instances and the product of neg_err_rate is less than 1e-3.
- Else a job is marked Inconclusive if there are 10 finished instances
The job distribution policy is the same as in example 2.
Setup
Create an application named bossa_example3. Create some jobs:
php bossa_example_make_jobs.php --app_name bossa_example3 --dir example
We'll also need to create some calibration jobs:
php bossa_example_make_jobs.php --app_name bossa_example3 --dir example --calibration
This will create 10 calibration jobs based on the images in example/ (recall that these images have corresponding "answer" files.
Callback functions
This example's call back functions are in (html/inc/bossa_example3.inc) The replication policy is implemented in job_finished():
50 function job_finished($job, $inst, $user) { 51 $response = null; 52 if (get_str('submit', true)) { 53 $response->have_ellipse = 0; 54 } else { 55 $response->have_ellipse = 1; 56 $response->cx = get_int('pic_x'); 57 $response->cy = get_int('pic_y'); 58 } 59 $inst->set_opaque_data($response); 60 61 // if this is a calibration job, update user's opaque data 62 // 63 if ($job->calibration) { 64 $b = $user->bossa; 65 $info = $job->get_opaque_data(); 66 $answer = $info->answer; 67 $u = $b->get_opaque_data(); 68 if (!$u) { 69 $u->npos = 0; 70 $u->npos_err = 0; 71 $u->nneg = 0; 72 $u->nneg_err = 0; 73 } 74 if (compatible($response, $answer)) { 75 if ($answer->have_ellipse) { 76 $u->npos++; 77 } else { 78 $u->nneg++; 79 } 80 } else { 81 if ($answer->have_ellipse) { 82 $u->npos++; 83 $u->npos_err++; 84 } else { 85 $u->nneg++; 86 $u->nneg_err++; 87 } 88 } 89 $b->set_opaque_data($u); 90 return; 91 } 92 93 // now see if job is done 94 // 95 $insts = $job->get_finished_instances(); 96 $n = count($insts); 97 98 $results = null; 99 $users = null; 100 foreach ($insts as $inst) { 101 $results[] = $inst->get_opaque_data(); 102 $u = $inst->get_user(); 103 $users[] = $u->bossa->get_opaque_data(); 104 } 105 106 // see if there's a negative consensus 107 // 108 $prob = 1; 109 for ($i=0; $i<$n; $i++) { 110 $r = $results[$i]; 111 if ($r1->have_ellipse) continue; 112 $u = $users[$i]; 113 $prob *= $u->neg_err_rate; 114 } 115 if ($prob < PROB_LIMIT) { 116 $job->set_state(BOSSA_JOB_DONE); 117 return; 118 } 119 120 // see if there's a positive consensus 121 // 122 for ($i=0; $i<$n; $i++) { 123 $r1 = $results[$i]; 124 $u = $users[$i]; 125 $prob = $u->pos_error_rate; 126 for ($j=0; $j<$n; $j++) { 127 if ($j == $i) continue; 128 $r2 = $results[$j]; 129 if (compatible($r1, $r2)) { 130 $u2 = $users[$j]; 131 $prob *= $u2->pos_err_rate; 132 } 133 } 134 if ($prob < PROB_LIMIT) { 135 $job->set_state(BOSSA_JOB_DONE); 136 return; 137 } 138 } 139 140 // see if there are too many instances without a consensus 141 // 142 if ($n >= 10) { 143 $job->set_state(BOSSA_JOB_INCONCLUSIVE); 144 return; 145 } 146 147 // still looking for consensus - get another instance 148 // 149 $job->set_priority(2); 150 151 }
We also supply a callback function to show a user's opaque data on administrative web pages:
function user_summary($user) { $b = $user->bossa; $info = $b->get_info(); if ($info) { if ($info->npos) { $pos_err = $info->npos_err/$info->npos; } else { $pos_err = "---"; } if ($info->nneg) { $neg_err = $info->nneg_err/$info->nneg; } else { $neg_err = "---"; } return "error rate: positive $pos_err ($info->npos_err/$info->npos), negative $neg_err ($info->nneg_err/$info->nneg) "; } else { return "No data"; } }