Example 3: Calibration jobs and adaptive replication
This example uses a more refined replication policy: instead of doing 2-fold replication, we estimate the error rate of volunteers using calibration tasks, and do enough instance of each job so that the overall probability of error is below a threshold.
We'll track the error rate separately for positive and negative jobs (i.e., images with and without an ellipse). The opaque data structure for users has components:
nneg: # of negative calibration jobs completed
nneg_err: of those, the number of errors
npos: # of positive calibration jobs completed
npos_err: of those, the number of errors
From these we'll derive:
neg_err_rate: error rate for negative cases
pos_err_rate: error rate for positive cases
Our replication policy is:
- A job is marked Done if either
- There are N positive instances that match within 20 pixels, and for which the product of pos_err_rate (for the corresponding users) is less than 1e-3, or
- there are N negative instances and the product of neg_err_rate is less than 1e-3.
- Else a job is marked Inconclusive if there are 10 finished instances
The job distribution policy is the same as in example 2.
Setup
Create an application named bossa_example3. Create some jobs:
php bossa_example_make_jobs.php --app_name bossa_example3 --dir example
We'll also need to create some calibration jobs:
php bossa_example_make_jobs.php --app_name bossa_example3 --dir example --calibration
This will create 10 calibration jobs based on the images in example/ (recall that these images have corresponding "answer" files.
Callback functions
This example's call back functions are in (html/inc/bossa_example3.inc) The replication policy is implemented in job_finished():
50 function job_finished($job, $inst, $user) {
51 $response = null;
52 if (get_str('submit', true)) {
53 $response->have_ellipse = 0;
54 } else {
55 $response->have_ellipse = 1;
56 $response->cx = get_int('pic_x');
57 $response->cy = get_int('pic_y');
58 }
59 $inst->set_opaque_data($response);
60
61 // if this is a calibration job, update user's opaque data
62 //
63 if ($job->calibration) {
64 $b = $user->bossa;
65 $info = $job->get_opaque_data();
66 $answer = $info->answer;
67 $u = $b->get_opaque_data();
68 if (!$u) {
69 $u->npos = 0;
70 $u->npos_err = 0;
71 $u->nneg = 0;
72 $u->nneg_err = 0;
73 }
74 if (compatible($response, $answer)) {
75 if ($answer->have_ellipse) {
76 $u->npos++;
77 } else {
78 $u->nneg++;
79 }
80 } else {
81 if ($answer->have_ellipse) {
82 $u->npos++;
83 $u->npos_err++;
84 } else {
85 $u->nneg++;
86 $u->nneg_err++;
87 }
88 }
89 $b->set_opaque_data($u);
90 return;
91 }
92
93 // now see if job is done
94 //
95 $insts = $job->get_finished_instances();
96 $n = count($insts);
97
98 $results = null;
99 $users = null;
100 foreach ($insts as $inst) {
101 $results[] = $inst->get_opaque_data();
102 $u = $inst->get_user();
103 $users[] = $u->bossa->get_opaque_data();
104 }
105
106 // see if there's a negative consensus
107 //
108 $prob = 1;
109 for ($i=0; $i<$n; $i++) {
110 $r = $results[$i];
111 if ($r1->have_ellipse) continue;
112 $u = $users[$i];
113 $prob *= $u->neg_err_rate;
114 }
115 if ($prob < PROB_LIMIT) {
116 $job->set_state(BOSSA_JOB_DONE);
117 return;
118 }
119
120 // see if there's a positive consensus
121 //
122 for ($i=0; $i<$n; $i++) {
123 $r1 = $results[$i];
124 $u = $users[$i];
125 $prob = $u->pos_error_rate;
126 for ($j=0; $j<$n; $j++) {
127 if ($j == $i) continue;
128 $r2 = $results[$j];
129 if (compatible($r1, $r2)) {
130 $u2 = $users[$j];
131 $prob *= $u2->pos_err_rate;
132 }
133 }
134 if ($prob < PROB_LIMIT) {
135 $job->set_state(BOSSA_JOB_DONE);
136 return;
137 }
138 }
139
140 // see if there are too many instances without a consensus
141 //
142 if ($n >= 10) {
143 $job->set_state(BOSSA_JOB_INCONCLUSIVE);
144 return;
145 }
146
147 // still looking for consensus - get another instance
148 //
149 $job->set_priority(2);
150
151 }
We also supply a callback function to show a user's opaque data on administrative web pages:
function user_summary($user) {
$b = $user->bossa;
$info = $b->get_info();
if ($info) {
if ($info->npos) {
$pos_err = $info->npos_err/$info->npos;
} else {
$pos_err = "---";
}
if ($info->nneg) {
$neg_err = $info->nneg_err/$info->nneg;
} else {
$neg_err = "---";
}
return "error rate: positive $pos_err ($info->npos_err/$info->npos),
negative $neg_err ($info->nneg_err/$info->nneg)
";
} else {
return "No data";
}
}
