Changes between Version 3 and Version 4 of LocalityNew
- Timestamp:
- Aug 14, 2012, 5:17:04 PM (12 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
LocalityNew
v3 v4 60 60 {{{ 61 61 batch 62 62 // this table already exists; we may need to add fields to it 63 63 64 64 batch_host // batch/host association table 65 66 67 65 host_id integer 66 batch_id integer 67 cursor_id integer 68 68 69 69 locality_cursor 70 batch_id integer 71 expavg_credit double 72 // sum of expavg_credit of hosts in the team 73 first_job_num integer 74 last_job_num integer 75 first_unfinished_job_num integer 76 // all jobs before this have been completed 77 first_ungenerated_job_num integer 78 // we've generated workunit records for all jobs before this 79 index on (batch_id, expavg_credit) 70 batch_id integer 71 first_job_num integer 72 last_job_num integer 73 // range of jobs to be done 74 expavg_credit double 75 // sum of expavg_credit of hosts in the team 76 remaining_credit double 77 // estimated credit of unfinished jobs 78 // zero means all jobs finished 79 first_unfinished_job_num integer 80 // all jobs before this have been completed 81 first_ungenerated_job_num integer 82 // we've generated workunit records for all jobs before this 80 83 81 84 workunit (new fields) 82 83 85 cursor_id integer 86 job_num integer 84 87 85 88 }}} 86 89 87 === Initialization===90 === Creating a batch === 88 91 89 To initialize a batch:92 To create a batch: 90 93 91 94 * create batch record … … 97 100 ==== Assign host to cursors ==== 98 101 102 Define 103 {{{ 104 est_time_left(cursor) = cursor.remaining_credit/cursor.expavg_credit 105 }}} 106 This is an estimate of the time needed to complete the cursor's jobs, 107 given its current team. 108 99 109 For each batch: 100 110 101 111 If this is a new host (i.e. no batch_host record) then 102 * assign host to cursor for this batch with least expavg_credit112 * assign host to cursor for this batch with least greatest est_time_left(). 103 113 * create batch_host record 104 114 * add host's expavg_credit to cursor's expavg_credit 105 115 106 116 Otherwise, consider moving this host to a different cursor. 107 Let C = host's cursor. 108 If C.expavg_credit > 2*lowest expavg among cursors, 109 then move this host to that cursor. 110 (This policy may need to be refined a bit). 117 Let C = host's cursor, 118 and let D = the cursor for which est_time_left() is greatest. 119 If est_time_left(C) < .5*est_time_left(D), 120 then move this host to D. 121 (This policy may need to be refined a bit to reduce moving hosts between cursors). 111 122 112 123 ==== Assigning jobs ==== … … 127 138 ==== Deleting files ==== 128 139 129 If the host has a file that's used only by finished jobs,130 tell client to delete it.140 If the host has a sticky file that's not used by an unfinished job in its cursor, 141 tell client to delete that file. 131 142 132 143 Note: names of sticky files should encode the batch and file number. … … 143 154 144 155 {{{ 145 if workunit.job_num == cursor.first_unfinished_job_num 146 cursor.first_unfinished_job_num++ 156 while cursor.first_unfinished_job_num is finished 157 cursor.first_unfinished_job_num++ 158 update cursor.remaining_credit 147 159 }}} 148 160