Changes between Version 12 and Version 13 of ClientSim
- Timestamp:
- Sep 23, 2010, 2:07:12 PM (14 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
ClientSim
v12 v13 1 1 = BOINC client simulator = 2 2 3 The BOINC client simulator simulates a single BOINC client interacting with one or more projects. 4 The simulator models the CPU scheduling and work-fetch policies of the BOINC client very closely; 5 in fact it uses the same source code as the core client for these policies. 6 The simulator implements several different scheduling policies: the ones currently in use, 7 the ones used in the old (version 4) client, 8 and several experimental policies that are under development. 3 The BOINC client simulator simulates a single BOINC client 4 interacting with one or more projects. 5 The simulator uses the same source code as the client for 6 the CPU scheduling and work-fetch policies, 7 so it models the BOINC client accurately. 9 8 10 9 The intended uses of the simulator include: 11 10 12 * Identifying scenarios (combinations of host and project characteristics) where the current scheduling policies don't behave well. 11 * Identifying scenarios (combinations of host and project characteristics) 12 where the current scheduling policies don't behave well. 13 13 * Studying experimental policies. 14 14 15 However, the simulator is not necessarily perfect - in some cases its results may differ significantly from what the actual client would do. 15 However, the simulator is not necessarily perfect - 16 in some cases its results may differ significantly from what the 17 actual client would do. 16 18 Or its inputs may be inadequate to describe a real-life scenario. 17 If you find such cases, please send email to David Anderson or [/email_lists.php boinc_dev]. 19 If you find such cases, please send email to David Anderson 20 or [/email_lists.php boinc_dev]. 18 21 19 22 You can use the simulator in either of two ways: 20 * Through a [/sim_form.php web interface]. This lets you do one simulation at a time, and shows you results graphically. 21 * Compile it yourself. This provides a more flexible, but less convenient, interface. 23 * Through a [/sim_form.php web interface]. 24 This lets you do one simulation at a time, and shows you results graphically. 25 * Compile it yourself and run from a command line. 26 This provides a more flexible interface. 22 27 23 28 == Input files == 24 29 25 The input consists of four files.26 27 === sim_projects.xml === #input_sim_projects30 The input consists of the following files: 31 32 === client_state.xml === #input_client_state 28 33 29 34 This describes a set of attached projects. 35 The format is an extension of the state file generated by the client; 36 you can use the state file of a running client as 37 an input to the simulator. 38 39 The fields used by the simulator are as follows 40 (fields marked with * are not generated by the client). 30 41 31 42 {{{ 32 <projects> 33 <project> 34 <project_name>P1</project_name> 35 <resource_share>100</resource_share> 36 <app> 37 <latency_bound>1000</latency_bound> 38 <fpops_est>1e9</fpops_est> 39 <fpops> 40 <mean>1e9</mean> 41 <stdev>1e5</stdev> 42 </fpops> 43 <working_set>1e8</working_set> 44 [<avg_ncpus>x</avg_ncpus>] 45 [<natis>x</natis] 46 [<ncudas>x</ncudas>] 47 [<gpu_ram>x</gpu_ram>] 48 [<flops_est>x</flops_est>] 49 </app> 50 ... other apps 51 <available> 52 <frac>.7</frac> 53 <lambda>1000</lambda> 54 </available> 55 [<max_infeasible_count>N</max_infeasible_count>] 56 </project> 57 ... other projects 58 </projects> 43 host_info 44 p_ncpus 45 p_fpops 46 m_nbytes 47 time_stats 48 on_frac 49 connected_frac 50 active_frac 51 gpu_active_frac 52 *available 53 frac 54 lambda 55 *idle 56 frac 57 lambda 58 project 59 project_name 60 resource_share 61 *available 62 frac 63 lambda 64 app 65 name 66 *latency_bound 67 *fpops_est 68 *fpops_actual 69 mean 70 stddev 71 *weight 72 app_version 73 app_name 74 avg_ncpus 75 flops 76 plan_class 77 coproc 78 type 79 count 80 gpu_ram 81 *working_set 82 workunit 83 app_name 84 rsc_fpops_est 85 rsc_fpops_bound 86 result 87 name 88 report_deadline 89 received_time 90 active_task 91 result_name 92 working_set_size 59 93 }}} 60 94 61 A project has one or more applications. 62 Each application has a given latency bound and working-set size. 63 The number of FP ops is a truncated normal distribution with the given mean and standard deviation. 64 65 The availability of the projects (i.e. the periods when scheduler RPCs succeed) is modeled with two parameters: 66 the duration of available periods are exponentially distributed with the given mean, 67 and the unavailable periods are exponentially distributed achieving the given available fraction. 68 69 max_infeasible_count specifies how many jobs that are infeasible (due to deadline/workload) 70 to tolerate over before giving up in work-send loop. Default is 0. 71 72 === sim_host.xml === #input_sim_host 73 74 This describes the host hardware and availability. 95 Notes: 96 Each application has a fixed latency bound. 97 It can be specified in app.latency_bound. 98 If not, and there is a result for that app, 99 it is computed as report_deadline - received time for one such result. 100 If there is no result, it is 1 week. 101 102 An application has a fixed FLOP count estimate. 103 It can be specified as app.fpops_est. 104 If not, and there is a WU for that app, it is wu.rsc_fpops_est. 105 Otherwise it is 3600*1e9 (i.e., 1 GFLOPS/hr). 106 107 An application has a normal distribution of actual FLOP count. 108 It can be specified as app.fpops_actual. 109 Otherwise it is mean app.fpops_est, stddev 0. 110 111 An application has an associated weight that determines 112 the fraction of its jobs dispatched by that project. 113 This defaults to 1. 114 115 An application version has a fixed working set size. 116 This can be specified as app_version.working_set. 117 If not, and there is an active task for that app version, 118 active_task.working_set_size is used. 119 Otherwise it defaults to 0. 120 121 The availability of the projects (i.e. the periods when scheduler RPCs succeed) 122 is modeled with two parameters: 123 the duration of available periods are exponentially distributed 124 with the given mean, 125 and the unavailable periods are exponentially distributed 126 achieving the given available fraction. 127 The availability of a project can be specified 128 as project.available; 129 otherwise it is always available. 130 131 The algorithm for simulating a scheduler RPC is: 75 132 76 133 {{{ 77 <host> 78 <p_fpops>x</p_fpops> 79 <m_nbytes>x</m_nbytes> 80 <p_ncpus>x</p_ncpus> 81 [ <coproc> 82 <type>cuda</type> 83 <count>n</count> 84 <available_ram>x</available_ram> 85 </coproc> ] 86 <available> 87 <frac>.7</frac> 88 <lambda>1000</lambda> 89 </available> 90 <idle> 91 <frac>.7</frac> 92 <lambda>1000</lambda> 93 </idle> 94 </host> 134 while need more work 135 X = list of apps with versions for requested resources 136 choose an app A from X, randomly based on weights 137 V = version that uses requested resources 138 and has highest FLOPS 139 J = generate job 140 if J is feasible 141 update request 142 else 143 infeasible_count++ 144 if infeasible_count == 10 145 break 146 95 147 }}} 148 96 149 97 150 The available periods (i.e., when BOINC is running) and the idle periods 98 151 (i.e. when there is no user input) are modeled as above. 99 152 100 === sim_prefs.xml === #input_sim_prefs101 102 Same format as the [PrefsImpl global_prefs.xml] file.153 === global_prefs.xml === 154 155 format described [PrefsImpl here]. 103 156 104 157 === cc_config.xml === #input_cc_config 105 158 106 Same format as the client's [ClientMessages cc_config.xml] file.159 format described [ClientMessages here]. 107 160 108 161 == Building and running the simulator == #build_and_run