Changes between Version 12 and Version 13 of ClientSim


Ignore:
Timestamp:
Sep 23, 2010, 2:07:12 PM (14 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ClientSim

    v12 v13  
    11= BOINC client simulator =
    22
    3 The BOINC client simulator simulates a single BOINC client interacting with one or more projects.
    4 The simulator models the CPU scheduling and work-fetch policies of the BOINC client very closely;
    5 in fact it uses the same source code as the core client for these policies.
    6 The simulator implements several different scheduling policies: the ones currently in use,
    7 the ones used in the old (version 4) client,
    8 and several experimental policies that are under development.
     3The BOINC client simulator simulates a single BOINC client
     4interacting with one or more projects.
     5The simulator uses the same source code as the client for
     6the CPU scheduling and work-fetch policies,
     7so it models the BOINC client accurately.
    98
    109The intended uses of the simulator include:
    1110
    12  * Identifying scenarios (combinations of host and project characteristics) where the current scheduling policies don't behave well.
     11 * Identifying scenarios (combinations of host and project characteristics)
     12  where the current scheduling policies don't behave well.
    1313 * Studying experimental policies.
    1414
    15 However, the simulator is not necessarily perfect - in some cases its results may differ significantly from what the actual client would do.
     15However, the simulator is not necessarily perfect -
     16in some cases its results may differ significantly from what the
     17actual client would do.
    1618Or its inputs may be inadequate to describe a real-life scenario.
    17 If you find such cases, please send email to David Anderson or [/email_lists.php boinc_dev].
     19If you find such cases, please send email to David Anderson
     20or [/email_lists.php boinc_dev].
    1821
    1922You can use the simulator in either of two ways:
    20  * Through a [/sim_form.php web interface]. This lets you do one simulation at a time, and shows you results graphically.
    21  * Compile it yourself. This provides a more flexible, but less convenient, interface.
     23 * Through a [/sim_form.php web interface].
     24   This lets you do one simulation at a time, and shows you results graphically.
     25 * Compile it yourself and run from a command line.
     26   This provides a more flexible interface.
    2227
    2328== Input files ==
    2429
    25 The input consists of four files.
    26 
    27 === sim_projects.xml === #input_sim_projects
     30The input consists of the following files:
     31
     32=== client_state.xml === #input_client_state
    2833
    2934This describes a set of attached projects.
     35The format is an extension of the state file generated by the client;
     36you can use the state file of a running client as
     37an input to the simulator.
     38
     39The fields used by the simulator are as follows
     40(fields marked with * are not generated by the client).
    3041
    3142{{{
    32 <projects>
    33     <project>
    34         <project_name>P1</project_name>
    35         <resource_share>100</resource_share>
    36         <app>
    37             <latency_bound>1000</latency_bound>
    38             <fpops_est>1e9</fpops_est>
    39             <fpops>
    40                 <mean>1e9</mean>
    41                 <stdev>1e5</stdev>
    42             </fpops>
    43             <working_set>1e8</working_set>
    44             [<avg_ncpus>x</avg_ncpus>]
    45             [<natis>x</natis]
    46             [<ncudas>x</ncudas>]
    47             [<gpu_ram>x</gpu_ram>]
    48             [<flops_est>x</flops_est>]
    49         </app>
    50         ... other apps
    51         <available>
    52             <frac>.7</frac>
    53             <lambda>1000</lambda>
    54         </available>
    55         [<max_infeasible_count>N</max_infeasible_count>]
    56     </project>
    57     ... other projects
    58 </projects>
     43host_info
     44        p_ncpus
     45        p_fpops
     46        m_nbytes
     47time_stats
     48        on_frac
     49        connected_frac
     50        active_frac
     51        gpu_active_frac
     52    *available
     53        frac
     54        lambda
     55    *idle
     56        frac
     57        lambda
     58project
     59        project_name
     60        resource_share
     61        *available
     62                frac
     63                lambda
     64app
     65        name
     66        *latency_bound
     67        *fpops_est
     68        *fpops_actual
     69                mean
     70                stddev
     71        *weight
     72app_version
     73        app_name
     74        avg_ncpus
     75        flops
     76        plan_class
     77        coproc
     78                type
     79                count
     80        gpu_ram
     81        *working_set
     82workunit
     83        app_name
     84        rsc_fpops_est
     85        rsc_fpops_bound
     86result
     87        name
     88        report_deadline
     89        received_time
     90active_task
     91        result_name
     92        working_set_size
    5993}}}
    6094
    61 A project has one or more applications.
    62 Each application has a given latency bound and working-set size.
    63 The number of FP ops is a truncated normal distribution with the given mean and standard deviation.
    64 
    65 The availability of the projects (i.e. the periods when scheduler RPCs succeed) is modeled with two parameters:
    66 the duration of available periods are exponentially distributed with the given mean,
    67 and the unavailable periods are exponentially distributed achieving the given available fraction.
    68 
    69 max_infeasible_count specifies how many jobs that are infeasible (due to deadline/workload)
    70 to tolerate over before giving up in work-send loop.  Default is 0.
    71 
    72 === sim_host.xml === #input_sim_host
    73 
    74 This describes the host hardware and availability.
     95Notes:
     96Each application has a fixed latency bound.
     97It can be specified in app.latency_bound.
     98If not, and there is a result for that app,
     99it is computed as report_deadline - received time for one such result.
     100If there is no result, it is 1 week.
     101
     102An application has a fixed FLOP count estimate.
     103It can be specified as app.fpops_est.
     104If not, and there is a WU for that app, it is wu.rsc_fpops_est.
     105Otherwise it is 3600*1e9 (i.e., 1 GFLOPS/hr).
     106
     107An application has a normal distribution of actual FLOP count.
     108It can be specified as app.fpops_actual.
     109Otherwise it is mean app.fpops_est, stddev 0.
     110
     111An application has an associated weight that determines
     112the fraction of its jobs dispatched by that project.
     113This defaults to 1.
     114
     115An application version has a fixed working set size.
     116This can be specified as app_version.working_set.
     117If not, and there is an active task for that app version,
     118active_task.working_set_size is used.
     119Otherwise it defaults to 0.
     120
     121The availability of the projects (i.e. the periods when scheduler RPCs succeed)
     122is modeled with two parameters:
     123the duration of available periods are exponentially distributed
     124with the given mean,
     125and the unavailable periods are exponentially distributed
     126achieving the given available fraction.
     127The availability of a project can be specified
     128as project.available;
     129otherwise it is always available.
     130
     131The algorithm for simulating a scheduler RPC is:
    75132
    76133{{{
    77 <host>
    78     <p_fpops>x</p_fpops>
    79     <m_nbytes>x</m_nbytes>
    80     <p_ncpus>x</p_ncpus>
    81     [ <coproc>
    82         <type>cuda</type>
    83         <count>n</count>
    84         <available_ram>x</available_ram>
    85     </coproc> ]
    86     <available>
    87         <frac>.7</frac>
    88         <lambda>1000</lambda>
    89     </available>
    90     <idle>
    91         <frac>.7</frac>
    92         <lambda>1000</lambda>
    93     </idle>
    94 </host>
     134while need more work
     135        X = list of apps with versions for requested resources
     136        choose an app A from X, randomly based on weights
     137        V = version that uses requested resources
     138                and has highest FLOPS
     139        J = generate job
     140        if J is feasible
     141                update request
     142        else
     143                infeasible_count++
     144                if infeasible_count == 10
     145                        break
     146       
    95147}}}
     148
    96149
    97150The available periods (i.e., when BOINC is running) and the idle periods
    98151(i.e. when there is no user input) are modeled as above.
    99152
    100 === sim_prefs.xml === #input_sim_prefs
    101 
    102 Same format as the [PrefsImpl global_prefs.xml] file.
     153=== global_prefs.xml ===
     154
     155format described [PrefsImpl here].
    103156
    104157=== cc_config.xml === #input_cc_config
    105158
    106 Same format as the client's [ClientMessages cc_config.xml] file.
     159format described [ClientMessages here].
    107160
    108161== Building and running the simulator == #build_and_run