wiki:ClientSim

Version 8 (modified by davea, 17 years ago) (diff)

--

BOINC client simulator

The BOINC client simulator simulates a single BOINC client interacting with one or more projects. The simulator models the CPU scheduling and work-fetch policies of the BOINC client very closely - in fact it uses the same source code as the core client for these policies. The simulator implements several different scheduling policies: the ones currently in use, the ones used in the old (version 4) client, and several experimental policies that are under development.

The intended uses of the simulator include:

  • Identifying scenarios (combinations of host and project characteristics) where the current scheduling policies don't behave well.
  • Studying experimental policies.

However, the simulator is not necessarily perfect - in some cases its results may differ significantly from what the actual client would do. If you find such cases, please send email to David Anderson or boinc_dev.

You can use the simulator in either of two ways:

  • Through a web interface. This lets you do one simulation at a time, and shows you results graphically.
  • Compile it yourself. This provides a more flexible, but less convenient, interface.

Input files

The input consists of four files.

sim_projects.xml

This describes a set of attached projects.

<projects>
    <project>
        <project_name>P1</project_name>
        <resource_share>100</resource_share>
        <app>
            <latency_bound>1000</latency_bound>
            <fpops_est>1e9</fpops_est>
            <fpops>
                <mean>1e9</mean>
                <stdev>1e5</stdev>
            </fpops>
            <working_set>100000000</working_set>
        </app>
        ...
        <available>
            <frac>.7</frac>
            <lambda>1000</lambda>
        </available>
        [<max_infeasible_count>N</max_infeasible_count>]
    </project>
    ...
</projects>

A project has one or more applications. Each application has a given latency bound and working-set size. The number of FP ops is a truncated normal distribution with the given mean and standard deviation.

The availability of the projects (i.e. the periods when scheduler RPCs succeed) is modeled with two parameters: the duration of available periods are exponentially distributed with the given mean, and the unavailable periods are exponentially distributed achieving the given available fraction.

max_infeasible_count specifies how many jobs that are infeasible (due to deadline/workload) to tolerate over before giving up in work-send loop. Default is 0.

sim_host.xml

This describes the host hardware and availability.

<host>
    <p_fpops>x</p_fpops>
    <m_nbytes>x</m_nbytes>
    <p_ncpus>x</p_ncpus>
    <available>
        <frac>.7</frac>
        <lambda>1000</lambda>
    </available>
    <idle>
        <frac>.7</frac>
        <lambda>1000</lambda>
    </idle>
</host>

The available periods (i.e., when BOINC is running) and the idle periods (i.e. when there is no user input) are modeled as above.

sim_prefs.xml

Same format as the global_prefs.xml file.

cc_config.xml

Same format as the client's cc_config.xml? file.

Building and running the simulator

The simulator can be built with 'makefile_sim' on Unix or the 'sim' project on Windows. The usage is:

sim [--duration X] [--delta X] [--server_uses_workload] [--dirs d1 ...]
--duration
simulate this much time.
--delta
time step of simulation.
--server_uses_workload
servers take existing workload into account when deciding whether to send jobs.
--dcf_dont_use
Duration correction factor (DCF) is one.
--dcf_stats
Use formula for DCF based on completion time mean/stdev.
--dirs d1 …
chdir into each of the given directories, and runs a simulation based on the input files there. Prints summaries of each one separately, and a total summary.

Output files

The simulator creates two output files:

sim_log.txt: This is the message log (same as would be generated by the client). Its contents are controlled by cc_config.xml.

sim_out.html: When viewed in a web browser, a 'time line' showing what's running when. The bottom of this file contains four numbers that summarize how well the policies performed:

wasted_frac
Of the total CPU time, the fraction spent computing results that missed their deadline.
idle_frac
Of the total CPU time, the fraction spent not computing.
share_violation
A measure (0 to 1) of how badly resource shares were violated.
monotony
A measure (0 to 1) of how long a single project used all CPUs (so that user would see only that project on their screensaver, and get bored).

In addition, information is printed about the per-project CPU time and waste.