Changes between Version 66 and Version 67 of ProjectOptions


Ignore:
Timestamp:
May 23, 2008, 5:11:42 PM (17 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • ProjectOptions

    v66 v67  
    1212}}}
    1313
    14 == Scheduling == #scheduling
     14== Scheduling: limiting work distribution ==
     15
    1516{{{
    1617<one_result_per_user_per_wu/>
    1718}}}
    1819If set, send at most one result of a given workunit to a given user. This is useful for checking accuracy/validity of results. It ensures that the results for a given workunit are generated by different users. If you have a validator that compares different results for a given workunits to ensure that they are equivalent, you should probably enable this. Otherwise you may end up validating results from a given user with results from the same user.
    19 
     20{{{
     21<one_result_per_host_per_wu/>
     22}}}
     23If present, send at most one result of a given workunit to a given host. This is weaker than `one_result_per_user_per_wu`; it is useful if you're using homogeneous redundancy and most of the hosts of a particular class belong to a single user.
    2024{{{
    2125<max_wus_to_send> N </max_wus_to_send>
     
    3943By default, results are not sent to hosts too slow to complete them within delay bound. If this flag is set, this rule is not enforced.
    4044{{{
    41 <dont_generate_upload_certificates/>
    42 }}}
    43 Don't put upload certificates in results. This makes result generation a lot faster, since no encryption is done, but you lose protection against DoS attacks on your upload servers.
    44 {{{
    45 <ignore_upload_certificates/>
    46 }}}
    47 If upload certificates are not generated, this option must be enabled to force file upload handler accept files being uploaded.
    48 {{{
    49 <locality_scheduling/>
    50 }}}
    51 When possible, send work that uses the same files that the host already has. This is intended for projects which have large data files, where many different workunits use the same data file. In this case, to reduce download demands on the server, it may be advantageous to retain the data files on the hosts, and send them work for the files that they already have. See [LocalityScheduling Locality Scheduling].
    52 {{{
    53 <locality_scheduling_wait_period> N </locality_scheduling_wait_period>
    54 }}}
    55 This element only has an effect when used in conjunction with the previous locality scheduling element. It tells the scheduler to use 'trigger files' to inform the project that more work is needed for specific files. The period is the number of seconds which the scheduler will wait to see if the project can create additional work. Together with project-specific daemons or scripts this can be used for 'just-in-time' workunit creation. See [LocalityScheduling Locality Scheduling].
    56 {{{
    57 <min_core_client_version> N </min_core_client_version>
    58 }}}
    59 If the scheduler gets a request from a client with a version number less than this, it returns an error message and doesn't do any other processing.  The version number is expressed as an integer with the encoding major*100+minor.  You can also specify this separately for each [AppVersion application].
    60 {{{
    61 <choose_download_url_by_timezone> 0|1 </choose_download_url_by_timezone>
    62 }}}
    63 When the scheduler sends work to hosts, it replaces the download URL appearing in the data and executable file descriptions with the download URL closest to the host's timezone. The project must provide a two-column file called 'download_servers' in the project root directory. This is a list of all download servers that will be inserted when work is sent to hosts. The first column is an integer listing the server's offset in seconds from UTC. The second column is the server URL in the format such as !http://einstein.phys.uwm.edu. The download servers must have identical file hierarchies and contents, and the path to file and executables must start with '/download/...' as in '!http://X/download/123/some_file_name'.
    64 {{{
    65 <cache_md5_info> 0|1 </cache_md5_info>
    66 }}}
    67 When creating work, keep a record (in files called foo.md5) of the file length and md5 sum of data files and executables. This can greatly reduce the time needed to create work, if (1) these files are re-used, and (2) there are many of these files, and (3) reading the files from disk is time-consuming.
    68 {{{
    69 <nowork_skip> 0|1 </nowork_skip>
    70 }}}
    71 If the scheduling server has no work, it replies to RPCs without doing any database access (e.g., without looking up the user or host record). This reduces DB load, but it fails to update preferences when users click on Update. Use it if your server DB is overloaded.
    72 {{{
    73 <resend_lost_results> 0|1 </resend_lost_results>
    74 }}}
    75 If set, and a <other_results> list is present in scheduler request, resend any in-progress results not in the list. This is recommended; it may increase the efficiency of your project. For reasons that are not well understood, a BOINC client sometimes fails to receive the scheduler reply. This flag addresses that issue: it causes the SAME results to be resent by the scheduler, if the client has failed to receive them.  Note: this will increase the load on your DB server; you can minimize this by creating an index:
    76 {{{
    77 alter table result add index res_host_state (hostid, server_state);
    78 }}}
    79 {{{
    80 <send_result_abort>0|1</send_result_abort>
    81 }}}
    82 If set, and the client is processing a result for a WU that has been canceled or is not in the DB (i.e. there's no chance of getting credit), tell the client to abort the result regardless of state. If client is processing a result for a WU that has been assimilated or is overdue (i.e. there's a chance of not getting credit) tell the client to abort the result if it hasn't started yet. Note: this will increase the load on your DB server.
    83 {{{
    84 <default_disk_max_used_gb> X </default_disk_max_used_gb>
    85 }}}
    86 Sets the default value for the `disk_max_used_gb` preference so it's consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences, or the preference is set to zero. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Default is 100.
    87 {{{
    88 <default_disk_max_used_pct> X </default_disk_max_used_pct>
    89 }}}
    90 Sets the default value for the `disk_max_used_pct` preference so its consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences, or the preference is set to zero. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Default is 50.
    91 {{{
    92 <default_disk_min_free_gb> X </default_disk_min_free_gb>
    93 }}}
    94 Sets the default value for the `disk_min_free_gb` preference so its consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Also, the scheduler uses this setting to override any smaller preference from the host, it enforces a 'minimum free disk space' to keep from filling up the drive. Recommend setting this no smaller than .001 (1MB or 1,000,000 bytes). Default is .001.
    95 {{{
    96 <one_result_per_host_per_wu/>
    97 }}}
    98 If present, send at most one result of a given workunit to a given host. This is weaker than `one_result_per_user_per_wu`; it is useful if you're using homogeneous redundancy and most of the hosts of a particular class belong to a single user.
    99 {{{
    100 <next_rpc_delay>x</next_rpc_delay>
    101 }}}
    102 In each scheduler reply, tell the clients to do another scheduler RPC after at most X seconds, regardless of whether they need work.  This is useful, e.g., to ensure that in-progress jobs can be canceled in a bounded amount of time.
     45<ban_os>regexp</ban_os>
     46}}}
     47Any host for which os_name<tab>os_version matches the given regular expression will not be sent jobs. This is a POSIX extended regular expression.
     48{{{
     49<ban_cpu>regexp</ban_cpu>
     50}}}
     51Any host for which p_vendor<tab>p_model matches the given regular expression will not be sent jobs. This is a POSIX extended regular expression.
    10352{{{
    10453<workload_sim>0|1</workload_sim>
    10554}}}
    10655Do a simulation, based on current client workload, in deciding whether a job's deadline can be met.
    107 {{{
    108 <shmem_work_items>N</shmem_work_items>
    109 }}}
    110 The size of the shared-memory array of jobs.  Default is 100.
    111 {{{
    112 <feeder_query_size>N</feeder_query_size>
    113 }}}
    114  The size of the feeder's enumeration query.  Default is 200.
    11556{{{
    11657<homogeneous_redundancy>N</homogeneous_redundancy>
     
    11960Otherwise, specifies the granularity of host classification (1=fine, 2=coarse).
    12061{{{
    121 <ended>0|1</ended>
    122 }}}
    123 Project has permanently ended.  Tell clients so user can be notified.
    124 {{{
    125 <ban_os>regexp</ban_os>
    126 }}}
    127 Any host for which os_name<tab>os_version matches the given regular expression will not be sent jobs. This is a POSIX extended regular expression.
    128 {{{
    129 <ban_cpu>regexp</ban_cpu>
    130 }}}
    131 Any host for which p_vendor<tab>p_model matches the given regular expression will not be sent jobs. This is a POSIX extended regular expression.
    132 {{{
    133 <max_ncpus>N</max_ncpus>
    134 }}}
    135 Treat all hosts as having no more than N CPUs.  This affects things like max results per day.  Use this, e.g., if your application uses a GPU or other co-processor.
    136 {{{
    137 <granted_credit_weight>X</granted_credit_weight>
    138 }}}
    139 {{{
    140 <granted_credit_ramp_up>N</granted_credit_ramp_up>
    141 }}}
    142 {{{
    14362<distinct_beta_apps>0|1</distinct_beta_apps>
    14463}}}
    14564If set, [AppFiltering user application selection] applies to [BetaTest beta test applications] as well as others.
    146 }}}
    147 Hosts whose average turnaround is at most reliable_max_avg_turnaround and whose error rate is at most reliable_max_error_rate are considered 'reliable'.
     65
     66== Scheduling: array-based scheduling ==
     67{{{
     68<nowork_skip> 0|1 </nowork_skip>
     69}}}
     70If the scheduling server has no work, it replies to RPCs without doing any database access (e.g., without looking up the user or host record). This reduces DB load, but it fails to update preferences when users click on Update. Use it if your server DB is overloaded.
     71{{{
     72<shmem_work_items>N</shmem_work_items>
     73}}}
     74The size of the shared-memory array of jobs.  Default is 100.
     75{{{
     76<feeder_query_size>N</feeder_query_size>
     77}}}
     78The size of the feeder's enumeration query.  Default is 200.
     79
     80{{{
     81<reliable_max_avg_turnaround_time>secs</reliable_max_avg_turnaround_time>
     82<reliable_max_error_rate>secs</reliable_max_error_rate>
     83}}}
     84Hosts whose average turnaround is at most reliable_max_avg_turnaround
     85and whose error rate is at most reliable_max_error_rate
     86are considered 'reliable'.
    14887{{{
    14988<reliable_reduced_delay_bound>X</reliable_reduced_delay_bound>
     
    15695}}}
    15796Results with priority at least 'reliable_on_priority' will be sent only to reliable hosts; increase priority of duplicate results by 'reliable_priority_on_over'; increase priority of duplicates caused by timeout (not error) by 'reliable_priority_on_over_except_error'.
    158 {{{
    159 <granted_credit_weight>X</granted_credit_weight>
    160 }}}
    161 KEVIN - PLEASE EXPLAIN
     97
     98
     99== Scheduling: locality scheduling ==
     100{{{
     101<locality_scheduling/>
     102}}}
     103When possible, send work that uses the same files that the host already has. This is intended for projects which have large data files, where many different workunits use the same data file. In this case, to reduce download demands on the server, it may be advantageous to retain the data files on the hosts, and send them work for the files that they already have. See [LocalityScheduling Locality Scheduling].
     104{{{
     105<locality_scheduling_wait_period> N </locality_scheduling_wait_period>
     106}}}
     107This element only has an effect when used in conjunction with the previous locality scheduling element. It tells the scheduler to use 'trigger files' to inform the project that more work is needed for specific files. The period is the number of seconds which the scheduler will wait to see if the project can create additional work. Together with project-specific daemons or scripts this can be used for 'just-in-time' workunit creation. See [LocalityScheduling Locality Scheduling].
     108
     109== Scheduling: job retransmission ==
     110{{{
     111<resend_lost_results> 0|1 </resend_lost_results>
     112}}}
     113If set, and a <other_results> list is present in scheduler request, resend any in-progress results not in the list. This is recommended; it may increase the efficiency of your project. For reasons that are not well understood, a BOINC client sometimes fails to receive the scheduler reply. This flag addresses that issue: it causes the SAME results to be resent by the scheduler, if the client has failed to receive them.  Note: this will increase the load on your DB server; you can minimize this by creating an index:
     114{{{
     115alter table result add index res_host_state (hostid, server_state);
     116}}}
     117{{{
     118<send_result_abort>0|1</send_result_abort>
     119}}}
     120If set, and the client is processing a result for a WU that has been canceled or is not in the DB (i.e. there's no chance of getting credit), tell the client to abort the result regardless of state. If client is processing a result for a WU that has been assimilated or is overdue (i.e. there's a chance of not getting credit) tell the client to abort the result if it hasn't started yet. Note: this will increase the load on your DB server.
     121
     122
     123== Scheduling: data distribution ==
     124{{{
     125<choose_download_url_by_timezone> 0|1 </choose_download_url_by_timezone>
     126}}}
     127When the scheduler sends work to hosts, it replaces the download URL appearing in the data and executable file descriptions with the download URL closest to the host's timezone. The project must provide a two-column file called 'download_servers' in the project root directory. This is a list of all download servers that will be inserted when work is sent to hosts. The first column is an integer listing the server's offset in seconds from UTC. The second column is the server URL in the format such as !http://einstein.phys.uwm.edu. The download servers must have identical file hierarchies and contents, and the path to file and executables must start with '/download/...' as in '!http://X/download/123/some_file_name'.
     128{{{
     129<cache_md5_info> 0|1 </cache_md5_info>
     130}}}
     131When creating work, keep a record (in files called foo.md5) of the file length and md5 sum of data files and executables. This can greatly reduce the time needed to create work, if (1) these files are re-used, and (2) there are many of these files, and (3) reading the files from disk is time-consuming.
     132
     133== Upload certificates ==
     134{{{
     135<dont_generate_upload_certificates/>
     136}}}
     137Don't put upload certificates in results. This makes result generation a lot faster, since no encryption is done, but you lose protection against DoS attacks on your upload servers.
     138{{{
     139<ignore_upload_certificates/>
     140}}}
     141If upload certificates are not generated, this option must be enabled to force file upload handler accept files being uploaded.
     142
     143== Default preferences ==
     144{{{
     145<default_disk_max_used_gb> X </default_disk_max_used_gb>
     146}}}
     147Sets the default value for the `disk_max_used_gb` preference so it's consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences, or the preference is set to zero. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Default is 100.
     148{{{
     149<default_disk_max_used_pct> X </default_disk_max_used_pct>
     150}}}
     151Sets the default value for the `disk_max_used_pct` preference so its consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences, or the preference is set to zero. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Default is 50.
     152{{{
     153<default_disk_min_free_gb> X </default_disk_min_free_gb>
     154}}}
     155Sets the default value for the `disk_min_free_gb` preference so its consistent between the scheduler and web pages. The scheduler uses it when a request for work doesn't include preferences. The web page scripts use it to set the initial value when displaying or editing preferences the first time, or when the user never saved them. Also, the scheduler uses this setting to override any smaller preference from the host, it enforces a 'minimum free disk space' to keep from filling up the drive. Recommend setting this no smaller than .001 (1MB or 1,000,000 bytes). Default is .001.
     156
    162157
    163158=== Deprecated options ===
     
    180175== Client control == #client-control
    181176{{{
     177<next_rpc_delay>x</next_rpc_delay>
     178}}}
     179In each scheduler reply, tell the clients to do another scheduler RPC after at most X seconds, regardless of whether they need work.  This is useful, e.g., to ensure that in-progress jobs can be canceled in a bounded amount of time.
     180{{{
    182181<verify_files_on_app_start/>
    183182}}}
     
    265264(See also the command-line options of the [ValidationIntro validator]).
    266265{{{
     266<granted_credit_weight>X</granted_credit_weight>
     267}}}
     268{{{
     269<granted_credit_ramp_up>N</granted_credit_ramp_up>
     270}}}
     271{{{
    267272<fp_benchmark_weight> X </fp_benchmark_weight>
    268273}}}
    269274The weighting given to the Whetstone benchmark in the calculation of claimed credit. Must be in [0 .. 1]. Projects whose applications are floating-point intensive should use 1; pure integer applications, 0. Choosing an appropriate value will reduce the disparity in claimed credit between hosts. The script html/ops/credit_study.php, run against the database of a running project, will suggest what value to use.
     275{{{
     276<granted_credit_weight>X</granted_credit_weight>
     277}}}
     278KEVIN - PLEASE EXPLAIN
    270279
    271280== File deletion policy == #file-deletion
     
    354363
    355364== Miscellaneous == #misc
     365{{{
     366<min_core_client_version> N </min_core_client_version>
     367}}}
     368If the scheduler gets a request from a client with a version number less than this, it returns an error message and doesn't do any other processing.  The version number is expressed as an integer with the encoding major*100+minor.  You can also specify this separately for each [AppVersion application].
     369{{{
     370<ended>0|1</ended>
     371}}}
     372Project has permanently ended.  Tell clients so user can be notified.
    356373{{{
    357374<disable_account_creation/>
     
    449466 * scripts: use the bin/parse_config program
    450467
     468