| | 1 | = Database purging utility = |
| | 2 | |
| | 3 | As a BOINC project operates, the size of its workunit and result tables increases. Eventually they become so large that adding a field or building an index may take hours or days. |
| | 4 | |
| | 5 | To address this problem, BOINC provides a utility '''db_purge''' that writes result and WU records to XML-format archive files, then deletes them from the database. |
| | 6 | |
| | 7 | Workunits are purged only when their input files have been deleted. Because of BOINC's file-deletion policy, this implies that all results are completed. So when a workunit is purged, all its results are purged too. |
| | 8 | |
| | 9 | Run db_purge from the project's bin/ directory. It will create an archive/ directory and store archive files there. |
| | 10 | |
| | 11 | db_purge is normally run as a daemon, specified in the [ProjectConfigFile config.xml] file. It has the following command-line options: |
| | 12 | |
| | 13 | '''-min_age_days N ''':: |
| | 14 | Purge only WUs with mod_time at least N days in the past. Recommended value: 7 or so. This lets users examine their recent results. |
| | 15 | '''-max N''':: |
| | 16 | Purge at most N WUs, then exit |
| | 17 | '''-max_wu_per_file N''':: |
| | 18 | Write at most N WUs to each archive file. Recommended value: 10,000 or so. |
| | 19 | '''-zip''':: |
| | 20 | Compress archive files using zip |
| | 21 | '''-gzip''':: |
| | 22 | Compress archive files using gzip |
| | 23 | '''-d N''':: |
| | 24 | Set logging verbosity to N (1,2,3) |
| | 25 | |
| | 26 | == Archive file format == |
| | 27 | |
| | 28 | The archive files have names of the form `wu_archive_TIME` and `result_archive_TIME` where `TIME` is the Unix time the file was created. In addition, db_purge generates index files 'wu_index' and 'result_index' associating each WU and result ID with the timestamp of its archive file. |
| | 29 | |
| | 30 | The format of both type of index files is a number of rows each containing: |
| | 31 | {{{ |
| | 32 | ID TIME |
| | 33 | }}} |
| | 34 | The ID field of the WU or result, 5 spaces, and the timestamp part of the archive filename where the record with that ID can be found. |
| | 35 | |
| | 36 | The format of a record in the result archive file is: |
| | 37 | |
| | 38 | {{{ |
| | 39 | <result_archive> |
| | 40 | <id>%d</id> |
| | 41 | <create_time>%d</create_time> |
| | 42 | <workunitid>%d</workunitid> |
| | 43 | <server_state>%d</server_state> |
| | 44 | <outcome>%d</outcome> |
| | 45 | <client_state>%d</client_state> |
| | 46 | <hostid>%d</hostid> |
| | 47 | <userid>%d</userid> |
| | 48 | <report_deadline>%d</report_deadline> |
| | 49 | <sent_time>%d</sent_time> |
| | 50 | <received_time>%d</received_time> |
| | 51 | <name>%s</name> |
| | 52 | <cpu_time>%.15e</cpu_time> |
| | 53 | <xml_doc_in>%s</xml_doc_in> |
| | 54 | <xml_doc_out>%s</xml_doc_out> |
| | 55 | <stderr_out>%s</stderr_out> |
| | 56 | <batch>%d</batch> |
| | 57 | <file_delete_state>%d</file_delete_state> |
| | 58 | <validate_state>%d</validate_state> |
| | 59 | <claimed_credit>%.15e</claimed_credit> |
| | 60 | <granted_credit>%.15e</granted_credit> |
| | 61 | <opaque>%f</opaque> |
| | 62 | <random>%d</random> |
| | 63 | <app_version_num>%d</app_version_num> |
| | 64 | <appid>%d</appid> |
| | 65 | <exit_status>%d</exit_status> |
| | 66 | <teamid>%d</teamid> |
| | 67 | <priority>%d</priority> |
| | 68 | <mod_time>%s</mod_time> |
| | 69 | </result_archive> |
| | 70 | }}} |
| | 71 | |
| | 72 | The format of a record in the WU archive file is: |
| | 73 | {{{ |
| | 74 | <workunit_archive> |
| | 75 | <id>%d</id> |
| | 76 | <create_time>%d</create_time> |
| | 77 | <appid>%d</appid> |
| | 78 | <name>%s</name> |
| | 79 | <xml_doc>%s</xml_doc> |
| | 80 | <batch>%d</batch> |
| | 81 | <rsc_fpops_est>%.15e</rsc_fpops_est> |
| | 82 | <rsc_fpops_bound>%.15e</rsc_fpops_bound> |
| | 83 | <rsc_memory_bound>%.15e</rsc_memory_bound> |
| | 84 | <rsc_disk_bound>%.15e</rsc_disk_bound> |
| | 85 | <need_validate>%d</need_validate> |
| | 86 | <canonical_resultid>%d</canonical_resultid> |
| | 87 | <canonical_credit>%.15e</canonical_credit> |
| | 88 | <transition_time>%d</transition_time> |
| | 89 | <delay_bound>%d</delay_bound> |
| | 90 | <error_mask>%d</error_mask> |
| | 91 | <file_delete_state>%d</file_delete_state> |
| | 92 | <assimilate_state>%d</assimilate_state> |
| | 93 | <hr_class>%d</hr_class> |
| | 94 | <opaque>%f</opaque> |
| | 95 | <min_quorum>%d</min_quorum> |
| | 96 | <target_nresults>%d</target_nresults> |
| | 97 | <max_error_results>%d</max_error_results> |
| | 98 | <max_total_results>%d</max_total_results> |
| | 99 | <max_success_results>%d</max_success_results> |
| | 100 | <result_template_file>%s</result_template_file> |
| | 101 | <priority>%d</priority> |
| | 102 | <mod_time>%s</mod_time> |
| | 103 | </workunit_archive> |
| | 104 | }}} |