Changes between Initial Version and Version 1 of DiskManagement


Ignore:
Timestamp:
Apr 29, 2007, 8:48:59 PM (17 years ago)
Author:
KSMarksPsych
Comment:

Added page

Legend:

Unmodified
Added
Removed
Modified
  • DiskManagement

    v1 v1  
     1= Disk space management =
     2       
     3== THIS FILE IS DEPRECATED ==
     4
     5This document describes the core client's policies for managing disk space. The goals are the following (highest to lowest priority):
     6
     7   1. '''Enforce user disk usage preferences.'''
     8   2. '''Enforce resource shares.'''
     9   3. '''Provide disk space for completed results.'''
     10   4. '''Provide disk space for active results.'''
     11   5. '''Provide disk space for queued results.'''
     12   6. '''Provide disk space for project file storage ('sticky files'). '''
     13
     14'''Total_Limit'''; disk usage limit as determined by prefs, disk size, and non-BOINC usage
     15
     16'''Core_Usage'''; space currently being used by core client
     17
     18'''all_Projects_limit [aPl] =''' total_limit - core_usage, this is space that projects can use [aPL]
     19
     20'''Project_usage(p) =''' The size of all files associated with a project p, returns project->size in most cases;
     21
     22'''all_Project_usage[aPu] =''' {Project_usage(p)} for all Project p
     23
     24'''Project_limit(p) =''' aPl*resource_share(p)
     25
     26The '''free space''' on a client is determined by
     27
     28    free_space = all_Projects_Limit - all_Project_usage
     29
     30A project is an '''offender''' if
     31
     32    project_usage(p) - project_limit(p) > 0
     33
     34The '''greatest offender''' is the project with
     35
     36    max {project_usage(p) - project_limit(p))} for all p.
     37
     38The client will always try to remove files from the greatest offender first before querying other projects.
     39
     40== A Project_limit(p) Example ==
     41
     42Consider a system participating two projects, A and B, with resource shares 75% and 25%, respectively. After computing the aPl available, A would receive 75% of the aPL as its project limit and B would receive 25% of the aPL as its project limit.
     43
     44If P_u(A) < P_l(A), Project A is not utilizing all Project_limit(A). Therefore, B should be able to use the difference (P_u(A) - P_l(A)) if it needs to. This applies to all projects in any situation where a project is not utilizing all its Project_limit(p). This unused space will show up as free_space in most cases.
     45
     46When A wants to add a new file, if adding the file would cause all_Project_usage >= aPL, a project must delete a file first. If A is not an offender, then files are deleted from an offender, in this case, Project B. Files will be deleted from B as described in ''Adding a File to a Project'' below.
     47
     48== What BOINC does in problem situations ==
     49
     50    * The current project has a disk share large enough for all the workunits needed to keep itself busy, but there is not enough disk space available for all the files because of other projects.
     51
     52          When the BOINC client requests work, it tells the scheduling server how much free space it has on the client, as well as how much space it could potentially free. It is then the projects decision on how much work to award the client based on these values. When the files are downloaded, others are deleted as described below
     53
     54    * A large file was moved somewhere into the BOINC directory structure and begins to contribute to the space BOINC takes up.
     55
     56          Because of the way BOINC calculates the program size, this kind of error is unavoidable. The client will notice that BOINC has violated its disk share and will do the necessary steps described above (see first priority in Maintaining Project Disk Share). The user will have to find the problematic file and remove it in order for BOINC to reclaim that space. Deleted files will not be restored.
     57
     58    * A compution creates large temp files and stores them in its slot directory. These temp files are so large that they fill up the project disk size and the client no longer has enough space to run any computations .
     59
     60          The client will try to create space for these temporary files as much it can, to the point where all files except those running a computation for a project are deleted. Because BOINC makes computation a priority over storage at this time, this is a very bad situation as it will delete all files in all other projects to ensure the work continues. If the total space for the BOINC client is still larger than the preferences allow, the client will suspend all activities and notify the user of the issue. The user can either drop the offending project or increase its disk size for BOINC in their preferences.
     61
     62== Adding a file to a project ==
     63
     64The algorithm is run whenever:
     65
     66    * A file is added to a project via an RPC reply
     67
     68The client will first attempt to use all the PDS that is free. When all the space for BOINC projects is used by some combination of projects, files must be deleted to make space.
     69
     70The client maintains the project share sizes and workunit queues by:
     71
     72    * Try deleting files up to the point where a project is no longer an offender
     73    * Starting deletion with the lowest priority files first
     74          * Being the greatest offender doesn't guarantee that it will be the first to loose files, will delete lower priority files from other offenders first.
     75    * Never deleting workunit files associated with the requesting project, give up this point
     76    * Never delete results that have finished computation, let them be uploaded and recorded properly.
     77
     78The client's method of creating free_space ensures the above:
     79
     80   1. If there enough free_space in the aPl,
     81          * If yes, return ''true'' and ''increase project->size by file->nbytes''
     82   2. Else check if there are any projects that are offending their project share,
     83          * If yes, delete files starting with the greatest offender and the lowest priority/expired files
     84          * Delete these priority/expired files until the project size is below the project's share size
     85          * If there is not enough space made, increase the priority to delete and repeat
     86   3. Else check if there can be room made in the project by deleting any non-workunit files,
     87          * If yes, delete them.
     88          * If there is now enough free space, return ''true'' and ''increase project->size by file->nbytes''
     89   4. Else, return false, there is no way to associate the file with the project and guarantee the statements above
     90
     91=== Pseudo Code ===
     92
     93{{{
     94PROJECT:
     95   double size
     96        double share_size
     97        double resource_share
     98FILE_INFO:
     99        double nbytes
     100       
     101get_more_disk_space():
     102   for some PROJECT p and space_needed = number of bytes required
     103        init total_space = 0
     104       
     105        total_space = free space in the project disk size
     106   if (total_space > space_needed):
     107        return true
     108
     109        mark all projects as unchecked
     110       
     111        while(total_space < space_needed):
     112                g_of = the greatest_offender
     113        if(couldn't find one or g_of == p):
     114                        increase priority to delete from
     115                if can't increase anymore, return false         
     116        mark g_of as checked
     117                only delete low priority files up to the point when it isn't an offender
     118        return true
     119
     120
     121associate_file():
     122        for some FILE_INFO fip and a PROJECT p
     123   init space_made = 0 
     124   // check offenders
     125   if(get_more_disk_space(p, fip->size)):
     126        p.size += fip.nbytes
     127                return true
     128        // check self
     129   is there any free space?
     130   try and delete expendable files from p
     131   if(space_made > fip->nbytes):
     132        return true
     133        // if hasn't return true yet, failure
     134   return false
     135}}}
     136
     137== Violating User Disk Usage Preferences ==
     138
     139This checking is done in the `data_manager_poll()` which is placed at the top of the client's FSM hierarchy. If there is no space violation, it takes no action and returns false. If BOINC is larger than the Total_Limit, the client will reduce Project_usages using the following method:
     140
     141   1. Check all the offending projects and delete files from them until they are all not offenders or no more files can be deleted
     142   2. Cycle through each project, deleting one files at a time from each project, starting with the lowest priority and expired files, until only referenced files are left for the project.
     143   3. Cycle through each project, deleting one result at a time from each project until there are only results that are waiting for thier 'server ack' or are part of a current computation. This will removed references to files and hence mark them available for deletion.
     144
     145If all three conditions fail, all computation is suspended, a messsage is sent to the user explaining the problem and the function returns true. If at any time in the function, the total used space falls below the allowed disk usage set by the user preferences, the function returns false.
     146
     147== Work Fetch Policy ==
     148
     149In conjunction with the CPU Scheduler's work fetch policy, the data manager's work fetch policy's goals are to:
     150
     151    * Calculate total free space.
     152    * Calculate total free space if lowest priority files were deleted.
     153    * Calculate total free space if highest priority files were deleted.
     154
     155When an RPC request is made, the client communicates to the server the values described above and the server can make a decision on how much work to send to the client.
     156
     157=== Calculating free space ===
     158
     159'''Psuedocode'''
     160
     161{{{
     162anything_free():
     163        init total_size = 0
     164        foreach p in projects:
     165        total_size += p.size
     166    get project disk size
     167        free space = project disk size - total_size
     168
     169// get the total number of bytes that would be free
     170// if files with priority < pr were deleted from all other projects
     171// and low priority files were deleted from this project
     172//
     173
     174total_potentially_free():
     175        for some project my_p and some priority pr
     176        init tps = anything_free();
     177
     178        ref_count all files in file_infos
     179    foreach p in projects
     180    if(p != my_p):
     181            tps += potentially free space from p with priority less than pr
     182
     183
     184   foreach fip in file_infos
     185   if(fip.project == my_p, is permantent, and not part of a computation):       
     186                and if(fip has lowest priority or is expired):
     187                tps += fip->nbytes
     188 
     189potentially_free():
     190    for a project p and some priority pr
     191    if it is not an offender:
     192            return 0;
     193        foreach fip in file_infos:
     194    if(fip.project == p, is permantent, and not part of a computation):         
     195            and if(fip.priority <= pr or is expired):
     196                tps += fip->nbytes   }
     197    return tps
     198}}}
     199
     200== Project Deletion Policy ==
     201
     202There are three types of deletion policy that a project can specify in its config.xml
     203
     204   1. '''Priority deletion'''. Files with the lowest priority level get deleted first in the order they were introduced to the client (downloaded).
     205   2. '''Expiration deletion'''. Whenever space runs out, all files that have past their expiration date are deleted first. Any file who's the expiration_date is less than the time now is deleted.
     206   3. '''Least Recently Used (LRU)'''. The DEFAULT method, the last file to be downloaded/uploaded is deleted first. The LRU policy is always used to determine the next file to delete.
     207
     208The policies are invoked by including the following in the config.xml file
     209
     210    <deletion_policy_priority/>
     211    <deletion_policy_expire/>
     212    - the LRU policy in inbedded in the core-client as the default
     213
     214If any of these flags are present in the config.xml, a similar tag will be included in a successful RPC request.
     215
     216=== Using a Project Deletion Policy ===
     217
     218A FILE_INFO, when created, has the following default values related to a projects deletion policy. These values are created for every file.
     219
     220    priority = P_LOW;
     221    time_last_used = time now
     222    exp_date = 60 days from now
     223
     224where P_LOW is defined in client_types.h as the following
     225
     226    #define P_LOW 1
     227    #define P_MEDIUM 3
     228    #define P_HIGH 5
     229
     230If using the defualts, files will not be guarenteed to survive more than sixty days if <deletion_policy_expire> is true.
     231
     232If a priority or exp_date other than the default is required, the priority must be set when the workunit is created. By including the following tags in a workunit or result template, the default information is replaced.
     233
     234    <priority>(int; 1-5)<priority>
     235    <exp_days>(int; # of days to keep)<exp_days>
     236
     237== Scheduling Server Changes ==
     238
     239The client communicates three values of disk usage to the server.
     240
     241   1. The amount of free_space
     242   2. The amount of free_space if the client were to delete files from offending projects
     243   3. The amount of free_space if the client were to delete files from offending projects & itself
     244
     245The server will assign workunits normally using the first amount. If no workunits were assigned, a second pass of the database is made using the second amount. If no workunits were assigned and the following is in the config.xml:
     246
     247    <delete_from_self/>
     248
     249the third amound of free_space is used and a third pass of the database is made. Return whatever workunits were deemed acceptable for the host.
     250
     251Under most circumstances, the amount of free_space will be enough to get workunits for a project. If a project has larger workunits (> 1 gb) or the host is storing many files for a project, amounts 2 & 3 become more important. The amount of free_space if files are deleted is found by:
     252
     253    * For offending projects, marking files that could be deleted up to where the project is no longer an offender
     254    * For the requesing project, marking all files that could be deleted
     255          * Files that are not expired are never included if deletion_policy_expire is true
     256
     257== TODO: Future Changes ==
     258
     259There is currently a method for requesting a list of files from the project. There needs to be a way to communicate the information back to the project, such as an xml doc that can be parsed by the project.
     260
     261There also needs to be a database, separate from the scheduling database, which keeps track of the files on host's clients.