= Hierarchical upload/download directories =
The data server for a large project, may store 100Ks or millions of files at any given point. If these files are stored in 'flat' directories (project/download and project/upload) the data server may spend a lot of CPU time searching directories. If you see a high CPU load average, with a lot of time in kernel mode, this is probably what's happening. The solution is to use '''hierarchical upload/download directories'''. To do this, include the line
{{{
1024
}}}
in your [ProjectOptions config.xml file] (this is the default for new projects). This causes BOINC to use hierarchical upload/download directories. Each directory will have a set of 1024 subdirectories, named 0 to 3ff. Files are hashed (based on their filename) into these directories.
The hierarchy is used for input and output files only. Executables and other application version files are in the top level of the download directory.
This affects your project-specific code in a couple of places. First, your work generator must put input files in the right directory before calling [WorkGeneration create_work()]. To do this, it can use the function
{{{
int dir_hier_path(
const char* filename, const char* root, int fanout, char* result,
bool make_directory_if_needed=false
);
}}}
This takes a name of the input file and the absolute path of the root of the download hierarchy (typically the `download_dir` element from [ProjectOptions config.xml]) and returns the absolute path of the file in the hierarchy. Generally `make_directory_if_needed` should be set to true: this creates a fanout directory if needed to accommodate a particular file. Secondly, your validator and assimilator should call
{{{
int get_output_file_path(RESULT const& result, string& path);
or
int get_output_file_paths(RESULT const& result, vector& );
}}}
to get the paths of output files in the hierarchy. A couple of utility programs are available (run this in the project root directory):
{{{
dir_hier_move src_dir dst_dir fanout
dir_hier_path filename
}}}
`dir_hier_move` moves all files from `src_dir` (flat) into `dst_dir` (hierarchical with the given fanout). `dir_hier_path`, given a filename, prints the full pathname of that file in the hierarchy.
== Transitioning from flat to hierarchical directories ==
If you are operating a project with flat directories, you can transition to a hierarchy as follows:
* Stop the project and add `` to [ProjectOptions config.xml]. You may want to locate the hierarchy root at a new place (e.g. download/fanout); in this case update the `` element of config.xml, and add the element
{{{
old download dir
}}}
This causes the file deleter to check both old and new locations.
* Use `dir_hier_move` to move existing upload files to a hierarchy.
* Start the project, and monitor everything closely for a while.