Changes between Version 4 and Version 5 of AppCoprocessor
- Timestamp:
- Mar 12, 2008, 9:45:36 AM (17 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
AppCoprocessor
v4 v5 1 = Managing allocated resources =1 = Using coprocessors = 2 2 3 BOINC has a hardwired model of various resources: 4 5 * CPUs 6 * physical memory and swap space 7 * disk space 8 9 Some hosts have other processing resources: 3 This document describes BOINC's support for applications that use coprocessors such as 10 4 11 5 * GPU(s) 12 6 * SPEs in a Cell processor 13 7 14 How can we extend BOINC to take these resources into account,15 i.e. to send the best available app version to hosts,16 and to execute the best combination of apps on a given host?17 18 8 We'll assume that these resources are "allocated" rather than "scheduled": 19 an application using a resourcehas it locked9 an application using a coprocessor has it locked 20 10 while the app is in memory, 21 11 even if the app is suspended by BOINC or descheduled by the OS. … … 23 13 == Proposed design == 24 14 25 1. We define an XML notation for resources. This might look like15 1. The BOINC client will probe for coprocessors, and report them in scheduler requests. The XML looks like: 26 16 {{{ 27 < resource>28 <type>Cell SPE</type>29 <number>6</number>30 </resource>31 <resource>32 <type>NVIDIA 8800 GPU with 1.7 driver</type>33 <number>1</number>34 </ resource>17 <coprocs> 18 <coproc_cuda> 19 <count>1</count> 20 <name>GeForce 8800 GT (1)</name> 21 <totalGlobalMem>...</totalGlobalMem> 22 ... 23 </coproc_cuda> 24 </coprocs> 35 25 }}} 36 1. The BOINC client will discover resources, and will pass the resource description in scheduler request messages.37 1. An app_version record (in the server DB) will have a new field '''resource_requirements''' of the form.26 1. An app_version record (in the server DB) will have a new field '''coproc_req''', which is a character string encoding its coprocessor requirements. 27 1. The scheduler is linked with a project-specific function 38 28 {{{ 39 <resource> 40 <type>REGEXP</type> 41 <number>n</number> 42 </resource> 43 ... 29 bool coprocessor_compatible(COPROCS&, char coproc_req, double& flops); 44 30 }}} 45 A host is "compatible" with an app_version if, for each required resource, the host has at least n instances of a resource whose name matches REGEXP. 46 1. In addition, app_version will have an '''acceleration''' field, representing the (approximate) speedup relative to CPU-only execution. 47 1. The scheduler will be modified so that, when sending a job to a host, it finds the compatible app_version for which '''acceleration''' is greatest. 48 1. The scheduler reply will include app_version.resource_requirements and app_version.acceleration. 49 1. The client will be modified so that it keeps track of resource allocation, i.e. how many instances of each resource are free. It only runs an app if enough instances are available, and it decrements the counts accordingly. 50 1. The client will be modified to use app_version.acceleration in estimating job completion times. 31 This function: 32 * returns true if the coprocessor resources (COPROC&) are sufficient for the app version 33 * fills in the num_used fields of the elements of COPROC, indicating how many instances of each coprocessor will be used 34 * returns (in flops) the estimated FLOPS (used to estimate job completion time) 35 36 1. The scheduler will be modified so that, when sending a job to a host, it finds the compatible app_version for which flops is greatest. 37 1. The scheduler reply will include, for each app version, the list of coprocessors that it will use, and the estimated FLOPS. 38 1. The client will be modified so that it keeps track of coprocessor allocation, i.e. how many instances of each are free. It only runs an app if enough instances are available, and it decrements the counts accordingly. 39 1. The client will be modified to use app_version.flops in estimating job completion times. 51 40 52 41 == Questions ==