Changes between Version 11 and Version 12 of GpuWorkFetch


Ignore:
Timestamp:
Dec 26, 2008, 12:29:13 PM (15 years ago)
Author:
davea
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • GpuWorkFetch

    v11 v12  
    1010indicating the total duration of jobs being requested.
    1111
    12 This policy has various problems.  First:
     12This policy has some problems.  First:
    1313
    1414 * There's no way for the client to say "I have N idle CPUs; send me enough jobs to use them all".
    1515
    16 Problems related to GPUs:
     16And various problems related to GPUs:
    1717
    1818 * If there is no CPU shortfall, no work will be fetched even if GPUs are idle.
     
    5454== Client ==
    5555
    56 
    5756New abstraction: '''processing resource''' or PRSC.
    5857There are two processing resource types: CPU and CUDA.
    5958
    60 === Per-resource-type backoff
     59=== Per-resource-type backoff ===
    6160
    6261We need to handle the situation where there's a GPU shortfall
     
    6867it's cleared whenever we get a job of that type.
    6968
    70 ==- Work-fetch state ==
     69=== Work-fetch state ===
    7170
    7271Each PRSC has its own set of data related to work fetch.
    7372This is stored in an object of class PRSC_WORK_FETCH.
    7473
    75 Data members of PRSC_WORK_FETCH:
     74Data members of PRSC_WORK_FETCH (set by rr_simulation()):
    7675
    7776'''double shortfall''': shortfall for this resource
    78 '''double max_nidle''': number of idle instances
     77
     78'''double nidle''': number of currently idle instances
    7979
    8080Member functions of PRSC_WORK_FETCH:
    8181
    82 '''clear()''': called at the start of RR simulation
    83 
     82'''rr_init()''': called at the start of RR simulation.
     83Compute share of each project for this PRSC,
     84and clear shortfall.
     85
     86------------
    8487'''prepare()''': called before exists_fetchable_project().
    8588sees if there's project to req from for this resource, and caches it
     
    135138Each PRSC also needs to have some per-project data.
    136139This is stored in an object of class PRSC_PROJECT_DATA.
    137 Its members include (* means save in state file):
     140It has the following "persistent" members (i.e., saved in state file):
     141
     142'''double long_term_debt*'''
     143
     144'''backoff timer'''*:  how long to wait until ask project for work specifically for this PRSC;
     145double this any time we ask for work for this rsc and get none
     146(maximum 24 hours).
     147Clear it when we ask for work for this PRSC and get some job.
     148
     149And the following transient members (used by rr_simulation()):
    138150
    139151'''double shortfall'''
    140152
    141 '''int last_job'''*: last time we had a job from this proj using this rsc
    142 if the time is within last N days (30?)
    143 we assume that the project may possibly have jobs of that type
    144 
    145153'''bool runnable'''
    146154
    147155'''max deficit'''
    148156
    149 '''backoff timer'''*:  how long to wait until ask project for work only for this rsc
    150 double this any time we ask only for work for this rsc and get none
    151 (maximum 24 hours).
    152 Clear it when we have a job that uses the PRSC.
    153 
    154157'''double share''': # of instances this project should get based on RS
    155158
    156 '''double long_term_debt*'''
    157 
     159'''instances_used''': # of instances currently being used
    158160
    159161=== debt accounting ===
     
    166168
    167169{{{
     170cpu_work_fetch.rr_init()
     171cuda_work_fetch.rr_init()
    168172do simulation as current
    169173on completion of an interval dt