| | 59 | |
| | 60 | New approach: |
| | 61 | Do it one resource at a time (GPUs first). |
| | 62 | For each resource: |
| | 63 | |
| | 64 | * For each app, find the best app version and the best reliable app version |
| | 65 | * For each of these app versions, find the expected speed |
| | 66 | (taking on-fraction etc. into account). |
| | 67 | Based on this, and the statistics of the host population, |
| | 68 | decide what size job to send for this resource. |
| | 69 | * Scan the job array, starting at a random point. |
| | 70 | Make a list of jobs for which an app version is available, |
| | 71 | and that are of the right size. |
| | 72 | * Sort this list by a "score" that combines the above criteria |
| | 73 | (reliable, beta, previously infeasibly, locality scheduling lite). |
| | 74 | * Scan the list; for each job |
| | 75 | * Make sure it's still in the array |
| | 76 | * Do quick checks |
| | 77 | * Lock entry and do slow checks |
| | 78 | * Send job |
| | 79 | * Leave loop if resource request is satisfied or we're out of disk space |