| 59 | |
| 60 | New approach: |
| 61 | Do it one resource at a time (GPUs first). |
| 62 | For each resource: |
| 63 | |
| 64 | * For each app, find the best app version and the best reliable app version |
| 65 | * For each of these app versions, find the expected speed |
| 66 | (taking on-fraction etc. into account). |
| 67 | Based on this, and the statistics of the host population, |
| 68 | decide what size job to send for this resource. |
| 69 | * Scan the job array, starting at a random point. |
| 70 | Make a list of jobs for which an app version is available, |
| 71 | and that are of the right size. |
| 72 | * Sort this list by a "score" that combines the above criteria |
| 73 | (reliable, beta, previously infeasibly, locality scheduling lite). |
| 74 | * Scan the list; for each job |
| 75 | * Make sure it's still in the array |
| 76 | * Do quick checks |
| 77 | * Lock entry and do slow checks |
| 78 | * Send job |
| 79 | * Leave loop if resource request is satisfied or we're out of disk space |