Version 1 (modified by 12 years ago) (diff) | ,
---|
8th BOINC Workshop: Summary and work items
Replacing the heartbeat mechanism
Problem: while an app is doing I/O-intensive stuff, other apps get no-heartbeat exits;
- I changed client/API so that the client passes its PID to app, and the app periodically checks whether the client is alive, instead of using heartbeat messages. This mechanism will be used only with new (7.0.37+) clients and new app versions. Other combinations will continue to use heartbeats.
- We discussed having the client send heartbeat messages in a separate thread. I propose not doing this because the problem should be solved by the above.
Handling long non-checkpoint jobs
Problem: need a mechanism for sending long jobs that don't checkpoint only to hosts that are likely to finish them.
- Have client send its current uptime and the duration of its previous session's uptime in scheduler request message.
- On server, allow flagging app versions as non-checkpointing.
- Scheduler: if app version is non-checkpoint, send job to a host only if its expected runtime is less than the host's uptime or previous uptime.
Server software testing and release management
Goals include:
- Increase the quality and frequency of server software releases.
- Increase the stability of the server software in trunk.
We discussed the following:
- Automated system-level testing of server software. We used to have frameworks for this (boinc/test/) but they're not maintained. We lack the manpower to do this; volunteers are needed.
- How to test server software? When to do releases?
Automated testing would help,
but a large number of features can feasibly be tested only in live use.
I think we need projects to help as follows:
- Operate test projects for testing new server software.
- Use these project to beta-test server software.
- When have a release candidate, create a new branch, test it using these projects, release it when all bugs fixed.
- Unit testing of server software. I'm not sure if this has good cost/benefit; few if any would be detected. But if a volunteer wants to write unit tests, I'd be happy to add them to the tree.
- Automated nightly builds. Rom will look into this. How to do for Win, Mac?
- Automated system testing of web software. We lack the manpower to do this; volunteer help is needed.
Remote job submission
Some changes were proposed but I forget what they were. Wenjing?
Server scheduling (user quotas, accelerated batch completion
Several people expressed interest in these features. We will work on them, hopefully in the 2-3 month timeframe. Design docs are here:
http://boinc.berkeley.edu/trac/wiki/JobPrioritization
http://boinc.berkeley.edu/trac/wiki/PortalFeatures
Comments (on boinc_dev) are welcome.
Python framework for validation and assimilation
David Coss worked on documentation for this. David, please add to the Wiki or send to me.
Support for job DAGs
David Coss presented this. I think it would be a useful feature, although no project other than David's had an immediate need for it. We should document it and add it to the source tree.
Drupal/BOINC integration
Oliver demonstrated this. My impression is that it's about 90% complete. When done we can add it to BOINC.
Locality scheduling
This is on hold until someone (e.g. Einstein@home) needs it.
http://boinc.berkeley.edu/trac/wiki/LocalityNew
BOINC on Android
Current work items:
- Make sure that everything needed to build BOINC/Android, and test apps, is in the BOINC tree and documented (Rom).
- Finish the GUI. Main items:
- Add interface for adding/removing projects and account managers.
- Show graphics of some sort (BOINC and/or project-specific)
- Get some projects to add Android/ARM app versions.