Testing BOINC software
Our options for development workflow and release management depend on our ability to test. This is a summary of what needs to be tested, and the existing and prospective ways of doing it.
Terminology: a branch is "stable" if it's been throughly tested, i.e. there is evidence that most features work on most supported platforms. "Unstable" means there isn't such evidence. "Most" is something like 90%. 100% is impossible.
The only changes that can be made to a stable branch are tested bug fixes. New features aren't added to stable branches.
Client software
The client software is feature-rich, and runs on many platforms and configurations. In the early days of BOINC we realized that developers could not adequately test the client software. We also realized that active reporting is needed: an absence of bug reports doesn't mean an absence of bugs. So we created a testing and release system with two related parts:
- An Alpha test project where a large pool of volunteers,
having all different types of computers, GPU, configurations, etc.,
could test new versions of the client software, and
- A comprehensive list of test cases.
- A web-based interface by which testers report results.
- A web-based interface for viewing test results.
- A policy for approving a version: It must get at least 5 "no problems found" reports for each sub-platform (e.g. Win7, Win10, Mac 10.11, Mac 10.12, Android).
- A release management policy
that works as follows:
- Master is unstable.
- When significant new features have been added to master, create a new "client release branch".
- Have the Alpha test project test this branch.
- Fix bugs and repeat.
- When done, the branch is stable; make it the new public release.
- Thereafter, only bug fixes are allowed in the branch. Retest with Alpha test, and release new minor versions as needed.
Fixing bugs that occur on volunteer hosts can be hard. To facilitate it, I built a Client emulator that lets me recreate volunteer scenarios in a place where I can use a debugger.
This has worked extremely well. Over the years we have simultaneously
- developed the client at a very fast pace
- done a sequence of multi-platform client releases that have been free of major defects.
We did this with minimal staff overhead (a small fraction of me and Rom). Leveraging volunteer testers was critical.
Server software
This includes the scheduler, transitioner and other back-end programs, make_project, create_work, start script, remote file management, and remote job submission.
Testing the server software is hard because:
- Basic functionality is complex, e.g.:
- New results are created when others time out
- A workunit errors out when too many results have been created
- For a given work request, the appropriate number of jobs is sent and the appropriate app versions are used
- File deletion and DB purging eventually clean up everything It's not realistic to expect a developer to test all of these.
- Realistic testing cannot be done with a single client. It must involve a diverse population of clients, some of the unreliable or malicious. (or a simulation of such a population).
- The server software has hundreds of options and features (to name a few: job assignment, job size matching, plan classes, trickle messages, jobs with big data files, non-CPU-intensive apps, and so on) Many of them are there to support a particular project, and require that project's context (project-specific application, validators, assimilators etc.) in order to test.
- Server functionality can be central (1 server) or distributed. Testing must cover all cases.
- The server software must work with various versions of dependent software (MySQL, PHP, Linux, VirtualBox).
- Unintended consequences: a change in one place can introduce bugs in (unpredictable) other places. When you change anything, you have to test everything.
The bottom line is:
Developer, by themselves, are unable to test the server software.
And in fact
we currently have no way of testing server software.
Thus
The server software in the master branch is inherently unstable.
Current
I'm the main developer of server software. Before I commit server software to master I deploy it on a minimal test project and make sure whatever I changed works, and that basic functionality (creating and sending jobs) works.
This tests only a tiny fraction of the server functionality. A change could pass this test but cause serious problems that appear only after hours or days.
Eventually I test master in SETI@home beta, and then in SETI@home. Sometimes there is a delay of a month or two in doing this.
There has never been a stable release of the server software. In practice master is bug-free most of the time, but there is no guarantee of this.
Proposal
Most large projects maintain separate test projects (e.g. SETI@home beta). I propose that these be used to test server software, the same way we use volunteer Alpha testers to test the client software. To test a branch, projects will deploy it in their beta test projects, and test the features they use, with their apps, and will report results using a web-based system (we could use the Alpha-test system for this).
We should maintain server release branches, similar to client release branches. When significant new features have been added to master, create a new candidate release branch. Deploy to beta projects. Fix bugs as needed; repeat. When done, that branch is stable; make it the public release branch.
Server release branches should, in fact, be a stable version of everything that a project needs: server, web, API, wrappers. NOTE: They are not stable for client software.
This will also encourage existing projects to merge project-specific changes into BOINC (as e.g. SETI@home does) , so that they are able to use current server software.
Fixing bugs that occur in a particular project's beta test can be difficult. Sometimes I've added logging output. As a last resort projects have let me log into their servers. Maybe we can do something analogous to the Client Emulator for server code.
Automated testing?
We could create a framework for automated testing of server software. I started doing this a long time ago, and didn't make much progress. More recently, I (with Trilce Estrada) added a mechanism for simulating a project against a dynamic population of clients. This could be used as a basis for automated testing. But it would be a lot of work - like 6 FTE months.
I'm skeptical about the feasability of this. The effort needed to construct and maintain an automated test for a complex feature can exceed that for the feature itself. I spent a lot of time making a system for testing basic features. I never found any bugs in BOINC, just a never-ending sequence of bugs in the testing system.
If we were to create such a system, and made it available to developers, we could use it for testing changes to master, at which point master would be stable. This is the only way that master can be stable; I don't think we can use beta-test projects for individual developer changes.
Web software
This is the PHP code that implements project web site features such as forums, leader boards, etc.
As with server software, the web code is feature-rich, and can't be thoroughly tested by a single developer.
Current
I test changes on my test project before commiting to master. Then I deploy changes to SETI@home, where the changes are seen by thousands of people. Any problems are reported to me by Jord, and I fix them quickly. This works well, though there are short periods when master has bugs.
Proposal
Although the two are mostly independent, we should use the same approach for web software as for server software: test it, using beta-test projects, in the server stable branch.
We should also explore tools for syntax-checking and security testing, and integrate them into the CI process on master.
App API software and wrappers
These are feature-rich, and many of the features are used only by specific applications that are not available to developers. So it's not feasible for developers to test them.
We should test these using beta-test projects, as for server software.