Opened 16 years ago

Closed 16 years ago

Last modified 15 years ago

#728 closed Enhancement (wontfix)

Add delay for reporting results and check for work upload to be finished

Reported by: Ageless Owned by: davea
Priority: Major Milestone: Undetermined
Component: Client - Scheduler Policy Version: 6.2.18
Keywords: Cc: Ageless, Pepo

Description

As requested in the BOINC Dev forums, is it possible to add a delay for reporting results on servers where:

1) The scheduler delay is extremely short.
2) Where it's possible to lose work done with a validator error because the upload is still going, while BOINC asks for work (for another CPU core) and immediately reports the work it is still uploading....

In scenario 1) it can also happen that the person is still uploading work on a slow connection while his BOINC already requests more work and reports the task being uploaded. But in general it's about projects, like Seti, where the default scheduler deferral is 11 seconds (or there-abouts).

Solution: If the deferral given by the project server is less than a minimum time set in BOINC, add the minimum deferral time for good measure. Make sure <report_results_immediately> follows this minimum time as well.

In scenario 2) it doesn't even matter much if one has a quick or slow upload option. On projects where the result files are really big (CPDN, Einstein), it can happen that BOINC is still uploading the task, while a new scheduler request is made for more work. The new "ready to report" rules state that work is reported at the next request for work.

Solution: Have uploads have a flag that's checked by the client which will induce an extra deferral on reporting work for as long as the flag is available. Make sure <report_results_immediately> follows this flag as well.

Change History (11)

comment:1 Changed 16 years ago by Ageless

Cc: Ageless added

comment:2 Changed 16 years ago by Der Meister

Does the client really report results before the uploads for the result have finished? If yes I would consider this a serious bug that needs to be fixed.

comment:3 in reply to:  2 Changed 16 years ago by Ageless

Replying to Der Meister:
it does do this when: 1) on multi-processor systems a core is about to go 'dry', so a new request for work is made. As you know at any request for work any work that is ready to report is reported. For some reason (a bug) it can happen that work is still being uploaded when it is being reported already. 2) in the case of Seti (and perhaps other projects), it appears that work cannot be reported immediately after uploading it as that'll give immediate validation errors. Sounds to me like a problem with the database, but who am I. ;-) (I can't reproduce it either, all my Seti tasks reported with the RRI function, them being reported within 7 seconds of uploading, go through without problem).

The trouble with the second 'bug' is that work is wasted, as the task will be sent to a third computer unncessarily.

comment:4 Changed 16 years ago by Nicolas

Does it REALLY report work before the workunit is completely uploaded? Or is it that it reports very shortly after the upload finishes?

Maybe SETI server setup means the validator can't access uploaded files until some seconds pass (and files are transferred from one server to another or whatever). In which case it would be a SETI bug: the validator should be prepared for this problem, and defer validating that WU for a few seconds if the file isn't available yet.

comment:5 in reply to:  4 Changed 16 years ago by Ageless

Replying to Nicolas:

Does it REALLY report work before the workunit is completely uploaded? Or is it that it reports very shortly after the upload finishes?

Both, as far as I understand. See Les Bayliss' post which explains it for CPDN.

Maybe SETI server setup means the validator can't access uploaded files until some seconds pass (and files are transferred from one server to another or whatever). In which case it would be a SETI bug: the validator should be prepared for this problem, and defer validating that WU for a few seconds if the file isn't available yet.

I agree with that conclusion and did ask people to go back to Seti and have them change (at least) the scheduler deferral. Back to at least a minute? Why is it 11 seconds?

Still, it won't hurt to actually have a check if work to be reported has actually left the machine completely before trying to report it.

comment:6 Changed 16 years ago by Dagorath

It should not matter to the server whether it receives the report first or the result first. If it receives the report first then it should just be patient. If it doesn't receive the result by the deadline then mark it MIA or whatever and replicate another task. That's the sensible way to handle it.

The "let's implement a delay" game never ends. You can lengthen the delay but inevitably there arises a situation where the delay is not long enough. So you lengthen the delay again. Then some users complain that the delay is too long while others point to the fact that they are still losing work units due to the delay not being long enough. Why play a game you can never win?

The other somewhat sensible (but not foolproof) strategy is to simply not report a task until the upload is finished. But that can fail where a project has 2 different servers: server A for result uploads and server B for result reports. Sometimes they are in different cities. If communications between A and B fail (it does happen) then server B will be unaware that the result has been uploaded and mark the result invalid.

comment:7 Changed 16 years ago by Nicolas

The client should never report a result until the upload is finished.

And in the case there are server A for result uploads and server B for result reports, it's up to the project to correctly handle the case where the upload arrived to A, the report arrives to B, and B can't access A in time.

comment:8 Changed 16 years ago by Pepo

Cc: Pepo added

comment:9 Changed 16 years ago by davea

Results are reported only after uploads are finished. -- David

comment:10 Changed 16 years ago by romw

Milestone: 6.6Undetermined

comment:11 Changed 16 years ago by romw

Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.