Context Navigation

← Previous Ticket
Next Ticket →

#925 closed Defect (fixed)

BOINC 6.6.36 non-CUDA checkpoint interval scaled by host.ncpus

Reported by:	Thyme Lawn	Owned by:	davea
Priority:	Minor	Milestone:	Undetermined
Component:	Client - Daemon	Version:	6.6.36
Keywords:	checkpoint	Cc:

Description

BOINC 6.6.36 has a checkpoint multiplier problem for non-CUDA tasks on multi-core systems.

I have my checkpoint interval set to 10 minutes but checkpoints for malariacontrol.net and WCG are happening every 20 minutes on a dual core system and every 40 minutes on a quad core. Applications are kept in memory and WCG usually checkpoints soon after being scheduled, meaning it can be scheduled for >80 minutes on the quad core instead of the 60 minutes set in preferences.

The problem is in ACTIVE_TASK::write_app_init_file() which contains the following pair of lines:

int nprocs = (result->avp->ncudas)?coproc_cuda->count:gstate.ncpus; aid.checkpoint_period = nprocs*gstate.global_prefs.disk_interval;

This means the checkpoint interval for non-CUDA tasks will always be scaled up by the host's number of CPUs (gstate.ncpus) instead of the average number of CPU's requested for the task (result->avp->avg_ncpus).

Sure enough, app_init.xml on the dual core system has

<checkpoint_period>1200.000000</checkpoint_period>

and on the quad core it has

<checkpoint_period>2400.000000</checkpoint_period>

Attachments (1)

app_start_cpp.patch (637 bytes) - added by Thyme Lawn 16 years ago.: Patch

Download all attachments as: .zip

Change History (3)

Changed 16 years ago by Thyme Lawn

Attachment:	app_start_cpp.patch added

Patch

comment:1 Changed 16 years ago by romw

Resolution:	→ fixed
Status:	new → closed

Now fixed in 6.10.

comment:2 Changed 16 years ago by Nicolas

Fixed in r19293, backported to 6.10 in r19321.

Note: See TracTickets for help on using tickets.

Download in other formats: