Opened 13 years ago

Last modified 13 years ago

#1081 new Defect

BOINC Screen Saver does not respond properly when a GPUGrid.net task with Swan_Sync set to 0 saturates and starves either the primary GPU device or the CPU

Reported by: JacobKlein Owned by: romw
Priority: Major Milestone: Undetermined
Component: Client - Screen Saver Version: 6.10.58
Keywords: ScreenSaver Respond GPU White Swan_Sync Cc: Jacob_W_Klein@…

Description

Rom and I have been doing some screensaver testing in Alpha testing, and I believe I have come up with some reproducible bugs that should be reported as tickets. As far as I know, these bugs exist on 6.10.58 and the latest alphas.

This ticket is regarding the case where the Screen Saver is running, and the primary GPU device is running a GPU task. Essentially, the Screen Saver becomes delayed, starved for GPU cycles, and bad things can happen.

Among those bad things are:
- Moving mouse/keyboard does not release the screensaver, and the user has to hit Ctrl+Alt+Del (in one case, even that did not respond, and I had to hard-boot ... but for the most part, if you just wait patiently, it will eventually respond)
- The Screen Saver sometimes makes the screen completely white for a few seconds, and may flicker between full-black and full-white.
- Windows Event Viewer can sometimes show an "Application Hang" event (Interestingly enough, in addition to seeing hangs for the BOINC screen saver files boinc.scr, boincscr.exe, I've even seen an "Application Hang" event for the Windows 7 "Blank" screen saver, scrnsave.scr)
- While the screen saver is running, if you are able to see graphics, they may appear choppy or not moving or frozen.

I'm not sure how to fix this, but I am pretty sure the problem is a result of GPU tasks running. For me, I only notice it when GPUGrid.net is running GPU tasks. If I'm able to do more testing, I'll report back here.

One possible fix (that ROM thought up) might be to disable GPU processing on the primary display, whenever that primary display is in screen-saver mode (or maybe whenever any screen saver is running, since the Windows 7 "Blank" screen saver hung on me earlier).

For reference, my system is a Windows 7 x64 PC, Intel i7 965 eXtreme edition, quad-core with hyperthreading, so Windows sees 8 processors, BOINC sees 8 processors, I have 3 GPUs (1 Geforce GTX460 and 2 Geforce 9800 GTs), and I regularly run GPUGrid.net with the Swan_Sync system variable set to 0, which creates CPU processes to run alongside the GPU processing, and I have BOINC set up to use 63% (5 of 8) of my processors... so... BOINC is crunching 5 "cpu-only" tasks, and also 3 "gpu+cpu" tasks, which ends up fully saturating the entire PC.  A mouthful.

Below is an exchange of emails regarding the issue.

==============================

From: jacob_w_klein@…
To: romw@…; boinc_alpha@…
Date: Thu, 31 Mar 2011 !17:30:56 -0500
Subject: Re: [boinc_alpha] 6.12.19 - Strange screensaver results

I just had another breakthrough, regarding the white-screen issue! (Which may be entirely separate from Ctrl+Alt+Del issue.)

On a test just now, when the screensaver was supposed to start, all I saw was a white screen and a busy cursor, for several seconds.
Then, it may have flickered black, but then went back to white.

Mouse and Keyboard were not responding, and so I hit Ctrl+Alt+Del, and that took several seconds (20-25?) to finally return me to the login.

The best part of all this... is that it got logged (attached)!

Here's the juicy parts!
[03/31/11 17:14:55] Screen saver is executed
[03/31/11 17:15:43] Threads and project selection happens (a whole 48 seconds late, and I'm betting I saw all white during the time!)
[03/31/11 17:15:44] Docking project chosen.
[03/31/11 17:15:55] Activity detected (me pressing keyboard? Looks like 11 seconds into the project's time slice)
[03/31/11 17:15:55] - [03/31/11 17:16:00] Tons of shutdown calls (shouldn't there only be one?)
... and to top it off, the log didn't get flushed properly.

I see chaos. Let the fixing begin?

Still smiling,
Jacob

==============================

From: jacob_w_klein@…
To: romw@…; boinc_alpha@…
Subject: RE: [boinc_alpha] 6.12.19 - Strange screensaver results
Date: Thu, 31 Mar 2011 !17:56:03 -0500

Note: The "white-out" apparently occurred while GPU tasks were running, and while my CPU % settings was at 100, meaning: 8 cpu-only tasks, alongside 3 cpu+gpu tasks
... ie: overloaded even for my system.

I'm thinking the "white-out" issue is a result of running GPU tasks alongside the screensaver.

Here's a thought regarding the shutdown:
Is it possible to boost the screensaver priority whenever things are being shutdown? (to ensure shutdown happens ASAP)
Because, at the time I hit Ctrl+Alt+Del, I'm sure I had 8 cpu processes at "Low" priority, 3 cpu processes at "Below Normal" priority, and 3 GPUs that were tied up.
I'm not sure what the screensaver priority was, but it should be at least "Normal" if not higher.

- Jacob

==============================

Date: Thu, 31 Mar 2011 !19:07:01 -0400
From: romw@…
To: jacob_w_klein@…; boinc_alpha@…
Subject: Re: [boinc_alpha] 6.12.19 - Strange screensaver results

Screensaver priority is determined by Windows when the screensaver is
launched.

I suspect the only way to "fix" this issue would be to shutdown the GPU
app running on the primary display. Then the screensaver and OpenGL
graphics application from the project wouldn't be starved(waiting) for
GPU cycles.

----- Rom

Attachments (1)

stdoutscr_JWK_016_6.12.19_boincscr.310311.x64.txt (16.2 KB) - added by JacobKlein 13 years ago.
Alpha testing - Log file showing major delays and bad behavior

Download all attachments as: .zip

Change History (4)

Changed 13 years ago by JacobKlein

Alpha testing - Log file showing major delays and bad behavior

comment:1 Changed 13 years ago by JacobKlein

The Windows Event Viewer "Application Hang" events are Event ID 1002, Task Category (101), and I have events (hangs) listed for the following programs:
- boinc.scr version 6.12.19.0
- boincscr.exe version 6.12.19.0
- scrnsave.scr version 6.1.7600.16385
- minirosetta_graphics_1.92_windows_x86_64.exe version 0.0.0.0
- charmm34_6.23_graphics_windows_x86_64 version 0.0.0.0

comment:2 Changed 13 years ago by JacobKlein

Keywords: Swan_Sync added
Summary: BOINC Screen Saver does not respond properly when a GPU task saturates and starves the primary GPU device (like a GPUGrid.net task)BOINC Screen Saver does not respond properly when a GPUGrid.net task with Swan_Sync set to 0 saturates and starves either the primary GPU device or the CPU

Here are more testing results, reported on the Alpha email list:

From: jacob_w_klein@…
> To: davea@…; boinc_alpha@…; romw@…
> Date: Sat, 2 Apr 2011 !12:16:04 -0500
> Subject: Re: [boinc_alpha] 6.12.19 - Strange screensaver results
>
>
> Rom / David / List:
>
> I've done some more testing regarding the screensaver slowness and hangs...
> and although I'm unable to conclude whether it is either GPU starvation or CPU starvation,
> I do believe it has something to do with how GPUGrid.net uses the system variable "Swan_Sync" when set to "0".
>
> Here are some tests I've done on my quad-core hyperthreaded (BOINC sees 8 CPUs) machine, with 3 GPUs:
>
> When "Swan_Sync" is not set, and I have BOINC setup to use 100% of my processors:
> - BOINC uses 8 CPUs as Low priority processes
> - GPUGrid.net uses all 3 GPUs
> - GPUGrid.net also creates 3 "acemd..." Below Normal priority processes, which consume very little CPU because "Swan_Sync" is not set.
> - ScreenSaver responsiveness seems very good. Good when starting, good when executing, and good when releasing.
>
> When "Swan_Sync" is set to "0", and I have BOINC setup to use 63% of my processors:
>
> - BOINC uses 5 CPUs as Low priority processes
> - GPUGrid.net uses all 3 GPUs
> - GPUGrid.net also creates 3 "acemd..." Below Normal priority
> processes, which consume as much CPU as they can because "Swan_Sync" is set to "0".
> - ScreenSaver responsiveness is poor. Slow to start, choppy when executing, and slow to release, even sometimes hanging.
>
> Do you think that maybe, when the screensaver runs
> ... having fully-active CPU processes at priority higher than "Low" can cause the screensaver to have problems?
> Again, I'm not sure if it's GPU starvation or CPU starvation, mainly because I'm unsure of the effects of Swan_Sync on GPU utilization, but either way it's not good.
>
> Note:
> At GPUGrid.Net, we are told to set "Swan_Sync" to "0" to make crunching faster for modern GPUs, and to make tasks crash less for older GPUs.
> I plan on keeping the setting set.
> I am only playing with the setting, and running the BOINC Screensaver, to help with testing - my normal setup is to run the "Blank" screensaver.
>
> Regards,
> Jacob

comment:3 Changed 13 years ago by JacobKlein

I have noticed that, when processing on the GPU with Swan_Sync set to 0, even the Windows Blank screensaver has trouble promptly responding, especially when Windows just entered screensaver mode.  (ie: The screen goes blank, you wait 1-2 seconds, then move the mouse).  I've seen 5-15 second delays in the response, and I've seen white-screens with the busy-cursor-animation.

So, I believe this issue does not only affect the Boinc Screensaver, it likely affects any screensaver.

Note: See TracTickets for help on using tickets.