Opened 11 years ago

Last modified 11 years ago

#1247 new Defect

Non-Admin Execution can erroneously result in VirtualStore contents

Reported by: JacobKlein Owned by: romw
Priority: Undetermined Milestone: Undetermined
Component: Client - Setup Version: 7.0.64
Keywords: VirtualStore Admin Privelege Windows Account Cc: Jacob_W_Klein@…

Description

Somehow, my BOINC installation resulted in a VirtualStore directory being created. A later installation, which fixed BOINC to reference the proper ProgramData BOINC directory, still did not quite work correctly, because of the existence of the VirtualStore data.

We should try to:
- Find how the VirtualStore directory got created, and prevent it if possible
- Cleanup any existing BOINC VirtualStore directories, if possible, since their mere existence can cause problems.

The issue I had was that the GPUGrid.net project would use some files from C:\ProgramData\BOINC\slots\0 ... and some files from VirtualStore\ProgramData\BOINC\slots\0 ...  and the end effect was that tasks would complete immediately, be marked successful, and I would be granted credit.  Not good.

Change History (1)

comment:1 Changed 11 years ago by JacobKlein

The following email exchange documents the finding of the issue, with some developer feedback.


From: jacob_w_klein@…
To: romw@…
CC: davea@…
Subject: RE: GPUGrid.net problem - Completing immediately, status successful, and granting credit
Date: Mon, 13 May 2013 20:33:04 -0400

Rom,

Is your fix capable of making BOINC work correctly in the presence of an existing "VirtualStore\ProgramData\BOINC" directory?
My initial testing with your private drop indicates that it doesn't fix that issue.
I know you said, before, "but the damage was done", but... are there any plans to fix that issue?

Note: I am now able to readily and easily recreate the directory structure that causes problems, but I have not yet researched the actual trigger, and haven't yet tested your fix against that trigger. I plan to do that later this week.

Regards,
Jacob

---- Subject: RE: GPUGrid.net problem - Completing immediately, status successful, and granting credit
Date: Mon, 13 May 2013 12:46:10 -0400
From: romw@…
To: jacob_w_klein@…; davea@…

Jacob,

 

Thanks again for the bug report.

 

After going back over the thread, I suspect the actual trigger in your case was that you removed your current account from all the BOINC groups and from the administrators group.  When BOINC launched, all files it opens for output were redirected to the VirtualStore.  The next time you installed BOINC, the installer fixed the environment but the damage was done.

 

From that point on Windows was mixing and matching files between the two directory structures.

 


 

From: Jacob Klein [mailto:jacob_w_klein@msn.com]
Sent: Monday, May 13, 2013 12:16 PM
To: David Anderson (BOINC)
Cc: Rom Walton
Subject: RE: GPUGrid.net problem - Completing immediately, status successful, and granting credit

 

Thanks!
Finding the trigger for this issue, is a big win for me!

The forum moderator at GPUGrid, skgiven, and MrS and Beyond (Ed), all gave several suggestions, but they generally were guessing (try disabling Firewall) and speculation (maybe CPU overloaded, maybe app_config). Lately, skgiven had been suggesting that I just reinstall the OS.  I was almost at that point, because I had almost run out of things to try.

I'm just so glad that my methodical approach paid off, and I remembered to use Process Monitor, before resorting to an OS reinstall.

The GPUGrid admins had said that I was the only one that currently had this problem (based presumably on a database search). While that may be true... from what I understand about the problem, I would think other users (and other projects!) would also be susceptible.

I still don't fully understand the cause, or which scenarios could cause it.


Date: Mon, 13 May 2013 08:56:38 -0700
> From: David Anderson
> To: Jacob Klein
> CC: Rom Walton
> Subject: Re: GPUGrid.net problem - Completing immediately, status successful, and granting credit
>
> Wow! Congrats on figuring that out.
>
> On 13-May-2013 5:09 AM, Jacob Klein wrote:
> > David / Rom:
> >
> > You both know I've been trouble-shooting a GPUGrid problem I've been having,
> > where Nathan tasks in slot 0 appear to complete immediately and grant full+bonus
> > credit.
> > I had done a TON of testing on it recently, to confirm that it wasn't related to
> > app_config, cc_config, global_prefs_override, having multiple GPUs, nVidia
> > driver versions, or any conflicting programs.
> >
> > Well, I wanted to let you know that I believe I have now solved it.
> >
> > I think you will find the following post very very interesting:
> > http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29894
> >
> > I'm not sure where the bug truly is, but somehow someone (maybe BOINC, maybe
> > GPUGrid acemd app, maybe Windows itself)...
> > ... is not handling Virtual Store correctly.
> >
> > Your thoughts?
> >
> > Thanks for your help guys, I truly do appreciate it.
> > - Jacob
> >
> >
> >
> >
> >
> > > Date: Sun, 12 May 2013 22:37:53 -0700
> > > From: David Anderson
> > > To: Jacob Klein
> > > Subject: Re: GPUGrid.net problem - Completing immediately, status successful,
> > and granting credit
> > >
> > > Jacob:
> > >
> > > Nothing comes to mind.
> > > I agree that the evidence suggests a problem somewhere in the BOINC
> > > runtime system (the way client communicates with app).
> > >
> > > However, the fact is that the app is exiting,
> > > apparently by calling boinc_finish()
> > > (since the file boinc_finish_called is present).
> > > Without more info about why the app is exiting,
> > > I'm not sure I can figure out anything.
> > >
> > > Strangely, there's nothing in the stderr output of the short jobs
> > > (strange because boinc_finish() writes something there).
> > >
> > > So it would help me if GPUGrid would add some fprintf(stderr,...)s
> > > to their app (especially wherever it exits).
> > > I'd be happy to work closely with them (and you) to solve this problem.
> > >
> > > -- David
> > >
> > > On 10-May-2013 3:24 PM, Jacob Klein wrote:
> > > > Hey David,
> > > >
> > > > Regarding this issue (where a GPUGrid tasks completes in <5 seconds, says it's
> > > > successful, and grants full+bonus credit)...
> > > >
> > > > I've been able to determine that:
> > > > - It happens even when I'm not running World Community Grid Help Conquer Cancer
> > > > GPU tasks (so, it's unrelated to WCG)
> > > > - It happens even when running only 1 task per GPU (so it's not caused by
> > > > running 2-tasks-on-1-GPU)
> > > >
> > > > It inconclusively *appears* also that:
> > > > - It happens on Nathan Long-run dhfr tasks only (That's the only type of work
> > > > units I've seen it happen on)
> > > > - It happens only when I use an app_config.xml file (with 9999
> > max_concurrent, 1
> > > > gpu_usage, 0.001 cpu_usage) (I haven't been able to reproduce the error
> > when not
> > > > running an app_config.xml file)
> > > > - It happens on slot 0 (For the past 3 times it happened, I've paid closer
> > > > attention, and they all were on slot 0)
> > > >
> > > > I was wondering:
> > > > Would you please consider doing the following:
> > > > - Inspecting the code to see if app_config.xml would do anything funky with the
> > > > CPU? I believed that the 0.001 cpu_usage was only used in the calculation of
> > > > how many CPU tasks to additionally start, but, if it also has any other effect,
> > > > it could be causing an issue.
> > > > - Inspecting the code to see if anything special happens on slot 0?
> > > > - Inspecting the log snippets I captured, to see if anything jumps out as
> > > > suspicious? I had tons of debug flags on, and you probably can read them better
> > > > than anyone.
> > > >
> > > > The task details are at:
> > > > http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29804
> > > > Log part 1/3: http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29805
> > > > Log part 2/3: http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29806
> > > > Log part 3/3: http://www.gpugrid.net/forum_thread.php?id=3332&nowrap=true#29807
> > > >
> > > > Log parts 1&2 are of the download of the task, and may not be relevant.
> > > > Log part 3 has the log where it began execution and immediately completed, and
> > > > would be where to focus.
> > > > Searching for the term 'RND5144' might be the easiest way to find pertinent
> > info.
> > > >
> > > > Basically, I'm hoping you conclude that it's an application problem, but
> > because
> > > > it happens only with app_config, and only on slot 0, I'm concerned it could
> > be a
> > > > BOINC issue.
> > > > Side note: The GPUGrid.net admins are still not being helpful in resolving this
> > > > issue.
> > > >
> > > > Thanks for all your help,
> > > > Jacob

Note: See TracTickets for help on using tickets.