Posts by DylanMadeley

1) Message boards : Problems : Observation about LHC jobs (Message 1246)
Posted 19 Jul 2022 by DylanMadeley
Post:
Thanks. I have joined their forum. In the meantime, I have found ways to keep the LHC ATLAS simulation tasks steadily going to the one machine that processes them pretty well, and I find other things for the newer one to do.
2) Message boards : Problems : Observation about LHC jobs (Message 1243)
Posted 18 Jul 2022 by DylanMadeley
Post:
I use two laptops. One is an older, Dell G7 17. Other is newer, Alienware G15 20 cores and the source of earlier posted issues.

I have now observed that the newest batch of LHC jobs run well enough on the older Dell. In fact, heat isn't even generated to excess, and my earliest problem was solved by switching off EFmer TThrottle and dedicating that older model to LHC jobs; they run cool and run on time.

On the newer machine, I thought I had solved the problem by disabling TThrottle and removing any other job; I could dedicate 16 cores to LHC, two jobs at 8 cores apiece. It still ran like Zeno's Paradox, it would just keep slowing until it could never reach the end.

I'm giving new machine a shutdown, time out, lots of rest, but maybe it just isn't any good for LHC jobs. It can run Einstein, Prime Grid, Amicable, it can run 19 core Amicable jobs but then it has to be dedicated to that one job because it has 20 total cores. But I just have to steer it away from LHC and keep those on the older model I guess.
3) Message boards : Problems : Reference_by_pointer bluescreens since yesterday (Message 1167)
Posted 25 Jun 2022 by DylanMadeley
Post:
Seems resolved: Alienware Command Center had screwed up during a software update and this somehow corrupted Windows. After a full reset/reinstall, back to work with no problems.
4) Message boards : Problems : Reference_by_pointer bluescreens since yesterday (Message 1161)
Posted 24 Jun 2022 by DylanMadeley
Post:
Short update: reinstalled Windows, reinstalled the few programs I had on there since I'm literally just running it for BOINC until I need it for anything else.

Current theory: since every crash log showed that a crash in Alienware Command Center was concurrent with it, and AWCC has 1 star reviews for good reason, I question whether AWCC is what flaked out. The moment of truth: having reinstalled and updated every driver and program needed inc. TThrottle, if I can run BOINC on lighter than usual settings and not get a crash, then it's fixed. If it bluescreens again, then perhaps I bricked my GPU. No crashes since the reinstallation, but it just started putting the GPU to work for BOINC.
5) Message boards : Problems : Reference_by_pointer bluescreens since yesterday (Message 1160)
Posted 24 Jun 2022 by DylanMadeley
Post:
This is for me and anybody else running similar specs who is suddenly running into frequent crashes.

laptop specs: 12th Gen Intel(R) Core(TM) i7-12700H [Family 6 Model 154 Stepping 3] , NVIDIA NVIDIA GeForce RTX 3080 Ti Laptop GPU (4095MB) , Microsoft Windows 11
Core x64 Edition , 32461.29 MB memory , 20 cores , BOINC version: 7.16.20 , TThrottle running to manage temperatures

problem: Since late last night, keeps throwing reference-by-pointer bluescreen errors. Have since uninstalled BOINC on that machine only to find the same error keeps happening. Even when four loud fans are all running full blast and TThrottle shows temps as nominal, and the keyboard/surface is cool to the touch, attempted installations of basic program updates have caused a crash.

current actions: Did diagnostics for corrupted data and memory, reinstalled NVIDIA driver and checked all drivers, errors kept occurring. My next course of action is to reset the thing, wiping out all personal files (I have backups, thankfully, and have only had this thing a month) and reinstalling Windows since whatever corruption may have happened seems to run too deep for me to figure this out with my pedestrian computer skills.

lesson one: This isn't the ideal platform for hours-long distributed computing applications anyway, I made a terrible mistake purchasing this machine. I don't even need the portability so I should have gone with a tower with robust cooling systems. Even if the tower had the same processor, RAM and GPU specs, it would at least be able to handle using its processing power to the full and would live a longer life of good work.

fears: Have I managed to brick any of the hardware within a month of purchasing this machine? Trade-in value at the seller looks sad and I will not be in position to finance more suitable equipment for a long time. Well, I tried.





© 2024 UC Berkeley