EVGA

Trying to track down a BSOD culprit

Page: < 12 Showing page 2 of 2
Author
Sajin
EVGA Forum Moderator
  • Total Posts : 49168
  • Reward points : 0
  • Joined: 2010/06/07 21:11:51
  • Location: Texas, USA.
  • Status: online
  • Ribbons : 199
Re: Trying to track down a BSOD culprit 2022/10/17 20:31:29 (permalink)
Sounds like you’re good then.
#31
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/17 21:36:56 (permalink)
arestavo
I've got a hunch that it's the motherboard, since even after replacing the CMOS battery and later the RAM, the two voltages for the RAM are off. One reads the proper voltage, and the other reads higher than what it's set at.
 
As for the storage options, I'll try without them connected later - yet they both register as fine for smart data (and testing with Micron's executive software for the 9300). Sadly, I can go for days without a BSOD and then get one - so, it'll be a bit of a bother without the game drive and the movie drive.
 
Memtest86 and Memtest86+ both fully pass stock JDEC and XMP1 settings for the old and new RAM. I'm running an all night Memtest86 if it'll keep going for the free version.

I'm sorry you are having hard time.
 
I'm not sure it's the mb. I'm running X299 Dark and I have observed discrepancies in voltage readouts but some variations are expected and readouts depended on which of the sensors were used to read them out (look for my posts on that topic). In the end I figured readouts don't matter as long as system is stable. It would help if you would share more info.
 
Intel's processor tests are not very thorough so I wouldn't say they are 100% guarantee CPU is OK. However if I had to suspect something it would be your memory. I use MemTest86 Pro and GSAT for memory testing. My G.Skill is 3200/16. Your 3600/16 might be too much. Have you tried using OCCT to stress test your CPU and memory?
 
 
 
#32
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/17 23:06:22 (permalink)
ZoranC
 
I'm sorry you are having hard time.
 
I'm not sure it's the mb. I'm running X299 Dark and I have observed discrepancies in voltage readouts but some variations are expected and readouts depended on which of the sensors were used to read them out (look for my posts on that topic). In the end I figured readouts don't matter as long as system is stable. It would help if you would share more info.
 
Intel's processor tests are not very thorough so I wouldn't say they are 100% guarantee CPU is OK. However if I had to suspect something it would be your memory. I use MemTest86 Pro and GSAT for memory testing. My G.Skill is 3200/16. Your 3600/16 might be too much. Have you tried using OCCT to stress test your CPU and memory?

OCCT passed the CPU test for over an hour, same for AIDA64. FWIW, I got the same BSODs with 8 sticks of Corsair Vengeance 3200MHz C16 RAM.
 
I started an RMA request with Intel. In the meantime, I pulled the trigger on an AMD 5800X3D and MSI MEG X570S Ace Max. Also snagged a 2TB Firecuda 530 M.2 since I'll need to cut down on my PCIE cards. That drive has some serious specs that actually exceed all of my old PCIE Intel 900P drive (except for random 4K Q1 read speeds, in that the 900P still beats the Firecuda). I'll just have the X299 FTW K, 10940X, and Intel 900P to replace my aging backup X79 which is getting quite long in the tooth.
post edited by arestavo - 2022/10/17 23:50:11
#33
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/18 10:36:50 (permalink)
arestavoOCCT passed the CPU test for over an hour, same for AIDA64. FWIW, I got the same BSODs with 8 sticks of Corsair Vengeance 3200MHz C16 RAM.

I assume you tested all variations of instructions sets, and also did memory one, with OCCT? If you didn't already you might want to try "power supply" portion of OCCT test that subjects both CPU and GPU to simultaneous load test.
 
FWIW, IME Corsair Vengeance also throwing errors isn't indicator that it isn't memory. I had frequent memory test failures and issues when I tried to use Corsair in my X299 Dark with more than one model of their memory. Not a single one with G.Skill. G.Skill also claims compatibility with X299 Dark. Corsair doesn't.
 
With that said I fully understand you wanting to be done with it and move on. If my X299 Dark wasn't stable that is what I too would be doing. Thank God (and to knock on the wood) it is so I will be sitting on the bench watching what next generations of CPUs will bring.
#34
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 14:07:12 (permalink)
Sajin
Sounds like you’re good then.

Intel is saying that enabling RAM XMP profiles above the CPU rating voids the warranty? The 10940X was only rated for 2933....
 
Guess I'll try RMAing the X299 FTW K with EVGA.
#35
Sajin
EVGA Forum Moderator
  • Total Posts : 49168
  • Reward points : 0
  • Joined: 2010/06/07 21:11:51
  • Location: Texas, USA.
  • Status: online
  • Ribbons : 199
Re: Trying to track down a BSOD culprit 2022/10/18 14:09:46 (permalink)
Lol. You weren’t suppose to tell them that.
#36
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 14:14:25 (permalink)
Sajin
Lol. You weren’t suppose to tell them that.

Well I didn't, I just find it hard to believe that enabling a conservative XMP profile of 3200MHz would void a warranty.
#37
Sajin
EVGA Forum Moderator
  • Total Posts : 49168
  • Reward points : 0
  • Joined: 2010/06/07 21:11:51
  • Location: Texas, USA.
  • Status: online
  • Ribbons : 199
Re: Trying to track down a BSOD culprit 2022/10/18 14:15:37 (permalink)
I hear ya, but those are the rules from Intel.
#38
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 16:24:44 (permalink)
One interesting note, as I sit here with Intel's XTU stress testing AVX (because Star Citizen uses AVX) for 4 hours straight with no issues - I've only gotten these BSODs in games, which are variable loads. Unlike these stress tests which are very stable for the workload.
#39
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/18 16:43:29 (permalink)
arestavo
Sajin
Lol. You weren’t suppose to tell them that.

Well I didn't, I just find it hard to believe that enabling a conservative XMP profile of 3200MHz would void a warranty.

That has been subject of a debate for quite a while now. Both sides have a valid ground to stand on but digging in won't help David when dealing with Goliath so best thing to do is what you did.
#40
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/18 16:44:51 (permalink)
arestavo
Sajin
Sounds like you’re good then.

Intel is saying that enabling RAM XMP profiles above the CPU rating voids the warranty? The 10940X was only rated for 2933....
 
Guess I'll try RMAing the X299 FTW K with EVGA.

While you are at it you might want to ask them if they could upgrade you to X299 Dark.
#41
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/18 16:59:59 (permalink)
arestavo
One interesting note, as I sit here with Intel's XTU stress testing AVX (because Star Citizen uses AVX) for 4 hours straight with no issues - I've only gotten these BSODs in games, which are variable loads. Unlike these stress tests which are very stable for the workload.

First, IME Intel's XTU doesn't stress system nowhere as much as OCCT and P95 do. So if I were you I wouldn't waste time on XTU, I would spend that time on OCCT's "power supply" test.
 
Second, you are saying exactly what I have been saying for quite a while now in my threads on my X299 Dark build, that constant load tests of single component are not the ultimate way to confirm final system stability, that best way is random variable loads simultaneously across multiple components.
 
Third, in other words, you are talking about same thing I have been talking about before, that I had issues with stability when I was allowing system to vary its power. That is why I locked in all cores to same multiplier, disabled turbo boost 3.0, disabled C states and rest of the stuff I mentioned above and am running Ultimate power plan so power fluctuations are minimal.
 
ALSO, I have configured driver for my 2080 to prefer maximum performance to minimize spike related instabilities with it too.
 
P.S. Is your CSM enabled or disabled? Is your above 4G decoding enabled or disabled? Are you trying to use ReBAR?
#42
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 17:32:30 (permalink)
ZoranC
 
First, IME Intel's XTU doesn't stress system nowhere as much as OCCT and P95 do. So if I were you I wouldn't waste time on XTU, I would spend that time on OCCT's "power supply" test.
 
Second, you are saying exactly what I have been saying for quite a while now in my threads on my X299 Dark build, that constant load tests of single component are not the ultimate way to confirm final system stability, that best way is random variable loads simultaneously across multiple components.
 
Third, in other words, you are talking about same thing I have been talking about before, that I had issues with stability when I was allowing system to vary its power. That is why I locked in all cores to same multiplier, disabled turbo boost 3.0, disabled C states and rest of the stuff I mentioned above and am running Ultimate power plan so power fluctuations are minimal.
 
ALSO, I have configured driver for my 2080 to prefer maximum performance to minimize spike related instabilities with it too.
 
P.S. Is your CSM enabled or disabled? Is your above 4G decoding enabled or disabled? Are you trying to use ReBAR?


CSM is disabled, ReBAR is enabled with above 4G decoding enabled.
#43
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/18 17:50:28 (permalink)
arestavo
CSM is disabled, ReBAR is enabled with above 4G decoding enabled.

You might want to try with disabled above 4G and ReBAR. Few times I had odd behaviors they were enabled. I can't claim with 100% certainty they were the culprit because issue was very sporadic but ever since then I keep them disabled and issue hasn't happen repeated itself. Which is fine with me because I couldn't see any real world benefit when they were enabled.
#44
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 17:52:34 (permalink)
ZoranC
 
You might want to try with disabled above 4G and ReBAR. Few times I had odd behaviors they were enabled. I can't claim with 100% certainty they were the culprit because issue was very sporadic but ever since then I keep them disabled and issue hasn't happen repeated itself. Which is fine with me because I couldn't see any real world benefit when they were enabled.


I'll give it a shot
#45
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/18 18:49:58 (permalink)
Ha. So, as part of my testing for intel they had me download the intel XTU program and use the stress test. I got a BSOD within a minute of starting the test. All the other "stress" tests that I've done until now were fine.
 
For ZoranC, that was with ReBAR and Above 4G Decoding disabled.
#46
Sajin
EVGA Forum Moderator
  • Total Posts : 49168
  • Reward points : 0
  • Joined: 2010/06/07 21:11:51
  • Location: Texas, USA.
  • Status: online
  • Ribbons : 199
Re: Trying to track down a BSOD culprit 2022/10/18 20:14:09 (permalink)
Lol.
#47
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/19 14:21:20 (permalink)
arestavo
Ha. So, as part of my testing for intel they had me download the intel XTU program and use the stress test. I got a BSOD within a minute of starting the test. All the other "stress" tests that I've done until now were fine.
 
For ZoranC, that was with ReBAR and Above 4G Decoding disabled.

Didn't you say you tested with XTU for 4 hours straight without any issues??? What has changed?
 
If I were you I would go back to basics: BIOS settings at default multipliers and values for voltages (no overclocking), AVX offset 3/5, and test that memory with XMP on extensively.
post edited by ZoranC - 2022/10/19 14:32:54
#48
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/19 14:21:36 (permalink)
Sajin
Lol.

???
#49
Sajin
EVGA Forum Moderator
  • Total Posts : 49168
  • Reward points : 0
  • Joined: 2010/06/07 21:11:51
  • Location: Texas, USA.
  • Status: online
  • Ribbons : 199
Re: Trying to track down a BSOD culprit 2022/10/19 15:57:23 (permalink)
ZoranC
Sajin
Lol.

???

arestavo
Ha. So, as part of my testing for intel they had me download the intel XTU program and use the stress test. I got a BSOD within a minute of starting the test. All the other "stress" tests that I've done until now were fine.

I found that funny.
#50
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/19 16:44:10 (permalink)
ZoranC
Didn't you say you tested with XTU for 4 hours straight without any issues??? What has changed?
 
If I were you I would go back to basics: BIOS settings at default multipliers and values for voltages (no overclocking), AVX offset 3/5, and test that memory with XMP on extensively.


Oh no, that was several hours of testing OCCT and AIDA64. And yes, that was 100% stock settings with RAM speed set to 2933 where I got that BSOD in under a minute with Intel's XTU stress test.
#51
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/19 17:42:01 (permalink)
arestavo
ZoranC
Didn't you say you tested with XTU for 4 hours straight without any issues??? What has changed?

Oh no, that was several hours of testing OCCT and AIDA64.

This...
arestavo
One interesting note, as I sit here with Intel's XTU stress testing AVX (because Star Citizen uses AVX) for 4 hours straight with no issues ...

... is why I thought you said you already tested with XTU without issues.
 
arestavo
And yes, that was 100% stock settings with RAM speed set to 2933 where I got that BSOD in under a minute with Intel's XTU stress test.

 Manually setting RAM speed is not 100% stock setting. When I had problems with Corsair memory I tried manually setting memory speed and that resulted in even more frequent problems. Clear your CMOS, configure following:
 
{Boot}
Fast Boot : On -> Off
 
{CPU}
CPU Multiplier Control : Auto -> Manual - PerCore
Per Core OC Ratio : X -> 43 (CPU dependent, all core boost for i9-10900X/920X is 4.3GHz)
Core 1 to ‘n’ Ratio : X -> 43 for i9-10900X/920X (CPU dependent)
Mesh ratio: Auto -> 24
AVX2 Negative Offset : X -> 3 for i9-10900X/920X (CPU dependent)
AVX3 Negative Offset : X -> 5 for i9-10900X/920X (CPU dependent)
 
{Memory}
Memory Profiles : Automatic -> XMP Profile 1
Force Memory Retraining : Disabled -> Enabled
 
{CPU Configuration}
EIST : Enabled
EIST>Turbo Mode : Enabled
CPU C states : Auto -> Disabled (Enhanced C1 disabled)
MSR Lock Control : Enabled
Hyper-Threading : Enabled
Virtualization Technology : Enabled
Intel Virtualization Technology For Directed I/O : Disabled -> Enabled
CPU TjMax : Auto -> 86C (CPU dependent)
Intel Turbo Boost 3.0 Driver Support : Enabled -> Disabled
 
{PCIe Configuration}
Above 4G Decoding : Disabled
Re-Size BAR Support : Disabled
PE1-PE6 Speed : Auto -> Gen3
 
Leave everything else at -default- after CMOS clearing.
 
Test memory using PassMark's MemTest86 -PRO- booting from -USB flash drive-. Boot from flash is the only way to thorougly test practically all memory. Free version doesn't include "hammer" test and it was hammer test that revealed issues with my Corsair memory. Also do at least 4 passes, ideally 8 or more. At default settings first pass doesn't run everything, and errors sometimes wouldn't pop up until second or third pass.
 
Only once memory got clean bill of health at these settings move onto Windows based tests. Put Windows in Ultimate performance plan, and modify plan:
USB settings > USB selective suspend setting: Enabled -> Disabled
PCI Express > Link state power management: Off
 
Put Nvidia driver in prefer maximum performance and read up on debug mode. Consider putting your 3090 BIOS switch in different position.
 
Repeat your XTU tests testing one instruction set at the time. Start with regular moving onto AVX2 and then AVX3.
 
And consider trying out G.Skill memory model that has been certified for your mb.
#52
arestavo
CLASSIFIED ULTRA Member
  • Total Posts : 6916
  • Reward points : 0
  • Joined: 2008/02/06 06:58:57
  • Location: Through the Scary Door
  • Status: offline
  • Ribbons : 76
Re: Trying to track down a BSOD culprit 2022/10/19 19:33:01 (permalink)
ZoranC
arestavo
ZoranC
Didn't you say you tested with XTU for 4 hours straight without any issues??? What has changed?

Oh no, that was several hours of testing OCCT and AIDA64.

This...
arestavo
One interesting note, as I sit here with Intel's XTU stress testing AVX (because Star Citizen uses AVX) for 4 hours straight with no issues ...

... is why I thought you said you already tested with XTU without issues.
 
arestavo
And yes, that was 100% stock settings with RAM speed set to 2933 where I got that BSOD in under a minute with Intel's XTU stress test.

 Manually setting RAM speed is not 100% stock setting. When I had problems with Corsair memory I tried manually setting memory speed and that resulted in even more frequent problems. Clear your CMOS, configure following:
 
{Boot}
Fast Boot : On -> Off
 
{CPU}
CPU Multiplier Control : Auto -> Manual - PerCore
Per Core OC Ratio : X -> 43 (CPU dependent, all core boost for i9-10900X/920X is 4.3GHz)
Core 1 to ‘n’ Ratio : X -> 43 for i9-10900X/920X (CPU dependent)
Mesh ratio: Auto -> 24
AVX2 Negative Offset : X -> 3 for i9-10900X/920X (CPU dependent)
AVX3 Negative Offset : X -> 5 for i9-10900X/920X (CPU dependent)
 
{Memory}
Memory Profiles : Automatic -> XMP Profile 1
Force Memory Retraining : Disabled -> Enabled
 
{CPU Configuration}
EIST : Enabled
EIST>Turbo Mode : Enabled
CPU C states : Auto -> Disabled (Enhanced C1 disabled)
MSR Lock Control : Enabled
Hyper-Threading : Enabled
Virtualization Technology : Enabled
Intel Virtualization Technology For Directed I/O : Disabled -> Enabled
CPU TjMax : Auto -> 86C (CPU dependent)
Intel Turbo Boost 3.0 Driver Support : Enabled -> Disabled
 
{PCIe Configuration}
Above 4G Decoding : Disabled
Re-Size BAR Support : Disabled
PE1-PE6 Speed : Auto -> Gen3
 
Leave everything else at -default- after CMOS clearing.
 
Test memory using PassMark's MemTest86 -PRO- booting from -USB flash drive-. Boot from flash is the only way to thorougly test practically all memory. Free version doesn't include "hammer" test and it was hammer test that revealed issues with my Corsair memory. Also do at least 4 passes, ideally 8 or more. At default settings first pass doesn't run everything, and errors sometimes wouldn't pop up until second or third pass.
 
Only once memory got clean bill of health at these settings move onto Windows based tests. Put Windows in Ultimate performance plan, and modify plan:
USB settings > USB selective suspend setting: Enabled -> Disabled
PCI Express > Link state power management: Off
 
Put Nvidia driver in prefer maximum performance and read up on debug mode. Consider putting your 3090 BIOS switch in different position.
 
Repeat your XTU tests testing one instruction set at the time. Start with regular moving onto AVX2 and then AVX3.
 
And consider trying out G.Skill memory model that has been certified for your mb.


Sorry, the stress test (top one) was different than the AVX test that I ran in XTU. The AVX one was fine for 4 straight hours, however the regular stress test (which I hadn't tried before) popped the BSOD in under a minute.
 
If I get the chance this weekend I'll see about testing those settings - though I do not have the pro version of MemTest86.
#53
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/19 20:21:52 (permalink)
arestavoThe AVX one was fine for 4 straight hours, however the regular stress test (which I hadn't tried before) popped the BSOD in under a minute.

If stress test of regular instruction set, that is usually less demanding than AVX2/3 instruction set, pops the error but AVX2/3 doesn't then either:
 
a) AVX2/3 test success was just pure luck, or
b) something is off about your regular instruction frequencies/voltages.
 
What frequency task manager and XTU say your CPU cores are at when running regular stress test vs. AVX2/3 one? Don't forget to observe things like power and temperature throttling while running XTU.
 
In my experience MemTest86 Pro is worth its price.
#54
ZoranC
FTW Member
  • Total Posts : 1099
  • Reward points : 0
  • Joined: 2011/05/24 17:22:15
  • Status: offline
  • Ribbons : 16
Re: Trying to track down a BSOD culprit 2022/10/19 20:25:49 (permalink)
Opps, I just caught a mistake, you have 10940X, I have 10920X so your testing multipliers should be:
 
Per Core OC Ratio : X -> 41
Core 1 to ‘n’ Ratio : X -> 41
#55
Page: < 12 Showing page 2 of 2
Jump to:
  • Back to Mobile