2020/09/30 23:43:38
flyinion
I've been randomly having Folding units cause blue screen restarts off and on since I started Folding back in March.  After finding out early that others had had problems with bad work units I always just wrote it off as that.  Today I actually caught one of the crashes (usually it's overnight) and notice the displays just went all corrupted with alternating light/dark rectangles and graphical anomalies covering the desktops. 
 
It was suggest to run the OCCT GPU tests so I did.  3D test ran for 45 minutes no issues.  VRAM test causes the issue in seconds.  I tried underclocking the VRAM by 500 and it lasted nearly 6 minutes and 12 passes then it happened again. 
 
I'm not sure if that's just an OCCT issue, driver issue (run 456.38) or even potentially an issue cause by running a PCIE extender due to vertical mounting.  The extender is the Phanteks 220mm with the extra shielding and the new design where it's split in a number of individual sections.  I can't test without since I'm watercooled.  At least not without draining the loop, dissembling the GPU section, and finding some extra tubing to reconfigure it.
 
I've had zero issues in real content like gaming, benchmarks like Firestrike, etc.  Just trying to figure out if I have a real issue, or the FTW3 cards are just pushed to the limit and high stress like Folding or OCCT is able to push it past that limit.  Anyone have any thoughts?
 
edit:  just to add, even with the OCCT crashes, it's not something like the card just went today or anything.  After the OCCT issue it actually ran a large Folding unit through to completion no problems.  So it seems to be just extreme high stress that causes it.
2020/10/01 00:35:15
NobleNomad10
How is the Hydro Copper this series?
2020/10/01 09:34:58
Sajin
Riser could be the issue. I’d suggest testing without one.

I also posted the following in one of your threads... https://forums.evga.com/FindPost/3060562
2020/10/01 13:57:49
flyinion
Sajin
Riser could be the issue. I’d suggest testing without one.

I also posted the following in one of your threads... https://forums.evga.com/FindPost/3060562



Thanks, unfortunately I can't test without the riser, not without a LOT of work to reconfigure since it's watercooled.  I can't just remove the block and leave it hanging off and slap an air cooler back on and stick it directly in the slot either since it's a factory hydrocopper card so I don't have an air cooler for it.
2020/10/01 18:05:09
flyinion
Not sure if this helps.  I tried the latest driver, thinking maybe it was a driver issue as it seemed to start happening more frequently again since the one from mid-September got installed.  Didn't fix the OCCT problem so far, but I managed to get some event log errors from it this time.  It's stuff like this, which from what I could find potentially also points to driver issues as much as hardware issues.  For software issues it was pointed out that it occurs under very high stress loads which is my case.  Again, zero issues with games, benchmarks, etc.  Wondering if I should just contact EVGA about RMA anyway but if I do what the heck am I even going to end up with as a replacement since all the 2080's are out of stock and same with the 3xxx cards.
 
These are all nvlddmkm errors for Source, EventID 13
first error is this with eventID 14 though: 
\Device\Video3
20df(3268) 00000000 00000000
 
EventID13:
\Device\Video3
Graphics Exception: ESR 0x52df30=0xb030020 0x52df34=0x4 0x52df28=0x4c1eb72 0x52df2c=0x174
 
\Device\Video3
Graphics SM Global Exception on (GPC 5, TPC 3, SM 0): Multiple Warp Errors
 
\Device\Video3
Graphics SM Global Exception on (GPC 5, TPC 2, SM 0): Multiple Warp Errors
 
\Device\Video3
Graphics SM Warp Exception on (GPC 5, TPC 2, SM 0): MMU NACK Errors
 
2020/10/01 18:07:23
flyinion
Interesting just found some info searching for that first error.  It could be an AMD/Nvidia conflict ongoing issue actually.
 
2020/10/01 20:11:41
Sajin
It's a faulty card... https://forums.evga.com/2080-Ti-XC-ULTRA-Random-crashes-in-several-games-m3063964.aspx
 
I'm sure you'll get another 2080 at some point.
2020/10/01 20:42:39
flyinion
Sajin
It's a faulty card... https://forums.evga.com/2080-Ti-XC-ULTRA-Random-crashes-in-several-games-m3063964.aspx
 
I'm sure you'll get another 2080 at some point.


Yeah I may have to it's just weird since it doesn't do it in anything except random folding units or that vram test in OCCT (which apparently had issues at some point anyway)
2020/10/01 22:00:30
flyinion
Here's what I see when the issue happens.  Anyway, sending support a message and we'll go from there.  Just did 2 hours of WoW with streaming with NVenc2 and no issues again still.  Also, I'm still wondering if it's somehow a driver issue or as some forum posts I've read in places have pointed to a possible AMD AGESA issue and something with the PCIE bus not responding properly.  When I induced it to get the corruption pic, windows actually recovered quickly twice which is an improvement since putting the latest (non-hotfix) driver on early this evening.   You can see in the OCCT window where DWM recovered but lost all the taskbar icons.
 

 

 
 

Attached Image(s)

2020/10/01 22:11:10
Sajin
If it was a driver issue you could make the issue go away by installing a really old driver. If it's an amd agesa issue you could fix it by updating/downgrading the version. You could also test the card in another computer that isn't an amd system. I'm guessing underclocking both the core & memory to max negative clocks inside msi afterburner did nothing?

Use My Existing Forum Account

Use My Social Media Account