Hi,
I've also submitted a tech support request about this as I'm sort of out of stuff to try. I'm running 2 x GTX 980 ACX 2.0 in SLI config, and have been for the last 4 months or so. PSU is an EVGA 1000W P2. Last night I started to get 0x00000116 BSOD and was able to boot to Windows with SLI enabled but as soon as I did anything even close to gaming, it would happen again. My monitor is plugged into Card #1, disabling SLI resolved the issue (except obviously now I'm only using 1 card). Plugging into the second card with SLI disabled resulted in a BSOD. I'm thinking that this looks easy to troubleshoot as it's obviously Card #2 which is the problem. So this morning I did some further digging to isolate the problem, but I can't get it to ever happen with just 1 card plugged in to the system. Any time I enable SLI and try to game, I will always crash.
Here's what I've tried with both cards:
- Replacing SLI bridge
- Swapping which card is in which PCIe slot
- Swapping PCIe power cables between eachother
- Clean install of all drivers
None of the above worked, so then I moved on to finding which specific card was the problem. I took out Card #2 and ran Furmark against Card #1. No problems. I also ran BarsWF (Google it), a fairly old program but which allows me to run heavy CUDA workload and also specify a GPU on which to run it, and no issues were found. Then I took out Card #1 and put Card #2 in the same slot with the same power cables, and re-ran both tests, no problems at all. So at this point I figured that maybe it just needed reseating or something. So I put Card #2 back in its own slot with its own power cables, all tests passed again. Then I put Card #1 back in and re-connected the SLI bridge. Instant BSOD on opening a game. So I removed the SLI bridge and booted without it, so now both GPUs are connected to the system but no SLI available. However BarsWF can still use both cards as it individually gives them a workload. So I passed it a --gpu_mask 1 parameter (stress Card #1), worked without issues. Then I tried --gpu_mask 2 and got an instant BSOD. I then switched Card #1 and Card #2 around, this time --gpu_mask 1 gives the BSOD, which again points to Card #2 as being the problem.
The issue is, I can never seem to reproduce the issue when just 1 card is in the system! I haven't done any recent BIOS updates or anything of that nature. The nvidia drivers I'm using have been on my system since their release now (350.12). I've also tried multiple different slots on the PSU as well to plug the cards into. But I don't think it's a power problem as it occurs immediately when the BarsWF test is running against 1 GPU but when 2 are installed in the system (but the 2nd card isn't drawing any load). I also feel like it's some driver issue but why would it just occur now?
Any help is much appreciated.
post edited by Pet0r - 2015/05/03 05:55:22