EVGA

RTX 2080 ti FTW3 Ultra Crashing

Author
InArduaTendit
New Member
  • Total Posts : 10
  • Reward points : 0
  • Joined: 2019/05/14 01:07:23
  • Status: offline
  • Ribbons : 0
2019/05/14 18:54:45 (permalink)
I built my computer a little under 2 months ago, for the first few weeks everything was fine but now I have been dealing with crashes of increasing frequency. When it crashes it just instantly shuts off and reboots, no stalling no blue screen, just an instant reboot, no dump files are created and event viewer only shows a Event ID 41 Kernal-Power Critical Error "The system has rebooted without cleanly shutting down first." with a bugcheck code of 0. Also occasionally it would fail to boot up, stalling during "Installation of PCH Runtime Services" and the motherboard debug led would light up for VGA error: GPU not detected or failed, but would boot correctly after turning off and back on. It started crashing maybe one out of every 10 times I launched Mass Effect Andromeda, always crashing at roughly the same spot during the loading screen. Then it started crashing every time it loaded. Then other games (Sniper Elite 4, Dying Light, Destiny 2) started crashing, also during loading screens. I updated drivers, re-seated all hardware, still crashed. Flashed motherboard BIOS, clean installed windows, still crashed. Now it started crashing during gaming, not just on loading screens. But crashes are not predictable, and sometimes it will go for several hours without issue. Running monitoring software shows that most temperatures are well under their limits, with GPU 1 and 2 sensors rarely going above 60. The only outliers are the Mem2 and Mem3 modules, which are routinely much hotter than the rest of the card (for instance when Mem1 is at 48C, Mem2 and Mem3 are at 66C and 68C respectively, and that is after setting a much more aggressive fan curve in Precision X1, 30% at 30C 60% at 50C and 100% at 70C on all 3 fans, was even higher with default curve), and my CPU never even really goes above 50C. I have run Firestrike stress testing and gotten 98.7% stability, stays totally stable during Prime95 torture testing, ran four passes of MemTest86 with no errors, chkdsk with no errors, UserBenchmark shows every component working above expectations, and I ran pretty much everything else I can think of and found no issues. Voltages seem pretty stable, but (and I'm not sure that this matters) as soon as I am under load I get a PerfCap reason of VRel and sometimes Pwr, VRel. Unfortunately I have neither an extra GPU to swap it out for, nor another computer I can swap this GPU into, so I am limited to testing with my current equipment. Nothing in my system is overclocked (except for the XMP for the RAM, but disabling XMP did not stop the crashing) even though the Scan feature of Precision X1 says I could set it to +80. I am pretty much at a loss for what to do, the only hint I have gotten is the VGA error during PCH Runtime Installation which points to the GPU. Do I have a faulty GPU? Is there something else I am missing? What do I do next? Any help would be appreciated. If there is any information I forgot to supply, or any tests I should run and post results of, I would be happy to.
 
These are my system specs
Case: CoolerMaster h500m Mesh
CPU: Intel i7 9700K
CPU Cooler: EVGA CLC 280
GPU: EVGA 2080 ti FTW3 Ultraa
RAM: 16GB (2x8GB) G.Skill TridentZ RGB 3200 cl16
HD: 1TB Samsung 970 EVO Plus and 500GB Samsung 860 EVO 
PSU: Corsair AX860 860w 80+ Platinum
Motherboard: MSI MEG Z390 ACE
 
EDIT: Included motherboard on system specs
post edited by InArduaTendit - 2019/05/14 19:58:51
#1

13 Replies Related Threads

    AHowes
    CLASSIFIED ULTRA Member
    • Total Posts : 6502
    • Reward points : 0
    • Joined: 2005/09/20 15:38:10
    • Location: Macomb MI
    • Status: offline
    • Ribbons : 27
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/14 19:07:11 (permalink)
    PSU new? That's usually the cause of what your experiancing.

    Intel i9 9900K @ 5.2Ghz Single HUGE Custom Water Loop.
    Asus Z390 ROG Extreme XI MB
    G.Skill Trident Z 32GB (4x8GB) 4266MHz DDR4 
    EVGA 2080ti K|NGP|N w/ Hydro Copper block.  
    34" Dell Alienware AW3418DW 1440 Ultra Wide GSync Monitor
    Thermaltake Core P7 Modded w/ 2x EK Dual D5 pump top,2 x EK XE 480 2X 360 rads.1 Corsair 520 Rad.
    #2
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/14 19:19:19 (permalink)
    Yes and no. The PSU is certified factory refurbished, but was purchased at the same time as the other components (March) and only used in this computer. Could the PSU cause the VGA failure on boot? And are there any signs I could look for in the voltages that would indicate a PSU issue? Voltages seem stable (for instance +12V rail stays between 12.000V and 12.096V), is there any way to test the PSU without just buying a second one and swapping it in?
    #3
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/14 19:26:39 (permalink)
    Also just now I added GPU Board power to the HWM on Precision X1 and the readings seem weird, since the computer is basically idle other than using this forum, here is a screenshot. If its too small, it is maxing out at 823.
     
    This is a screenshot of max and mins after furmark benchmarking too so you can see voltages
     
    #4
    Sajin
    EVGA Forum Moderator
    • Total Posts : 49088
    • Reward points : 0
    • Joined: 2010/06/07 21:11:51
    • Location: Texas, USA.
    • Status: offline
    • Ribbons : 199
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/14 21:32:10 (permalink)
    The psu or video card could cause this.
    #5
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/14 23:55:47 (permalink)
    So my best (or at least cheapest) option is probably replace the psu, and if it keeps happening after that then I'll know it's the video card? Or is there a way to test the psu without replacing it that I don't know of? If I have Precision X1 logging stats to file when a crash happens, is there anything specific I can look for in the voltages/power draw/etc. immediately prior to the crash that would help determine which is causing it? Or is the only way to swap them out and see if that helps. My reluctance to swap-check them is based on (a) it costing me quite a bit of money to test, and (b) the fact that there can be hours/days that it doesn't crash even during long gaming sessions, which means it will be hard to quickly tell whether or not the problem is fixed after swapping, since nothing can reliably make it crash and it doesn't crash during stress testing or benchmarking, unless there are some anomalies in the monitoring readouts where I could look for immediate changes.
    #6
    Sajin
    EVGA Forum Moderator
    • Total Posts : 49088
    • Reward points : 0
    • Joined: 2010/06/07 21:11:51
    • Location: Texas, USA.
    • Status: offline
    • Ribbons : 199
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 01:09:32 (permalink)
    Pretty much, but you could use a digital multimeter to see if the 12v rail drops out of spec when the crashes occur. That would point to a faulty psu. Are you powering the card with two separate pci-e power cables? If no, do that first as it could be the problem.
    #7
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 01:51:23 (permalink)
    Yes I am using two pci-e 6+2 pin power cables for the GPU, and two 4+4 pin power cables for the CPU, so power delivery to the card and board shouldn't be an issue (unless something is wrong with the hardware of course). If I do swap-test the PSU should I step up to a higher wattage? I figured that 860w would be plenty, since even at load my system really shouldn't be drawing more than 600W (System configurator estimated 530w peak but I wouldn't be surprised if it went a bit over that), but I am in no way an expert.
     
    EDIT: Here is a picture of my system up and running (Only removed the glass to take picture without reflections, I don't run the case open), the 25 in the top right corner is my CPU temp, and the LEDs on the GPU are configured to turn from blue to red as temps increase (I know its not perfectly accurate but its nice to have a quick visual indicator)

    post edited by InArduaTendit - 2019/05/15 02:01:37
    #8
    Sajin
    EVGA Forum Moderator
    • Total Posts : 49088
    • Reward points : 0
    • Joined: 2010/06/07 21:11:51
    • Location: Texas, USA.
    • Status: offline
    • Ribbons : 199
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 02:00:36 (permalink)
    860w is enough for your system.
    #9
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 02:21:26 (permalink)
    Thanks, I figured that was the case but it's good to hear it from someone else. So I guess the next step is just replace the PSU and then run it as hard as I can to see if I can get it to crash. One interesting thing is that I have not yet had a crash since I set the custom aggressive fan curve (Only set it that way earlier today when I noticed the elevated temps on the mem2 and mem3 modules), is it possible that those memory modules were, even briefly, exceeding their safe operating temperature during loading games before the default fan curve was kicking in enough to cool them, or during long periods under load but when the GPU package temps were too low to speed up fans enough to cool them? Granted I've only used it for a couple hours with the new curve, and even less time in games, but it just made me think, maybe those modules were just heating up really quickly before the second two fans even kicked on, or at least got to a speed where they could cool them adequately under long loads (am I wrong in thinking that loading screens are the most intensive times for memory modules and RAM?) I guess I'll know for sure if it crashes again before I get the replacement PSU since I now have those temps logging to file constantly. 
    #10
    Sajin
    EVGA Forum Moderator
    • Total Posts : 49088
    • Reward points : 0
    • Joined: 2010/06/07 21:11:51
    • Location: Texas, USA.
    • Status: offline
    • Ribbons : 199
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 02:26:20 (permalink)
    No problem. No your mem temps were fine. The memory can be fine up to 95c.
    #11
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 02:41:47 (permalink)
    Thank you for everything, I'll go ahead with replacing the PSU and will report back after that's up and running for long enough to see if the problem is fixed. I want you to know how much I appreciate the time you are taking to read and respond, it really is above and beyond. So thank you again!
    #12
    AHowes
    CLASSIFIED ULTRA Member
    • Total Posts : 6502
    • Reward points : 0
    • Joined: 2005/09/20 15:38:10
    • Location: Macomb MI
    • Status: offline
    • Ribbons : 27
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 08:15:44 (permalink)
    Just one thing to point out with the fan curves. When you goto set up a custom curve. Remember that your setting the temp curve to that component. So GPU fan from the GPU temp. The mem from one of those gpu chips (no idea which its sensing from.. some chips run much hotter then others) the VRM for the power fan.

    So use the ICX button to check those temps.. set the curve accordingly.

    I like to get the fans moving max on load so they can do their job. So 80% at 40c for the GPU and 100% at 45% and up.. so this would be more effective if the other 2 fans would work the same since that heatsink is all connected to the GPU die.

    Would be best if we could sync it all to the gpu temp.. but I dont think that works. So one needs to watch what the load temps are for the memory and power areas to make sure the fans ramp up as well to help cool the GPU temps.

    If you have a 2nd monitor.. have PX1 open on that monitor with ICX open to watch those temps while gaming or benchmarking and then adjust the fan profile accordingly.

    Basically have the fans running at like 30-50% at idle to keep the temps in check before it ramps up to help the overall load temps.

    Dont let everything cook and then expect to be able to control the temps on air once it's already really higher over 50-60c at idle.

    Intel i9 9900K @ 5.2Ghz Single HUGE Custom Water Loop.
    Asus Z390 ROG Extreme XI MB
    G.Skill Trident Z 32GB (4x8GB) 4266MHz DDR4 
    EVGA 2080ti K|NGP|N w/ Hydro Copper block.  
    34" Dell Alienware AW3418DW 1440 Ultra Wide GSync Monitor
    Thermaltake Core P7 Modded w/ 2x EK Dual D5 pump top,2 x EK XE 480 2X 360 rads.1 Corsair 520 Rad.
    #13
    InArduaTendit
    New Member
    • Total Posts : 10
    • Reward points : 0
    • Joined: 2019/05/14 01:07:23
    • Status: offline
    • Ribbons : 0
    Re: RTX 2080 ti FTW3 Ultra Crashing 2019/05/15 08:46:05 (permalink)
    That was my issue with the default curve, it only ran one fan at 20% when idle, and the other two didn't turn on til it was pretty warm and even then spin fairly slow. Now I have all 3 set to spin at 30% while idle, and all 3 start ramping up as soon as it hits 30C, running all 3 at 100% at 70C and above. I don't have a second monitor but I have all the ICX temps logging to file so I can see what they were under load. With my current fan curve the hottest any of the sensors hit while under full load is 68C on the mem3 module, everything else is considerably lower. Since Sajin said they are safe up to 95C I don't think I need to make the curve any more aggressive, I know cooler is always better but less spinning does mean a longer lifespan on the bearings and less dust being forced through the card unnecessarily (obviously case is filtered and configured for positive pressure, but some dust always gets through, it's not like the mesh filters are HEPA quality)
    #14
    Jump to:
  • Back to Mobile