EVGA

Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards.

Page: < 12345.. > >> Showing page 2 of 6
Author
Feklar
Superclocked Member
  • Total Posts : 152
  • Reward points : 0
  • Joined: 2007/05/08 13:01:44
  • Location: Foxtrot Uniform
  • Status: offline
  • Ribbons : 1
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 02:18:39 (permalink)
yaggaz
Rewire92

A firmware update should be able to resolve this.




Wow thanks for figuring all this out.  If they could fix with a firmware update, would that be a case of limiting the card's performance to achieve safety?


It better not be considered a solution if it limits the cards perfomance. This is not why we buy FTW3 cards at a higher cost. I wouldn't accept that.
#31
Rewire92
New Member
  • Total Posts : 46
  • Reward points : 0
  • Joined: 2018/09/06 00:21:39
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 03:17:26 (permalink)
Feklar
yaggaz
Rewire92

A firmware update should be able to resolve this.




Wow thanks for figuring all this out.  If they could fix with a firmware update, would that be a case of limiting the card's performance to achieve safety?


It better not be considered a solution if it limits the cards perfomance. This is not why we buy FTW3 cards at a higher cost. I wouldn't accept that.


It shouldn't limit the performance at all, as the voltage limits *should* only apply for the lower power clock states.  Remember, the card has issues at LOW power states, not high power states.

An update, no crashes experienced since setting a voltage limit of 1.062V.  Running an overclock of +120 Core/+1250 Memory. Runs Cyberpunk at 2070 Mhz/11000 Mhz on Ultra without Ray-Tracing at 1440p at 80-95 FPS without DLSS.

Capped my FPS in LoL and Halo and the like at 163 FPS.  Still running CSGO at like 400 FPS.  No issues.

I am confident this workaround fixes the FTW3 series issues with having to be RMAed constantly at this point.  Haven't had any more black screens at all, no issues in games, great FPS everywhere.
post edited by Rewire92 - 2021/02/10 03:25:44


#32
neteng101
Superclocked Member
  • Total Posts : 153
  • Reward points : 0
  • Joined: 2018/02/03 06:56:57
  • Status: offline
  • Ribbons : 1
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 13:32:35 (permalink)
Its the transient response of the card during voltage shifts.  If you want to know more about transient response and how it affects electronic circuits I recommend watching some of Buildzoid's videos on Youtube.
 
The really oversimplified version is that changes in state can lead to overshoot and you can't see this in monitoring software, only on an oscilloscope.  The voltage spikes you can't see can be much higher than the safe limits, eg. the card tries to ramp up to 1.081V, but overshoot could say lead it to run at 1.3V for a brief moment.  It leads to crashes at least, and at worse will just burn out components.
 
The power delivery of the FTW3 cards can't respond fast enough to regulate the voltage changes safely - my guess is they didn't quite account for something correctly with the 3x8 power inputs or the programming for the VRMs isn't right.  Best case they can reprogram the firmware and a BIOS update can fix this, worse case the card might need a redesign of its power delivery/filtering.  We already know that Nvidia pushed Ampere to the limits so much that early cards were crashing, with some cheaper cards being more prone to crashing.
 
If you're wondering why someone would be interested enough to go listen to Buildzoid's technical ramblings, it taught me a lot about LLC settings and overclocking my CPU.  I can instantly crash my otherwise stable system by forcing an extreme state transition - eg. launching Intel XTU's benchmark test.  I was able to tweak LLC on my Z370 board enough to deal with a lot of instability but its VRMs were never designed for the power hungry i9-9900k I upgraded to.  What you're doing in LOL is going from low load state to max speed/voltage so its a big jump in the curve - you can mask the problem by limiting the overshoot say setting 1.05v max, so your overshoot is lowered enough that it doesn't crash your system, but it doesn't solve the bad transient response on the card itself.
#33
f0resight
New Member
  • Total Posts : 16
  • Reward points : 0
  • Joined: 2008/01/07 12:06:44
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 16:18:42 (permalink)
My original 3090 failed, I'm not sure what the voltages were on it.  The replacement 2114 S/N model does not go above 1.062V for the GPU stock without any tweaking in light loads.  I've tested Halo and FFXIV, not LoL yet.  I've not had any issues the last couple of weeks I've had it.
#34
ClowReed
New Member
  • Total Posts : 34
  • Reward points : 0
  • Joined: 2011/05/13 05:24:19
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 17:52:15 (permalink)
neteng101
Its the transient response of the card during voltage shifts.  If you want to know more about transient response and how it affects electronic circuits I recommend watching some of Buildzoid's videos on Youtube.
 
The really oversimplified version is that changes in state can lead to overshoot and you can't see this in monitoring software, only on an oscilloscope.  The voltage spikes you can't see can be much higher than the safe limits, eg. the card tries to ramp up to 1.081V, but overshoot could say lead it to run at 1.3V for a brief moment.  It leads to crashes at least, and at worse will just burn out components.
 
The power delivery of the FTW3 cards can't respond fast enough to regulate the voltage changes safely - my guess is they didn't quite account for something correctly with the 3x8 power inputs or the programming for the VRMs isn't right.  Best case they can reprogram the firmware and a BIOS update can fix this, worse case the card might need a redesign of its power delivery/filtering.  We already know that Nvidia pushed Ampere to the limits so much that early cards were crashing, with some cheaper cards being more prone to crashing.
 
If you're wondering why someone would be interested enough to go listen to Buildzoid's technical ramblings, it taught me a lot about LLC settings and overclocking my CPU.  I can instantly crash my otherwise stable system by forcing an extreme state transition - eg. launching Intel XTU's benchmark test.  I was able to tweak LLC on my Z370 board enough to deal with a lot of instability but its VRMs were never designed for the power hungry i9-9900k I upgraded to.  What you're doing in LOL is going from low load state to max speed/voltage so its a big jump in the curve - you can mask the problem by limiting the overshoot say setting 1.05v max, so your overshoot is lowered enough that it doesn't crash your system, but it doesn't solve the bad transient response on the card itself.


 Nice explanation! I tried to reach out to Buildzoid about this matter, but so far he didn't answer.
You think it's possible for us to limit the overshoot? Like make a custom bios or something? Or it's just EVGA that holds this kind of power?
#35
TechJessica87
Superclocked Member
  • Total Posts : 104
  • Reward points : 0
  • Joined: 2021/02/08 12:20:28
  • Location: Dearborn, MI
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 19:09:05 (permalink)
Solid info, I'm sure it'll come together!
#36
theanalyzer
New Member
  • Total Posts : 5
  • Reward points : 0
  • Joined: 2021/02/06 03:52:17
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 19:12:05 (permalink)
Does mining affect cards? Ill have to go back and check (got 1x3080 FTW3 mining in my spare time)
 
Have only played Dota 2 while mining, and a bit of SOTR.
#37
Rewire92
New Member
  • Total Posts : 46
  • Reward points : 0
  • Joined: 2018/09/06 00:21:39
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 20:55:49 (permalink)
theanalyzer
Does mining affect cards? Ill have to go back and check (got 1x3080 FTW3 mining in my spare time)
 
Have only played Dota 2 while mining, and a bit of SOTR.


Funnily enough, playing a game while mining would be a proper mitigation, because the card would be locked at full voltage regardless of rendering load.


#38
jackychim
New Member
  • Total Posts : 2
  • Reward points : 0
  • Joined: 2021/01/30 05:36:56
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 21:23:22 (permalink)
Can anyone tell me how to adjust the MSI curve on the back end? It keeps jumping in alignment with the lower voltages.
#39
jackychim
New Member
  • Total Posts : 2
  • Reward points : 0
  • Joined: 2021/01/30 05:36:56
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/10 21:32:25 (permalink)
Rewire92
arestavo
1.1V is what these GPUs are rated for. Whether or not the GPU will be able to hit that due to GPU boost's algorithm is a whole different story.


Well they may be "rated" for 1.1V, but since I did this fix I found, I'm going on 6 hours with no crashing, and no voltage spikes past 1.068V.

It may not be the root of the problem, but it's certainly fixed it.

EDIT:  Also, the crashes were happening in low power states at low wattage and GPU usage.  You're running the the highest performance state with full GPU usage, which has no problems as demonstrated by my 150 hours on cyberpunk.




Would you kindly be able to show me how to adjust the back curve of the clock speeds like in OP's screenshot?  My MSI Afterburner keeps resetting higher to match the previous power curve pivot point.
#40
Exsurgolol
New Member
  • Total Posts : 8
  • Reward points : 0
  • Joined: 2014/02/08 11:48:29
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 06:08:37 (permalink)
EVGA needs to recall or put out an emergency bios update for these cards.
I can 100% replicate this issue (older low demand games will blackscreen my system in minutes and require several reboot cycles to even output display) with my FTW3 Ultra that seems to have been progressively failing the past week.  It might be only certain cards with faulty voltage regulation/spikes at fault here (mine is 2012 China model).
#41
no00wa
New Member
  • Total Posts : 3
  • Reward points : 0
  • Joined: 2021/02/01 02:20:50
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 06:20:12 (permalink)
Has anyone had a card that isn't necessarily failing (permanently) but is suffering from degraded performance, like stuttering in games, after the fact? clockspeeds / memory speeds all seem normal.
#42
ClowReed
New Member
  • Total Posts : 34
  • Reward points : 0
  • Joined: 2011/05/13 05:24:19
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 06:36:43 (permalink)
no00wa
Has anyone had a card that isn't necessarily failing (permanently) but is suffering from degraded performance, like stuttering in games, after the fact? clockspeeds / memory speeds all seem normal.


Better yet... Someone with a perfectly working card for 2+ months in older games? Cause it feels like ALL cards have this problem, but only a few vocal folks come forward to shout about it.
Mine should arrive today and I have 7 days to return it. If it fails even once I'll return it and get a TUF, since the Strix is WAY expensive here. :/
#43
exlink
New Member
  • Total Posts : 100
  • Reward points : 0
  • Joined: 2007/04/27 20:38:59
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 06:39:09 (permalink)
I find it interesting that Jacob from EVGA mentioned that the FTW3 HydroCopper block may not be sold separately due to “technical reasons”. After the HC cards being delayed over 4 months I’m wondering if EVGA is in the middle of a PCB revision. Maybe a fix for the issues?

Would explain the HC delays and why the HydroCopper blocks may not be sold separately since they could potentially be incompatible with the first PCB version.
#44
ClowReed
New Member
  • Total Posts : 34
  • Reward points : 0
  • Joined: 2011/05/13 05:24:19
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 06:50:04 (permalink)
If they revise it, you believe they will recall the already sold ones? And what about consumers outside US and Europe? Normally they recall for those too? Sorry... I'm a little worried haha
#45
exlink
New Member
  • Total Posts : 100
  • Reward points : 0
  • Joined: 2007/04/27 20:38:59
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 07:40:24 (permalink)
ClowReed
If they revise it, you believe they will recall the already sold ones? And what about consumers outside US and Europe? Normally they recall for those too? Sorry... I'm a little worried haha

Your guess is as good as mine. If there is a PCB revision to fix any issues, but the original cards don’t cause damage to any other components and aren’t a danger to customers then I wouldn’t be surprised if EVGA doesn’t do a recall. They would probably just replace the cards if they fail and come in for RMA. But this is all speculation on my end.
#46
B0baganoosh
CLASSIFIED Member
  • Total Posts : 2365
  • Reward points : 0
  • Joined: 2009/08/04 04:27:18
  • Status: offline
  • Ribbons : 39
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 13:07:37 (permalink)
If they can make the cards safe without a noticeable performance loss (seeing as the issue, if widespread, seems to be at low power-states, not in power-hungry apps/games), I would bet they just put out a bios update that forcibly does what you've done in MSI-AB. From a business perspective, I would expect a manager to weigh the cost/benefit of that vs. a complete recall and go with the bios update followed by a PCB revision that just goes out to newer orders. It's costly to do a board spin in the middle of production, but if you can time it right for a smooth cut-in, that's ideal. This is following exlink's theory on HC delays.
 
The end result is your card works great, it doesn't fail, your most demanding games see no loss in performance, and the problem goes away...for everyone that updates their bios. Cutting in new boards eventually just removes the dependency on people to update their bios or worrying about what bios is on what cards.

6Q6CPFHPBPCU691 is a discount code anyone can use.
 
i9 13900k - EVGA Z690 Classy - Nvidia RTX 4090 FE - G.Skill 32GB DDR5-6000  - WD SN850 2TB NVMe Gen4 - Be Quiet! Straight Power 12 1200W - Be Quiet! Dark Base 900 Pro. MO-RA3 420 Pro. Dark Palimpsest MODS RIGS post for build notes.
#47
i.am.pekk
New Member
  • Total Posts : 2
  • Reward points : 0
  • Joined: 2020/12/17 11:04:44
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 19:14:13 (permalink)
This is very interesting. I also have an FTW3 Ultra 3090 and I just RMA'd the first one. I am not so OC savvy to know about how undervolting and overclocking works so well but I can confirm I had the same black screen issue. My screens go black, the monitor stops receiving a signal. Sometimes my audio still works (game audio and discord audio still come through). I can sometimes continue talking through open mic on discord as well.
 
I mostly play Overwatch, where I have most settings on the minimum (to optimize for frame rate and eliminate distractions). I think I've seen OW crash once or twice at most, but usually it doesn't have any issues. Sometimes when this crash occurs the computer is able to recover so the black screen is just a flicker and my desktop returns shortly after (like switching inputs on the monitor would appear). At this point anything that uses hardware rendering seems to fail. Blizzard's bnet launcher fails to render after a recoverable crash saying that the UI can't get a 3d context (or something like that), which is often resolved by exiting and restarting the client.
 
Software that has reliably caused issues:
  • Mozilla Firefox - Assuming this is related to GPU acceleration and video playback (youtube, twitch.tv, websites that use GPU accelerated animations?)
  • League of Legends - I play at max graphical settings here, unlike OW - Crash happens mid-game
  • Resident Evil: HD Remaster - Also max settings
  • Sekiro: Shadows Die Twice - Also max settings - Seems to happen while alt-tabbing to desktop
  • da Vinci Resolve 16 - Has crashed multiple times
  • da Vinci Resolve 17 beta - has also crashed once
My second card has only crashed once or twice so far and I haven't put in a ticket yet as it was only a day or so ago and I'm still trying to see what the issue might be to provide more details. I thought maybe I was using some bad/cheap cables and that was my issue.
 
I have 2 monitors plugged in:
  1. Dell Alienware AW2518HF 240hz (DisplayPort 1.2 cable)
  2. El Gato HD60S Capture Card (HDMI 2.0 Cable)
 
It seems like most people who have issues have multiple screens. I was wondering if the refresh rate of the screens/resolution had any correlation. i.e., the amount of data bandwidth required to run those screens?
 
I just replaced the HDMI cable on the capture card hoping that would resolve it since the capture card was also having some issues with sound dropping and video colors distorting. So it seemed likely that the capture card was somehow related.
 
Oddly League of Legends was one of the most reliable crash reproducing games. I crashed enough that I got a leave buster warning for leaving too many games and had to digitally sign an agreement to never leave another game early.
 
I've never once crashed during 3DMark Time Spy.
 
If you need more software to use to test reliability you can try the ones I mentioned. Also worth noting I often lurk in many twitch streams to help viewership so there could be multiple videos playing simultaneously, though I mute the sound through firefox or by using a separate sound output device for the web browser and mute that.
 
Has undervolting reliably fixed your issue? I saw some other posts on nvidia forums where people tried undervolting to no avail, so I never bothered trying it myself.
#48
Noodle 1
New Member
  • Total Posts : 100
  • Reward points : 0
  • Joined: 2020/12/03 14:16:42
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 19:18:59 (permalink)
Does anyone know if this is related to EVGA cards, FTW specifically, or just 3000 series in general?
#49
MatthewAMEL
Superclocked Member
  • Total Posts : 164
  • Reward points : 0
  • Joined: 2016/07/13 23:15:40
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 20:06:02 (permalink)
ClowReed
no00wa
Has anyone had a card that isn't necessarily failing (permanently) but is suffering from degraded performance, like stuttering in games, after the fact? clockspeeds / memory speeds all seem normal.


Better yet... Someone with a perfectly working card for 2+ months in older games? Cause it feels like ALL cards have this problem, but only a few vocal folks come forward to shout about it.
Mine should arrive today and I have 7 days to return it. If it fails even once I'll return it and get a TUF, since the Strix is WAY expensive here. :/



Are we talking only 3090's? I have a 3080 FTW3. Received in November. I play a lot of older games (Fallout:NV, SupCom, Homeworld Remastered). I have put hundreds of hours in on the older games since I received my card. I've never had a crash or black screen. I do not OC my card. My most recent TS: 17205, TSE: 8890.
#50
Rewire92
New Member
  • Total Posts : 46
  • Reward points : 0
  • Joined: 2018/09/06 00:21:39
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/11 20:26:56 (permalink)
i.am.pekk
This is very interesting. I also have an FTW3 Ultra 3090 and I just RMA'd the first one. I am not so OC savvy to know about how undervolting and overclocking works so well but I can confirm I had the same black screen issue. My screens go black, the monitor stops receiving a signal. Sometimes my audio still works (game audio and discord audio still come through). I can sometimes continue talking through open mic on discord as well.
 
I mostly play Overwatch, where I have most settings on the minimum (to optimize for frame rate and eliminate distractions). I think I've seen OW crash once or twice at most, but usually it doesn't have any issues. Sometimes when this crash occurs the computer is able to recover so the black screen is just a flicker and my desktop returns shortly after (like switching inputs on the monitor would appear). At this point anything that uses hardware rendering seems to fail. Blizzard's bnet launcher fails to render after a recoverable crash saying that the UI can't get a 3d context (or something like that), which is often resolved by exiting and restarting the client.
 
Software that has reliably caused issues:
  • Mozilla Firefox - Assuming this is related to GPU acceleration and video playback (youtube, twitch.tv, websites that use GPU accelerated animations?)
  • League of Legends - I play at max graphical settings here, unlike OW - Crash happens mid-game
  • Resident Evil: HD Remaster - Also max settings
  • Sekiro: Shadows Die Twice - Also max settings - Seems to happen while alt-tabbing to desktop
  • da Vinci Resolve 16 - Has crashed multiple times
  • da Vinci Resolve 17 beta - has also crashed once
My second card has only crashed once or twice so far and I haven't put in a ticket yet as it was only a day or so ago and I'm still trying to see what the issue might be to provide more details. I thought maybe I was using some bad/cheap cables and that was my issue.
 
I have 2 monitors plugged in:
  1. Dell Alienware AW2518HF 240hz (DisplayPort 1.2 cable)
  2. El Gato HD60S Capture Card (HDMI 2.0 Cable)
 
It seems like most people who have issues have multiple screens. I was wondering if the refresh rate of the screens/resolution had any correlation. i.e., the amount of data bandwidth required to run those screens?
 
I just replaced the HDMI cable on the capture card hoping that would resolve it since the capture card was also having some issues with sound dropping and video colors distorting. So it seemed likely that the capture card was somehow related.
 
Oddly League of Legends was one of the most reliable crash reproducing games. I crashed enough that I got a leave buster warning for leaving too many games and had to digitally sign an agreement to never leave another game early.
 
I've never once crashed during 3DMark Time Spy.
 
If you need more software to use to test reliability you can try the ones I mentioned. Also worth noting I often lurk in many twitch streams to help viewership so there could be multiple videos playing simultaneously, though I mute the sound through firefox or by using a separate sound output device for the web browser and mute that.
 
Has undervolting reliably fixed your issue? I saw some other posts on nvidia forums where people tried undervolting to no avail, so I never bothered trying it myself.


My workaround should fix your issue.  You are having the same issues I had, which my voltage curve adjustment has resolved at this point 100%.

I have 5 monitors, no identical models.
post edited by Rewire92 - 2021/02/11 20:30:45


#51
Exsurgolol
New Member
  • Total Posts : 8
  • Reward points : 0
  • Joined: 2014/02/08 11:48:29
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 02:35:47 (permalink)
While waiting for EVGA support to get back to me I have been searching for ways to replicate this issue.
I have now found 2 benchmark suites that replicate this low power game issue and will usually black screen lock up my system within 5 minutes, all at stock settings.
 
The alarming thing is the spikes to 1.081V only black screen my card faster.  Spiking frequently to 1.062V will blackscreen my system too, it just takes a little longer.  Black screens are probably possible lower down the voltage curve too, but much less likely.
 
The issue with replicating this problem is I think you need games or benchmarks that vary significantly in performance, i.e pushing the card up and down the V/F curve (3DMark etc tends not to do this).  Problem is this actually does happen a lot in games when you transition through cut scenes, enter menus etc, but often can take 1h+ for the issue to arise.  Even FS2020 can blackscreen for me because when approaching a busy airport for landing the game becomes CPU bottlenecked, causing GPU utilisation spikes.
#52
RKR21566
New Member
  • Total Posts : 30
  • Reward points : 0
  • Joined: 2020/12/26 01:28:38
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 04:33:46 (permalink)
This is my new (10 days old) 3090 FTW3 Ultra (First one was a 2012SN and died playing LOL after 40 days)
This is a 2014SN

Is this so bad?
the GPU-Z values are recorded after a couple of games on Overwatch.
Since I got the new card I've experienced none of black screen, fan ramping etc...

In 1 hour game I found in GPU-Z Log 40 lines including the 1.0810 spikes (a couple of seconds each)


post edited by RKR21566 - 2021/02/12 04:45:41

EKWB Dual Custom Loop - AMD Ryzen 9 3950X @4.4Ghz - Asus X570 Crosshair VIII Formula - G.Skill TridentZ Neo 64Gb @3.6Ghz - Samsung 970 Pro M.2 512Gb -
Samsung 970 EVO Plus M.2 2Tb - Samsung 860 QVO 2Tb - EVGA RTX 3090 FTW3 Ultra - Creative Sound Blaster X7 LE - Creative E-MU XM7 -
Thermaltake The Tower 900 - Corsair AX1600i - Corsair Glaive RGB Pro - Corsair Polaris - Corsair K95 - AORUS FV43U 4K @144hz- Oculus Rift S -
Sennheiser HDV 820 - Sennheiser HD800S/HD25 Amperior 
#53
Exsurgolol
New Member
  • Total Posts : 8
  • Reward points : 0
  • Joined: 2014/02/08 11:48:29
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 05:01:19 (permalink)
I can't say if it's bad or not.  In theory these cards should be engineered to withstand far higher than 1.081V, considering they have a voltage slider that can go up to 100.
 
My gut feeling is capping the voltage artificially with MSI V/F curve is a Band-Aid, masking deeper power delivery problems that will probably reveal themselves sooner or later.  However, I doubt all FTW3 cards suffer from this problem, I think it's more likely to be several bad batches as I would assume QC testing would reveal these problems fairly quickly if it was every card.  In theory I could give details for the benchmarks I have found that black screen my card within 5 minutes average at stock settings but I don't want any more cards to suffer unnecessary damage.  Hopefully EVGA get back to me on my testing results and I hope they can release a bios fix for these cards.
 
 
#54
Fracture-7
New Member
  • Total Posts : 1
  • Reward points : 0
  • Joined: 2021/02/12 05:03:13
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 05:18:12 (permalink)
Exsurgolol
While waiting for EVGA support to get back to me I have been searching for ways to replicate this issue.
I have now found 2 benchmark suites that replicate this low power game issue and will usually black screen lock up my system within 5 minutes, all at stock settings.
 
The alarming thing is the spikes to 1.081V only black screen my card faster.  Spiking frequently to 1.062V will blackscreen my system too, it just takes a little longer.  Black screens are probably possible lower down the voltage curve too, but much less likely.
 
The issue with replicating this problem is I think you need games or benchmarks that vary significantly in performance, i.e pushing the card up and down the V/F curve (3DMark etc tends not to do this).  Problem is this actually does happen a lot in games when you transition through cut scenes, enter menus etc, but often can take 1h+ for the issue to arise.  Even FS2020 can blackscreen for me because when approaching a busy airport for landing the game becomes CPU bottlenecked, causing GPU utilisation spikes.



I wanted to chime in on this issue as I have a FTW3 3080 with which I was getting constant crashes, at random intervals, when gaming. Crashes where my entire system would lose power instantly and I'd have to reset the power switch on my PSU to even turn the system back on. Crashes were happening when playing Tarkov and Valheim recently, as well as other games previously that I can't recall which game). WHEA was showing the problem causing the shutdown was related to Power. At first given some advice from others I thought perhaps it was a PSU related problem and I simply did not have enough juice to support the massive power spikes this card can hit (I've seen it peak to almost 400W and I know it can draw even higher amounts than that) because the issue only started happening when I swapped my 5800x for a 5900x. Thus it made since given that I inserted a higher power draw CPU and now the card's spikes would be enough to trigger the overcurrent protection causing the shutdown. I thought I was getting somewhere with that, as I lowered the power limit to 65% and was no longer able to cause full shutdown even after 30+ minutes of stress testing. For comparison, I had been using Furmark+Prime95 blend running 24 threads+OCCT's Power stress test all simultaneously. This was the only combination of tests I found that was able to replicate the shutdown, and it was occurring without fail within 5 minutes of running all 3 with stock settings, sometimes within 60 seconds. I noticed my voltage also peaking to 1.063V many times (including the instant before shutdown), and the max hitting that same number you saw too: 1.081V. So given what I had read in this thread and knowing it was either related to power or the GPU's voltage I did some further testing. Since 65% power limit caused no issues, and lowering the power limit also lowers voltage, I decided to reset everything to stock and allow 100% power limit. From there, I did a complete undervolt to 0.0975V. The same combo of stress tests that was causing a crash within 5 minutes then ran flawlessly for 2 consecutive 30+ minute sessions with no crash, and I have no crashed while gaming yet. It definitely isn't a wattage issue either: at 100% power limit with the undervolt I'm still seeing maxes at near 400W just like I was on stock settings. The only difference is my voltage doesn't hit those spikes, obviously.

Sir, I think you're definitely on to something and have figured out this issue. It also appears to be an issue affecting FTW3s in general, at least for 3080s and 3090s. Perhaps it is related to the 3 8 pin connection vs the 2 8 pin connection found on the 3060 Ti and 3070 versions as I haven't found anything related to those having issues. Either way: if you read this thread just know that Exsurgo is correct and his solution will likely work for you, as it did for me. We both had basically the exact same situation, exact same voltage spikes, and same solution despite me having a 3080 and him having a 3090. If you have the FTW3 version of these cards PLEASE keep an eye on them. Also, while I listed the way I was able to replicate the shutdown on my PC: follow along at your own risk knowing that forcing the GPU to kill itself like that could lead to long term or permanent damage. I have an old card to use and could have patiently waited for an RMA if mine died doing these tests, but I'd hate to see someone kill their card trying to do this. Just listen to us and undervolt your GPU and see the results.
post edited by Fracture-7 - 2021/02/12 05:21:09
#55
RKR21566
New Member
  • Total Posts : 30
  • Reward points : 0
  • Joined: 2020/12/26 01:28:38
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 05:28:26 (permalink)
Somewhere on Reddit one users was saying that the problem is related to the 3*8pin connection. He says that these VGA had been initially designed to use 2*8pin+1*6pin and this configuration has been changes in last moment to 3*8Pin.

EKWB Dual Custom Loop - AMD Ryzen 9 3950X @4.4Ghz - Asus X570 Crosshair VIII Formula - G.Skill TridentZ Neo 64Gb @3.6Ghz - Samsung 970 Pro M.2 512Gb -
Samsung 970 EVO Plus M.2 2Tb - Samsung 860 QVO 2Tb - EVGA RTX 3090 FTW3 Ultra - Creative Sound Blaster X7 LE - Creative E-MU XM7 -
Thermaltake The Tower 900 - Corsair AX1600i - Corsair Glaive RGB Pro - Corsair Polaris - Corsair K95 - AORUS FV43U 4K @144hz- Oculus Rift S -
Sennheiser HDV 820 - Sennheiser HD800S/HD25 Amperior 
#56
i.am.pekk
New Member
  • Total Posts : 2
  • Reward points : 0
  • Joined: 2020/12/17 11:04:44
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 13:34:27 (permalink)
Rewire92
i.am.pekk
...
Has undervolting reliably fixed your issue? I saw some other posts on nvidia forums where people tried undervolting to no avail, so I never bothered trying it myself.

My workaround should fix your issue.  You are having the same issues I had, which my voltage curve adjustment has resolved at this point 100%.

I have 5 monitors, no identical models.




Sorry if I missed something in the thread, do I just eyeball the values from your screenshot or is there a way for me to download your curve and load it in? Can I do it with the EVGA X1 tool or do I need MSI afterburner?
#57
Paynal
New Member
  • Total Posts : 20
  • Reward points : 0
  • Joined: 2013/03/23 11:13:56
  • Status: offline
  • Ribbons : 0
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 16:03:55 (permalink)
i.am.pekk
Has undervolting reliably fixed your issue? I saw some other posts on nvidia forums where people tried undervolting to no avail, so I never bothered trying it myself.

 
Popping in to add a perspective from someone who's been running with this solution for an extended period on a November 2020 red 3090 FTW Ultra.  I had my first blackscreen of impending death first week of December, 2020, and started undervolting immediately afterwards (via Afterburner).  Since then, I've subjected the card to dozens of hours of different workloads, ranging from video and lightweight 2D platformers to stress tests, RTX, and high framerate, high rez VR.  VR in particular is prone to placing wildly-shifting loads on the system that are a prime candidate for triggering a voltage spike as the card rapidly shifts from high to low and back up again.  (Half-Life Alyx was what did it for me, and I've seen that one mentioned as trouble by other VR users on this board.)
 
The results?  2.5 months of perfect performance.  No more blackscreens, with or without maxed out fans.  No red light of death.  No voltage spikes in my logs.  Card performance has not degraded, I'm still pulling the same scores in 3DMark I was months ago and seeing the same framerates in VR.  I'm not sacrificing performance in high demand games, either, as this particular card happily runs at 1980mhz at 0.881 volts.
 
TL;DR -- Undervolting works for the long run, go for it.
 
 
@Rewire92, Exsurgolol, and Fracture-7 -- I saw the exact same spikes you three did when I started having trouble last year.  It was even the same exact values.  Card would regularly throw spikes up to 1.060 and stay on its feet (with stutters for a second or two sometimes), but 1.061 would often come with a program crash, though the card would usually stay up.  1.063 and up and it might crash violently -- blackscreen with either screaming fans or several second blackscreen and then stuck in an extreme low-clock state (200-400mhz), requiring a system reboot to clear.
post edited by Paynal - 2021/02/12 16:06:43

3090FTW Ultra (1980mhz at .881V)/10900K/Asus z590 Maximus XIII Extreme/128GB Micron E-Die/SuperFlower LeadEx Platinum 1600W/Valve Index/HTC Vive Pro 1 (WiFi'ed)/Thermaltake Core W100 + half a dozen Delta EFBs
#58
Kylearan
iCX Member
  • Total Posts : 288
  • Reward points : 0
  • Joined: 2013/12/26 04:04:40
  • Status: offline
  • Ribbons : 2
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 17:59:59 (permalink)
Paynal
i.am.pekk
Has undervolting reliably fixed your issue? I saw some other posts on nvidia forums where people tried undervolting to no avail, so I never bothered trying it myself.

 
Popping in to add a perspective from someone who's been running with this solution for an extended period on a November 2020 red 3090 FTW Ultra.  I had my first blackscreen of impending death first week of December, 2020, and started undervolting immediately afterwards (via Afterburner).  Since then, I've subjected the card to dozens of hours of different workloads, ranging from video and lightweight 2D platformers to stress tests, RTX, and high framerate, high rez VR.  VR in particular is prone to placing wildly-shifting loads on the system that are a prime candidate for triggering a voltage spike as the card rapidly shifts from high to low and back up again.  (Half-Life Alyx was what did it for me, and I've seen that one mentioned as trouble by other VR users on this board.)
 
The results?  2.5 months of perfect performance.  No more blackscreens, with or without maxed out fans.  No red light of death.  No voltage spikes in my logs.  Card performance has not degraded, I'm still pulling the same scores in 3DMark I was months ago and seeing the same framerates in VR.  I'm not sacrificing performance in high demand games, either, as this particular card happily runs at 1980mhz at 0.881 volts.
 
TL;DR -- Undervolting works for the long run, go for it.
 
 
@Rewire92, Exsurgolol, and Fracture-7 -- I saw the exact same spikes you three did when I started having trouble last year.  It was even the same exact values.  Card would regularly throw spikes up to 1.060 and stay on its feet (with stutters for a second or two sometimes), but 1.061 would often come with a program crash, though the card would usually stay up.  1.063 and up and it might crash violently -- blackscreen with either screaming fans or several second blackscreen and then stuck in an extreme low-clock state (200-400mhz), requiring a system reboot to clear.




So this is very similar same as the MSI MXM GTX 1070 v1.0 bug, except that happened between 0.950v-1.013v (someone said it was exactly 1.013v which crashed but I never owned that card, I have the v1.2 version), and it didn't destroy the card either.  Probably because of much lower power draw.  Hard for magic smoke to come out of a 115W card.
 
http://forum.notebookreview.com/threads/1070-laptop-gt73vr-gt62vr-gt72vr-reboot-crash-problem.804978/
 
If anyone here cares.  I know a lot of people don't care much about my posts.
#59
_Gir_
iCX Member
  • Total Posts : 331
  • Reward points : 0
  • Joined: 2016/02/02 20:12:10
  • Status: offline
  • Ribbons : 2
Re: Fixing EVGA's 7 Figure Problem with FTW3 30 Series cards. 2021/02/12 18:48:09 (permalink)
Are you measuring the voltage in software or a meter?
 
Is the aforementioned voltage being measured at the source (VRM) or in the die?  
 
 
#60
Page: < 12345.. > >> Showing page 2 of 6
Jump to:
  • Back to Mobile