EVGA

F@H Performance Assessments & Comparisons

Page: < 12 Showing page 2 of 2
Author
Chris21010
FTW Member
  • Total Posts : 1587
  • Reward points : 0
  • Joined: 2006/05/03 07:26:39
  • Status: offline
  • Ribbons : 9
Re: F@H Performance Assessments & Comparisons 2018/04/02 21:32:29 (permalink)
OK, in my latest test i went and compared all the different driver versions i currently had running and found that 390.25 (Linux) was the best one of them all. i now have 390.48 running on a few of the machines so i can compare it to 390.25 but still there was only a minor ~6% performance gain when running driver 390.25 when compared to 384.XX drivers.


#31
ProDigit
iCX Member
  • Total Posts : 465
  • Reward points : 0
  • Joined: 2019/02/20 14:04:37
  • Status: offline
  • Ribbons : 4
Re: F@H Performance Assessments & Comparisons 2019/07/26 23:41:08 (permalink)
Let me add to this thread:
RAM doesn't really do much to me, however, I assume when you used 8GB, you either compared it with 4GB dual channel in Windows, or you ran Linux with the memory in dual channel vs 4GB in single channel!
My previous Xeon setup folded no different using 2x 2GB RAM or 2x 4GB ram under windows or Linux.
I believe your numbers are as a result of running dual channel, so the CPU can much faster retrieve the data from RAM, and send it to the GPU; which causes a reduction in latency; vs running it on single channel. A single memory stick (running single channel) is usually fast enough, but it adds latency.
Likewise, tests were done where 16GB and higher was actually slowing down the system from running less memory.
I recommend 2x2GB for DDR3 systems, and 2x4GB for DDR4 systems (they don't sell below 4GB sticks of DDR4). This setup is much better than running 1x4GB or 1x8GB of DDR3; or 1x8GB or 1x16GB of DDR4 in a system.
 
PCIE 3.0:
- For PCIE speeds, a modern RTX 2060 graphics card can do with a PCIE 3.0 1x to 16x riser, and get nearly 97% of it's full slot performance.
- RTX 2060 Super, as well as the older RTX 2070 cards, can also work fine on 1x, however if you had a 2x port, it would be better.
- RT 2080 and 2070 Super, need a 2x port (which, aside from a converted m.2 to 4x slot, you won't find). So for 2070 Super, 2080, 2080 Super, and 2080Ti, you probably get a 3.0 4x slot or higher.

For PCIE 2.0:
- RTX 2060 runs best off of a PCIE 2.0 4x slot or greater.
- RTX 2060 Super, or 2070 can run off of a 4x slot, but preferably use an 8x slot or greater.
- RTX 2070 Super, 2080, 2080 Super, and a 2080 Ti, need a PCIE 2.0 8x slot, or greater.
PCIE 2.0 1x slots aren't recommended for GTX or better cards! A GT 1030 is about as fast as a 2.0 1x speed slot can provide for. Even a GTX 1050 will run 10-15% slower on a PCIE 2.0 1x slot!
 
MultiGPUs:
It appears that most modern Intel motherboards offering multi full size slots, can only drive up to 4x GPUs.
I've tested multiple Asus, Asrock, MSI and Gigabyte LGA 1151 motherboards in the sub $200 range, for Intel 6 & 7th gen, and 8 & 9th gen CPUs, trying to get more than 4GPUs to work at a time, without success.
*Please note, I did not use any performance lowering PCIE 1x to 4x splitters, to drive multi card setups. I also didn't run multi PCIE 1x slot motherboards. Most motherboards came with 2 or 3 full size slots, and 3 PCIE 1x slots*
Some motherboards don't even recognize more than 3 GPUs, and are finicky in that they sometimes do, and sometimes don't show all the GPUs upon boot!
With the limitation of 1GPU per core for most optimal results, I would recommend to stick with quad core CPUs, as most motherboards don't support more than 4 GPUs (2 to 3 in a full size slot, the remaining in a PCIE 1x slot) when running Linux.
Get a 6 core CPU, like Intel Core I5 9400F (or K), if you're running Windows; or, you can disable cores to save power matching your 1 core per GPU (+1 core for Windows).
 
There are some (standard ATX or extended ATX) motherboards which support more than 4GPUs, but the majority won't (not including mining motherboards filled with PCIE 1x slots), and faster GPUs, like the 2070 Super, 2080, or higher, need PCIE 4x minimum, so those multi PCIE 1x slot boards aren't really a good alternative to run these cards.
Many of these cheaper Chinese mining boards also run PCIE 2.0, so not a good option for folding!

The problem with most motherboards lies in it's use of full size PCIE slot speed (8x, 4x, 4x) configuration, often leaving a single 1x slot available for 4th GPU, before PCIE lanes are used up.
If they had created a 4x 4x 4x configuration on their full size slots, it would have been possible to drive 4 additional GPUs via 1x to 16x risers.
Sadly, this is not the case.

What's more, CPU's like the i5 9400F have no IGPs. Motherboard manufacturers could have routed an additional 8 PCIE lanes (from the CPU pins that were originally dedicated to the IGP) to a PCIE 4x slot on the board, if the CPU's IGP was either not present, or bypassed.
But they don't.
 
CPU:
I've also tested CPU throttling.
- While 6th and 7th gen CPUs from Intel can be throttled down, you'll see a ~10% performance penalty, running the CPU at 10% lower power consumption. In my case, this resulted in 5-6Watts of power savings. This was not worth it. My recommendation is to run the CPU at full speed, but you could disable turbo boost if you like; as Turbo Boost is the cause of most CPU power spikes; and Folding on Nvidia cards use the CPU constantly, not under spikes, anyway. Just as long as the CPU runs fast enough to drive your GPU, you should be ok.

- With Intel 9th gen CPUs, I've noticed that cutting power by 5-10W on these CPUs, can result in an infinite boot loop. Both on Gigabyte and Asus, did I mess up the system by giving the CPU less than recommended power. Normally the CPU would run at lower speeds, but the Bios allows for CPUs to run at much less than 50W (for a 65W CPU). Once you surpass the threshold, the CPU no longer gets enough power to boot the board, and you'll end up with a broken Mobo. Not recommended!
 
- I haven't yet gotten to the performance difference between eg: an Intel Quadcore with HT, vs a true 6 or Octacore CPU.
If you're having a quadcore with HT, running only 4GPUs, it is recommended to turn off HT for Linux. It'll allow for faster performance. It is estimated, that you could run 6 GPUs fine on a 4C/8T CPU, without speed degradation, but I haven't been able to test this claim. I guess as long as each thread can meet up each GPU's demand for processing power, it might work.
But it'd be interesting to see how the numbers are with HT enabled CPUs (eg: cut 3 cores off of a core i7, running single core + HT, and pushing 2 GPUs. Then lower the CPU frequency, until one card starts throttling).
 
- While it is possible to run multiple Nvidia GPUs per CPU core, it's really not recommended in Linux. Seeing that a 3,6Ghz dual core CPU could barely push 4x RTX 2060 cards, with loss of performance. The 1core per GPU still holds true (if you want to keep that 90-95% efficiency margin when compared to running 1 GPU per core). However, if you'd like to save some cost in power, you could reduce the CPU speed in the Bios to meet your most demanding GPU.
For an RTX 2080Ti @2.2M PPD, you need a 3Ghz CPU or faster.
For an RTX 2080 @1.5M PPD, you need a 2.75Ghz CPU or faster.
For an RTX 2060 @1M PPD, you need a 2Ghz CPU or faster.
For a GTX GPU @ 150-600k PPD you need a 1,7Ghz CPU or faster.
For a GTX GPU @ <150k PPD you can come by with the currently slowest CPUs (Intel Atom @1,6Ghz).
I can see the CPU usage in Linux, and also measured PPD efficiency drop drastically when dropping CPU frequency below the above standards.
If you run one RTX 2080, and the rest slower cards, you could cap your CPU to 2,8 or 3Ghz on all cores.
While it's not going to save you lots of power, it should come with a fairly unnoticeable performance penalty, and your CPU will run cooler too.
 
 
All the above findings are recorded in Lubuntu 18.10, with the Geforce 428.xx drivers, and for me, for a GPU to be running efficiently, it'll have to run at within 90-95% PPD, of when you'd run the same GPU by itself, in a PCIE 3.0 16x slot in Linux.
If the numbers are more than 10% off, I would not consider using that kind of setup, even if it saves you money on buying hardware.
Think 10% performance loss on 8 GPUs that do 1M PPD, equals 800k PPD performance loss. 
You'll lose the equivalent of a small RTX 2060 on score!
Also, if you're losing 10% of speed because of using PCIE 2.0, and 10% because of using Windows, and 10% because of using a PCIE splitter, you're running at 73% efficiency.
This is not recommended!
 
 
 
Since the GTX 1080 Ti is faster than the RTX 2060, I would highly doubt you could run 8 GPUs through a 3.7Ghz dual core CPU like the G4620, without some serious (like 80%) performance penalty!
At best, in Linux, according to my calculations, you could run 4 GTX 1080 Tis on a dual core that is at least 3,7Ghz, but it's only theoretical.
I would perhaps like to open conversation again about running multiple cards per CPU core; to iron out this inconsistency with my own findings.
post edited by ProDigit - 2019/07/27 00:38:49
#32
Chris21010
FTW Member
  • Total Posts : 1587
  • Reward points : 0
  • Joined: 2006/05/03 07:26:39
  • Status: offline
  • Ribbons : 9
Re: F@H Performance Assessments & Comparisons 2019/07/27 11:56:27 (permalink)
while it is nice to see some updated info some of your statements are in direct conflict to my results. my G4620 dual core HT 3.7 GHz only suffered ~28% performance loss with 8 GPUs and not an 80% loss. while it is important that the CPU is fast enough to properly feed your GPU data to process the vast majority of the load you see on the CPU come from the process constantly asking the GPU "are you done yet". this is basically an idle process taking eating up 100% load on the thread. because of this the losses are not as bad as you would expect as the CPU isnt actually doing work when its waiting for the GPU and HT kicks in and allows other processes that need work to be processed to use time on that coreinstead.
 
also to clarify on the memory question, i had to increase the memory to 8GB because i saw in linux that each WU thread could ask for up to 800 MB of RAM. because of these large WU asking for a lot of memory the performance gain from the increase RAM was from avoiding the use of page files on the disc drive. this is not something that will always come up as you could get a bunch of smaller WU and only need ~300MB per GPU, but if you have enough large WU to push RAM into page files performance will be lost.
post edited by Chris21010 - 2019/07/27 12:17:37


#33
ProDigit
iCX Member
  • Total Posts : 465
  • Reward points : 0
  • Joined: 2019/02/20 14:04:37
  • Status: offline
  • Ribbons : 4
Re: F@H Performance Assessments & Comparisons 2019/07/28 07:20:27 (permalink)
Each GPU in my case, takes about 500MB in their VRAM.
As far as Lubuntu goes, it only uses 300-600MB of actual PC RAM. I don't think I've ever seen it pass 1,5GB when using a browser and some other programs.
I don't see any benefit in adding more than 2 memory sticks of 1GB if it's just for folding; except, DDR 3 2GB sticks are so cheap nowadays. Even 4GB sticks, so one might as well buy 2x 4GB.
For DDR4 there's really no option to buy anything smaller than 4GB per stick, and use 2 sticks for Dual channel.
 
As far as the CPU goes, in Windows and Linux, you can see the Kernel times.
While the threads are 100% in use per GPU, the kernel times show a pretty accurate description of how much of the CPU is actually used by the GPU (without the Idle process).
Accurate, in a sense that a 3Ghz CPU will show ~60% kernel time usage for an RTX 2070.
Drop that CPU down to 2Ghz and kernel times increased to ~95%, and the RTX 2070 starts throttling (lowering performance).
 
While it's possible that you can run 8GPUs on a dual core CPU, I think you'd benefit greatly by upgrading your CPU to at least a quadcore like this one:
https://www.amazon.com/Intel-Core-i5-3470-Quad-Core-Processor/dp/B0087EVHVW
Only 7 Watts more TDP, 

If you can find a core i7 (I presume your motherboard runs 6th to 7th gen Core I CPUs) second hand, it would be even better, as they have HT.
Aside from the small initial purchase price, you'll see (in your own words) a 28% performance increase. 
With 8 GPUs, that would be the same as if you'd be running an additional 2 GPUs, at the cost of just 1 CPU. It would be well worth the investment if you ask me...
post edited by ProDigit - 2019/07/28 07:23:12
#34
Page: < 12 Showing page 2 of 2
Jump to:
  • Back to Mobile