Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
freakysqeeky
iCX Member
- Total Posts : 475
- Reward points : 0
- Joined: 2006/06/02 22:35:08
- Status: offline
- Ribbons : 3
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/22 16:19:25
(permalink)
I've been noticing that the utilization of the one gpu of your 295 is usually lower than mine when there is a load on both gpu's of the 295 with your 280, when you have a load on all three. Up to 20% less utilizatiion. It must taking off 20% off the load for physx because its not hurting your score. I see it helping you alot. But some of the applications you wouldn't think had any physx involved is still using your physx card. But by no means in a bad way.
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/22 20:36:05
(permalink)
My concern is that it looks like to me in more then 1 app now, that when my 295 is used with my 280... My workload is never high enough on my 295. (Not like yours...) I feel like they need to put 1/3 of the work on all 3 of my GPU's, but what I normally get is 50-60% on each 1/2 of my 295, and 80-90% on my 280. My fear is the programmer is just saying 50% of the work goes to the 295, and the other 50%, to the 280. They need to say, 2/3 of the work goes to the 295, and 1/3 to the 280. I also believe my 280 PhysX processor does help in more than PhysX apps, but want the extra performance the 280 brings without my 295 only 1/2 trying. (Yes, I am thrilled by that too. Dedicated PhysX my eye!) It does make me wonder how the guy's with smaller PhysX processors will do in apps like this, where workload wants to be highest, on the PhysX GPU. I want 90% utilization on all 3 GPU's. ( Demanding I know!!) I'm afraid the ' absolute beauty' of Mandelbulb's GPU workload distribution using the Optix libraries, has ruined me... I want that kind of even workload balance, on all multi-GPU apps. Especially Benchmark Apps.
post edited by Talonman - 2009/12/22 21:16:46
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
freakysqeeky
iCX Member
- Total Posts : 475
- Reward points : 0
- Joined: 2006/06/02 22:35:08
- Status: offline
- Ribbons : 3
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/22 21:50:00
(permalink)
Talonman My concern is that it looks like to me in more then 1 app now, that when my 295 is used with my 280... My workload is never high enough on my 295. (Not like yours...) I feel like they need to put 1/3 of the work on all 3 of my GPU's, but what I normally get is 50-60% on each 1/2 of my 295, and 80-90% on my 280. My fear is the programmer is just saying 50% of the work goes to the 295, and the other 50%, to the 280. They need to say, 2/3 of the work goes to the 295, and 1/3 to the 280. I also believe my 280 PhysX processor does help in more than PhysX apps, but want the extra performance the 280 brings without my 295 only 1/2 trying. (Yes, I am thrilled by that too. Dedicated PhysX my eye!) It does make me wonder how the guy's with smaller PhysX processors will do in apps like this, where workload wants to be highest, on the PhysX GPU. I want 90% utilization on all 3 GPU's. (Demanding I know!!) I'm afraid the 'absolute beauty' of Mandelbulb's GPU workload distribution using the Optix libraries, has ruined me... I want that kind of even workload balance, on all multi-GPU apps. Especially Benchmark Apps. I want 90% utilization on all 3 GPU's. Amen, good but could be better.I'm at my limit with a 295 on this benchmark. I still have plenty on the cpu 14% :) let some of the cpu do some work aswell. It must be hard for programmers 99% gpu1 98% gpu2. "My fear is the programmer is just saying 50% of the work goes to the 295, and the other 50%, to the 280. " I agree but better than one core on the cpu and half load on the 295.
post edited by freakysqeeky - 2009/12/22 21:53:51
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/23 05:46:47
(permalink)
True... I can see we both just want to have all of our system resources harnessed. All CPU cores, and GPU's!
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/23 16:16:43
(permalink)
I can finally run the OpenCL benchmark on SiSandra Lite 2010 with my graphics cards. Before with the 9.11 and stream 2 beta4, it would only test my CPU's OpenCL speed. Now, it does the GPU too It only recognizes "2" GPUs though. I saw the percentage utilization during the test, and GPU #2 did most of the rendering at ~60% utilization and halfway through the test, GPU #1 started working at ~25%. Here are my results with GPUs at stock speaks. All three are enabled but like I said, it is only picking up 2.
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/23 16:44:30
(permalink)
Thanks... luv2increase It only recognizes "2" GPUs though. I saw the percentage utilization during the test, and GPU #2 did most of the rendering at ~60% utilization and halfway through the test, GPU #1 started working at ~25%. So it's just not the Nvidia side that shows an un-even utilization's on the GPU's it uses... It's still early in the game I guess... For trivia was my utilization during Sandra's OpenCL test: A fresh run on my system, same graph as you posted:
post edited by Talonman - 2009/12/23 17:03:20
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/23 17:11:58
(permalink)
Here is something crazy I just realized. My GPU clocks on all 3 GPUs don't go above 400/900, even while under load for this Sisandra OpenCL test!!!!!!!!!
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/23 17:29:58
(permalink)
Unexceptable!!
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 09:09:16
(permalink)
freakysqeeky Talonman My concern is that it looks like to me in more then 1 app now, that when my 295 is used with my 280... My workload is never high enough on my 295. (Not like yours...) I feel like they need to put 1/3 of the work on all 3 of my GPU's, but what I normally get is 50-60% on each 1/2 of my 295, and 80-90% on my 280. My fear is the programmer is just saying 50% of the work goes to the 295, and the other 50%, to the 280. They need to say, 2/3 of the work goes to the 295, and 1/3 to the 280. I want 90% utilization on all 3 GPU's. Amen, good but could be better. I'm at my limit with a 295 on this benchmark. I still have plenty on the cpu 14% :) Let some of the cpu do some work aswell. It must be hard for programmers. 99% GPU1 - 98% GPU2. I agree but better than one core on the cpu and half load on the 295. You know, I was thinking more about the GPU workload distribution. My fear that the programmer was just splitting the work 50/50 between the 295 and 280 holds no water. The system has to consider that they are 3 individual GPU's each with their own memory. I don't believe they could just assign a single workload to my 295, and expect that both sides would load up... I think only 1 side would. I believe on a system running 3 GPU's, the workload MUST be divided into 3 separate assignments, if all 3 GPU's are to be used... So in the OpenCL test, they must have included the logic to divide up workload the best they could? On your system, both sides of your 295 are running full tilt... But I'm still not sure how your 99% load on GPU1 - and 98% load on GPU2 becomes: (When running a single 295) First 1/2 of my 295 reporting 61% utilization... Second 1/2 of my 295 reporting 58% utilization... 280 checking in at a whopping 92% utilization... (When running a 295 and 280.) I guess they simply must have assigned the biggest workload to the 280? Again, it makes we wonder how the workload would go if you ran a smaller PhysX card. Would the app realize this, and assign a smaller workload to the PhysX GPU? Update!!: v0.45 does much better on GPU workload distribution. 97%, 98%, and 98% GPU utilization... Sweet! My best score was C1786.0 on the old version... Now both 1/2's of my 295 are doing 30% of the total work... Love it! CPU has some skin in the game too. 3 GPU's working efficiently generate a nice score...
post edited by Talonman - 2009/12/24 10:51:43
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
freakysqeeky
iCX Member
- Total Posts : 475
- Reward points : 0
- Joined: 2006/06/02 22:35:08
- Status: offline
- Ribbons : 3
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 12:43:20
(permalink)
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 12:52:18
(permalink)
Thanks... (We do think alike. This multi-GPU thing has such potentilal, if done right.) I think because 'Pat' the programmer, produced a report listing each GPU, and the % it chipped into the total workload... He naturally had workload balance in mind. Looks like he make some adjustments too. We need more like Pat.
post edited by Talonman - 2009/12/24 12:53:26
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
freakysqeeky
iCX Member
- Total Posts : 475
- Reward points : 0
- Joined: 2006/06/02 22:35:08
- Status: offline
- Ribbons : 3
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 13:54:29
(permalink)
Yeah he did a good job on this one. Just that cpu Fermi should help and a good programmer. "The World Isn’t Flat, It’s Parallel "
post edited by freakysqeeky - 2009/12/24 14:07:01
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 14:33:02
(permalink)
Man-O... Thanks for the pictures. Looks like in the Multi-GPU OpenCL test, 4 CPU cores is max that the app will use. (Still good) So a single 295 scores around C1431 using both sides of the GPU... A 295 and 280, scored C1786 on version 0.44 when workload balance was off... and a 295 and 280, score C2418 on version 0.45 when workload is well balanced... Parallel Rules!! I wonder what a Quad-SLI Rig would score, and if it would use his 4th GPU? (Us Computer Operators look for inefficiencies and bottlenecks on the system, and raise the concern up to support for a programmer to look into.) That's just what I'm programmed to do... Pat and I would get along just fine!
post edited by Talonman - 2009/12/24 14:42:55
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 14:43:14
(permalink)
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 14:47:57
(permalink)
So (1) 5870 scores higher than both sides of a 295, and 280 in this test? Gasp... Wonder why? Please run the OpenCL test with profile cs_4_0 and post your results... I don't know what cs_5_0 does to the score.
post edited by Talonman - 2009/12/24 14:53:16
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 14:50:59
(permalink)
Talonman So (1) 5870 scores higher than both sides of a 295, and 280 in this test? Gasp... Wonder why? I would have to say that DXCompute is more efficient. The part that has me puzzled is the OpenCL. A single 5870 scores 3x better than what you had too. It is either that or ATI really hit a home run with their latest drivers "and" architecture.
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 14:55:27
(permalink)
Profile cs_5_0 I think -vs- cs_4_0 (I edited my above post...) Until the Direct Compute Combined test works, I don't care too much about it... My thrill comes from when all GPU's kick in. He does have the 'Combined DirectCompute' option in the drop down menu to select as a valid option... I have to believe he is currently working on that feature too.
post edited by Talonman - 2009/12/24 15:14:52
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 19:09:14
(permalink)
I am still looking for an ATI guy to run the OpenCL test using profile cs_4_0. I would like to see what your numbers are, even if it's just for a single card. Just for fun I set both my 295 and 280 to the exact same clock speeds: Core=702, Shaders=1512, and Memory=1188. Result, new high score for me. C2575.6!! Note that both 1/2's of my 295's Workload Share is now officially above 30% (almost 31%), and that my 280 still is doing the lions share of the work at 38.43%. I also verified that my clocks stay in 3D Performance mode for the entire duration of the test. Needless to say, I love this app, and if the Combined Direct Compute test starts to work as good as the Combined OpenCL test does, it may become my favorite benchmark program. I hope the Combined Direct Compute Test also produces a Workload Share report. That will also be some fun numbers to look at.
post edited by Talonman - 2009/12/24 19:24:18
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 19:34:54
(permalink)
Talonman, here is the run with CS4.0. To be honest, the results are virtually identical when running in 5.0 and 4.0. I haven't tried 4.1 yet. I think he doesn't have it programmed correctly for a DX11 GPU with DX11 API installed to run a lower Compute Shader version in his benchmark. I found out something interesting. When I checked the GPU utilization before, I checked it when it was going through the OpenCL part of the test. It hits up to a single GPU "only" doing 98% utilization. I noticed though that with the DirectCompute test, I have "2" GPUs active and both have 95%-98% GPU utilization. So, OpenCL only uses 1 GPU, and DirectCompute uses 2 GPU. It won't allow me to just do a "combined" DXCompute or "combined" OpenCL though.
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 19:40:41
(permalink)
Wow... Thanks So a single 5870 produces C4405 in the OpenCL test using profile cs_4_0. I would be glad for a reasonable explanation for this. For your information, us Nvidia guys don't have a cs_4_1 profile option available.
post edited by Talonman - 2009/12/24 19:43:07
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 19:55:23
(permalink)
Well, the OpenCL test doesn't rely on the "Compute Shader" level. Compute Shader 4.0 = DX10 = DirectCompute 10 Compute Shader 4.1 = DX10.1 = DirectCompute 10.1 Compute Shader 5.0 = DX11 = DirectCompute 11 So, the CS 4.0 or 5.0 doesn't have anything to do with the OpenCL part of the test. It is only applied DirectCompute part of the test. edit: Hopefully, when the programmer Pat fixed the OpenCL part of the test so you can use more than a single ATI card for the test, I will get 4400x2 or even 4400x3. 13200 would be a nice score indeed :) That is if the scaling was 100% for OpenCL. It may be.
post edited by luv2increase - 2009/12/24 20:14:33
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 20:28:39
(permalink)
Odd... Looking at the main thread listed in the OP, more scores are being posted. http://www.ngohq.com/grap...pencl-benchmark-9.html (added a 8800 gts to my 4850) NVIDIA GeForce 8800 GTS @ 1300 MHz (10DE / 193 / 22511682) Intel (R) Core(TM) i7 CPU 965 @ 3.20GHz (8 logical CPUs) OpenCL: C453.4 Here are my scores with my 9300 @ stock and my 4890 at 940 core 1070 memory. Using 0.45b OpenCL C2350.2 NVIDIA GeForce GTX 260 @ 1296 MHz (10DE / 5E2 / 20D5107D) Intel(R) Core(TM)2 Duo CPU E8500 @ 3.16GHz (2 logical CPUs) OpenCL: C707.3 XFX HD 5770 1GB GDDR5 OpenCL: C1042.9 So that means: A 8800 GTS and a single 4850 produces around C453.4 A single 260 produces around C707.3 A single XFX HD 5770 1GB produces around C1042.9 A single 295 produces around C1431 using both sides of the GPU... A single 4890 produces around C2350 A single 295 and single 280 produce around C2575 Luv's single 5870 produces around C4405 I am joining that site... I want Pat's ear! "Thank you for registering, Talonman. Your account has been submitted for moderation by an administrator and will be activated shortly. You will be notified by email when this happens. To return to the forums, click here."
post edited by Talonman - 2009/12/24 21:00:16
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 20:44:01
(permalink)
Talonman Here are my scores with my 9300 @ stock and my 4890 at 940 core 1070 memory. Using 0.45b OpenCL C2350.2 A 5870 has double the stream processors as the 4890 so double the performance, or almost double (4400 to 2350) is right on. Nvidia just might not have the right architecture for OpenCL, or their drivers aren't mature enough yet, or Pat doesn't know how to program the program for Mutli-GPU Ati nor Nvidia in general when it comes to OpenCL...
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 20:47:39
(permalink)
Also note that the single 260, is about 1/2 the 295's score... Guess the numbers are good.
post edited by Talonman - 2009/12/24 20:50:21
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 20:48:47
(permalink)
Talonman Also note that the single 260, is about 1/2 the 295 score... Guess the numbers are good. If that is the case, Nvidia won't want to port PhysX to OpenCL like they were thinking... edit: Also, they may have stopped working with ATI to implement PhysX on their cards because they realize that the higher number, although less efficient, of shader processors that the ATI cards have do a better job of parallel processing on their own CUDA!!! I think I just figured out a conspiracy again ROFL
post edited by luv2increase - 2009/12/24 20:50:32
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 20:54:12
(permalink)
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 21:58:43
(permalink)
I noticed Pat had posted on page 2... "Setting different profiles for CPU and OpenCL does not mean anything so you got almost the same results (it's hard to get the same results for CPU because of background tasks) The profile combobox is only enabled in DirectCompute tests and force the DirectX shaders compiler to build the GPU code for specific shader model. The score you get is simply the number of mega kernel loops (10^6) per second that your CPU can process (using 12 threads). Higher number = better CPU performance. The scores for different APIs are comparable so getting C1000 and M10 means your graphic card can handle 100x more calculations per second than your CPU. Thats mainly because the GPU can process thousands of threads at the same time without threads switching and the CPU usually can process 2, 4 or 8 threads." Question: If scores for both CPU's and GPU's are generated by counting mega kernel loops (10^6) per second... I know Nvidia Shaders do more work in 1 clock cycle than ATI. I wonder if just counting kernel loops will equate to real world performance, when comparing ATI to Nvidia in OpenCL apps? I still have a hard time accepting that a single 5870 would actually deliver more performance, than a 295 and 280 working together all with high utilization. I think the app gives accurate performance info when comparing Nvidia to Nvidia, or ATI to ATI, but am still not sure about comparing Nvidia to ATI. The 'counting kernel loops' thing has me wondering now...
post edited by Talonman - 2009/12/24 22:09:54
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|
luv2increase
CLASSIFIED Member
- Total Posts : 2643
- Reward points : 0
- Joined: 2008/12/31 16:26:56
- Status: offline
- Ribbons : 8
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 22:31:34
(permalink)
Talonman Question: If scores for both CPU's and GPU's are generated by counting mega kernel loops (10^6) per second... I know Nvidia Shaders do more work in 1 clock cycle than ATI. I wonder if just counting kernel loops will equate to real world performance, when comparing ATI to Nvidia in OpenCL apps? You are missing exactly how it works. Clock cycles have nothing to do with it. I think you have the "per second" in your head. We aren't comparing a single CUDA core to a single ATI Stream processor here. Here is an example of what exactly this benchmark does and how it works to come up with a score. Suppose we have 2 groups of people counting apples. 1 group has 100 (Nvidia CUDA cores) really fast apple counters while the other group has 1000 (ATI Stream processors) slower apple counters. The OpenCL GPU test takes 10-20 seconds to complete. Let us just say 20 for this example. Each second, those 100 CUDA cores are counting 800 apples (8 apples per core per second). Each second, those 1000 Stream processors are counting 4000 apples (4 apples per core per second). So, in 20 seconds, the CUDA core group counts (20x800=16000) while the ATI Stream processor group counts (20x4000=80000). Even though each CUDA core does more work every cycle a.k.a. second in our example, the fact remains that there just aren't enough of them to match the number of apples counted by the larger # of stream processors. I think the deal breaker as we are seeing evolve here is that the larger # of shader cores you have, the better the parallel processing power will be which is what DXCompute, OpenCL, ATI Stream, and CUDA processing requires. We may be witnessing something "VERY" bad for Nvidia. ATI's design of "MANY" inefficient shader cores WINS when it comes to GPGPU computing. In gaming, it is just about "evened out", but in GPGPU computing, we may be witnessing as I said before something not so nice to know about Nvidia's lower # more efficient shader core architecture.
post edited by luv2increase - 2009/12/24 22:34:04
HEATWARE - Intel Core i7 920 @ 4.1Ghz 24/7 * Have x5650 Xeon 6c/12t want to install!!! - Corsair Dominator 12GB - EVGA x58 Classified 760 - MSI GTX 960 - MegaRAID 9260-8i Raid Card - 4 x Samsung 850 EVO 120GB in Raid-0 - 4 x Samsung EcoGreen 1.5TB - Thermaltake Toughpower 1200W - IKONIK Ra X10 SIM - Pioneer BD-RW - 46" Samsung LN46A630 1080p - Windows 10 Professional Build 10147
|
Talonman
FTW Member
- Total Posts : 1391
- Reward points : 0
- Joined: 2008/04/01 09:26:53
- Location: Ohio
- Status: offline
- Ribbons : 31
Re:DirectCompute & OpenCL Benchmark BETA, now supports Multi-GPU's.
2009/12/24 22:51:29
(permalink)
Thanks for the post. So how many Stream Processors (shaders) does your one GPU have? I have 720 between my 3 GPU's. Update: Found it. The 5870 has: 1600 SPUs, 80 TAUs and 32 ROPs. So it's 720 Nvidia Apple Counters -vs- 1600 ATI Apple Counters?
post edited by Talonman - 2009/12/24 23:11:29
Asus ROG Maximus IX Hero Z270 / i7-7700K / Windows 10 Pro / EVGA GTX 1080 TI FTW3 Elite GPU / 32GB G.SKILL TridentZ RGB Series DDR4 3200MHz / EVGA Super Nova 850 G3 80 Plus Gold Modular PSU / Case: Phanteks Eclipse P400 in Red / (1) Samsung 960 EVO M.2 Internal SSD 500GB for OS (2) Samsung 850 EVO 1TB in RAID-0 for games / (1) Western Digital Black 7200 RPM 3TB Hard Drive for system backups - EVGA CLC 280 CPU Cooler. (EVGA affiliate code SKLZ84OQ2M)
|