2013/12/16 01:08:28
eduncan911
Update 2013/12/18: Added Titan SLI and Titan w/Titan PhysX with Double Precision enabled.  Added FPS line graph, showing how bad the Titan + CPU runs really were.  Converted all graphs for a much clearer view and also links to larger versions.
--
 
So it came up in another thread that a used Titan is a good purchase right now for the 6GB of ram for the next several years.  Afterwhich, it may can be used as a PhysX engine.  I know, most of us would be like, "What?  A $1000 GPU for a PhysX engine?"  Well, this got me thinking... Just what exactly are the results of using a beast of a GPU for a PhysX engine?  The results are interesting...
 
 
Testbed / Gaming machine / Full-time work PC
Intel Core i7 4930k @ 4.7 Ghz, 160 Bclk, HT Enabled, 16 GB ram @ 2133 CAS9 (see sig)
2x EVGA GTX Titan w/Skynet v3 1006 Mhz BIOS @ 1202 Mhz core / 6,800 Mhz Memory
1x EVGA GTX 460 EE
 
CPU-Z Validation: http://valid.canardpc.com/rabp3t

 
 
Skyn3t's GTX Titan "V3" BIOS 1006
The Titan BIOS, like almost all other 600 and 700 series BIOSes, limits overclocking.  Mostly by the Power Target limiter of 106% on the Titan (115% on my 780 Classified, etc).  Also, a lot of people report "+115" as an overclock, when that's not an overclock - the BIOS will boost to what it attempt to run at, and will lower the clock under a number of circumstances.  This is fine for the average user that does nothing to the card for cooling; but, for those of us that knows how to manage cooling (or have better cooling), we need more room to grow. 
 
Enter custom BIOSes you can flash to the card.  EVGA has stated that under an RMA, the card must have its original BIOS and be write-locked.  Otherwise flashing BIOSes does not void warranty.  I really love EVGA.
 
There are a number of BIOSes available for the Titan now.  I've tried out a lot for my 780 Classified and Titans, and feel the best support is given by Skyn3t and his brother over on the OCN forums.  So, I picked his "V3 1006" BIOS for these tests.  Verbatium from the scarce readme file:
 
Base core clock 1006Mhz
Boost Disabled
Voltage unlocked 1.212v
Default power target 350W with 125% slide = 439w
Max fan speed adjustable to 100%

 
Base core clock set waaay higher than the factory 837 Mhz base to 1006, a nice solid number that all air-cooled Titans can run with the stock fan profile and all.  Boost being disabled allows us to run the Titan monster at exactly the same Mhz for each and every benchmark.  The higher voltage limit allows us to eek out just a little more core speed.  And the default power target increased allows us to run many more amps before the PT limit kicks it.
 
As mentioned above, the Titans are set to an overclock of:
 
1202 Mhz GPU Core*
6,800 Mhz Memory
1.212V
 
* When Double Precision is enabled on GPU2 for certain tests, the GPU Core is lowered 7 steps, or 105 Mhz down to 1097 Mhz.  See below.
 
 
Enabling Double Precision with Custom BIOS
I will admit, this had me a little scared at first because enabling Double Precision lowers the clock rates of the GPU core on the stock BIOS.  In my tests, both 1006-stock and overclocked, that "lowering" seems to be right at about 105 Mhz on the dot (that's a multiplier reduction of 7 in Nvidia's GPU's terms, because the Nvidia TItan operates on a 15 Mhz multiplier rate.  7 * 15 = 105 Mhz).
 

 
But, I am running a custom BIOS made for overclocking and as a matter of fact has a ~150 Mhz overclock built into the "base, will not go lower" BIOS!  So for a few first tests, I ran with overclocks disabled.  With the 1006 Mhz BIOS, the core speed dropped down to 901 Mhz.  With my 1202 Mhz overclock, when enabling DP the core speed dropped to 1097 Mhz.  It made it through a few quick 3D tests, and the GPU2 Core temps stayed very low.  So I felt comfortable then to go through the entire PhysX benchmark here.
 
To recap, when Double Precision (DP) is enabled in the tests below for GPU2, the GPU core speed drops down to 1097 Mhz from the overclocked 1202 Mhz setting.  This is expected, and being a known DP drop of 105 Mhz with boost disabled in the BIOS, we have a nice solid base to run a benchmark against.
 
 
Benchmark Software
I first attempted to use PLAGame Benchmark, but I could never get the 2nd GPU to have any usage throughout the entire benchmark.  So, I moved over to my copy of Batman Origins that came with my 780 Classifieds (gave one away here in the Giveaway forums).  It's really a great update to the franchise and I am so glad they got right of the Microsoft account crap with this version over the last Batman AC.
 
Batman: Arkham Origins

 
Batman Origins' Graphics Setup
I went ahead and maxed every freakin' setting in the game.  Why not?  It's a Titan right?
 

 
Resolution: 1920x1080 (could not enable NVIDIA Surround of 5760x1080 on a single GPU with two identical GPUs in the system)
V-Sync: off
Anti-Aliasing: TXAA (aka 4xMSAA + TXAA)
DX11 graphics enabled in every option.
Hardware Accelerated PhysX: HIGH
 
FYI: TXAA HIGH translates to 4xMSAA + TXAA HIGH.  I could have selected 8xMSAA alone.  But I felt 4xMSAA + TXAA is as smooth as you'd ever want on a 1080p monitor. 
 
 
NVIDIA Control Panel Setup
Now this gets annoying.  I originally wanted to run at my standard 5760x1080 tri-monitor resolution to really beat down on the GPU1 and additional PhysX over such a wide view.  But, NVIDIA must have some dumb rule that if you have two identical GPUs in the system, SLI is required for 2D Surround.  Nothing I did or hack would get the 3rd monitor enabled over HDMI (all 3 connected to GPU1) when I had a 2nd Titan in the system as the dedicated PhysX.  Sure, this works perfectly when I stick in my GTX 460.  But not when I had a 2nd Titan in the system.  Nvidia really wanted me to enable SLI.  I even tried to disconnect the SLI bridge in which the Nvidia Control Panel and popups started yelling at me to "connect an SLI bridge."  It continued to gray-out the 3rd monitor even then.
 
Back on point..  I used the Nvidia Control Panel to designate the which GPU or the CPU as the PhysX engine.  As shown in the screenshot below in the upper-right.
 
 

 

 
SLI was always disabled when using the 2x Titans.  I used the PhysX Settings to designated either Titan GPU1 or Titan GPU2 as the PhysX engine, with GPU2 checked off as "Dedicated to PhysX."
 
 
Alright then, on with the results!
 
Batman Origins Benchmark FPS Results
 
2-way Titan SLI

 
Titan + 2nd Titan set as a Dedicated PhysX (not in SLI)

 
Titan + 2nd Titan set as Dedicated PhysX, with Double Precision enabled

 
Titan + GTX460 set as the dedicated PhysX engine

 
Titan + CPU set as the dedicated PhysX engine

 
A single Titan set as both the main display, as well as the PhysX engine.

 

 
 
 
 
The graphs speaks for themselves, and the "Minimum" really is a real-world minimum during the benchmark.  
 
Note that if you have two Titans in SLI and want to play this PhysX game, or maybe another, then disable SLI and designate the 2nd GPU as your PhysX card.  Unfortunately I run with Nvidia Surround, so I have to use both Titans in SLI for tri-monitors.  But you can bet I am eye-balling my 780 Classified sitting in the closet next to me.  Humm...
 
Double-Precision actually hurts, just a little.  Most likely because of the lower GPU clock rate when enabling DP (it lowers 105 Mhz as I noted above).
 
About that Titan + CPU test: Don't be fooled thinking the CPU is just fine.  It's the worse option of all as you can see in the FPS line graph above when it dips very very log for two scenes.  Yes, the Titan w/CPU designated as the PhysX really does suck that bad, crawling at ~15 to ~30 FPS for a good long time.  Also, I couldn't believe those numbers so I ran it three times - with a fresh reboot in-between.  It's dead on every time.  Point being: do NOT use the CPU for Nvidia PhysX of any kind!  Disable PhysX before even attempting to use PhysX.  It really stuttered hard at the beginning of most scenes and was not smooth at all.
 
I also felt using the Titan + GTX460 was actually a lot smoother than the standalone Titan runs, even though there were higher FPS maxes.
 
 
GPU Utilization
 

 

 
First, you can overlay both of these graphs if you like.  The timecodes are exactly the same across both graphs.
 
You'll notice that the Titan Standalone run is pegged at 99% the entire time.  To me, this is a clue that PhysX and Graphics Rendering within the game are equally matched at this resolution.  Perhaps the game developers optimized all graphics rendering and PhysX rendering to be on a single GPU, as that is how the vast majority of players will use the game (and across consoles).
 
Note that the main GPU1 utilization actually doesn't peak to 100% when using a dedicated PhysX engine.  But, depending on which PhysX engine you use, it really does effect how much it does get used surprisingly.  This may indicate that the graphics rendering is bottlednecked by the response of the PhysX GPU performance: the faster the PhysX GPU, the less bottlenecked (blocked) the primary GPU.  
 
Also note that during the Titan SLI runs, Nvidia with "Auto" selected as my PhysX engine decided to use my GPU2.  This may account for the very big usage difference between GPU1 and GPU2.  AS a test, I manually selected GPU1 as the PhysX engine with SLI enabled.  The graphs could almost be flipped from the Titan SLI show above - it was identical.  
 
 
Conclusion
 
I would summarize by saying to choose your PhysX engine wisely.  Don't rely on that old 9800 GTX thinking it's just fine - it actually directly affects your FPS.  You want the biggest and baddest "dedicated" GPU you got as your PhysX card.  And, Double Precision doesn't matter - at least at this resolution.
 
I personally can only take this test with a grain of salt as I only game at 5760x1080 / 6000x1080 resolutions.  With Nvidia restriction Surround mode on a single GPU with two identical GPUs in the system, I was unable to fully utilize the hardware for accurate results.
 
So yeah, if I had the money, I'll take a 3rd Titan as my PhysX card please.
 
And donate the GTX 460 for some other purpose...
 
 
 
 

Attached Image(s)

2013/12/16 01:21:11
eduncan911
reserved
2013/12/16 03:35:15
SeeThruHead
What would your recommended card be for physx for someone running a single 780ti or Titan?
2013/12/16 05:32:57
eduncan911
At this point, I'd say as fast as a GPU as you can get. You can see in the bar chart that using a Titan for PhysX vs a GTX460 as a PhysX is a 33% difference.

Also, just for giggles, I ran the benchmark with two Titans in SLI and letting Nvidia auto-decide which GPU to use as PhysX - it was worse then using a single Titan and another Titan as PhysX! Just a little better than a single Titan alone. Mostly likely hitting a CPU scaling issue.

I may have captured those data points, but I'm travelling now and can't update it for a few days.
2013/12/16 10:11:28
eduncan911
Thanks for the blue ribbon!
 
I was thinking... To really show how poor that CPU is used as a PhysX I need to post graphs of the FPS - which I logged as well.  That FPS bar chart is mis-leading - showing how the CPU may be a viable alternative.  It isn't, trust me.  Problem is I am travelling, so I'll do that when I am back at the machine.
 
I also got a PM that the attached images aren't visible for people with less than 50 posts.  I'll move them to my server as well when I am back home.  
 
So, stay tuned for more updates...
 
2013/12/16 12:03:21
FalconX79Dark
Hi eduncan911,
 
Thank you for running this incredible test.  I am new and not sure how ribbons are awarded but definitely thinks this deserves one as I can't find this information online.  Also thank you for the level of detail you have provided.  I found one Titan SC card left and TigerDirect won't price match NCIX price because NCIX is out of stock and they won't offer the Nvidia bundle if I buy in the store.  Looking forward to reviewing your notes again. 
2013/12/16 12:34:10
staba2009
. Great Info!! i ve been looking for a test like this. Can you please run a test with Titan + Titan for physics but with DP enabled on the dedicated physX? I would like to see if double presicion helps on physics calculations. (I wonder if a Titan with dp on, would be the best choice for a physics card when maxwell comes out).
2013/12/16 13:00:31
eduncan911
staba2009
. Great Info!! i ve been looking for a test like this. Can you please run a. Will est with Titan + Titan for physics but with DP enabled on the dedicated physX? I would like to see if double presicion helps on physics calculations. (I wonder if a Titan with dp on, would be the best choice for a physics card when maxwell comes out).


I totally forgot that was exactly one of my goals! I even mentioned that goal in the other thread I link to at the top of the post as one of the very reasons to run this test!

OK, got two things to update now... FPS graph and a Double Precision-enabled Titan+Titan run. Will do in a day or two when I get home.

Though, I already speculate it would not matter and will even hurt the scores because DP lowers the clock and memory speeds. Then again, I am running a custom bios at a custom speed.... Humm.

Dont think it will matter much mostly because the program has to be written to use those extra long decimal places. "Double Precision" has been described in 1 review I found, the only one, to say enabling double precion goes from 53.525 to 53.525773 (doubling the decimal places). I don't think that is it though. I think it just enables the abilitt to use doubles as floats or alike. For example, say a value was calculated to be 67242.64. Changing this to a float doubles the address space available for storage across two memory addresses, which sounds much more plausible to me. I guess I could dig into the CUDA developer notes as I am a member to see exactly.
2013/12/18 10:06:21
eduncan911
Update 2013/12/18:
 
Added Titan SLI and Titan w/Titan PhysX with Double Precision enabled.  
Added FPS line graph, showing how bad the Titan + CPU runs really were.  
Converted all graphs for a much clearer view and also links to larger versions.
 
Whew... Think I am done with this.
2013/12/18 10:48:45
HeavyHemi
eduncan911
Update 2013/12/18:
 
Added Titan SLI and Titan w/Titan PhysX with Double Precision enabled.  
Added FPS line graph, showing how bad the Titan + CPU runs really were.  
Converted all graphs for a much clearer view and also links to larger versions.
 
Whew... Think I am done with this.


Something else I'm not sure you're aware of. When dedicating PhysX to CPU in the Nvidia Control Panel it sets the PhysX setting in game to normal. You can't set it to high. It will stay on normal until you set PhysX back to either auto or in some manner using a GPU in the Nvidia Control Panel. Then you have to reset the PhysX to High in the game options and restart the game. I bring this up because your SLI TITAN FPS look like that might have been run with PhysX on normal instead of high. Also it reveals, that since the game forces PhysX to normal when using the CPU, the performance for even reduced PhysX on the CPU is terrible as you indicated. This line is a bit confusing as related to your graphs: "SLI was always disabled when using the 2x Titans.  I used the PhysX Settings to designated either Titan GPU1 or Titan GPU2 as the PhysX engine, with GPU2 checked off as "Dedicated to PhysX."  Did you run with the TITANS in SLI with PhysX on auto with just the two TITANS installed?

Use My Existing Forum Account

Use My Social Media Account