My money right now is on design defect in the hardware. The pattern showing up reminds me of the RRoD problems. It will work completely fine under heavy loads for a period of time, but once it starts happening, anything will trigger it constantly. If it were software, I think everyone would be getting it.
I can also rule out anything related to pixel clock or monitor settings, as I just booted up with no monitor plugged in and it died after a few minutes while idle.
Is there some way to get detailed event/driver logging that might provide some clues? I have MSI Afterburner configured to log basic sensor info, nothing showing up there though.
Edit: Seeing posts on other forums for previous generations of cards seems to be blaming this on heat/power. I don't have a hot card as I said, and I'm pretty sure power is fine. But I did realize that around when this started happening, I also underclocked my CPU by choosing the cool/quiet/powersaving OC setting in the ASUS BIOS. I've set that back to default and the system seems to be surviving longer than usual. This was never a problem on previous generations of nvidia cards. Will update if it dies with this setting...
Edit 2: Nope, definitely not fixed.