AMD Zen 5 Details Emerge with GCC "Znver5" Patch: New AVX Instructions, Larger Pipelines

Join us now!

Username
Password
Verification
	Stay logged in

Forgot Your Password? Forgot your Username? Haven't received registration validation E-mail?

Welcome, ! User Control Panel Log out

Forums
Posts

Latest Posts

Active Posts

Recently Visited

Search Results

View More
Blog

Recent Blog Posts

View More
Photos

Recent Photos

My Favorites

View More Photo Galleries
PMs

Unread PMs

Inbox

Send New PM View More
Page Extras
Menu
- Forum Themes

Mark Thread UnreadFlat Reading Mode ❐

Hot!AMD Zen 5 Details Emerge with GCC "Znver5" Patch: New AVX Instructions, Larger Pipelines

Author Post Essentials Only Full Version
rjohnson11 EVGA Forum Moderator Total Posts : 84782 Reward points : 0 Joined: 2004/10/05 12:44:35 Location: Netherlands Status: offline Ribbons : 86 2024/02/12 02:53:26 (permalink) https://www.techpowerup.com/318991/amd-zen-5-details-emerge-with-gcc-znver5-patch-new-avx-instructions-larger-pipelines AMD's upcoming family of Ryzen 9000 series of processors on the AM5 platform will carry a new silicon SKU under the hood—Zen 5. The latest revision of AMD's x86-64 microarchitecture will feature a few interesting improvements over its current Zen 4 that it is replacing, targeting the rumored 10-15% IPC improvement. Thanks to the latest set of patches for GNU Compiler Collection (GCC), we have the patch set that proposes changes taking place with "znver5" enablement. One of the most interesting additions to the Zen 5 over the previous Zen 4 is the expansion of the AVX instruction set, mainly new AVX and AVX-512 instructions: AVX-VNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, and PREFETCHI. AVX-VNNI is a 256-bit vector version of the AVX-512 VNNI instruction set that accelerates neural network inferencing workloads. AVX-VNNI delivers the same VNNI instruction set for CPUs that support 256-bit vectors but lack full 512-bit AVX-512 capabilities. AVX-VNNI effectively extends useful VNNI instructions for AI acceleration down to 256-bit vectors, making the technology more efficient. While narrow in scope (no opmasking and extra vector register access compared to AVX-512 VNNI), AVX-VNNI is crucial in spreading VNNI inferencing speedups to real-world CPUs and applications. The new AVX-512 VP2INTERSECT instruction is also making it in Zen 5, as noted above, which has been present only in Intel Tiger Lake processor generation, and is now considered deprecated for Intel SKUs. We don't know the rationale behind this inclusion, but AMD sure had a use case for it. Next, we have a larger pipeline design. The Zen 5 integer unit has six ALUs compared to the four found in Zen 4. The Address Generation Unit (AGU) count is also higher, going from three to four. The floating point store pipelines are now doubled, and they are 256-bit each to handle a 512-bit floating point store from a single cycle. Some other instructions like cmov/setcc and floating point shuffles can now be handled by all ALUs in Zen 5, whereas in Zen 4, it was handled only by two ALUs. Apparently, the Zen 5 uArch is now handling most of the AVX-512 operations as a single slot pipeline cycle, rather than the old double pumping, which halved AVX-512 instructions into two 256-bit ones for processing on the 256-bit wide ALUs. Lastly, the patch notes that, once again, there will be no difference between Zen 5 and Zen 5c cores ISA-wise, same with Zen 4 and Zen 4c cores, where the latter only implemented smaller caches. I wish AMD would hurry up and release these CPUs. AMD Ryzen 9 7950X, Corsair Mp700 Pro M.2, 64GB Corsair Dominator Titanium DDR5 X670E Steel Legend, MSI RTX 4090 Associate Code: H5U80QBH6BH0AXF. I am NOT an employee of EVGA #1 0 Replies Related Threads

Author

Post

Essentials Only Full Version

rjohnson11

EVGA Forum Moderator

Total Posts : 84782
Reward points : 0
Joined: 2004/10/05 12:44:35
Location: Netherlands
Status: offline
Ribbons : 86

2024/02/12 02:53:26 (permalink)

https://www.techpowerup.com/318991/amd-zen-5-details-emerge-with-gcc-znver5-patch-new-avx-instructions-larger-pipelines

AMD's upcoming family of Ryzen 9000 series of processors on the AM5 platform will carry a new silicon SKU under the hood—Zen 5. The latest revision of AMD's x86-64 microarchitecture will feature a few interesting improvements over its current Zen 4 that it is replacing, targeting the rumored 10-15% IPC improvement. Thanks to the latest set of patches for GNU Compiler Collection (GCC), we have the patch set that proposes changes taking place with "znver5" enablement. One of the most interesting additions to the Zen 5 over the previous Zen 4 is the expansion of the AVX instruction set, mainly new AVX and AVX-512 instructions: AVX-VNNI, MOVDIRI, MOVDIR64B, AVX512VP2INTERSECT, and PREFETCHI.

AVX-VNNI is a 256-bit vector version of the AVX-512 VNNI instruction set that accelerates neural network inferencing workloads. AVX-VNNI delivers the same VNNI instruction set for CPUs that support 256-bit vectors but lack full 512-bit AVX-512 capabilities. AVX-VNNI effectively extends useful VNNI instructions for AI acceleration down to 256-bit vectors, making the technology more efficient. While narrow in scope (no opmasking and extra vector register access compared to AVX-512 VNNI), AVX-VNNI is crucial in spreading VNNI inferencing speedups to real-world CPUs and applications. The new AVX-512 VP2INTERSECT instruction is also making it in Zen 5, as noted above, which has been present only in Intel Tiger Lake processor generation, and is now considered deprecated for Intel SKUs. We don't know the rationale behind this inclusion, but AMD sure had a use case for it.

Next, we have a larger pipeline design. The Zen 5 integer unit has six ALUs compared to the four found in Zen 4. The Address Generation Unit (AGU) count is also higher, going from three to four. The floating point store pipelines are now doubled, and they are 256-bit each to handle a 512-bit floating point store from a single cycle. Some other instructions like cmov/setcc and floating point shuffles can now be handled by all ALUs in Zen 5, whereas in Zen 4, it was handled only by two ALUs. Apparently, the Zen 5 uArch is now handling most of the AVX-512 operations as a single slot pipeline cycle, rather than the old double pumping, which halved AVX-512 instructions into two 256-bit ones for processing on the 256-bit wide ALUs. Lastly, the patch notes that, once again, there will be no difference between Zen 5 and Zen 5c cores ISA-wise, same with Zen 4 and Zen 4c cores, where the latter only implemented smaller caches.

I wish AMD would hurry up and release these CPUs.

AMD Ryzen 9 7950X, Corsair Mp700 Pro M.2, 64GB Corsair Dominator Titanium DDR5 X670E Steel Legend, MSI RTX 4090 Associate Code: H5U80QBH6BH0AXF. I am NOT an employee of EVGA

0 Replies Related Threads

Jump to: