AMD X670 Chipset and Ryzen 9 Early Adopter Woes
A few days ago, I upgraded my desktop from an AMD Ryzen 7 2700 with 64 GB of DDR4 memory to a Ryzen 7950X with 128 GB of DDR5 memory. I had to change motherboards, of course, from an ASUS Prime B350-Plus to an MSI PRO X670-P, and decided to get a new cabinet as well. The motherboard, CPU, RAM, and cabinet were put together at the shop. Back at home, we transferred my existing PSU (an NZXT C850), GPU (an MSI GeForce RTX 3080 Gaming Z TRIO LHR[1]), the four hard disks (12 TB Western Digital drives), a Samsung EVO Plus 1 TB NVMe SSD, and an older Samsung 480 GB SATA SSD from the existing cabinet. We powered it on with just the display, keyboard, mouse, and power connected. The lights turned on, the fans started spinning, and… the display remained blank before switching to power saving mode.
Apparently, the first boot of an X670 chipset after losing mains power can result in long wait times because it first goes through ‘memory training’: running the RAM through various timings to find stable and performant settings. I was told this had happened when booting at the shop too (with a different external GPU and display). It seems like useful feature in theory. However, in this case, we saw the motherboard’s ‘EZ Debug LEDs’ were on: the red and yellow LEDs were solid the entire time, which the manual told us indicated issues with the CPU and RAM, respectively. We tried reseating the RAM, the oversized CPU cooler, and the GPU. We even waited 25 minutes in case it was doing something strange and needed extra time, but it wouldn’t boot. Given that it was already late, there was no choice but to take it into the shop the next morning.
Fixing it ultimately took no time. The root cause? They had set the RAM to the rated speed (5200 MHz.) instead of leaving it on auto and letting the motherboard figure it out. More on that in a moment.
With the computer returned to me in working condition, I got to tweaking. I enabled Eco Mode, which I’d heard is very effective on the 7950X. I also enabled A-XMP (automatic, motherboard-controlled memory timings, separate from the normal training). That turned out to be a bad idea. The computer booted but Windows threw up a BSOD and thereafter declared my installation was corrupted. All this was not helped by my having just tried to install the latest feature update via Windows Update, which seemed to have failed.
From then on, every time I tried to start Windows, I was shown an error screen. From there, I could choose to enter recovery mode, which then rebooted instead of repaired. Even when I tried my recovery USB drive, it immediately rebooted. All of this was complicated further by my use of VeraCrypt’s full-disk encryption, which installs its own boot loader to decrypt the drive before passing control to the main boot loader.
I finally returned to the recovery screen by resetting the BIOS (incidentally, the recovery drive was plugged in this time but not booted from); in particular, the Memory Context Restore setting is meant to minimize the need for memory re-training, but it seemed to be causing instability. I elected to repair the Windows installation, sat through two reboots while it rolled back the half-applied update, and was greeted by my login screen. Everything was normal within Windows itself.
Based on this experience and comments from the shop, it looks like the motherboard needs a few more BIOS revisions before it settles down. Right now, it’ll only boot with an automatically-selected memory speed of 3600 MHz., and even then, not entirely reliably. Going up to 5200 leaves it booting successfully only one out of every 10 or 20 times, and there’s no way to tell which that’ll be since there’s always an arbitrary delay… apart from the few times it somehow boots instantly. (Note: I ran a full memory diagnostic without trouble; the RAM doesn’t appear to be defective.) What’s more, the motherboard package is mysteriously missing a jumper cap, so I have no way to restore the CMOS settings to the default if it doesn’t boot.
The mysterious instability has been explained: according to the official specifications, going beyond two RAM sticks lowers the maximum memory speed to 3600 MHz for all four new Ryzen processors. The only remaining question is why the motherboard’s A-XMP settings tried to run at 5200 MHz. in the first place. I suppose it must have been an oversight. Either way, I’m relieved to know my hardware isn’t faulty.
The problem that persists is that only alternate reboots work. This is a nuisance when running Windows updates.
What I originally wanted was a brand new Genoa-based EPYC. One of the biggest draws of that range for me, as someone who works with a lot of containers, VMs, IDEs, and memory-devouring Electron applications alongside some gaming, is the support for large quantities of ECC RAM (which adds error detection and correction). The original was to quadruple my 64 GB. I’d been waiting for the new platform since the middle of the year, but a full month after the November release, I couldn’t find any information on availability or pricing even abroad, let alone in India.
Thus it was that in December I lost patience and settled on the also new 7950X (my first time buying anything right at the top of the range, come to think of it). The Ryzen CPUs only support 128 GB of memory, sadly, so I’m at the limit. I could also have looked at the Threadripper line, with its 2 TB limit, but the newest models at the time of writing were released this past March and are based on the previous generation of the Zen architecture. In any case, if I were sacrificing clock speed for core count, it would only be for EPYC.
As it happens, I almost decided to get a 7900X instead when I heard the shop recommended a liquid cooling setup with the 7950X—I have never used and would rather continue never to use liquid cooling—but some cursory research showed that a ridiculously large air cooler is perfectly adequate for the 7950X. I chose the Noctua NH-D15.
My old cabinet was the reliable and unremarkable CoolerMaster HAF 912 Plus. The new one is a Lian Li LanCool III. It’s noticeably bigger and features RGB fans as well as tempered glass side panels, all of which I consider disadvantages. Indeed, the main reason for choosing this cabinet was the paucity of mid tower cases with room for four SATA drives, although it’s true that my options were additionally limited by the size of the 3080 and the new cooler, and the larger size lends itself to healthy airflow as well as easy access & tidy cable management.
I spent a lot of time trying to figure out how to disable the psychedelic colour scheme of the fans, which the shop had taken care of on the second day but which was undone when I reset the BIOS. You can turn off the colours for one power cycle by holding down the ‘M’ button on the front of the cabinet for 5 seconds; the permanent solution, though, is to enter the BIOS settings and turn off EZ LED Control. I still need to see how to disable the completely independent lights on the GPU, which the glass side panel puts on full display.
All in all, this isn’t quite the upgrade I’d planned, but it’s a giant improvement on my 2018 hardware. Everything’s suddenly snappy and responsive despite my not having reinstalled Windows yet. I also like having a fully functional array of USB ports. Only one of the two USB ports on the front of the old cabinet ever worked, and it was the 2.0 one at that; in addition, the old motherboard’s rear I/O panel had its own problems, so I only had two fully functional 3.0 ports and two fully functional 2.0 ports. (None of that included USB-C, of course.)
- Purchased at what turned out not to be the peak of the crypto– and Covid-induced price gouging in March. It replaced a freshly-ailing-but-always-underwhelming Quadro RTX 4000 that I bought when I was living in 2019.↩