Hardware

ECC RAM: what it is, how it works and what are the differences with standard RAM

ECC memory modules are often used on servers and company workstations: but what are the main differences compared to standard RAM?

When we talk about servers and computer systems for professional use, we often refer to memories RAM ECC (Error Correction Code).
Unlike the Non-ECC RAM, which is more common in home and consumer systems, ECC RAM is designed to detect and correct bit-level errors when reading and writing data. Browsing the online catalogs you can find a wide choice of ECC and non-ECC RAM memories: let’s take a closer look at the main ones differences between both.

Our devices are constantly moving data to and from the RAM memory: most of the time it is a “painless” process; but of course not everything always goes the right way.
Design errors, manufacturing errors, thermal stress, overvoltages and overcurrents, electromagnetic interferences – therefore both physical and environmental factors – can cause errors during the operation of the RAM memories.

The RAMs that most users”non-business” use, do not integrate algorithms error correction: we are talking about non-ECC RAM. They use techniques that still allow you to manage errors and reduce the risk of crashes Of the device.
In fact, non-ECC RAMs rely on a mechanism of error detection at the software level: the operating system is able to recognize errors and handle them accordingly. It can also avoid allocating data in damaged areas and use other parts of memory that work fine.

Conversely, ECC RAMs use a error correction algorithm to detect and correct bit-level errors that occur during data processing. This algorithm uses a code of error correction it adds bit extra to ciscuna word stored in memory: any errors can thus be detected and corrected automatically.

With the term word, in the case of RAM memory, it refers to the sequence of bits that is read or written simultaneously. In other words, it represents the amount of data that memory can transfer in a single read or write operation.

In practice, when a bit in the past is read from ECC RAM and an error is detected, the algorithm uses the extra bits to automatically correct the error and return the correct data. This error correction process happens in the background and nothing is reported to the user.

The extra bits that are matched to the data stored in each memory cell are also known as control bits and are used to detect errors when reading or writing data in RAM. The Hamming codefor example, uses the extra bits to create a control vector which represents the data stored in the memory cell. When reading data, the check vector is used to verify that the data has been successfully stored in the memory cell. If a bit error occurs, the error-correcting code can use the information contained in the control vector to determine which bit is wrong and correct it automatically.

Using ECC RAM can ensure higher reliability e stability computer systems, reducing the risk of system crashes and loss of critical data. However, ECC RAM is generally more expensive than non-ECC RAM and requires more compatible motherboards with this technology. That’s why this type of memory is mostly used in server and in highly reliable computer systems.

As we saw in the article on how to choose RAM memory, ECCs also differ capacity of the single module, clock speedlatency and obviously price but also for a further important parameter or the amount of correction bits.
ECC RAMs use different levels of error correction, depending on the number of correction bits used: some ECC memories use single-bit correction (SEC), others use two-bit correction (DSEC), while still others use three-bit correction (TSEC).

Modules with two- or three-bit error correction are exploited on systems, such as some types of servers and workstations, where stability is a top priority and the data corruption it cannot be tolerated.

Is it possible to use ECC RAM memories on consumer devices?

Most consumer systems (think PC desktop and you have notebook) do not support ECC RAM memories. However, there are some notable exceptions: some high-end consumer motherboards may support ECC RAM but careful user verification is required.

For i chip AMD, ECC memory is “unofficially” supported. That means it’s not an advertised feature, but it’s also not something the Sunnyvale-based company is closing the door on.
Lato IntelECC memory is supported on a handful of chipsets as of 2021, although finding an ECC-capable motherboard that accommodates consumer processors can be challenging. Unlike AMD, which leaves it up to motherboard manufacturers to implement ECC support or not, Intel limits the number of compatible chipset with memories that integrate automatic error correction. Then they will find each other enterprise-grade motherboards compatible but no longer properly consumer products.

Why isn’t ECC RAM used on all devices?

Linus Torvaldsfather of the Linux kernel, has criticized Intel for having – according to him – led the industry not to support ECC memory on the consumer side, stating that this choice could have brought and could still bring benefits for consumers.

There are some though disadvantages resulting from the use of ECC memories. Due to the adoption of error correction process, ECC RAM is a little slower than regular RAM, 2% to 5%. The additional features also involve a higher cost which translates into +10%/+20% compared to standard RAM.

The RAM consumer is currently very stable and errors generally occur very rarely: in any case, a sporadic reboot is tolerable. The situation is radically different in the case of corporate servers and workstations: the temporary unavailability of a service could be problematic.

Images used as “thumbnails” for this item are from the Amazon listing of Kingston Server Premier 16GB 2666MT/s DDR4 ECC memory.

Leave a Reply

Your email address will not be published. Required fields are marked *