johnwhelan
Well-known member
The only users who absolutely need ECC memory are those who use the machine for “mission critical” work, certainly not for running games or for performance applications.
Lets go back to basics. The computer memory we use today is called dram or dynamic memory. On some of the early machines I worked on we used core memory or little circular magnets turn the power off and the memory stayed in the same state. Unfortunately it was very expensive to make and rather slow to access the memory. DRAM ismuch cheaper to make and has much faster access time.
DRAM basically is two plates separated by an insulator. You bung a charge or a group of electrons on one side and they stay there because of a positive charge on the other plate. Over time the electrons wonder off or drift through the insulator.
Unfortunately in Chemistry electrons are a little odd. When we talk about them floating around in orbits we actually simplify things a little. We can know where an electron is or we can know when it is but not the two together. We are unable to say where a particular electron is a point it time. However to know what state a memory cell is in we need to know both. We can use probability theory to say where we expect a group of them to be.
If we read DRAM fairly quickly there is a good chance that the state of the cell will be the same, leave it for a week and it will probably be zero. So we refresh the memory cells every x milliseconds. For the sake of argument every 30 milliseconds this gives us something like a 99.99999+% chance it will be in the same state. If we refresh more frequently the chance is higher if less frequently the chance is lower. The designers have a trade off between cost and performance.
When memory modules are tested as they come off the assembly line a certain level of error is permitted. Brand X cheaper memory parts are often those that got rejected on the first testing but passed on the second.
Note we are dealing in probabilities not certain states of memory such as on or off.
When we read the memory cell we count the number of electrons if for example it is designed to hold 10,000 electrons then if it has 8,000 or more its probably positive. If its less than a 1,000 its probably negative. Because electrons tend to drift around a memory cell that starts with zero electrons can end up with a few from its surroundings. Now if we have a bad connection in to the memory cell we might only be dumping 8,100 electrons in not 10,000 as long as we say if there is more than 8,000 we are safe provided we don't lose more than 100 electrons to the surroundings.
Problem is what happens if you have 4,500 electrons is that a one or a zero?
But 99.99999% of the time its OK. Yes for each memory cell but when you have 6*1024*1024*1024 memory cells (6 gigbytes) 99.99999% doesn't look quite as good only 99.99999% will be correct. Throw in you cycle the cell 1000/30 times per second then occasionally a memory cell will be read incorrectly. We can use probability theory to catch a large number of these by doing a cyclic redundancy check or CRC and adding this on the end of the memory cells. We've just put our cost up because now we need to store both the memory and a check digit. Note we are still using probability theory so if we get more errors in the block than the CRC can correct then we could still have an error. However the CRC will detect the vast majority of these.
So if you accept that any dram will have errors from time to time then logically if you need a level of confidence that the memory is working correctly it makes sense to use crc to protect you from the errors.
Cheerio John