Barn-megaparsec: April 2013

The memory hierarchy is one of the abstractions in computer architecture that has withstood the test of time for at least the past 30 years. It was a lot simpler in the '80s, since computers did not have caches and while mainframes had access to tape storage, most microcomputers did not have hard disk drives.

There are many figures of merit to memory. The most important consideration is the distinction between size and speed: you can have one or the other. There are frontiers to what is possible with memory using today's technology. There is no memory that has the capacity of a modern hard disk with the access time of an CPU register, even for all the money in the world. However, since a computer needs to be able to execute programs using instruction and data memory very quickly, and it needs to store large files for future use, there need to be present multiple levels of memory in all digital systems except for the most very basic embedded systems. Other considerations exist: cost per gigabyte, reliability, physical size, and expandability.

Volatility

An unmentioned characteristic of the fastest memory is that it is invariably short-term. It is possible to execute code that rewrites the entire register file a million times in the blink of an eye. The SRAM on the processor, known as the cache, is short-term memory for the processor and it is constantly updated to whatever is most likely to be used by the processor, possibly changing every entry in the table every time a new process is completed. DRAM modules, historically and still known as "main memory," need to be constantly refreshed and they vanish when the power is out. Flash memory is persistent semiconductor memory, but compared to a disk it cannot be written as frequently; the mean longevity of flash-based memory as of this writing is about 10,000 writes; nevertheless, manufacturers as of 2013 are capable of warrantying SSDs (solid-state drives, or hard drives based on flash) to the same length as conventional disk drives. Nevertheless, the cost per gigabyte of semiconductors is much more than magnetic disk platters, so HDDs remain dominant for persistent storage at this time.

The bad old days

In the beginning, there was obviously no defined hierarchy. However, even the earliest machines had registers, even if it was only an accumulator for intermediate arithmetic results. The computer programs of the 1940s to 1960s were mostly arithmetical, designed to provide mathematical solutions to artillery trajectories, business solutions, and bookkeeping batch processing. Instructions and data still had to be entered into the machine's memory from punched cards or tape, and the "main memory" went through some ugly experimentation (UNIVAC used mercury delay line memory) until in 1954 it settled on magnetic core memory. This had a great run, but by the end of the 1970s it had been totally superseded by semiconductor-based DRAM.

All of these memories were slow (DRAM is still very slow compared to CPUs today and it will never catch up), but the main performance bottleneck has almost always been getting data into memory. It took far longer to enter the punched cards and read them into the machine than it did to execute the instructions. Although memories became larger and could store ever more complex programs, they were always outpaced by the demands of programs. Efficient memory usage was especially critical on the small-scale microcomputers that were the vanguard of the PC revolution in the early 1980s. These had a small amount of ROM that was always loaded when the computer was powered on to run the BIOS; the cheaper 8-bit home computers like the VIC-20 and Apple IIe did not have hard disk drives and this untouchable ROM was the only persistent memory on the machine, save for external floppy disks. The IBM PC started shipping with a hard drive in 1983, and eventually everybody else got the idea.

Does anyone miss reading programs into memory on microcomputers using floppy disks, or, even worse, compact cassettes? Neither do I. Computers sucked back then. I can be nostalgic about some aspects of old technology, but we should be infinitely grateful for what the past 20 years has brought us in terms of computing.

In the 1980s, the processor rarely had access to any on-chip memory aside from the register file, and coders in assembly (or compilers, for that matter) routinely used as many of the registers as they could, so it was impossible to use them for any kind of storage. DRAM was universally used to load programs, but, as previously stated, was just as slow relative to the CPU then as it is now. To get around this problem, Intel started putting L1 cache memory on-chip for the 486 processor, and with the Pentium Pro they finally had L1 and L2 caches. Cache memory stands alongside pipelining and multiple cores as one of the most important breakthroughs in microprocessors. Since then we've added L3 cache as well, since the roles of each cache level have changed a bit with multiple cores on the same die, but the memory hierarchy has remained in the same form since c. 1995.

Priebe's analogy*

All of the various types of memory have different roles to play. They are all needed in some way. All general purpose computers built nowadays will have a CPU with a register file, onboard cache, and will need RAM on the motherboard, and will probably also need an SSD or HDD. For truly massive data storage, tertiary storage sits below the disk.

To understand the role they each play, and the timeliness versus capacity tradeoff, I'll share with you Priebe's analogy. Suppose you are watching a football game at home. You are responsible for providing access to beer whenever one of your friends runs empty.

All beer enters your mouth from a bottle in your hand. Think of a beer in your hand as a register (and when everyone present in the party has a beer in their hand, you may be said to be fully utilizing the register file). It is so quickly available that it is practically no latency. However, just as you can only hold one beer in your hand, the register file can't hold much data, perhaps on the order of hundreds of bytes.
You could go to the fridge each time someone wants a beer (perhaps like you once did in the olden days) but you have yourself a fancy cooler onsite. Think of the cooler as the cache. This cooler also provides a short access time, although not as quite as short as bringing the bottle to your lips, you can still lean over, flip the lid, and grab a cold one in a few seconds. It takes maybe ten times longer to do this than to drink from a beer that is already in your hand, but this can't be helped. If you use an operand in the register for as long as you can and consistently find what you need in the cache, it's the best possible outcome.
The fridge is the lowest level of access that could be described as readily available. Think of the fridge as main memory. You're going to take a couple minutes to get what you need and load the cooler back up, making it much less convenient. However, it is still basically present. If someone had beer in their fridge, they would treat it like it was on the premises. Going into memory is a distinct penalty from using the cache, but it is only thousands of times slower. I guess if we wanted to make it realistic, the main memory access should take hours compared to the cache access! Still, as we're going to see, it could be a lot worse.
You have a large party and they consume all the beer in the fridge! If the fridge is empty, this is a very bad thing. You can't cancel all the festivities, but everyone is in for a long wait. You have to go to the store to get more beer. Think of the supermarket as the disk drive, or secondary storage. The difference between walking to the fridge to retrieve even one beer compared to driving through heavy traffic and standing in an enormous checkout line to purchase a new supply of beer, which you had to get at the rearmost aisle, is somewhat analogous to the difference between a memory access and a disk access. Except that a disk access is on the scale of milliseconds, which amounts to millions of cycles. It's massively slower than main memory, so it's truly a tragedy for performance whenever a disk access occurs (sometimes through a page fault), but its size is practically unlimited for most purposes.
For desktops, the disk is as far down as the hierarchy goes and it is where the example ended in class. However, I personally wanted to carry it further! If an organization had implausibly high demands for beer and denuded an entire store of its supply, they would just go to another one, and another until their demands were met. Likewise, a user who needed more storage space than a single disk would simply have to buy more disks until the demand was satisfied. However, there are businesses whose sole role is to deal in beer and provide it to people, and for them, their own supply might exceed that of any supermarket's supply, so it makes sense that they would incorporate a layer of supply that went above and beyond going to the store and buying some. Likewise, there are archives and databases of tertiary storage which store petabytes of data, far more than any individual would normally acquire, and stores them on magnetic tape libraries, swapping between sections of the library as needed. Accessing data on a tape archive amounts to billions of cycles. It's so slow that, using the same time scale as the previous examples, you could think of tertiary storage as founding a brewery and building it from the ground up to be one of the world's largest breweries. It's incredibly, agonizingly slow, but archives do need to exist to back up data in huge quantities, just like someone had to spend the time to make a successful brewing company, or else we wouldn't have beer available to us in the first place.

*- This analogy was mentioned in a lecture of my introductory computer architecture class as taught by Dr. Apan Qasem in Spring 2013. Dr. Roger Priebe is a Senior Lecturer in the Computer Science Department at Texas State University, my alma mater. Dr. Qasem is an Assistant Professor at the same institution.

Barn-megaparsec

Pages

Monday, April 8, 2013

Priebe's analogy

Volatility

The bad old days

Priebe's analogy*

Sunday, April 7, 2013

There were two Murderers' Rows in 1927

Popular Posts

Blog Archive

Followers