Electricity runs through circuits in much the same way that water runs through plumbing. Voltage is like the water pressure. Electric current is like the amount of water. A capacitor is like the tank on the back of the toilet.
There is one slight problem with the plumbing analogy. You fill a bathtub with water using the water pressure in the line, but you empty a bathtub by opening the drain and letting gravity push the water down. Circuits, however, use both positive and negative voltage, so the speed with which electric charge drains from a circuit is the same as the speed at which it fills (determined by voltage).
If you increase the water pressure, you can fill a glass or a bathtub more quickly. If you increase the voltage on a CPU chip, it can perform its operations more quickly. However, at a higher voltage the CPU uses more power and generates more heat. If all you need to know is the difference between on and off, then it doesn’t matter if you are talking about a 100 watt light bulb or an LED that uses a fraction of a watt.
Most casual computer users can get along quite nicely with an Intel Atom CPU that uses 4 watts of power. However, you can also buy a Quad Core top of the line CPU chip that can use as much as 125 watts.
To indicate a 1, you have to fill something. It doesn’t matter what you fill as long as you can quickly fill or empty it and you can accurately measure whether it is full or empty. Obviously, it takes less time to fill a shot glass with water than it does a bathtub. Not only is it faster, but it takes less water (current), and you can do it with a much lower water pressure (voltage). You can also empty it faster.
So by making a computer circuit smaller, it can operate faster, at lower voltage, using less power, and generating less heat. Circuit size is identifed by the width of the smallest circuit that can be constructed, but the actual size of circuit is an area in two dimensions rather than just a width. Recent generations of circuit sizes have been 90, 65, 45, and now 32 nanometers. Once these numbers are squared to turn them into area, each new generation has circuits half as large, or alternately twice as many circuits on the same sized chip.
For decades we have referred to computer chips as “silicon” because that is the material on which they were constructed. However, starting with the old Pentium 4 chip these smaller circuit sizes started to have a problem. As the size of circuits got smaller, so did the amount of silicon between circuits and wires. Silicon provides some insulation, but when the distance between the wires gets small some current leaks across to nearby wires. Circuits are “on” or “off”, much like a faucet. However, as circuits became smaller, one set in the “off” position began to leak current, much like a dripping faucet.
The solution was to replace pure silicon with other materials that provide better insulation, but that change in manufacturing process took much of the last decade to accomplish. Today the latest generation of chips not only has much smaller circuits, but they don’t leak as badly.
Flanders and Swan wrote this in a song about the laws of Thermodynamics (CD). Their summary runs “Work is Heat and Heat is Work.” In plumbing, it takes work to move the water up to your second floor bathroom. You don’t see the work because it is being done by massive pumps at the Water Company. However, if you had to pump the water yourself, or carry it upstairs in buckets, you would immediately recognize that work is involved. As you build up a sweat, you realize that work is heat.
The smaller a circuit is, the less work is required. It requires a lot less effort to carry up the stairs enough water to fill a shot glass than it does to fill a bathtub.
Every modern CPU chip generates more heat than it can tolerate. By itself, the chip will overheat in a few seconds and stop running. A few years back, someone posted pictures on the Web of a test in which they cooked an egg on a standard CPU chip. In a real computer, however, heat is just a waste product that must be discarded.
Cooling a CPU follows essentially the same rules as cooling a car engine. You can pump liquid through a block of metal that covers the chip, then vent the heat through a radiator. Most computers today, however, opt for a simple block of metal with cooling fins (a “heat sink”) that make the CPU air cooled. Anyone alive in the 60’s may remember that the old Volkswagens had air cooled engines that worked the same way.
Smaller circuits generate less heat, but the heat they do generate is concentrated in a smaller area. This may require better cooling. The most recent processors require a layer of copper between the CPU and the heat sink, because copper conducts heat better than aluminum. The very best heat sinks are all copper, but that presents another problem. Copper is much heavier than aluminum, and an all copper heat sink can add more weight than the motherboard can support, particularly when the system is moved.
If a zero bit is represented by empty and a one bit is represented by full, then under ideal circumstances every measurement would find every circuit either entirely empty or entirely full. However, there are slight imperfections in the material or manufacturing process for each circuit. Some may fill or empty slightly slower than others. So the computer is designed with some tolerance. If the circuit is less than 1/3 full, we may treat it as “empty”. If it is more than 2/3 full, we may treat it as “full”. In between, the status is indeterminate.
Intel or AMD test every circuit in every chip they make. They apply a standard voltage, then fill or empty the circuits and measure them. Some chips will be nearly perfect and will operate at the highest speed. Other chips may have circuits that run a bit more slowly and take longer to fill or empty. They will be sold to run at a slower speed.
The target for a CPU chip is that it makes a mistake once in a decade, or even once in a century. Of course, they cannot test each chip for years. So what they do is run the test at a lower voltage and a higher clock speed that the chip is designed to use. Then they test it for some number of hours. If the chip works during the test under these extreme conditions, it will certainly work under normal conditions at the normal speed and voltage.
It appears that the industry standard slop factor is about 33%. That is, if a packaged chip run at its normal speed and voltage is to have an error rate within the allowable tolerance (once per century or whatever) then it will be reliable enough when run at a clock rate one third faster, or a power level one third lower, to make errors only a few times a day.
Of course, if you are using your computer to make financial calculations for your business, then that rate of errors is absolutely unacceptable. However, if the only application of your computer is to listen to music and blast alien invaders from the planet Zoron, then a higher error rate is fine. Most of the time a single failure in a calculation will simply move one pixel around on the screen for a split second and you will never notice it.
So gamers take advantage of the slop factor built into the chip manufacturing and testing process to run their CPU chips faster by manually setting the clock speed to a higher than standard rate. This makes games run faster. For the rest of us, CPU power is almost never the thing that is limiting performance. Business and casual users do not use more than a fraction of the power they already have, and messing with the clock speed is not helpful even if it is possible.
However, the most recent generation of Quad-Core Intel CPU chips has a feature called TurboBoost. If you are doing some work that uses one or two of the cores, but you are not using all four, then the chip turns off the two cores you are not using and increases the voltage and clock speed on the two remaining cores. Essentially, it turns the underutilized 4 Core CPU chip into a higher speed 2 Core chip.
Most of the time you are using your computer it is doing very little work. Then you get up from the machine to put on a pot of tea, and the computer is doing nothing at all. When a modern CPU chip goes idle for a few seconds, it can enter a low power phase. It drops the clock rate in half and then lowers the voltage level. The CPU is still running, wating for something to do, but during this idle phase the CPU uses much less power and generates less heat.
Of course, the CPU chip is only part of the system. Mainboard vendors have their own versions of this technology that reduces clock speed and power use in the chipset. Video vendors are also introducing technology to cut off power to large sections of the card when all you are doing is browsing normal Web pages.
At slow speed anything works, so the first computer designs used whatever options were simpler. Then you push the speed higher and higher till you hit a barrier. There are several ways to move data around in a computer at higher speed. Each solves a problem. Several new architectures combine several features to provide some improvement.
If you look out a window, you may see a bird sitting on top of a power line. Birds can do this because voltage isn’t an absolute property but is always measured relative to something else. Electricity doesn’t move through the bird because the bird isn’t connected to the other power wire or to a ground. Occasionally a squirrel will touch two wires to complete the circuit.
All early computer interfaces started by assigning one wire to each bit of data or control signal. The voltage on all the signal wires would be measured relative to a single common ground wire. This works at low speed, over short distances.
The problem is using a single common ground to measure several different wires. There is some delay after the signal wires change before everything settles down and a reliable measurement can be made. Things are better if you have fewer signal wires for every ground wire. In a modern CPU, every fourth pin can be a ground connection. Still this type of structure seems to max out at around a 200 MHz clock signal.
The solution to this problem has been known since the ’60s, but was first applied to communication over long distances. Each signal is represented by two dedicated wires. To generate a signal, apply a small positive voltage to one wire and an equal negative voltage to the other wire. The receiver measures the difference between the two wires in the pair and determines which is positive and which is negative. Since the two wires have opposite signals, they exactly balance each other and produce 0 net voltage relative to any external reference point.
Balanced pairs also solve the problem of external interference. A long wire is also an antenna. Look in your AM radio and you may find that the antenna is just some ordinary electric wire run around the case. The longer the wire, the more outside signal gets picked up. Insulation blocks the flow of electricity, but radio waves pass right through it. The radio measures a signal induced on a single wire loop. However, when computers use a pair of balanced wires, any external source of interference produces exactly the same effect on each wire of the pair, and at the receiving end the two cancel out.
Over short distances, like on a mainboard, it is sufficient to run the pair of wires next to each other. Any interference they generate or receive tends to cancel out when the pair is measured against each other. Over longer distances, such as a USB or Ethernet cable, the pair of wires is twisted round each other. This prevents either wire from being “closer” all the time to either a source or recipient of interference.
The bad news is that you need more wires or pins than in the older one-pin-per-signal design. If every fourth pin used to be a ground, the pin count increases by 50% to switch to a balanced signal (3 signals require 6 pins instead of 4). However, the clock speed on the pair of wires can be increased by such a large factor that all of the new balanced pair connections end up using far fewer wires in total.
If you send the same electric signal through any parallel set of wires, the electricity will move more slowly through some wires due to slight differences in the metal. The signals arrive at the other end with very small timing differences. This is called “skew”. Like runner in the various lanes of a 100 meter dash, they all start at the same time and place, but they arrive at the finish line staggered by small differences in speed.
|<- Clock Pulse ->|
covers worst skew
Skew is less of a problem over short distances or low speeds. It gets worse when, as in the PCI bus, the wire is connected at points along the path to connectors on sockets into which adapter cards may or may not be plugged. Every time the signal hits a point where it is soldered to something, or where the signal splits in two directions, there will be some delay. These contacts must be manufactured for pennies, so it isn’t feasible for them to be of uniform quality.
Inside a CPU chip, where you have lower voltages and shorter distances, a version of skew can be created by differences in the individual circuits that receive the signal at the far end. Typically, these are circuts that fill up with an electric charge generated by the transmitter. However, variations in the makeup of the circuit can cause some receivers to fill more slowly than others, creating variations in the amount of time it takes to completely receive each signal on each wire.
In addition to the signal wires, a parallel bus carries a clock pulse. All the data bits start out at the same time. The clock, however, cycles half way between adjacent data bits. The idea is that the clock signals the earliest point when the next bit can arrive on any wire, and the last moment when the slowest previous data bit can have arrived.
The problems with a parallel bus are problems in physics, wire, and solder. You can’t fix them with faster CPU chips. Eventually, each parallel bus in the computer is replaced by something better.
The alternative has been understood for as long as there have been Personal Computers. Instead of sending one bit of data down each wire in a parallel bus, send all of the data one bit at a time down a single pair of wires. If there are slight imperfections in the wire, they effect each bit equally. The bits arrive at the other end at the same speed they were sent. This is a Serial bus.
The problem with a Serial bus is that it requires a very fast computer chip to generate and receive the signal. This, however, is a problem that is easily solved as silicon computer chips got faster and cheaper. As chip technology improved, one by one each parallel bus in the PC has been replaced by a serial alternative.
In computer terms, a “bus” is a communication path shared by many devices. The memory bus is a sequence of 1 to 4 slots into you can plug “DIMM” modules of memory. The PCI bus is a sequence of up to 5 slots into which you plug adapter cards. Even the IDE and SCSI disk connectors are bus cables that support more than one device.
However, each time a wire encounters a slot or connector there is an interface point that minimally increases skew and can also generate a “reflection” where the signal bounces off the interface and starts to travel back towards it source.
The solution is to redesign each connection to be “point to point” between two devices. In the newer disk technologies (SATA replacing old IDE, and SAS replacing the old SCSI) a dedicated cable with two pair of wires connects each disk to a dedicated connector on the controller card. On many mainboards today, the fastest supported memory speed is only permitted when a single DIMM is plugged into the memory slot. If you want to plug two DIMMs of memory into the same bus, you have to drop performance to a lower clock speed.
The most advanced version of this design principle is provided by HyperTransport, a new high speed chip interconnect system used by AMD, IBM, and Apple. The CPU is connected to other chips on the mainboard using a system or point to point wires. To make the system work, each chip can receive a signal from wires on one side and relay the signal bit by bit on to the next chip.
The conventional design for memory, CPU, PCI, and other chip connection technology is to use the same pins to both transmit and receive data. This requires the chip to have both “transmit” and “receive” electronics connected to the same wire.
Other systems designate one pair of wires to transmit data from the chip, and a different pair of wires to receive data on the chip. Superficially such a design appears to either require twice as many wires to get the same speed or else to cut the amount of data that can be transferred in half. That would be true if the data was going in only one direction. In practice, however, data has to flow in both directions and this design allows each chip to transmit and receive data simultaneously. Plus it allow a slightly higher clock rate.
Typically no system requires all of these features at once. Almost all new technologies use balanced pairs of wires. Otherwise, Intel’s PCI Express has decided on a serial bus (not parallel, not point to point) while AMD, IBM, and Apple like HyperTransport which is a parallel point to point (not serial, not bus) system.
The following table shows some PC serial connections that have been or will be replacing older parallel connections:
|USB printer||Parallel Printer cable||2000|
|Serial ATA||80 wire ATA cable||2003|
|Serial SCSI||various SCSI cables||2004|
|PCI Express||PCI bus||2004|