We had this problem before Computers

Characters had to be encoded in the early 20th Century for telegraph applications (AT&T) and punch card tabulation (IBM).

In the 1930’s, corporate data processing was done using decks of punched cards. A deck of sales data could be run through a mechanical sorter to arrange the cards by region, customer, salesman, or product. Then the sorted deck was fed into a mechanical tabulator “programmed” with patch panel wires to sum up groups of columns containing quantity or dollar value to generate reports. IBM dominated this early information processing technology.

IBM introduced computers into this existing product line to act as a more sophisticated card tabulating device. As a result, IBM created a computer character code that matched the way that characters were punched on the cards. The punch card was not really a binary system. It had ten rows labeled “0” to “9” and three extra control rows. Mechanical adding machines operated on decimal numbers, not binary numbers. For each column on the card there was a wheel with positions numbered 0 to 9. To add a number punched on the card, the device would sense which of the ten holes was punched and then rotate the wheel the corresponding number of positions. A “0” punch left the wheel alone. A punch in the “5” row rotated the wheel 5 positions. When the wheel rotated from 9 back to 0, a 1 carried over to the wheel one column to the left.

When letters are punched on the card instead of numbers, some of the top three control rows were also punched. Nevertheless, IBM arranged the letters in a decimal system. “A” was represented by a punch in the 1 row and some punches in the control row. “B” had a 2 punch, and so on up to “I” with a 9 punch. Then the letters wrapped back so “J” had a 0 punch and an alternate set of control punches in the top three rows.

When IBM created a character code for computers, it based the code on the decimal system used in the punch cards rather than binary values that would have been more natural for computers. The IBM code (called “EBCDIC”) preserved the break between “I” and “J” and between “R” and “S”.

Punch cards were a fairly expensive method of entering data. If the equipment had not already existed, it would not have been created just for computers. Other early computer makers wanted a less expensive system, but they were not prepared to design their own equipment. So they turned to a device created by another US monopoly: the phone company. AT&T had been producing Teletype machines for years to support transmission of text messages over the telephone network.

When telegraph signals were sent by hand, characters were transmitted by a series of dots and dashes. However, Teletype machines transmitted text electromechanically by alternating sound sent continuously over a phone line between two different tones. One sound, called “Mark”, represented a 1, while the other, called “Space” represented a 0. A teletype transmitted 10 characters per second. An operator could dial the phone, connect to another teletype, and type a message at the keyboard. Each key pressed on one machine was printed onto the roll of paper at the other end. However, long distance phone calls were expensive in those days. It was more efficient to prepare the message in advance by recording it on punched paper tape. Then when the phone connection was made, a paper tape reader connected to the Teletype device could transmit the message at the maximum speed of 10 characters per second.

By good luck, the teletype code was binary and was therefore ideal for computer applications. The Teletype only supported uppercase letters. With 10 numeric digits, 26 uppercase letters, a space, and the highest value (all ones) reserved for correcting typographical errors, This left room for 27 punctuation marks while still remaining within the 64 code values that would be permitted by a six bit code. However, the Teletype also required “control characters” that correspond to the typewriter operations of Return, Tab, Backspace, and Page Eject.

Teletypes had a few extra control functions like BEL (“Bell”). The printer had a little bell in it that rang when the BEL character was received. Teletypes connected services like AP to the newspapers and TV/radio stations. When an important story was about to be printed out, a few BEL characters were supposed to attract attention. The “Hot Line” between the President and the Kremlin was never red phones, but was instead a pair of Teletype machines.

So although the Teletype had only 64 graphic characters, it added an additional 32 control characters. That forced the code up to seven bits. A seven bit code has possible values from 0 to 127. The highest value is reserved for error correction. This meant the original ASCII standard had 32 control codes (0-31), 64 graphic characters(32-95), 31 unassigned code values(96-126), and DEL(127).

Standards are reviewed every five years. At its first revision, the previously unused 31 code values were assigned to support lower case letters and some additional punctuation. This provided a basic Latin alphabet that would support English language text.