Harley Hahn's Guide to
|
A Personal Note
Chapters...
Command
INSTRUCTOR |
Chapter 7... Using the Keyboard With Unix
In Chapter 6, we talked about the differences between the GUI (graphical user interface) and the CLI (command line interface). Starting with this chapter, and for the rest of the book, we will be concentrating on the CLI, the traditional way to use Unix. There are several ways in which you can use the CLI. When you work with your own computer, you can use a virtual console or a terminal window (including the Konsole program). We discussed the details in Chapter 6. When you work with a remote host, you can connect via the ssh program, which will act as a terminal emulator for you. Regardless of how you get to a Unix command line, once you are there, it always works the same way (more or less). If you are using a GUI-based system, I would like you to be familiar with several topics from Chapter 6 before you read this chapter: virtual consoles, terminal windows, and how to select and paste. With a GUI, understanding these ideas is crucial to using the CLI well.
When Unix was first developed by Ken Thompson and Dennis Ritchie (see Chapter 2), they used Teletype ASR33 terminals (see Chapter 3). The Teletype ASR33 was an electromechanical device, originally developed to send and receive text messages. It had a keyboard for input and a built-in printer for output. It also had a paper tape punch, which could store data by punching holes on paper tape, as well as a paper tape reader, which could read data from punched tape. The ASR33's capabilities made it suitable to use as a computer terminal. In fact, from the mid-1960s to the mid 1970s, virtually all non-IBM computer systems used an ASR33 for the console. This was true of the PDP minicomputers used by Thompson and Ritchie, so it was natural that these devices should become the very first Unix terminals (see box). The Teletype ASR33 The ASR33, manufactured by the Teletype Corporation, was introduced in 1963. There were three Teletype 33 models: the RO, KSR and ASR. Of the three Teletype 33s, the ASR was by far the most popular. The RO (Receive-Only) had a printer but no keyboard. As such, it could receive messages, but not send them. The KSR (Keyboard Send-Receive) had both a printer and a keyboard, and could send and receive messages. The outgoing messages were typed by hand on the keyboard. The ASR (Automatic Send-Receive) had a printer, a keyboard, and a paper tape punch/reader. Like the KSR, the ASR could send and receive messages. However, with the ASR the outgoing text could be generated in two ways. It could be typed at the keyboard by hand, or it could be read automatically from pre-punched paper tape (hence the name "Automatic"). It was these combination of features that made the ASR33 useful as a computer terminal. The Teletype ASR 33 terminal weighed 56 pounds, including a 12-pound stand. If you had bought one from DEC in 1974, it would have cost you $1,850, plus a $120 installation fee and $37/month maintenance. In 2008 dollars, that's $8400 for the machine, $550 for the installation, and $170/month for maintenance. You can see photos of an ASR33 in Chapter 3. Figures 3-1 and 3-2 show the machine. Figure 3-3 is a close-up of the paper tape punch/reader. The keyboard of the ASR33 was originally designed to send and receive messages, not to control the operation of a computer. As such, Thompson and Ritchie had to adapt Unix to work with the ASR33 keyboard. What is interesting is that the basic system they devised worked so well, it is still used today. As you would expect, the keyboard of the Teletype contained keys for the 26 letters of the alphabet, the digits 0-9, as well as the most common punctuation symbols. However, there were also a few special keys that were used to provide the functions necessary to send and receive messages (see Figure 7-1). The most important such keys were < Esc>, <Ctrl>, <Shift>, <Tab> and <Return>.
The <Ctrl> (Control) key was especially useful because, like the <Shift> key, it could be combined with other keys to form new combinations. For example, by holding down the <Ctrl> key and pressing one of the letters or numbers, you could send a signal such as <Ctrl-A>, <Ctrl-B>, <Ctrl-C>, and so on. What Thompson and Ritchie did was to incorporate the use of these keys into their basic design of the operating system. To do this, they wrote Unix so that certain signals could be used to control the operation of a program as it was running. For example, the signal called intr (interrupt) was used to terminate a program. To send the intr signal, you pressed <Ctrl-C>. In technical terms, when there is an equivalence between two things, we say that there exists a MAPPING between them. When we create such an equivalence, we say that we MAP one thing onto the other. For example, if we say that A is mapped onto B, it means that, when we use A, it is the same as using B. The idea of mapping is an important concept, one that you will meet again and again as you use computers. In this case, we can say that, within Unix, the <Ctrl-C> character is MAPPED onto the intr signal. This is the same as saying that when we press <Ctrl-C>, it has the effect of sending the intr signal. In a moment, we'll talk about the Unix signals in detail. In fact, my main goal in this chapter is to explain the important signals and their keyboard mappings. First, however, I want to take a moment to talk about nomenclature.
As you use Unix, you will find that many conventions are based on the technology of the 1970s, the time during which the first versions of Unix were developed. In particular, the world of Unix abounds with ideas that are based on the characteristics of the early terminals. This is why I have made a point of talking about Teletypes (the original terminals) and VT100s (the most popular terminals), both in this chapter and in Chapter 3. In this section, I'd like to take a quick detour to mention two such conventions that you will encounter a great deal. The first convention to which I want to draw your attention is the abbreviation "tty" (pronounced "tee-tee-why"). During the many years that Teletype machines were in use, they were referred to as TTYs. This custom was adopted into Unix and, even though it has been a long time since Teletypes were used, the word "tty" is often used as a synonym for a Unix terminal. In particular, you will see this term a lot in Unix documentation and in the names of programs and commands. Here are some examples: • Within a Unix system, each terminal has its own name. The command to display the name of your terminal is tty. (Try it and see what you get.) • The stty ("set tty") command can be used to display or change the settings of your terminal. • The getty ("get tty") program is used to open communication with a terminal and start the login process. The second convention I want to mention relates to the idea of printing. Teletype terminals had two ways to output data. They could print data on a continuous roll of 8½ inch paper for a human to read(*), and they could punch data on 1-inch wide paper tape for a machine to read. If you take a look at Figure 3-2 in Chapter 3, you can see both the roll of printer paper (in the center) and the roll of paper tape (to the left). * Footnote In case you are curious, Teletypes printed output on continuous rolls of 8½ inch paper, which could be up to 5 inches in diameter. The machine printed 10 characters/inch, with a maximum line length of 72 characters. The vertical spacing was 6 lines/inch. The printing was in one color, normally black. Because output was printed, it became the custom within Unix to use the word PRINT to describe the outputting of information. At the time, this made sense because output was, literally, printed on paper. What is interesting, however, is that, even when more modern terminals became available and data was displayed on monitors, the word "print" was still used, and that is still the case today. Thus, within Unix documentation, whenever you read about printing data, it almost always refers to displaying data. For example, the tty I mentioned above displays the internal name of your terminal. If you look up this command in the Linux version of the online Unix manual (see Chapter 9), you will see that the purpose of tty is to "print the file name of the terminal connected to standard input". Here is another example. As you work within the Unix file system, the directory in which you are working at the current time is called your "working directory". (We'll cover these ideas in Chapter 23.) The command to display the name of your working directory is pwd, which stands for "print working directory". At this point, it only makes sense to ask: If "print" means "display", what term do we use when we really mean print? There are two answers to this question. First, in some cases, "print" is used to refer to real printing and the meaning is clear by context. At other times, you will see the term "line printer" (itself an anachronism) or the abbreviation "lp". When you see this, you can consider it a synonym for "printer". For example, the two most important commands to print files are named lp and lpr. (lp comes from System V; lpr comes from Berkeley Unix.)
As I explained in Chapter 3, Unix was designed as a system in which people used terminals to access a host computer. One of the most important problems that Unix developers had to overcome was that each type of terminal had its own characteristics and used its own set of commands. For example, although all display terminals have a command to clear the screen, the command may not be the same for all terminals. So, what do you do if you are writing a program and at a particular point, you need to clear the screen of the user's terminal? How would you know what command to send to the terminal, when the actual command depends on what type of terminal is being used? It doesn't make sense to require every program to know every command for every type of terminal. This would be an enormous burden for software developers.(*) Moreover, what would happen when a new terminal was introduced? How could it be made to work properly with existing programs? * Footnote Even as long ago as 1980, most terminals supported well over 100 different commands. The solution was to collect descriptions of all the different types of terminals into a database. Then, when a program wanted to send a command to a terminal, it could be done in a standardized manner by using the information in the database. (We'll talk more about how it works in a moment.) The first system of this nature was created by Bill Joy, one of the fathers of Berkeley Unix (see Chapter 2). In 1977, when Joy was a graduate student and he put together 1BSD, the first official version of Berkeley Unix, he included a system for managing the display screen of various types of terminals. In mid-1978, he released 2BSD with a more elaborate version of this system, which he named TERMCAP ("terminal capabilities"). The first important program to use Termcap was the vi editor (see Chapter 22), also written by Joy. Using Termcap from within a program was a lot of work. To make it easier, another Berkeley student, Ken Arnold, developed a programming interface he called curses. (The name refers to "cursor addressing".) curses was designed to carry out all the functions that were necessary to manage a screen display, while hiding the details from the programmer. Once a programmer learned how to use curses, he could write programs that would work with any type of terminal, even those that had yet to be invented. All that was required was that the terminal should have an entry in the Termcap database. The first program to use Termcap was a popular text-based game called Rogue (see box).
The First Use of curses and Termcap:
The first program to use curses and Termcap to control the display screen was Rogue, a single-player, text-based fantasy game in the genre of Dungeons & Dragons. To play Rogue, you would take on the role of an adventurer in an enormous dungeon. At the beginning of the game, you are at the top level of the dungeon. Your goal is to fight your way to the bottom of the dungeon, where you can pick up the Amulet of Yendor. You must then return to the top, bringing the amulet with you. Along the way, you encounter monsters, traps, secret doors and treasures. At the time Rogue was developed, there was another single-player fantasy game, Adventure, that was very, very popular among programmers. (In fact, I remember playing it on an old Texas Instruments print terminal, connected to a Unix computer over a slow phone line.) Adventure was the same each time you played it but, with Rogue, the dungeon and its contents were generated randomly. This meant that the game was always different. In addition, because Rogue used curses, it was able to draw simple maps, something Adventure was not able to do. The authors of Rogue were Michael Toy, Glenn Wichman and, later, Ken Arnold. The first version of the game was written for Berkeley Unix and, in 1980, Rogue was included with 4.2BSD. 4.2BSD was so popular that, within a short time, Rogue was available to students around the world. If you were to look at the first version of Rogue, you would find it incredibly primitive. However, it was much more sophisticated than any previous computer game and, at the time, was considered to be very cool. Eventually, Rogue was ported to a variety of other systems, including the PC, Macintosh, Amiga and Atari ST. Today, Rogue is still around and, in its modern incarnation, is played by people around the world. If you are interested in taking a look at one of the more interesting legacies from the early days of Unix, search for "Rogue" on the Internet. In order to work effectively, the Termcap database had to contain technical information for every variation of every terminal that might be used with Unix, and all this data was contained in a single file. Over the years, as many new terminals became available, the Termcap file grew so large as to become unwieldy to maintain and slow to search. At the time, curses was being enhanced by the programmers at Bell Labs for System III and, later, for System V Release 1 (see Chapter 2). To improve the performance of curses, the Bell Labs programmers replaced Termcap with a new facility called TERMINFO ("terminal information"). Terminfo stored its data in a series of files, one for each terminal type. The files were organized into 26 directories named a through z, all of which were kept in a single Terminfo directory. (This will make sense after you read Chapter 23.) The Terminfo design was so flexible that it is still used today. For example, within Linux, the information for the generic VT100 terminal is stored in the file named: /usr/share/terminfo/v/vt100 The location of the master Terminfo directory can vary from one system or another. In case you want to look for it on your system, the most common names are:
/usr/share/terminfo/
The biggest problem with Terminfo was that AT&T, which owned Bell Labs, would not release source code (see Chapter 2). This meant that, although System V had Terminfo and a better version of curses, the hacker community did not have access to it. They had to make do with the older, less powerful Termcap-based facility. To overcome this limitation, in 1982, a programmer named Pavel Curtis began to work on a free version of curses, which he called ncurses ("new curses"). ncurses had very limited distribution until it was taken over by another programmer, Zeyd Ben-Halim, in 1991. In late 1993, Ben-Halim was joined by Eric Raymond, and together they began to work on ncurses in earnest. Throughout the early 1990s, ncurses had a lot of problems. However, in time, as other people joined the effort, the problems were solved and ncurses and Terminfo emerged as an enduring standard. Today, Terminfo has replaced Termcap permanently. However, to maintain compatibility with very old programs, some Unix systems still have a Termcap file, even though it is obsolete and its use is deprecated(*). * Footnote If something is deprecated, it means that, although you can use it, you shouldn't, because it is obsolete. You will often see the term "deprecated" in computer documentation, especially in the programming world, where things change quickly. When you see such a note, you should take it as a warning that the feature may be eliminated in future versions of the product. Would you like to see what Termcap or Terminfo information looks like? The Termcap database is easy to display because it consists of plain text, stored as one long file. If your system has a Termcap file, you can display it by using the following command: less /etc/termcap The less program displays a file, one screenful at a time. We will talk about less in detail in Chapter 21. For now, I'll tell you that, once less starts: |
• To move forward one screenful, press <Space>.
|
|
When a program writes a line of output to the bottom of your screen and all the other lines move up one position, we say they SCROLL upward. If a program produces output too fast, data will scroll off the top of the screen before you can read it. If you want to see an example of this, use one of the following commands. The dmesg command, which we met in Chapter 6, shows you all the messages that were displayed when the system booted. Alternatively, you can use the cat command to display the Termcap file:
dmesg
The cat command, which we will meet in Chapter 16, concatenates data and sends it to the default output location, or "standard output". In this case, cat copies data from the file /etc/termcap to your display. However, the copying is so fast that most of the data scrolls off the screen before you can read it, which is the purpose of this example. In such cases, you have three choices. First, if the lost data is not important, you can ignore it. Second, you can restart the program that generates the data and have it send the output to a so-called paging program like less (Chapter 21) that will display the output one screenful at a time. This is what we did earlier in the chapter when we used the command: less /etc/termcap With dmesg, we would use a different command that makes use of the | (vertical bar) character. This is called the "pipe symbol", and we will discuss it in Chapter 15. The idea is to reroute the output of dmesg to less. dmesg | less Finally, you can press the ^S key to send the stop signal. This tells Unix to pause the screen display temporarily. Once the display is paused, you can restart it by pressing ^Q to send the start signal. To remember, just think of "S" for Stop and "Q" for Qontinue. Using ^S and ^Q can be handy. However, you should understand that ^S only tells Unix to stop displaying output. It does not pause the program that is executing. The program will keep running and will not stop generating output. Unix will store the output so that none will be lost and, as soon as you press ^Q, whatever output remains will be displayed. If a great many lines of new data were generated while the screen display was paused, they will probably whiz by rapidly once you press ^Q. For this reason, you may find that it is just too difficult to press the ^S and ^Q keys fast enough. If this is the case, you should use less to control the output. By the way, you might be wondering, why were ^S and ^Q chosen to map to the start and stop signals? It does seem like an odd choice. The answer is, on the Teletype ASR33, <Ctrl-Q> sent the XON code, which turned on the paper tape reader; <Ctrl-S> sent the XOFF code, which turned it off. hint If your terminal ever locks up mysteriously, try pressing ^Q. You may have pressed ^S inadvertently and paused the display. When everything seems to have stopped mysteriously, you will never cause any harm by pressing ^Q.
From time to time, you will work with programs that expect you to enter data from the keyboard. When you get to the point where there is no more data, you indicate this by pressing ^D which sends the eof (end of file) signal. Here is an example: In Chapter 8, I discuss the bc program which provides the services of a built-in calculator. Once you start bc, you enter one calculation after another. After each calculation, bc displays the answer. When you are finished, you press ^D to tell bc that there is no more data. Upon receiving the eof signal, the program terminates.
In Chapter 2, I explained that the shell is the program that reads your Unix commands and interprets them. When the shell is ready to read a command, it displays a prompt. You type a command and press <Return>. The shell processes the command and then displays a new prompt. In some cases, your command will start a program, such as a text editor, that you will work with for a while. When you end the program, you will be returned to the shell prompt. Thus, in general terms, a Unix session with the CLI (command line interface) consists of entering one command after another. Although the shell may seem mysterious, it is really just a program. And from the point of view of the shell, the commands you type are just data that needs to be processed. Thus, you can stop the shell by indicating that there is no more data. In other words, you can stop the shell by pressing ^D, the eof key. But what does stopping the shell really mean? It means that you have finished your work and, when the shell stops, Unix logs you out automatically. This is why it is possible to log out by pressing ^D. You are actually telling the shell (and Unix) that there is no more work to be done. Of course, there is a potential problem. What if you press ^D by accident? You will be logged out immediately. The solution is to tell the shell to trap the eof signal. How you do this depends on what shell you are using. Let's take each shell in turn Bash, the C-Shell, and the Korn Shell and you can experiment with your particular shell.
Bash is the default shell with Linux. To tell Bash to ignore the eof signal, you use as environment variable named IGNOREEOF. (Notice there are two Es in a row, so be careful when you spell it.) Here is how it works. IGNOREEOF is set to a particular number, which indicates how many times Bash will ignore ^D at the beginning of a particular line before logging you out. To set IGNOREEOF, use a command like the following. (You can use any number you want instead of 5.) IGNOREEOF=5 To test it, press ^D repeatedly, and count how many ^Ds are ignored until you are logged out. When IGNOREEOF is set and you press ^D, you will see a message telling you that you can't log out by pressing ^D. If you are working in the login shell (that is, the shell that was started automatically when you logged in), you will see: Use "logout" to leave the shell. If you are working in a subshell (that is, a shell that you started after you logged in), you will see: Use "exit" to leave the shell. If, for some reason, you want to turn off the IGNOREEOF feature, just set it to 0: IGNOREEOF=0 To display the current value of IGNOREEOF , use: echo $IGNOREEOF To set IGNOREEOF automatically each time you log in, put the appropriate command in your .profile file (see Chapter 14).
The Korn shell is the default shell on various commercial Unix systems. In addition, the default shell for FreeBSD is almost the same as the Korn shell. To tell the Korn Shell to ignore ^D, you set a shell option named ignoreeof. (Notice there are two es in a row, so be careful when you spell it.) To do this, use the command: set -o ignoreeof Once you do, if you press ^D, you will see a message telling you that you can't log out by pressing ^D: Use "exit" to leave shell. If, for some reason, you want to turn off the ignoreeof option, use: set +o ignoreeof To display the current value of ignoreeof , use: set -o This will show you all the shell options and tell you whether or not they are off or on. To set ignoreeof automatically each time you login, put the appropriate set command in your .profile file (see Chapter 14).
To tell the C-Shell to ignore ^D, you set a shell variable named ignoreeof. (Notice there are two es in a row, so be careful when you spell it.) To do this, use the command: set ignoreeof Once you do, if you press ^D, you will see a message telling you that you can't log out by pressing ^D. If you are working in the login shell (that is, the shell that was started automatically when you logged in), you will see: Use "logout" to logout. If you are working in a subshell (that is, a shell that you started after you logged in), you will see: Use "exit" to leave csh. (csh is the name of the C-Shell program.) If, for some reason, you want to turn off the ignoreeof feature, use: unset ignoreeof To display the current value of ignoreeof , use: echo $ignoreeof If ignoreeof is set, you will see nothing. If it is not set, you will see: ignoreeof: Undefined variable. To set ignoreeof automatically each time you login, put the set command in your .cshrc file (see Chapter 14).
So far, I have mentioned a number of keyboard signals, each of which corresponds to some key on your keyboard. These are shown in Figure 7-4. The key mappings I have shown are the most common ones, but they are changeable. Figure 7-4: Summary of important keyboard signals
To display the key mappings on your system, use the following command. stty -a stty is the "set terminal" command; -a means "show me all the settings". The stty command displays several lines of information about your terminal. The only lines we are interested in are the ones that show the keyboard signals and the keys to which they are mapped. Here is an example from a Linux system:
intr = ^C; quit = ^\; erase = ^?; kill = ^U;
And here is an example from a Free BSD system:
discard = ^O; dsusp = ^Y; eof = ^D; eol = <undef>;
Notice that the FreeBSD example has an erase2 signal. As you can see, there are several signals I did not cover. Most of these are not important for day-to-day work, and you can ignore them. hint In Chapter 26, we will discuss how to pause and restart programs that are running. At the time, you will see that you can pause a program by pressing ^Z, which is mapped to the susp (suspend) signal. Once you pause a program with ^Z, it stops running until you restart it by entering the fg (foreground) command. So if you are ever working and, all of a sudden, your program stops and you see a message like Suspended or Stopped, it means you have accidentally pressed ^Z. When this happens, all you have to do is enter fg, and your program will come back to life.
If you would like to change a key mapping, use the stty command. Just type stty, followed by the name of the signal, followed by the new key assignment. For example, to change the kill key to ^U, enter: stty kill ^U Important: Be sure to type the <Ctrl> key combination as two separate characters, not as a real <Ctrl> combination; stty will figure it out. For example, in this case, you would type the ^ (caret) character, followed by the U character. You would not type <Ctrl-U>. When you use stty with the name of a <Ctrl> character, it is not necessary to type an uppercase letter. For instance, the following two commands will both work:
stty kill ^u
Just remember to type two separate characters. Strictly speaking, you can map any key you want to a signal. For example, you could map the letter K to the kill signal. Either of these commands will do the job:
stty kill k
Of course, such a mapping would only lead to problems. Every time you pressed the <K> key, Unix would erase the line you were typing! What an interesting trick this would be to play on a friend.(*) * Footnote You didn't read it here. Normally, we would use only <Ctrl> combinations for mappings. In fact, in almost all cases, it's better to just leave things the way they are, and stick with the standard key assignments. Here is one situation, however, where you may want to make a change. Let's say you often connect to a remote host over a network and, on that host, the erase key is ^?. However, your Backspace key sends a ^H. To make life more convenient, you map ^H to erase. This allows you to press <Backspace> to delete a character. stty erase ^H Here is the opposite example. You connect to a remote host on which ^H is mapped to erase. However, your <Backspace> (or <Delete>) key sends ^?. Use stty to change the mapping as follows: stty erase ^? Remember, the notation ^? does not refer to an actual <Ctrl> key combination. ^? is a two-character abbreviation for "whichever key on your keyboard sends the DEL code". If you decide to mess around with keyboard mappings, you can use the command I described above to check them: stty -a Alternatively, you enter the stty command by itself: stty This will display an abbreviated report, showing only those mappings that have been changed from the default.
As you type on the command line, the cursor points to the next available location. Each time you type a character, the cursor moves one position to the right. What do you do when you make a mistake? As we discussed earlier in the chapter, you press the <Backspace> key, erase one or more characters, and type the new ones. However, what if you want to fix a mistake at the beginning of the line, and you have already typed 20 characters after the mistake? Certainly you can press <Backspace> 21 times, fix the mistake, and retype the 20 characters. However, there is an easier way. With most (but not all) shells, you can simply use the left-arrow key, which I will call <Left>. Each time you press this key, it moves your cursor to the left without erasing anything. You can then make the changes you want and press <Return>. Try this example, using the echo command. (echo simply displays the value of whatever you give it.) Type the following: echo "This is a test!" Now press <Return>. The shell will display This is a test! Now type the following, but do not press <Return>: echo "Thus is a test!" Before you press <Return>, you need to change Thus to This. Your cursor should be at the end of the line, so press <Left> repeatedly until the cursor is just to the right of the u in Thus. Press <Backspace> once to erase the u, and then type i. You can now press <Return>, and you should see the correct output. This is an example of what is called COMMAND LINE EDITING, that is, changing what is on the command line before you send it to the shell. Notice that you did not have to move the cursor to the end of the line before you pressed <Return>. hint When you press <Return>, the characters that are on the command line are sent to the shell to be interpreted. Because the cursor does not generate a character, the shell doesn't care where the cursor is when you send it a command. This means that, when you are command line editing, you can press the <Return> key from anywhere in the line. The cursor does not have to be at the end of the line. All modern shells support some type of command line editing, but the details vary from one shell to another. For that reason, we will leave the bulk of the discussion to later in the book, when we talk about each individual shell. For now, I'll teach you the three most important techniques. Try them and see if they work with your particular shell. First, as you are typing, you can use the <Left> and <Right> arrow keys to move the cursor within the command line. This is what we did in our example above. Second, at any time, you can press <Backspace> to erase the previous character. With some shells, you can also use the <Delete> key to erase the current character. (The <Delete> key I am talking about is the one you find on a PC keyboard, next to the <Insert> key.) Third, as you enter commands, the shell keeps them in an invisible "history list". You can use the <Up> and <Down> arrow keys to move backward and forward within this list. When you press <Up>, the current command vanishes and is replaced by the previous command. If you press <Up> again, you get the command before that. Thus, you can press <Up> one or more times to recall a previous command. If you go too far, press <Down> to move down in the list. You can then edit the line to your liking, and resubmit it by pressing <Return>. What's in a Name? Destructive backspace, Non-destructive backspace When you press the <Backspace> key, it moves the cursor one position to the left while erasing a character. When you press the <Left> arrow key, it moves the cursor to the left without erasing a character. In a sense, the two actions are similar in that they both move the cursor backwards. The only difference is whether or not anything is erased. To capture this idea, you will sometimes see the terms "destructive backspace" and "non-destructive backspace" used. A DESTRUCTIVE BACKSPACE occurs when the cursor moves back and characters are erased. This is what happens when you press the <Backspace> key. A NON-DESTRUCTIVE BACKSPACE occurs when the cursor moves back but nothing is changed. This is what happens when you press the <Left> key.
Earlier in the chapter, we discussed how the way in which Unix handles the <Backspace> key can be traced back to the original Unix terminal, the Teletype ASR33. More specifically, there were two Teletype codes (BS and DEL) that were involved in erasing a character on paper tape. The Unix developers chose one of these codes to use for the erase signal. Interestingly enough, they had the same type of choice when it came to deciding what should happen with the <Return> key. Moreover, the decision they made regarding the <Return> key turned out to be much more important than the one they made with the <Backspace> key. This is because the code they chose is used not only with the <Return> key but as a special marker that goes at the end of every line in a text file. To begin our discussion, we need to, once again, go back in time to the Teletype ASR33. The Teletype ASR33 had a print head that used a ribbon to print characters on paper. As characters were printed, the print head moved from left to right. When the print head got to the end of a line, two things had to happen. First, the paper had to be moved up one line; second, the print head, which was attached to a "carriage", had to be returned to the far left. To make the Teletype perform these actions, there were codes embedded in whatever data was being printed. The data could come from the keyboard, from an incoming communication line, or from the paper tape reader. The first code, CR (carriage return), returned the carriage to its leftmost position. The second code, LF (linefeed), caused the paper to be moved up one line. Thus, the sequence CR-LF performed the actions necessary to prepare to print a new line. From the keyboard, you would send a CR code by pressing either the <Return> key or ^M. (They were equivalent.) You would send the LF code by pressing either the <Linefeed> key or ^J. (If you look at Figure 7-1 earlier in the chapter, the <Return> key is at the far right of the second row from the top. The <Linefeed> key is one position to the left.) When the Unix developers came to use the Teletype as a terminal, they created two signals based on the CR and LF codes. The CR code became the RETURN signal. The LF code became the LINEFEED signal. So now, let us ask a question: When you type at a Unix terminal, what happens when you press the <Return> key? Before I can answer that, I need to talk a bit about how Unix organizes plain text into files.
As we have discussed, Unix uses two signals based on the old Teletype: return and linefeed. From the keyboard, you send return by pressing ^M and linefeed by pressing ^J. Since return and linefeed are really the same as ^M and ^J, we usually refer to them as characters, rather than signals. Here is how they are used in three different situations. First: When files contain textual data, we usually divide the data into lines. In Unix, we use a ^J character to mark the end of each line. When we use ^J in this way, we refer to it as a newline character, rather than linefeed. Thus, when a program reads data from a file, it knows it has reached the end of a line when it encounters a newline (that is, a ^J character). Second: When you are typing characters at a terminal, you press <Return> at the end of the line. Doing so sends the return character, that is, ^M. Third: When data is displayed, it is sent to your terminal one line at a time. At the end of each line, the cursor must be moved to the beginning of the next line. As with the Teletype, this involves two separate actions: a "carriage return" to move the cursor to the beginning of the line, followed by a "line feed" to move the cursor down one line. For the "carriage return" Unix sends a return character (that is, ^M). For the "linefeed", Unix sends a linefeed character (that is, ^J). Thus, when data is displayed, each line must end with ^M^J. One of the most elegant features of Unix is that data typed at the keyboard is treated the same as data read from a file. For example, say you have a program that reads a series of names, one per line. Such a program can read the names either from a file on your disk or from the keyboard. The program does not need to be written in a special way to have such flexibility. This feature, called "standard input", is built into Unix. Standard input allows all Unix programs to read data in the same way, without having to worry about the source of the data. (We will discuss this idea in Chapter 15.) In order for standard input to work properly, every line of data must end with a newline. However, when you type characters at the keyboard, they come in with a return at the end of the line, not a newline. This creates a problem. Similarly, when Unix programs output data, they can make use of "standard output". This allows all programs to write data in the same way, without having to worry about where the data is going. When data is written to a file, each line must end with a newline character (that is, ^J). However, when data is written to the terminal, each line must end with return+newline (^M^J). This creates a second problem. These problems are reconciled in two ways. First, as you type, whenever you press <Return>, Unix changes the return into a newline. That is, it changes the ^M into ^J. Second, when data is being written to the terminal, Unix changes each newline to a return+linefeed. That is, it changes ^J to ^M^J. Here is a quick summary to help you make sense of all this:
At first, this may seem a bit confusing. Eventually, you will come to see that it all makes perfect sense, at which time you will know that you have finally started to think in Unix. hint Within text files, Unix marks the end of each line with a ^J (newline) character. Microsoft Windows, however, does it differently. Windows marks the end of each line with a ^M^J. (In Unix terms, that would be return+linefeed.) Thus, when you copy text files from Unix to Windows, each ^J must be changed to ^M^J. Conversely, when you copy files from Windows to Unix, each ^M^J must be changed to ^J. When you use a program to copy files between two such computers, the program should know how to make the changes for you automatically. If not, there are utility programs available to do the job.
Unless you are a programmer, it is not really necessary to master all the technical details regarding return and newline. Just remember to press <Return> at the end of each line and let Unix do the work. However, there are some situations in which understanding these ideas is helpful. On rare occasions, the settings for your terminal may become so messed up that the terminal does not work properly. In such cases, there are two commands you can use to reset your terminal settings to reasonable values: stty sane or reset. In rare cases, you may find that when you try to enter one of these commands by pressing <Return>, the return to newline conversion will not work, and Unix will not accept the command. If this happens, simply press ^M instead of <Return> This will work because the two keys are essentially the same. The solution is to press ^J (the same as newline), which is all Unix wants anyway. Thus, when all else fails, typing one of the following commands may rejuvenate your terminal. Be sure to type ^J before and after the command. You can try it now if you want; it won't hurt anything.
<Ctrl-J>stty sane<Ctrl-J>
You might ask, if that is the case, can you press ^J instead of <Return> to enter a command at any time? Of course try it. To show you how useful these commands can be, here is a true story. I have a friend Susan who was helping someone with a Linux installation. They were working with a program that allows you to choose which options you want included in the kernel. The program needed a particular directory that did not exist, so Susan pressed ^Z to pause the program. She now had an opportunity to create the directory. However, it happened that the program in order to keep the display from changing had disabled the effect of the return character. This meant that, whenever Susan entered commands, the output would not display properly. (Remember, when Unix writes data to the terminal, it puts return+linefeed at the end of every line. What do you think happens when only the linefeed works?) Susan, however, is nothing if not resourceful. She entered the reset command and, in an instant, the terminal was working properly. She then created the directory she needed, restarted the installation program, and lived happily ever after.
A long time ago, there lived a young, handsome, charming programmer (you can tell this is a fable), who won the love of a beautiful princess. However, the night before their wedding, the princess was kidnapped. Fortunately, the princess had the presence of mind to leave a trail of pearls from her necklace. The programmer followed the trail to a remote corner of the lawless Silicon Valley, where he discovered that his love was being held captive in an abandoned technical support center by an evil Vice President of Marketing. Thinking quickly, the programmer took a powerful magnet and entered the building. He tracked down the princess and broke into the room where the VP of Marketing stood gloating over the terrified girl. "Release that girl immediately," roared the programmer, "or I will use this magnet and scramble all your disks." The VP pressed a secret button and in the blink of an eye, four more ugly, hulking vice presidents entered the room. "On the other hand," said the programmer, "perhaps we can make a deal." "What did you have in mind?" said the VP. "You set me any Unix task you want," answered the programmer. "If I do it, the princess and I will go free. If I fail, I will leave, never to return, and the princess is yours." "Agreed," said the VP, his eyes gleaming like two toady red nuggets encased in suet. "Sit down at this terminal. Your task will have two parts. First, using a single command, display the time and date." "Child's play," said the programmer, as he typed date and pressed the <Return> key. "Now," said the VP, "do it again." However, as the programmer once again typed date, the VP added, "but this time you are not allowed to use either the <Return> key or ^M." "RTFM, you ignorant buffoon!" cried the programmer, whereupon he pressed ^J, grabbed the princess, and led her to his waiting Ferrari and a life of freedom.
Review Question #1: Why is it a Unix convention to use the abbreviation "tty" to refer to terminals? Review Question #2: Why is it a Unix convention to use the word "print" to refer to displaying data on a monitor? Review Question #3: What does the term "deprecated" mean? Review Question #4: How does Unix know which terminal you are using? Review Question #5: Which key do you press to erase the last character you typed? The last word? The entire line? Applying Your Knowledge #1: By default, the erase key is the <Backspace> key (or on a Macintosh, the <Delete> key). Normally, this key is mapped to ^H or, less often, ^?. Use the stty command to change the erase key to the uppercase letter "X". Once you do this, you can erase the last character you typed by pressing "X". Test this. What happens when you press a lowercase "x"? Why? Now use stty to change the erase key back to the <Backspace> (or <Delete>) key. Test to make sure it worked. For Further Thought #1: One way to logout is to press ^D (which sends the eof signal) at the shell prompt. Since you might do this by accident, you can tell the shell to ignore the eof signal. Why is this not the default? What does this tell you about the type of people who use Unix? For Further Thought #2: In Chapter 1, I mentioned that the first version of Unix was developed by Ken Thompson, so he could run a program called Space Travel. In this chapter, I explained that the first program to use Termcap (terminal information database) and curses (terminal manager interface) was a text-based fantasy game called Rogue, written by Michael Toy and Glenn Wichman. Creating a new operating system and experimenting with a brand new set of interfaces are both extremely time-consuming, difficult tasks. What do you think motivated Thompson and, later, Toy and Wichman to take on such challenging work for what seem to be trivial reasons? If you were managing a group of programmers, what motivations do you think they would respond to (aside from money)?
List of Chapters + Appendixes
© All contents Copyright 2024, Harley Hahn
|