Control characters in ASCII and Unicode
Tens of odd control characters appear in ASCII charts. The same characters have found their way to Unicode as well. CR, LF, ESC, CAN... what are all these codes for? Should I care about them? This is an in-depth look into control characters in ASCII and its descendants, including Unicode, ANSI and ISO standards.
When ASCII first appeared in the 1960s, control characters were an essential part of the new character set. Since then, many new character sets and standards have been published. Computing is not the same either. What happened to the control characters? Are they still used and if yes, for what?
This article looks back at the history of character sets while keeping an eye on modern use. The information is based on a number of standards released by ANSI, ISO, ECMA and The Unicode Consortium, as well as industry practice. In many cases, the standards define one use for a character, but common practice is different. Some characters are used contrary to the standards. In addition, certain characters were originally defined in an ambiguous or loose way, which has resulted in confusion in their use.
- Groups of control characters
- Control characters in standards
- Control characters in modern applications
- Character list
- Character index
This article starts by looking at the history of control characters in standards. We then move to modern times. The rest of the article lists all the control characters in detail.
The above clickable table summarizes the control characters. The character codes are given in hexadecimal. Color coding indicates character category. Click a character to jump to more information on it.
Groups of control characters
For the purposes of this document, control characters are divided into three groups.
1. ASCII control characters. The ASCII control character area covers code positions 0–31 (hex 00–1F). This area is also called the C0 set. Two additional controls appear at 32 and 127 (hex 20 and 7F). The ASCII control characters cover a wide range of uses, such as text layout, transmission and device control, and more. More
2. C1 control characters. C1 covers positions 128-159 (hex 80-9F). C1 is primarily for displays and printers. This set is related to ANSI escape sequences and VT100. More
3. ISO 8859 special characters. Two special characters, NBSP and SHY, are from ISO 8859. They are also used in Windows and Unicode. They appear at 160 and 173 (hex A0 and AD). More
Note: These control character sets are not the only control characters ever used. Other C0 and C1 sets do exist. Alternative sets were defined for special uses. In them, a part of the standard C0/C1 controls have been deleted or replaced by new controls. Even totally different alternative sets exist. Alternative control characters are not discussed in this article. One can find them in the International Register of Coded Character Sets.
Control characters in standards
ASCII control characters
C0 = positions 0–31. Origin with ASCII and ISO 646 character sets. Characters SP and DEL appear together with C0.
The first group of control characters originates from ASCII. These characters consist of a set called C0 and two additional characters. The C0 set is in locations 0 to 31. Two additional ASCII characters, SP and DEL, fall outside the C0 area, but they are closely related to the C0 set. All of these characters are defined by the same standards.
This set of control characters covers many uses. There are "Format Effectors" that control the appearance of plain text. There are "Transmission Controls" for use with transmission protocols and "Device Controls" to start, operate and stop auxiliary devices. There are "Information Separators" that delimit various pieces of data. Other controls exist for producing alerts, filling a media, indicating end of media, and for dealing with errors. There are even controls to create new characters and controls. The C0 set was defined with perforated tape, punched cards and typewriter-like devices in mind. Devices have changed since then, but the C0 controls have survived.
History of ASCII control characters
The first version of ASCII was released in 1963. Like the ASCII of today, the 1963 version covered some letters and symbols, as well as control characters. While many of those 35 control characters were similar to those of modern ASCII, some were different. ASCII-1963 had some serious shortcomings, such as no support for lower case letters. It quickly turned out that the standard must be revised. Today, ASCII-1963 is practically forgotten. Since ASCII-1963 deviates a lot from later ASCII versions in the control character area too, we will not go any deeper into it.
The next revision was ASCII-1965. This version, although formally accepted, was not published. Another revision was going to take place. ASCII as we know it is based on the ASCII-1967 standard (USAS X3.4-1967). This version was an important milestone. It was already very close to the version that then became widely used.
In 1968 ASCII was slightly updated and released as USAS X3.4-1968 (later retronamed as ANSI X3.4-1968). The actual updates were very small, only adding an option to use the character LF as a "newline", and designating ASCII and USASCII as the names of the standard. (Later on, the name USASCII was dropped, leaving ASCII as the official name.)
ASCII-1968 became immensely popular. Almost all of today's computer systems use ASCII or one of its descendants. (A notable exception is EBCDIC used on IBM mainframes, very different from ASCII.) The Internet is based on ASCII-1968 as well.
ASCII-1968 defined the 34 control characters that remained: the C0 set, SP and DEL. Included was a short description of the intended functionality of each control character. These definitions also made themselves to RFC 20 word for word. Most of these definitions have remained materially unchanged for decades. Later standards have updated the text, but the basic functionality is still the same. This is what comes to standards. Non-standard use is common and often contrary to the standards.
When ASCII emerged, computing equipment was quite different from the equipment that ASCII was going to be popularized with. Computers were regularly operated through punched cards, perforated tape and teletypewriters (TTYs). TTYs were typewriter-like devices, which were used as interactive computer terminals. Instead of a monitor they produced output on paper. The ASCII control characters were naturally designed considering the devices of those days. Since then, new devices such as monitors have emerged. It hasn't always been that simple to accommodate the control characters to the newer devices. Despite the challenges, the control characters of the 1960s are still with us.
ISO 646. ASCII evolved to an ISO standard, which is known as ISO 646. The first version came out in 1967. ISO 646 is the "international edition" of ASCII, with a few differences. Despite the differences, these standards were closely related. ISO 646 allowed national variants to support the national characters required for each country. The US national variant was ASCII. Several other national variants were released to support accented letters (à, ü and the like) and other symbols. The ISO variants including ASCII were a common way to express text in the 1970s and 1980s.
As to the control characters, the ASCII control characters set also appeared in ISO 646. The functionality of the control characters remained quite intact, even though the definitions were updated.
More standards. ISO 646 was also released as ECMA-6. The control characters appear in ECMA-6 very similar to those of ISO 646.
A part of the C0 codes were further refined in other standards. SI, SO and ESC appeared as character set extension controls in ANSI X3.41, ISO 2022 and ECMA-35. These characters became widely used to invoke additional character sets. The Transmission Control characters (T1 to T10) appeared as ISO 1745 in 1975, which gave detailed description of where and how they should be used. How widely ISO 1745 was actually used in transmission is another question.
Current status of ASCII control characters
ASCII was later updated in 1977 and again in 1986 to be in conformance with ISO 646. The control characters in ASCII-1986 and ISO 646/ECMA-6 are very similar, even though minor differences do exist.
The current ISO and ECMA versions, namely ISO 646:1991 and ECMA-6:1991, no longer define the C0 control characters. The control characters didn't go away, however. They now appear in ISO/IEC 6429:1992 and ECMA-48:1991, respectively. Simply put, the C0 set was lumped together with other control characters, the C1 set, which follows below.
As to some specific control characters, the current detailed definitions of SI, SO and ESC can be found in ANSI X3.41, ISO 2022 and ECMA-35. The current details for the Transmission Control characters (TC1 to TC10) appear in the old ISO 1745 from 1975.
Even though the history of the various standards related to the ASCII control codes may sound unnecessarily complicated, the standard functionality of the characters has not changed dramatically. It's still mostly the same as back in 1967. This is what comes to standards. The practice is totally different. Some control characters are indeed commonly used the standard way. On the other hand, many are used contrary to the standards, or simply ignored. It's not uncommon to find control characters forbidden in data. Control characters can have unwanted or unknown side-effects. The easiest way for programmers to deal with them is to shut their eyes or deny such characters altogether.
C1 control characters
C1 = positions 128–159. Primarily for displays and printers.
The C1 set appeared in the late 1970s. It is primarily designed for controlling display and printer devices, even though some of the controls warrant other uses as well. The C1 set is intended for use with the C0 set.
The C1 set includes "Format effectors" that control horizontal and vertical movement when displaying or printing. There are "Presentation controls" for defining line-break behavior. There are "Area definition" controls for form filling. There are "Introducers" and "Shift Functions" to support extra controls and characters. Additional controls exist for sending command strings and setting an indicator. Some of the controls were intended to cover for shortcomings in the C0 set. Some controls were reserved: 2 controls are for private use, while 4 controls were (and still are) reserved for future standardization.
The C1 set occupies positions 128–159 in 8-bit environments. There are also escape codes to use the C1 set on 7-bit systems. The respective escape codes (ESC char) are given in the C1 list further below.
History of C1
In 1979 ANSI released additional controls for use with ASCII (ANSI X3.64). This came to be known as the C1 set. A similar set was also released as ECMA-48. According to ANSI, the C1 controls were intended for
input/output control of two-dimensional character-imaging devices, including interactive terminals of both the cathode ray tube and printer types, as well as output to microfilm printers.
A bit later, in 1983, the C1 set was standardized as ISO 6429. Standard-wise, the C1 set has been volatile. Both ISO 6429 and ECMA-48 were updated several times. New control characters were added and definitions updated. One of the C1 characters (IND) was eventually deprecated and removed.
The standards actually cover more control codes than those that fit in the C1 area. These additional controls are used via control sequences (escape sequences). The sequences are beyond the subject of this article. Let it suffice that the sequences are an important part of the standards that should be used together with the C1 controls. The sequences, together with C1, are also known as VT100 and ANSI escape sequences.
Current status of C1
The current standards for C1 are ISO/IEC 6429:1992 and ECMA-48:1991. These standards now define both the C0 and C1 control characters.
Unicode allows the use of C1 (and C0 too). In fact, the C1 area has been entirely reserved for control codes in Unicode. On the contrary, the (somewhat outdated) DOS and Windows codepages, i.e. character sets, have not reserved space for C1. Instead, they have included additional graphic characters in the C1 area. This doesn't prevent the use of C1 controls on DOS and Windows, though.
In practice, the C1 control characters are not very common. They are specialized codes for special applications.
ISO 8859 special characters NBSP and SHY
Positions 160 and 173.
ISO 8859 is a group of 8-bit extended character sets. The sets cover various Latin characters and also Cyrillic, Greek, Arabic, Hebrew and Thai characters. ISO 8859 is related to the Windows character sets ("ANSI codepages"), but these are actually different from each other.
Two characters in ISO 8859 are of interest to us: Non-Breaking Space (NBSP) and Soft Hyphen (SHY). They both have control character like properties, even though they are not actually called control characters in ISO 8859.
NBSP appears in position 160 (hex A0) and SHY is 173 (hex AD). The same positions, and roughly the same meanings too, have been adopted to many of the Windows codepages and Unicode.
Note: ISO 8859-8 Latin/Hebrew defines two additional special characters, namely LRM (left-to-right mark) and RLM (right-to-left mark). These characters are not universal in ISO 8859, but specific to Hebrew. Since LRM and RLM were not used in any other ISO 8859 character set, and since they do not appear in Unicode at the same positions, they are not further presented in this article.
Current status of NBSP and SHY
Several current standards include NBSP and SHY. They appear at the same positions in all of the following:
- ISO 8859-1 to 8859-16.
Exception: ISO 8859-11 Latin/Thai does not include SHY.
- Windows codepages 1250–1258.
- Unicode, block U+0080 C1 Controls and Latin Supplement.
Control characters in Unicode
Control characters have made their way to Unicode as well. Unicode recognizes control characters and explicitly allows their use. While Unicode doesn't obsolete control characters, it defines special rules for just a handful of them. Let the standard speak for itself:
The Unicode Standard provides for the intact interchange of these code points, neither adding to nor subtracting from their semantics. The semantics of the control codes are generally determined by the application with which they are used. However, in the absence of specific application uses, they may be interpreted according to the control function semantics specified in ISO/IEC 6429:1992. (Unicode 9.0 p. 822)
Unicode specifies semantics for the following control characters. The semantics appear to be in line with their original semantics, even though some differences may exist.
- ASCII control characters:
- HT and SP are considered whitespace.
- LF, VT, FF and CR are considered whitespace, and also mandatory line breaks in the line breaking algorithm.
- FS, GS, RS and US are considered separators in the bi-directional algorithm.
- C1 control characters:
- NEL is considered a mandatory line break in the line breaking algorithm, even though supporting it is optional.
- ISO 8859:
- NBSP and SHY. These characters are not actually control characters in Unicode. Instead, NBSP is "Separator, space" and SHY is "Other, format". Both characters have features in the line-breaking algorithm. With SHY, Unicode is significantly more elaborate than ISO 8859 in that Unicode suggests more hyphenation features than just displaying a hyphen.
Note: While no new control characters appear in Unicode, it does define some of its own special characters, such as formatting characters. These characters are beyond the scope of this article.
From ASCII via ISO to Unicode
The following diagram summarizes the development of character standards. You can see how the control characters were propagated from ASCII (X3.4) and other standards to Unicode.
Control characters in modern applications
With so many control characters coming from the 1960s and 1970s, are they still useful for application programmers?
It depends on the application. Generally speaking, one needs control characters to work with old interfaces or devices. New protocols and file formats tend to use some other mechanism than control characters. Current formats typically use textual markup such as XML, which has little use for control characters beyond whitespace. On the device control side, unless you are writing device drivers, you control devices through operating system calls or library routines rather than sending them control strings to do tricks.
The following is a subjective list of which characters are still in common use and which ones are used less. The list is based on experience writing application software for Windows and DOS.
- ASCII control characters: some used, some not
- NUL is still common in everyday use. NUL terminates a string in many programming languages and interfaces.
- Transmission control characters (T1 to T10) are generally of little use. Data transfer is done through TCP/IP sockets, HTTP, FTP or some other protocol. Individual transmission control characters appear for special uses.
- BEL probably no longer appears in its original use. Rather than sending BEL to produce beeps, applications will rather play a tune via other means.
- Format effectors (F0 to F5) are possibly the most important control characters these days. Some of them, such as CR and LF, are essential for a system to work at all. HT is also very common, especially in plain text files. BS and FF are less common. VT appears only rarely if ever.
- Device control characters (DC1 to DC4) are not required to control devices, really. To control a device from an application you rather make a system call. On the other hand, you might still need XOFF (^S) or XON (^Q) in a command line session from time to time.
- SO, SI and ESC used to be common, but this has changed. One may find them from time to time, but supposedly it's about older systems then.
- CAN and EM are not in common use.
- SUB might no longer appear as a substitute. You will more likely see something like "?" or the Unicode REPLACEMENT CHARACTER (U+FFFD) as a substitute for a bad character. Another use for SUB still exists, though. You could find it at the end of a text file.
- Information separators (IS4 to IS1) are technically still valid. If anyone uses them to separate information is another question. Other techniques are used instead, such as XML or database systems. As a simple delimiter character a NUL, HT, CR/LF, comma or semicolon is more common than any of the information separators originally designed for the purpose.
- SP must be the heaviest used control character of them all.
- DEL – well, did you ever see one?
- Characters ^A to ^Z (1 to 26) frequently appear as keyboard shortcuts in various applications and operating systems. The actual feature triggered by a keyboard shortcut is often unrelated to the respective control character. More of that follows below.
- C1 control characters: little use
- NEL is the only C1 character recognized by Unicode. The most probable case to run into NEL is when EBCDIC compatibility is required.
- The other C1 characters appear outdated now. Since VT100 (that uses C1 extensively) is still a current method with Unix shell sessions, C1 is alive, maybe even everyday business for you. From a programmer's point of view the entire C1 set is rarely used.
- ISO 8859 special characters: in use
Some frequently used characters, especially in a special field, may not have been mentioned. If you know frequent current uses for any of the characters, let us know.
Many of the control characters only appear rarely. How did this affect the space efficiency of 7-bit and 8-bit character sets? Instead of reserving space for control characters, it was possible to reuse these areas for additional graphics. This was actually done by DOS, Windows and Mac, all of which assigned graphic characters to the control character areas. Unicode chose to be different in this respect. Since its code space is much larger than 128 or 256, it was possible to reserve the C0/C1 areas entirely for control characters. This has helped the control characters to survive, if not in practical use, then at least in various code charts and lists.
Keyboards and control characters
Users can create many of the control characters from their keyboards. This usually happens in combination with the Ctrl key, and, more rarely, with the Esc key. There are also some special keys that produce control characters on their own. ←Backspace, Enter, Esc, Space and ↹ Tab are the usual ones.
Key presses and control characters, while having some things in common, are usually unrelated. Pressing a key combination doesn't generally trigger the functionality of the respective control character. As an example, while it's possible to press Ctrl+O to create an SO (Shift Out), pressing the keys seldom runs the operation associated with SO (pick an alternate character set). Instead, Ctrl+O might start an operation beginning with an O, such as "Open".
In some cases a key press does trigger the respective control character feature. Pressing the ↹ Tab key, or Ctrl+I, can indeed produce an HT (Horizontal Tabulation) and move the cursor forward on the line. This is an exception rather than the norm, though.
Some key combinations are more likely than others. Ctrl+A through Ctrl+Z (in other words, ^A to ^Z) are common keyboard shortcuts. Control key combinations with a symbol (^@, ^[, ^\, ^], ^^, ^_, ^?) are less common. There is a reason why such combinations should be avoided. Considerable variation exists with symbol keys in different keyboard layouts. A Ctrl and symbol key combination doesn't always produce the same control character, or any character at all, which makes it less useful as a keyboard shortcut.
In this article the focus is on the programmatic features of control characters. Less focus is put on the use of keyboard shortcuts.
About the character list
Next we are going to list every control character in detail. The column Dec refers to the decimal value of the control code ("ASCII value"). Hex is the same in hexadecimal, preceded by a dollar sign for clarity. An octal value is also given. The column Pos shows the row/column of the character in code charts.
The list shows key presses that (often) produce the control character on the keyboard. In addition, C-style escape sequences (\c) are provided where available, as are special constants supported by Visual Basic: classic version and Visual Basic .NET.
The last column lists mnemonics and graphic symbols. The symbols (in black) have been standardized, but they have fallen into disuse. The 2-letter mnemonics are standardized for the ASCII section. Additional 2-letter mnemonics for the C1 and ISO 8859 sections are taken from RFC 1345, which is not a standard, but is frequently referred to in this context.
ASCII control characters (C0)
The ASCII control characters work in 7-bit and 8-bit environments, as well as in Unicode. These controls originate from a set of related standards: ASCII, ISO 646 and ECMA-6, and also ISO 6429 and ECMA-48. All of these characters are available in Unicode, too. The actual C0 set consists of characters NUL through US (0–31). Two additional characters, SP and DEL, are a part of ASCII and the related standards as well.
*) The 2-character mnemonics for the ASCII set are from ANSI X3.32, ISO 2047 and ECMA-17. So are also the graphic symbols. The symbols are outdated and rarely used. A couple of the symbols also have alternative forms.
|\0||^@||NUL is defined in the standards as a filler character. It can be used as media-fill or time-fill. NUL doesn't affect the information content of a data stream. It may affect the information layout and the control of equipment, though.|
|Note: NUL was originally intended as an ignorable filler character with no meaning. Especially convenient on paper tape, where a NUL equals no holes punched, it could be used to reserve space for new information or correcting errors. ASCII-1986 even suggests NUL as a "time-waster" character to be added after a newline to accommodate mechanical devices where a carriage return works slowly. Despite this, NUL has been used contrary to the standards in null-terminated strings as an End-Of-String marker. Several programming languages use this convention.|
|Constant in Visual Basic and VB.NET:
|1||$01||SOH||Start of Heading — TC1 Transmission control character 1||001||0/1||SH|
|^A||Indicates the beginning of a heading in a transmission. The heading can be terminated by STX. As per ASCII-1968, a heading constitutes a machine-sensible address or routing information. Later standards have dropped the explanation.|
|Note: SOH, along with STX and ETX, was intended for data transmission. It is not intended for marking a heading in a document.|
|2||$02||STX||Start of Text — TC2 Transmission control character 2||002||0/2||SX|
|^B||STX has two functions in a transmission: it 1) indicates the beginning of a text and 2) may terminate a heading (see SOH). As per ASCII-1968, text is what should be transmitted to a destination. Later standards have dropped the explanation.|
|3||$03||ETX||End of Text — TC3 Transmission control character 3||003||0/3||EX|
|^C||Terminates a text in a transmission. As per ASCII-1968, a text starts with STX and ends with ETX. Later standards don't necessarily require the pairing of STX with ETX.|
|Note: ETX may be used to call for reply from a slave station after a message has been sent. ETX is also commonly used to terminate an interactive process (keyboard: Ctrl+C).|
|Ctrl+Break on PC keyboard produces this character code.|
|4||$04||EOT||End of Transmission — TC4 Transmission control character 4||004||0/4||ET|
|^D||Indicates the conclusion of a transmission. The transmission may have contained one or more texts and associated heading(s).|
|Note: EOT can be used to end or abort a transmission. It can also be a reply to indicate inability to receive further messages. EOT (keyboard: Ctrl+D) is even used as an End-Of-File control in a Unix shell session.|
|5||$05||ENQ||Enquiry — TC5 Transmission control character 5||005||0/5||EQ|
|^E||Requests a response from a remote station. The response may include station identification or status. ENQ can be used as a "Who Are You" (WRU) to identify a remote station, especially after a new connection has been established.|
|6||$06||ACK||Acknowledge — TC6 Transmission control character 6||006||0/6||AK|
|^F||An affirmative response. Transmitted from a receiver as a response to the sender.|
|Note: ACK can indicate that a slave station has received a message correctly and is ready to receive more.|
|\a||^G||Calls for human attention. BEL may control alarm or attention devices.|
|Note: BEL is the only control character with an audible effect. It has been used to ring a bell (indeed) or produce a beep sound. A visual alarm is also possible.|
|In Unicode, this control character is abbreviated BEL but named ALERT, while the name BELL is confusingly used for a graphic character (🔔).|
|8||$08||BS||Backspace — FE0 Format effector 0||008||0/8||BS|
|\b||^H||Moves one character position backwards (keeping the previous character).|
|Note: Contrary to the standards, BS has been used as a combined "move back and delete" operation to remove the previous character. This is not the standard meaning of BS, however. BS is defined as a non-destructive "move back" or "move left" operation, similar to a backspace in mechanical typewriters. To delete the previous character, BS should be followed by DEL. On paper tape the result would be the previous character being completely punched out (erased). BS followed by another character would strike two characters in the same position. Overstriking was a way to produce combined characters. This option was intended to internationalize ASCII. A letter followed by BS followed by a diacritic symbol would produce an accented letter. As an example, u BS ^ would produce û. Several ASCII characters (" ' ` ^ ~ ,) were indeed defined to be used as diacritic symbols. Overstriking could also be suitable with other characters, such as for underlining with the "_" character or printing a slash "/" over "=" to produce "not equal". It could even be used to achieve a strike-through effect (perhaps with -, / or X) to indicate removed text. A boldface effect could be achieved by striking the same character several times at the same position.|
|Overstriking was a useful option with printing devices, but displays hardly support it. With the advent of more capable character sets and formatting techniques overstriking can be considered outdated. ASCII-1986 does not require overstriking capabilities and suggests that overstriking may be proscribed in the future. ISO 8859 explicitly forbids overstriking.|
|←Backspace on PC keyboard produces this character code.|
|Constant in Visual Basic and VB.NET:
|See also: CCH|
|9||$09||HT||Horizontal Tabulation — FE1 Format effector 1 (Character Tabulation)||011||0/9||HT|
|\t||^I||Advances to the next pre-determined character position (horizontal tab stop). HT could also be used as a skip function on punched cards.|
|Note: HT is commonly also abbreviated TAB.|
|Even though the standards don't set a universal tab width, a typical fixed tab width is 8 columns. Other tab widths, as well as custom tab positions, are used as well. HT is a simple method of data compression: a single character can represent several spaces in formatted text.|
|The ↹ Tab key on the keyboard is consistent with HT in that it usually produces the code HT. How the HT is treated in each application is another story. In windowing environments, there are three common alternative uses. Pressing ↹ Tab can either add an HT character into text, indent text (possibly by adding an appropriate number of spaces or shifting the marginal), or something completely different: jump to the next field or control in a graphical user interface. This way the key has been extended to cover more uses than what HT was originally intended for.|
|The original name of HT is Horizontal Tabulation. It was later renamed as HT Character Tabulation, first in ECMA-48:1986.|
|↹ Tab on PC keyboard produces this character code.|
|Constant in Visual Basic and VB.NET:
|10||$0A||LF||Line Feed — FE2 Format effector 2||012||0/10||LF|
|^J||LF has two alternative functions. It advances to the same character position on the next line (move down), or optionally to the first position on the next line (move to start of next line, i.e. newline). Originally LF was a move-down. A newline option (NL) was added soon. The option allowed LF to be used as a newline, which works like a combined CR LF. Use of LF as a newline requires agreement between sender and recipient of data. Universal agreement has not been reached.|
|Note: LF, having two alternative functions, has been a major source of confusion. While LF was initially defined as a "move down" operator, standards began to allow LF as a newline too. As a result, operating systems differ in their definition of a newline. A newline is LF on Unix. Operating systems using CR LF include CP/M, DOS, OS/2 and Windows. Naturally, this caused an incompatibility. To solve the problem, control characters IND and NEL were added to the C1 area. This did not solve the issue, resulting in IND being removed later. ECMA-6:1985 and ASCII-1986 attempted to clarify the situation by declaring LF deprecated for a newline and recommending CR LF instead. ECMA-48:1991 no longer allows LF to function as a newline.|
|The escape sequence for newline and LF is another source of confusion. \n is the common sequence for a newline, whereas there is no such a sequence for a line feed. The actual control character(s) represented by \n depend on the system. In some cases, \n indeed represents LF, but it can also represent another newline sequence.|
|Ctrl+Enter on PC keyboard produces this character code.|
|Constant in Visual Basic and VB.NET:
|See also: CR IND NEL|
|11||$0B||VT||Vertical Tabulation — FE3 Format effector 3 (Line Tabulation)||013||0/11||VT|
|\v||^K||Advances to the same character position on the next pre-determined line. ASCII-1977 and ASCII-1986 optionally allow VT to advance to the first position on the next pre-determined line, if agreed on.|
|Note: The original name of VT is Vertical Tabulation. It was later renamed as VT Line Tabulation, first in ECMA-48:1986. VT has been used to jump down to the next pre-defined line when printing on a paper form. According to some sources, vertical tab stops were typically spaced 6 lines apart. VT is a simple data compression method where a single VT represents several LF characters (and optionally a CR too).|
|In modern use VT must be quite a rare character. As Bob Bemer, one of the original designers of ASCII, put it: "This is a very dangerous character to use. It cannot be used directly on any terminal that I know of. Even if it could, the implementation rules are not supplied unambiguously in the ASCII standard."|
|Constant in Visual Basic and VB.NET:
|12||$0C||FF||Form Feed — FE4 Format effector 4||014||0/12||FF|
|\f||^L||Advances to the next form or page. Standards differ in what column the subsequent character position will be in. Originally, ASCII-1968 did not define the column at all. ISO and ECMA standards declare that FF does not change the column. ASCII-1977 and ASCII-1986 optionally allow, by agreement, moving to the first column, as if FF was actually CR FF.|
|Note: FF has been used as "page break" in text files, "new page" on printers and "clear the screen" on displays. The situation was originally unclear whether FF was just a "new page" operator or "new page, move to column 1". ASCII-1977 and ECMA-6:1985 attempted to clarify the situation by recommending the use of CR FF. ASCII-1986 even implied that the "new page, move to column 1" option might be deleted in a future edition of ASCII.|
|Constant in Visual Basic and VB.NET:
|13||$0D||CR||Carriage Return — FE5 Format effector 5||015||0/13||CR|
|\r||^M||Traditional definition: Moves to the first position on the same line (ASCII, ISO 646, ECMA-6). Newer definition: Moves to the line home position or line limit position of the same line (ISO 6429, ECMA-48).|
|Note: The standard meaning of CR is "move to beginning of current line". This allows overprinting the line with new characters, which could be used to achieve underlining, for example. For advancing to the next line CR would be followed by LF. On CP/M, DOS, OS/2 and Windows the newline marker is CR LF, which is according to the definition. CR alone has been used as the newline character on some systems, such as Commodore and Apple, which use does not conform to the standards in question. The order CR LF (instead of LF CR) may have been important on mechanical devices where a carriage return took relatively long to execute. A non-printing LF was more suitable output while the printing head was returning, rather than striking a graphic symbol in the middle of the line.|
|Enter on PC keyboard produces this character code.|
|Constant in Visual Basic and VB.NET:
|See also: LF|
|14||$0E||SO||Shift Out — LS1 Locking-Shift One||016||0/14||SO|
|^N||Used to extend the character set. SO may alter the meaning of the following bit combinations until an SI is reached. Between SI and SO, character positions 33-126 (decimal) may represent additional characters that would not otherwise fit in the regular character set.|
|Note: SO (Shift Out) is normal name of this control. LS1 (Locking-Shift One) is used by ECMA-35 and ECMA-48. In those standards, SO is used in 7-bit environments and LS1 in 8-bit environments. The mechanism to select the alternative character set(s) was defined in ANSI X3.41, ISO 2022 and ECMA-35. It includes the use of escape sequences starting with ESC. SO has also been used on printers to select enlarged characters or another color.|
|15||$0F||SI||Shift In — LS0 Locking-Shift Zero||017||0/15||SI|
|^O||Used in conjunction with SO. It may reinstate the standard meanings of the characters following it.|
|Note: SI (Shift In) is normal name of this control. LS0 (Locking-Shift Zero) is used by ECMA-35 and ECMA-48. In those standards, SI is used in 7-bit environments and LS0 in 8-bit environments. SI has also been used on printers to select condensed characters or to reset color.|
|16||$10||DLE||Data Link Escape — TC7 Transmission control character 7||020||1/0||DL|
|^P||Used to provide supplementary data transmission control functions. DLE changes the meaning of a limited number of following characters.|
|Note: DLE is the "escape" character for transmission control. DLE can potentially be put in the front of a transmission control character (TC1-TC10) to pass it through "as is" instead of controlling the current transmission. This is not always the case, though. It is possible to create new transmission control sequences with DLE in a similar way ESC is used to create escape sequences for other purposes. Contrary to the standards, ^P has been used as a keyboard shortcut to echo console activity at the printer.|
|17||$11||DC1||Device Control 1 — XON||021||1/1||D1|
|^Q||Intended to turn on or start an ancillary device, to restore it to the basic operation mode (see DC2 and DC3), or for any other device control function.|
|Note: DC1 is conventionally called XON when used in communication for software flow control. The meaning of XON is to continue data transmission after an XOFF (DC3) has been received. The name XON ("transmit on") does not come from a standard, but it is commonly used.|
|18||$12||DC2||Device Control 2||022||1/2||D2|
|^R||Intended for turning on or starting an ancillary device, set it to a special mode (restored via DC1), or for any other device control function.|
|19||$13||DC3||Device Control 3 — XOFF||023||1/3||D3|
|^S||Intended for turning off or stopping an ancillary device. It may be a secondary level stop such as wait, pause, stand-by or halt (restored via DC1). It can also perform any other device control function.|
|Note: DC3 is conventionally called XOFF when used in communication for software flow control. An XOFF is issued to stop transmission when a device cannot accept more data. Transmission can be continued via XON (DC1). The name XOFF ("transmit off") does not come from a standard, but it is commonly used. The use of XOFF and XON is in line with the standards, even though not directly defined in them.|
|XOFF (^S) is sometimes used as a pause command. Continuing requires pressing XON (^Q). ^S also works as a pause on MS-DOS and in Windows command prompt. Pressing any key continues.|
|20||$14||DC4||Device Control 4 (Stop)||024||1/4||D4|
|^T||Intended to turn off, stop or interrupt an ancillary device, or for any other device control function.|
|21||$15||NAK||Negative Acknowledge — TC8 Transmission control character 8||025||1/5||NK|
|^U||Negative response. Transmitted from a receiver as a response to the sender.|
|Note: NAK can be sent as a response to indicate inability to receive a message, or to request resending.|
|22||$16||SYN||Synchronous Idle — TC9 Transmission control character 9||026||1/6||SY|
|^V||Used as "time-fill" in synchronous transmission. Sent during an idle condition to retain a signal when there are no other characters to send.|
|Note: SYN has been used by synchronous modems, which have to send data constantly. — Beginning each transmission with at least two SYN characters is a way to achieve synchronization. The receiving station will possibly ignore SYN, since it doesn't belong to the actual data content.|
|23||$17||ETB||End of Transmission Block — TC10 Transmission control character 10||027||1/7||EB|
|^W||Indicates the end of a block of data. Used when data is divided into blocks for transmission.|
|Note: ETB, when used to end a block, may call for a reply from a slave station.|
|^X||Indicates that data is in error or should be disregarded. Affects "the data with which it is sent" (ASCII-1968, ASCII-1977) or "the data preceding it" (ASCII-1986, ISO 646, ECMA-6, ECMA-48).|
|Note: There are 2 alternative definitions for the data to be disregarded. The actual scope of cancellation is undefined by the standards and should be defined case by case. ^X has been used as a keyboard shortcut to cancel (delete) the characters on the current line, which use conforms to the standards.|
|25||$19||EM||End of Medium||031||1/9||EM|
|^Y||Identifies 1) the physical end of a medium, 2) the end of the used portion of a medium, or 3) the end of wanted data on a medium.|
|Note: EM may have been suitable for paper tape or magnetic tape to say "no more data". Disk file systems use more sophisticated ways to keep track of the used and unused areas of the medium.|
|This character is commonly abbreviated EM, except for Unicode, which provides it as an alias with abbreviation EOM.|
|^Z||Used in place of an invalid or erroneous character. Introduced by automatic means in cases like a transmission error.|
|Note: When SUB is used as a substitution character, the reverse question mark symbol seems quite good as its visual representation. Compare SUB to Unicode U+FFFD REPLACEMENT CHARACTER.|
|SUB has often been used contrary to the standards. On CP/M and MS-DOS, it appears as an End-Of-File marker for text files (^Z). On Unix, ^Z is a keyboard signal to interrupt a foreground process.|
|\e||^[||The first character of an escape sequence. Provides either supplementary characters or additional control functions. ESC changes the meaning of a limited number of following characters.|
|Note: ESC is used to form escape sequences, which perform various control functions or apply additional character sets. ESC can also be used to invoke the C1 control characters on a 7-bit system that only support character positions 0–127.|
|On the keyboard, sometimes the Esc key indeed produces the ESC control character. In windowing environments, the key typically cancels a dialog or an operation, rather than producing a control character or starting an escape sequence. This kind of an "escape" is not based on the character standards, however. The closest ASCII equivalent for canceling a dialog would be CAN, but since there is no Can key on the common keyboards, it can't be used.|
|Esc on PC keyboard produces this character code.|
|28||$1C||FS||File Separator — IS4 Information separator 4||034||1/12||FS|
|^\||The four information separators (FS, GS, RS and US) are used to separate and qualify data. Each separator has two alternative names: Information Separator Four equals File Separator, Information Separator Three equals Group Separator, Information Separator Two equals Record Separator and Information Separator One equals Unit Separator. The separators can be used either hierarchically or in a non-hierarchical manner. When used hierarchically, the order is US (least inclusive), RS, GS and FS (most inclusive). The content and length of a file, group, record or unit are not specified by the standards.|
|FS, when used in a hierarchical order, delimits a data item called a file. It can also delimit anything else.|
|29||$1D||GS||Group Separator — IS3 Information separator 3||035||1/13||GS|
|^]||GS, when used in a hierarchical order, delimits a data item called a group. It can also delimit anything else.|
|30||$1E||RS||Record Separator — IS2 Information separator 2||036||1/14||RS|
|^^||RS, when used in a hierarchical order, delimits a data item called a record. It can also delimit anything else.|
|31||$1F||US||Unit Separator — IS1 Information separator 1||037||1/15||US|
|^_||US, when used in a hierarchical order, delimits a data item called a unit. It can also delimit anything else.|
|Note: The information separators were deliberately arranged next to SPACE, which can also be used as an information separator (word separator).|
|Moves one character position forwards. Space may also have a function equivalent to that of an information separator.|
|Note: Space has a dual nature. It can be classified as both a control character and a (non-printing) graphic character. SP is similar to a Format Effector. It can also be used as a fifth Information Separator. Space is sometimes represented by the symbol ƀ or ␢ (b with a stroke) or ␣ (open box). SP does not belong to the C0 set.|
|Spacebar on PC keyboard produces this character code.|
|See also: NBSP|
|^?||Outdated. An ignorable character originally intended for erasing an erroneous or unwanted character in punched tape. In this standard use, DEL wouldn't affect the information content of data, even though it may have affected the information layout and the control of equipment. Standards also allowed DEL to be used as media-fill or time-fill (even though a NUL may be more appropriate).|
|Note: DEL is now outdated. It was removed from the latest standards (ECMA-48 in 1991 and ISO 6429 in 1992). The origin of DEL is with perforated paper. On that, DEL was equal to "all holes punched", which is a way to invalidate an erroneous character (rubout). In a sense, DEL is similar to NUL, since both characters mean "nothing". ASCII-1977 suggests the use of DEL as a "time waster" to accommodate mechanical devices where a carriage return takes time to execute. ASCII-1986 recommends NUL as a time waster instead of DEL. DEL does not belong to the C0 set, but is an individual control code.|
|Ctrl+←Backspace on PC keyboard produces this character code.|
|See also: NUL|
\x is what you write in a C program to produce the given control character. ^X means you press Ctrl+X to produce the given control character.
C1 control characters
The C1 control characters work in 8-bit environments. These controls come from 3 related standards: ANSI X3.64, ISO 6429 and ECMA-48. All of these characters are also available in Unicode, too. There are three unassigned control characters: PAD, HOP and SGCI. Use was planned for them in a failed draft DIS 10646, but they were not actually standardized or put to use. Despite this, one can find these control characters in various C1 lists online, and also as aliases in later Unicode standards.
†) The 2-character mnemonics for C1 are from RFC 1345. They are not standardized.
|128||$80||PAD||unassigned, "Padding Character"||200||8/0||PA|
|ESC @||A reserved control code. Intended for use as PAD Padding Character in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).|
|Note: Not part of ISO/IEC 6429 or ECMA-48.|
|Unicode lists this character as XXX and provides PAD as an alias.|
|129||$81||HOP||unassigned, "High Octet Preset"||201||8/1||HO|
|ESC A||A reserved control code. Intended for use as HOP High Octet Preset in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).|
|Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.|
|Unicode lists this character as XXX and provides HOP as an alias.|
|130||$82||BPH||Break Permitted Here||202||8/2||BH|
|ESC B||A point where a line break may occur.|
|Note: Roughly equivalent to a soft hyphen except that the means for indicating a line break is not necessarily a hyphen. Compare to Unicode U+200B ZERO WIDTH SPACE.|
|131||$83||NBH||No Break Here||203||8/3||NH|
|ESC C||A point where a line break may not occur.|
|Note: Compare to Unicode U+2060 WORD JOINER.|
|ESC D||Moves to the next line keeping the current horizontal position.|
|Note: According to ECMA-48:1986, IND was provided for use in those cases where LF was implemented as New Line. IND was deprecated in 1988 and withdrawn in 1992 from ISO/IEC 6429 (1986 and 1991 respectively for ECMA-48).|
|See also: LF RI|
|ESC E||Moves to the first position of the next line. Alternatively, to line home or line limit position.|
|Note: NEL maps to the control character NL (New Line) in the EBCDIC character set used on IBM mainframes.|
|See also: LF|
|134||$86||SSA||Start of Selected Area||206||8/6||SA|
|ESC F||Starts a string of character positions whose contents can be transmitted. The string ends at ESA (or end of display).|
|135||$87||ESA||End of Selected Area||207||8/7||ES|
|ESC G||Ends a string of character positions (started by SSA) whose contents can be transmitted.|
|136||$88||HTS||Horizontal Tabulation Set, Character Tabulation Set||210||8/8||HS|
|ESC H||Sets a tab stop at the active position.|
|Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTS as Character Tabulation Set.|
|137||$89||HTJ||Horizontal Tabulation with Justification, Character Tabulation with Justification||211||8/9||HJ|
|ESC I||Moves text to the following tab stop. The text is what comes after the previous tab stop up to the active position.|
|Note: This character has several names. ANSI X3.64 originally called it Horizontal Tabulation with Justify. ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed HTJ as Character Tabulation with Justification.|
|138||$8A||VTS||Vertical Tabulation Set, Line Tabulation Set||212||8/10||VS|
|ESC J||Sets a vertical tab stop at the active line.|
|Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 have renamed VTS as Line Tabulation Set.|
|139||$8B||PLD||Partial Line Down, Partial Line Forward||213||8/11||PD|
|ESC K||Moves down so that following characters will appear as subscripts. Subscripts end at the next PLU.|
|Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLD as Partial Line Forward. Sample: text PLD subscript PLU text.|
|140||$8C||PLU||Partial Line Up, Partial Line Backward||214||8/12||PU|
|ESC L||Moves up so that following characters will appear as superscripts. Superscripts end at the next PLD.|
|Note: ISO 6429:1992 and ECMA-48:1991 have renamed PLU as Partial Line Backward. Sample: text PLU superscript PLD text.|
|141||$8D||RI||Reverse Index, Reverse Line Feed||215||8/13||RI|
|ESC M||Moves to the previous line keeping the current horizontal position.|
|Note: ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991 renamed RI as Reverse Line Feed, apparently related to the removal of IND.|
|See also: IND|
|142||$8E||SS2||Single Shift Two||216||8/14||S2|
|ESC N||Used to extend the character set. The next character will be from the currently chosen G2 set.|
|Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.|
|143||$8F||SS3||Single Shift Three||217||8/15||S3|
|ESC O||Used to extend the character set. The next character will be from the currently chosen G3 set.|
|Note: For more information see ISO 2022 or ECMA-35. The next character should be in the decimal range 33-126 or 32-127.|
|144||$90||DCS||Device Control String||220||9/0||DC|
|ESC P||Starts a device control string. ST ends the string. The control string may include commands to the receiving device, or a status report from the sending device.|
|145||$91||PU1||Private Use One||221||9/1||P1|
|ESC Q||Reserved for private use, no standardized meaning.|
|146||$92||PU2||Private Use Two||222||9/2||P2|
|ESC R||Reserved for private use, no standardized meaning.|
|147||$93||STS||Set Transmit State||223||9/3||TS|
|ESC S||Notifies that data is ready for transfer from a device (ANSI X3.64), or establishes the transmit state in the receiving device (ISO 6429, ECMA-48). Doesn't initiate the actual transmission.|
|ESC T||Ignore the preceding graphic character (and CCH itself too). If the previous character is a control character or sequence, ANSI X3.64 says it should be ignored, while ISO 6429 and ECMA-48 leave the action undefined.|
|Note: Destructive backspace. Intended to eliminate ambiguity about the meaning of BS.|
|See also: BS|
|ESC U||Sets a message waiting indicator in the receiving device.|
|150||$96||SPA||Start of Guarded Protected Area, Start of Protected Area, Start of Guarded Area||226||9/6||SG|
|ESC V||Starts a string of character positions that can't be altered manually or transmitted. Optionally protects against erasure too. EPA will end the string.|
|Note: SPA is known as Start of Protected Area (ANSI X3.64, ECMA-48:1979), Start of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and Start of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).|
|151||$97||EPA||End of Guarded Protected Area, End of Protected Area, End of Guarded Area||227||9/7||EG|
|ESC W||Ends the area started by SPA.|
|Note: EPA is known as End of Protected Area (ANSI X3.64, ECMA-48:1979), End of Guarded Protected Area (ISO 6429:1983, ECMA-48:1984) and End of Guarded Area (ISO 6429:1992, ECMA-48:1986 and ECMA-48:1991).|
|152||$98||SOS||Start of String||230||9/8||SS|
|ESC X||Starts a control string. The string ends at ST. It cannot contain a SOS. The interpretation of the string depends on the application.|
|153||$99||SGCI||unassigned, "Single Graphic Character Introducer"||231||9/9||GC|
|ESC Y||A reserved control code. Intended for use as SGCI Single Graphic Character Introducer in draft DIS 10646, rejected, never standardized (not accepted to ISO 10646).|
|Note: Not part of ISO/IEC 6429 or ECMA-48. Listed as XXX in Unicode.|
|Unicode lists this character as XXX and provides SGC as an alias.|
|154||$9A||SCI||Single Character Introducer||232||9/10||SC|
|ESC Z||A reserved control code. The name was standardized as SCI Single Character Introducer, but the actual functionality was not implemented in the standards.|
|Note: SCI was to be followed by a single byte, which would represent a control function or a graphic character. The functions or characters were not defined in the standards.|
|155||$9B||CSI||Control Sequence Introducer||233||9/11||CI|
|ESC [||Starts a control sequence.|
|ESC \||Closes a string opened by APC, DCS, OSC, PM or SOS.|
|157||$9D||OSC||Operating System Command||235||9/13||OC|
|ESC ]||Starts an operating system control string. The string ends at ST and is interpreted subject to the operating system.|
|ESC ^||Starts a privacy message. ST will end the message.|
|159||$9F||APC||Application Program Command||237||9/15||AC|
|ESC _||Starts an application program command string. ST will end the command. The interpretation of the command is subject to the program in question.|
ESC X means you press Esc followed by X to produce this control character.
ISO 8859 special characters
The two special characters, NBSP and SHY, are not really control characters. They are graphic characters with a special feature. The characters also appear in Unicode. They are included here for the sake of completeness.
‡) The 2-character mnemonics for NBSP and SHY are from RFC 1345. They are not standardized.
|A space for use when a line break is to be prevented.|
|Note: NBSP can sometimes be produced by pressing Ctrl+Shift+SPACE. No universally supported key combination exists.|
|In HTML you can write
|See also: SP|
|Indicates an intraword break point for use when a word must be broken across lines. The visual rendering either is a hyphen (ISO 8859) or varies (Unicode).|
|Note: SHY can sometimes be produced by pressing Ctrl+-. No universally supported key combination exists.|
|In HTML you can write
A summary of character categories. Mostly based on ANSI X3.64, ISO 6429, ECMA-6 and ECMA-48.
- Delimiters (Control string delimiters)
- Delimiters start and end a control string. A control string consists of an opening delimiter, a command string or a character string, and a terminating delimiter (ST).
- APC DCS OSC PM SOS ST
- An introducer is a control character or escape sequence that begins a sequence. The sequence is interpreted as a single graphic character or control.
- CSI ESC SCI
- Shift function characters (was: Code extension control)
- Shift function characters are used to extend the character set of the code. They may alter the meaning of one or more characters that follow them.
- SI SO SS2 SS3
- Format effectors (also: Layout characters)
- Format effectors are mainly intended for the control of the layout and positioning of information. Format effectors (most of them) are data which happen to have a format representation rather than a graphic representation.
- BS CR FF HT HTJ HTS IND LF NEL PLD PLU RI VT VTS
- Information separators
- Information separators separate and qualify data logically. They may be used either in hierarchical order or non-hierarchically. Their specific meanings depend on the application.
- FS GS RS US
- Presentation control characters
- Presentation control characters indicate where a line break may or may not occur.
- BPH NBH
- Graphic characters
- Graphic characters appearing here are those that have control character like properties.
- NBSP SHY SP
- Area definition characters (was: Form filling)
- Area definition characters are used for entering information into a preformatted visual display.
- EPA ESA SPA SSA
- Device control characters
- Device control characters are intended for the control of local or remote or ancillary devices. They are not intended to control data communication systems; this should be done with transmission control characters.
- DC1 DC2 DC3 DC4
- Transmission control characters (was: Communication control)
- Transmission control characters are intended to control or facilitate transmission of information over telecommunication networks.
- ACK DLE ENQ EOT ETB ETX NAK SOH STX SYN
- Miscellaneous control characters fall outside other categories.
- BEL CAN CCH DEL EM MW NUL PU1 PU2 STS SUB
- Not assigned
- Unassigned control characters are ones that were not standardized. Their location was reserved for future standardization. These characters are known by names that appeared in a draft (DIS 10646), even though they didn't make it to the final standard.
- PAD HOP SGCI
The translated terms are taken from the given standards. Several alternative translations may exist.
|Standard||Unicode 5.0||GOST 34.301‑91, GOST 34.302.2‑91||T.53, T.51, T.50||DIN 66003||SFS 4017 *|
|SOH||Start of Heading||DET||début d'en-tête||НЗ||начало заголовка||comienzo de encabezamiento||Anfang des Kopfes||otsikon alku|
|STX||Start of Text||DTX||début de texte||НТ||начало текста||comienzo de texto||Anfang des Textes||tekstin alku|
|ETX||End of Text||FTX||fin de texte||КТ||конец текста||fin de texto||Ende des Textes||tekstin loppu|
|EOT||End of Transmission||FTR||fin de transmission||КП||конец передачи||fin de transmisión||Ende der Übertragung||tekstin loppu|
|ACK||Acknowledge||ACC||accusé de réception [positif]||ДА||подтверждение||acuse de recibo||Positive Rückmeldung||kuittaus|
|BS||Backspace||EFF||espace arrière||ВШ||возврат на шаг||retroceso||Rückwärtsschritt||peruutus|
|HT||Horizontal Tabulation||TAB||tabulation horizontale||ГТ||горизонтальная табуляция||tabulación de caracteres||Horizontal-Tabulator||sarakeohjaus|
|LF||Line Feed||PAL||changement de ligne||ПС||перевод строки||cambio de renglón||Zeilenvorschub||riviaskel|
|VT||Vertical Tabulation||TAV||tabulation verticale||ВТ||вертикальная табуляция||tabulación vertical||Vertikal-Tabulator||rivitys|
|FF||Form Feed||SDP||saut de page, page suivante||ПФ||перевод формата||página siguiente||Formularvorschub||sivun vaihto|
|CR||Carriage Return||RC||retour de chariot||ВК||возврат каретки||retorno del carro||Wagenrücklauf||vaunun palautus|
|SO||Shift Out||HC||hors code||ВЫХ||выход||cambio-salida||Dauerumschaltung||koodinvaihto|
|SI||Shift In||EC||en code||ВХ||вход||cambio-entrada||Rückschaltung||koodinpalautus|
|DLE||Data Link Escape||ÉCT||échappement transmission||AP1||авторегистр один||escape de enlace de datos||Datenübertragungsumschaltung||ohjauskoodin poikkeus|
|DC1||Device Control 1||CD1||commande d'appareil un||СУ1||символ устройства один||control de dispositivo uno||Gerätesteuerung 1||laitteen ohjaus 1|
|DC2||Device Control 2||CD2||commande d'appareil deux||СУ2||символ устройства два||control de dispositivo dos||Gerätesteuerung 2||laitteen ohjaus 2|
|DC3||Device Control 3||CD3||commande d'appareil trois||СУ3||символ устройства три||control de dispositivo tres||Gerätesteuerung 3||laitteen ohjaus 3|
|DC4||Device Control 4 (Stop)||CD4||commande d'appareil quatre||СУ4||символ устройства четыре||control de dispositivo cuatro||Gerätesteuerung 4||laitteen ohjaus 4|
|NAK||Negative Acknowledge||ACN||accusé de réception négatif||НЕТ||отрицание||acuse de recibo negativo||Negative Rückmeldung||kielteinen kuittaus|
|SYN||Synchronous Idle||SYN||synchronisation||СИН||синхронизация||reposo síncrono||Synchronisierung||tahditus|
|ETB||End of Transmission Block||FBT||fin de bloc de transmission||КБ||конец блока||fin de bloque de transmisión||Ende des Übertragungsblocks||jaksonsiirron loppu|
|EM||End of Medium||FS||fin de support||КН||конец носителя||fin del medio físico||Ende der Aufzeichnung||tietovälineen loppu|
|ESC||Escape||ÉCH||échappement||АР2||авторегистр два||escape||Umschaltung||koodin poikkeus|
|FS||File Separator||SF||séparateur de fichiers||РФ||разделитель файлов||separador de fichero||Hauptgruppen-Trennung||tiedoston erotusmerkki|
|GS||Group Separator||SG||séparateur de groupes||РГ||разделитель групп||separador de grupo||Gruppen-Trennung||ryhmän erotusmerkki|
|RS||Record Separator||SA||séparateur d'enregistrements, séparateur d'articles||РЗ||разделитель записей||separador de registro||Untergruppen-Trennung||tietueiden erotusmerkki|
|US||Unit Separator||SSA||séparateur de sous-articles||РЭ||разделитель элементов||separador de unidad||Teilgruppen-Trennung||yksikön erotusmerkki|
|PAD||"Padding Character"||caractère de bourre|
|HOP||"High Octet Preset"||octet supérieur prédéfini|
|BPH||Break Permitted Here||API||arrêt permis ici||РПС||разрешение переноса строки||corte permitido aquí|
|NBH||No Break Here||PAI||aucun arrêt ici||ЗПС||запрет переноса строки||corte no permitido aquí|
|NEL||Next Line||NL||à la ligne||НС||новая строка|
|SSA||Start of Selected Area||DZS||début de zone sélectionnée||НВО||начало выбранной области|
|ESA||End of Selected Area||FZS||fin de zone sélectionnée||КВО||конец выбранной области|
|HTS||Horizontal Tabulation Set||TTH||taquet de tabulateur horizontal||УГТ||установка горизонтальной табуляции|
|HTJ||Horizontal Tabulation with Justification||THJ||tabulateur horizontal avec justification||ГТВ||горизонтальная табуляция с выключкой|
|VTS||Vertical Tabulation Set||TTV||taquet de tabulateur vertical||УВТ||установка вертикальной табуляции|
|PLD||Partial Line Down||IPav||interligne partiel avant||CCB||смещение строки вперед||avance de línea parcial|
|PLU||Partial Line Up||IPar||interligne partiel arrière||CCH||смещение строки назад||retroceso de línea parcial|
|RI||Reverse Index||IR||index renversé, interligne inversé||ОПС||обратный перевод строки||cambio de renglón inverso|
|SS2||Single Shift Two||RU2||remplacement unique deux||ПЕ2||переключатель единичный два||cambio individual dos|
|SS3||Single Shift Three||RU3||remplacement unique trois||ПЕ3||переключатель единичный три||cambio individual tres|
|DCS||Device Control String||CCA||chaîne de commande d'appareils||УЦУ||управляющая цепочка устройства|
|PU1||Private Use One||UP1||usage privé un||ЧИ1||частное использование один|
|PU2||Private Use Two||UP2||usage privé deux||ЧИ2||частное использование два|
|STS||Set Transmit State||MMT||mise en mode transmission||УСП||установка состояния передачи|
|CCH||Cancel Character||ANC||annulation du caractère précédent||OTC||отмена символа|
|MW||Message Waiting||MES ATT||message en attente||ОС||ожидание сообщения|
|SPA||Start of Guarded Protected Area||DZP||début de zone protégée||НСО||начало сохраняемой области|
|EPA||End of Guarded Protected Area||FZP||fin de zone protégée||КСО||конец сохраняемой области|
|SOS||Start of String||DC||début de chaîne||НЦ||начало цепочки||comienzo de cadena|
|SGCI||"Single Graphic Character Introducer"||introducteur de caractère graphique unique|
|SCI||Single Character Introducer||ICU||introducteur de caractère unique||ГЕС||головной символ единичного символа|
|CSI||Control Sequence Introducer||ISC||introducteur de séquence de commandes||ГУП||головной символ управляющей последовательности||introductor de secuencia de control|
|ST||String Terminator||FC||fin de chaîne||ТРЦ||терминатор цепочки||terminador de cadena|
|OSC||Operating System Command||CSE||commande de système d'exploitation||КОС||команда операционной системы|
|PM||Privacy Message||MP||message privé||ЧС||частное сообщение|
|APC||Application Program Command||CO PRO||commande de progiciel||КПП||команда прикладной программы|
|NBSP||No-Break Space||ESP INS||espace insécable||непрерывающий пробел||espacio anticorte||yhdistävä välilyönti *|
|SHY||Soft Hyphen||CDN||trait d'union conditionnel||гибкий дефис||guión de corte programable||pehmeä tavuviiva *|
* Finnish terms marked with an asterisk are not from any standard, but from recommendation Eurooppalaisen merkistön merkkien suomenkieliset nimet.
- ASA standard X3.4-1963: American Standard Code for Information Interchange. Note: ASCII-1963.
- USAS X3.4-1967: USA Standard Code for Information Interchange. United States of America Standards Institute, New York, USA, 1967. Note: ASCII-1967.
- USAS X3.4-1968: USA Standard Code for Information Interchange. Reprinted as NIC 11246 in Feinler & Postel (ed.): Arpanet Protocol Handbook. NIC 7104 Rev. Jan 1978. ADA-052 594. Network Information Center, Menlo Park, California, USA. Note: ASCII-1968.
- ANSI X3.4-1977: American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982. Note: ASCII-1977.
- ANSI X3.4-1986: Coded Character Sets – 7-bit American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1986. Note: ASCII-1986.
- ANSI X3.32-1973: Graphic Representation of the Control Characters of American National Standard Code for Information Interchange. Reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
- ANSI X3.64-1979: Additional Controls for Use with American National Standard Code for Information Interchange. American National Standards Institute, Inc, New York, USA, 1979.
- Bemer, R.W.: Inside ASCII. Best of Interface Age, Volume 2: General Purpose Software. Oregon, USA (1980). Pages 1–50.
- Bies, Lammert: ASCII character map.
- Digital Research: An Introduction to CP/M Features and Facilities, version 1.3, 1976.
- ECMA-6: 7-bit Coded Character Set, 4th edition 1973, 5th edition 1985.
- ECMA-17: Graphic Representation of the Control Characters of the ECMA 7-Bit Coded Character Set for Information Interchange, 1st edition (withdrawn).
- ECMA-35: Character Code Structure and Extension Techniques, 6th edition.
- ECMA-48: Control Functions for Coded Character Sets, 2nd, 3rd, 4th and 5th edition.
- Gerstung, Olaf: Tabellen — Verschiedenes. Bedeutung der Steuerzeichen im ASCII und nach DIN 66003.
- GOST 34.301-91: Information technology. 7-bit and 8-bit coded character sets. Control functions – ГОСТ 34.301-91 (ИСО 6429-88) Информационная технология. 7-битные и 8-битные кодированные наборы символов. Управляющие функции.
- GOST 34.302.2-91: Information technology. 8-bit single-byte coded graphic character sets. Latin alphabet No. 2 – ГОСТ 34.302-91 (ИСО 8859/2-87) Информационная технология. Наборы 8-битных однобайтовых кодированных графических символов. Латинский алфавит № 2.
- Helsingin yliopiston yleisen kielitieteen laitos: Eurooppalaisen merkistön merkkien suomenkieliset nimet, 2. laitos, toukokuu 2004.
- ISO / R 646-1967 (E): 6 and 7-bit coded character sets for information processing interchange, 1st edition December 1967. International Organization for Standardization, Switzerland.
- ISO 646-1973 (E): 7-bit coded character set for information processing interchange. ISO Standards Handbook 1: Information transfer, 1st edition, 1977. Also reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
- ISO 646:1991: Information technology – 7-bit coded character set for information processing interchange.
- ISO 2022-1973 (E): Code extension techniques for use with the ISO 7-bit coded character set. ISO Standards Handbook 1: Information transfer, 1st edition, 1977.
- ISO 2047-1975 (E): Information processing – Graphical representations for the control characters of the 7-bit coded character set. ISO Standards Handbook 1: Information transfer, 1st edition, 1977.
- ISO/IEC 6429:1992 (E): Information technology – Control functions for coded character sets.
- ISO 1745-1975 (E): Information processing – Basic mode control procedures for data communication systems. Reprinted in McGraw Hill's Compilation of Data Communication Standards, edition II, McGraw-Hill, 1982.
- ISO/IEC 8859: Information technology – 8-bit single-byte coded graphic character sets. Note: Mostly ISO/IEC 8859-1:1998: 8-bit single-byte coded graphic character sets -- Part 1: Latin alphabet No. 1.
- ISO-IR 001: The set of control characters of the ISO 646. Note: ISO-IR 001 deviates slightly from ISO 646-1973 in wording. DEL missing.
- ISO-IR 077: C1 Control Character Set of ISO 6429-1983.
- Jennings, Tom: An annotated history of some character codes, revised 29 October, 2004.
- RFC 20: ASCII format for Network Interchange. Note: Identical to USAS X3.4-1968 (ASCII-1968). Missing Appendix A–D.
- RFC 1345: Character Mnemonics & Character Sets.
- SFS 4017: Tietojen vaihdossa käytettävä 7-bittinen koodi – 7-bit coded character set for information processing interchange. Suomen standardisoimisliitto, Helsinki, Finland, 1977.
- UIT-T T.50 (04/92): Alfabeto internacional de referencia, (anteriormente alfabeto internacional N.° 5 o IA5) – Tecnología de la información - Juego de caracteres codificado de siete bits para intercambio de información.
- UIT-T T.51 (09/92): Juegos de caracteres codificados basados en el alfabeto latino para los servicios de telemática.
- UIT-T T.53 (04/94): Funciones de control codificadas mediante caracteres para los servicios telemáticos.
- Unicode, Inc.: Unicode 5.0, section française.
- Unicode, Inc.: The Unicode Standard, version 9.0.0, 2016.
- Unicode, Inc.: Unicode Character Database, NameAliases-9.0.0.txt.
- Whistler, Ken: Why Nothing Ever Goes Away (was: Re: Acquiring DIS 10646). Unicode Mail List, 5 Oct 2015.
- Wikipedia: ASCII.
- Wikipedia: C0 and C1 control codes.
- Wikipedia: Control character.
- Wikipedia: Newline.
- Wikipedia: Software flow control.
Most of the sources have been consulted as of September/October 2011.
Special thanks for help to Douglas A. Kerr, the principal author and editor of the published standards document of the first complete version of ASCII.
Updated in August 2016: Unicode 9.0, CP/M, additional details on PAD, HOP and SGCI.
Updated in : Source links updated/fixed.
Control characters in ASCII and Unicode