Control character: Difference between revisions

Jump to navigation Jump to search
imported>Spitzak
 
imported>Plasticwonder
m Reverted good faith edits by ~2026-29240-32 (talk): editing tests (WS)
 
Line 4: Line 4:
{{More citations needed|date=September 2007}}
{{More citations needed|date=September 2007}}


In [[computing]] and [[telecommunications]], a '''control character''' or '''non-printing character''' ('''NPC''') is a [[code point]] in a [[character encoding|character set]] that does not represent a written [[Character (computing)|character]] or symbol. They are used as [[in-band signaling]] to cause effects other than the addition of a symbol to the text. All other characters are mainly ''[[graphic character]]s'', also known as ''printing characters'' (or ''printable characters''), except perhaps for "[[space (punctuation)|space]]" characters. In the [[ASCII]] standard there are 33 control characters, such as code 7, {{tt|BEL}}, which rings a terminal bell.
In [[computing]] and [[telecommunications]], a '''control character''' or '''non-printing character''' ('''NPC''') is a [[code point]] in a [[character encoding|character set]] that does not represent a written [[Character (computing)|character]] or symbol. They are used as [[in-band signaling]] to cause effects other than the addition of a symbol to the text. All other characters are mainly ''[[graphic character]]s'', also known as ''printing characters'' (or ''printable characters''), except perhaps for "[[space (punctuation)|space]]" characters. In the [[ASCII]] standard there are 33 control characters,<ref>{{cite book |at=4. Specification of the Coded Character Set |url=https://www.unicode.org/L2/L2006/06388-review-incits4.pdf |title=American National Standard for Information Systems — Coded Character Sets — 7-Bit American National Standard Code for Information Interchange (7-Bit ASCII), ANSI INCITS 4-1986 (R2002) (formerly ANSI X3.4-1986 (R1997)). |publisher=[[American National Standards Institute]], [[International Committee for Information Technology Standards]] |date=1986-03-26}}</ref> such as code 7, {{tt|BEL}}, which might [[Bell character|ring a bell]].


==History==
==History==
Line 33: Line 33:
  | title      = Component Description: IBM 2780 Data Transmission Terminal  
  | title      = Component Description: IBM 2780 Data Transmission Terminal  
  | id          = GA27-3005-3
  | id          = GA27-3005-3
  | section    = EOT (End of ransmission)  
  | section    = EOT (End of transmission)  
  | section-url = http://bitsavers.org/pdf/ibm/2780/GA27-3005-3-2780_Data_Terminal_Description_Aug71.pdf#page=31
  | section-url = http://bitsavers.org/pdf/ibm/2780/GA27-3005-3-2780_Data_Terminal_Description_Aug71.pdf#page=31
  | page        = 31
  | page        = 31
Line 50: Line 50:
* 0x0D ([[carriage return]], {{tt|CR}}, {{tt|\r}}, {{tt|^M}}), moves the printing position to the start of the line, allowing overprinting. Used as the end of line marker in [[Classic Mac OS]], [[OS-9]], [[FLEX (operating system)|FLEX]] (and variants). A {{tt|CR+LF}} pair is used by [[CP/M]]-80 and its derivatives including [[DOS]] and [[Windows]].
* 0x0D ([[carriage return]], {{tt|CR}}, {{tt|\r}}, {{tt|^M}}), moves the printing position to the start of the line, allowing overprinting. Used as the end of line marker in [[Classic Mac OS]], [[OS-9]], [[FLEX (operating system)|FLEX]] (and variants). A {{tt|CR+LF}} pair is used by [[CP/M]]-80 and its derivatives including [[DOS]] and [[Windows]].
* 0x1B ([[escape character|escape]], {{tt|ESC}}, {{tt|\e}} ([[GCC (software)|GCC]] only), {{tt|^[}}). Introduces an [[escape sequence]].
* 0x1B ([[escape character|escape]], {{tt|ESC}}, {{tt|\e}} ([[GCC (software)|GCC]] only), {{tt|^[}}). Introduces an [[escape sequence]].
* 0x1C ([[file separator]], {{tt|FS}}, {{tt|^\}}) ISO/IEC 10538 uses this as a "document terminator" code to indicate the end of text in a document.<ref name="ISO 10538">{{cite ISO standard |date=1991-09-15 |title=ISO/IEC 10538:1991 Information technology — Control functions for text communication |csnumber=18608 |access-date=2026-04-04}}</ref>{{rp|page=12}}
* 0x1D ([[group separator]], {{tt|GS}}, {{tt|^]}}) ISO/IEC 10538 uses this as a "page terminator" code to indicate the end of text on a page.<ref name="ISO 10538" />{{rp|page=16}}


Control characters may do something when the user inputs them, such as Ctrl+C ([[End-of-Text character]], ETX) to interrupt the running process, and Ctrl+Z ([[Substitute character]], SUB) for ending typed-in file on Windows. These uses usually have little to do with their ASCII definition. Modern systems often describe shortcuts as though they are control characters ("type a Ctrl+V to paste") but the code number is not even used to implement this.
Control characters may do something when the user inputs them, such as Ctrl+C ([[End-of-Text character]], ETX) to interrupt the running process, and Ctrl+Z ([[Substitute character]], SUB) for ending typed-in file on Windows. These uses usually have little to do with their ASCII definition. Modern systems often describe shortcuts as though they are control characters ("type a Ctrl+V to paste") but the code number is not even used to implement this.
Line 62: Line 64:
==Display==
==Display==
There are a number of techniques to display non-printing characters, which may be illustrated with the [[bell character]] in [[ASCII]] encoding:
There are a number of techniques to display non-printing characters, which may be illustrated with the [[bell character]] in [[ASCII]] encoding:
* [[Code point]]: decimal 7, hexadecimal 0x07
* [[Caret notation]] in ASCII using the ''n''th letter of the alphabet: {{tt|^G}}
* An abbreviation, often three capital letters: BEL
* A special character condensing the abbreviation: Unicode U+2407 (␇), "symbol for bell"
* An [[ISO 2047]] graphical representation: Unicode U+237E (⍾), "graphic for bell"
* [[Caret notation]] in ASCII, where code point 00xxxxx is represented as a caret followed by the capital letter at code point 10xxxxx: ^G
* An [[escape sequence]], as in [[C (programming language)|C]]/[[C++]] character string codes: {{mono|\a}}, {{mono|\007}}, {{mono|\x07}}, etc.
* An [[escape sequence]], as in [[C (programming language)|C]]/[[C++]] character string codes: {{mono|\a}}, {{mono|\007}}, {{mono|\x07}}, etc.
* An abbreviation, often three capital letters: {{tt|BEL}}
* A Unicode character from the [[Control Pictures]] Unicode block that condenses the abbreviation: {{unichar|2407}}
* An [[ISO 2047]] graphical representation: {{unichar|237E}}


==How control characters map to keyboards==
==How control characters map to keyboards==
Line 80: Line 81:
Many keyboards include keys that do not correspond to any ASCII printable or control character, for example cursor control arrows and [[word processing]] functions.  The associated keypresses are communicated to computer programs by one of four methods: appropriating otherwise unused control characters; using some encoding other than ASCII; using multi-character control sequences; or using an additional mechanism outside of generating characters. "Dumb" [[computer terminal]]s typically use control sequences.  Keyboards attached to stand-alone [[personal computer]]s made in the 1980s typically use one (or both) of the first two methods.  Modern computer keyboards generate [[scancode]]s that identify the specific physical keys that are pressed; computer software then determines how to handle the keys that are pressed, including any of the four methods described above.
Many keyboards include keys that do not correspond to any ASCII printable or control character, for example cursor control arrows and [[word processing]] functions.  The associated keypresses are communicated to computer programs by one of four methods: appropriating otherwise unused control characters; using some encoding other than ASCII; using multi-character control sequences; or using an additional mechanism outside of generating characters. "Dumb" [[computer terminal]]s typically use control sequences.  Keyboards attached to stand-alone [[personal computer]]s made in the 1980s typically use one (or both) of the first two methods.  Modern computer keyboards generate [[scancode]]s that identify the specific physical keys that are pressed; computer software then determines how to handle the keys that are pressed, including any of the four methods described above.


==The design purpose==
==Design purpose==
{{Unreferenced section|date=February 2012}}
{{Unreferenced section|date=February 2012}}


Line 134: Line 135:


==See also==
==See also==
* {{Slink|Arrow keys|HJKL keys}}, HJKL as arrow keys, used on ADM-3A terminal
*{{Slink|Arrow keys|HJKL keys}}, HJKL as arrow keys, used on ADM-3A terminal
* [[C0 and C1 control codes]]
*{{anl|C0 and C1 control codes}}
* [[Escape sequence]]
*{{anl|Escape sequence}}
* [[In-band signaling]]
*{{anl|Implicit directional marks}}
* [[Whitespace character]]
*{{anl|In-band signaling}}
*{{anl|Whitespace character}}


==Notes and references==
==Notes and references==