Introduction
The ASCII code includes control characters that do not print a
character. These have a short standard name, a standard function
plus a
large number of non-standard applications.
The original code was developed in the days of mechanical
terminals such as Teletypes. The meaning of the control codes are
defined in terms of typewriter like actions - Tab, ring bell,
back-space, return, and line feed. These have been re-interpreted
as cursor movements for CRT's. Many computer systems use the
control codes for special purposes. A competent software
engineer will know about the control codes; what they were
designed to mean, and how they are used or mis-used in real
systems.
In many high level languages the ASCII characters are representable as a
function, (eg Pascal - chr(i), C - (char)i, or Ada - CHAR'VAL(i) ) where
"i" is an integer. Ada specifies a special standard package that defines
ASCII with standard names for constants representing the coded character.
In C they can be indicated by a backslash character (\) followed by
either a special letter, or as a hexadecimal or octal number. The
following dictionary defines the ASCII name, its position in the ASCII
code, its original meaning, and Ada 83 symbol.
Control characters
- ASCII::=
Net{
There are 128 ASCII codes numbered from 0 to 128. I will use the notation of
character_nbr(i)
to indicate the i'th ASCII character:
- character_nbr::0..127---char, -- The "---" indicates that there are precisely 128 standard characters that correspond, one-for-one, to their code numbers.
I use the C/C++ abbreviation to indicate the set of all ASCII codes:
- char::=character_nbr(0..127), --all the ASCII codes.
- CTRL_CHARS::=
Net{
- NUL::=character_nbr(0)::= Fills in time* (ASCII'NUL).
- SOH::=character_nbr(1)::= Start Of Header (routing info)(ASCII'SOH).
- STX::=character_nbr(2)::= Start Of Text (end of header)(ASCII'STX).
- ETX::=character_nbr(3)::= End Of Text(ASCII'ETX).
- EOT::=character_nbr(4)::= End Of Transmission(ASCII'EOT).
- ENQ::=character_nbr(5)::= ENQuiry, asking who is there(ASCII'ENQ).
- ACK::=character_nbr(6)::= Receiver ACKnowledges positively(ASCII'ACK).
- BEL::=character_nbr(7)::= Rings BELl or beeps(ASCII'BEL)\a.
- BS::=character_nbr(8)::= Move print head Back one Space(ASCII'BS)\b.
- HT::=character_nbr(9)::= Move to next Tab-stop(ASCII'HT)\t.
- LF::=character_nbr(10)::= Line Feed (ASCII'LF)\n.
- VT::=character_nbr(11)::= Vertical Tabulation(ASCII'VT)\v.
- FF::=character_nbr(12)::= Form Feed - new page or form(ASCII'FF)\f.
- CR::=character_nbr(13)::= Carriage Return to left margin(ASCII'CR)\r.
- SO::=character_nbr(14)::= Shift Out of ASCII(ASCII'SO).
- SI::=character_nbr(15)::= Shift into ASCII(ASCII'SI).
- DLE::=character_nbr(16)::= Data Link Escape(ASCII'DLE).
- DC1::=character_nbr(17)::= Device control(ASCII'DC1).
- DC2::=character_nbr(18)::= Device control(ASCII'DC2).
- DC3::=character_nbr(19)::= Device control(ASCII'DC3).
- DC4::=character_nbr(20)::= Device control(ASCII'DC4).
- NAK::=character_nbr(21)::= Negative Acknowledgment(ASCII'NAK).
- SYN::=character_nbr(22)::= Sent in place of data to keep systems synchronized(ASCII'SYN).
- ETB::=character_nbr(23)::= End of transmission block(ASCII'ETB).
- CAN::=character_nbr(24)::= Cancel previous data(ASCII'CAN).
- EM::=character_nbr(25)::= End of Medium(ASCII'EM).
- SUB::=character_nbr(26)::= Substitute(ASCII'SUB).
- ESC::=character_nbr(27)::= Escape to extended character set(ASCII'ESC).
- FS::=character_nbr(28)::= File separator(ASCII'FS).
- GS::=character_nbr(29)::= Group separator(ASCII'GS).
- RS::=character_nbr(30)::= Record separator(ASCII'RS).
- US::=character_nbr(31)::= Unit separator(ASCII'US).
- SP::=character_nbr(32)::= Blank Space character(ASCII'SP).
- DEL::=character_nbr(127)::=Punch out all bits on paper tape(delete).
}=::
CTRL_CHARS.
Normal Characters
- OTHER_CHARS::=
Note
Notice that NO ASCII character sends a BREAK signal. This is not a
character. It is transmitted thru an RS232 cable by dropping the DTR
line to the signal ground, or thru a modem by ceasing to send the
carrier frequency for a fixed length of time. NUL transmits a
character (with all bits=0), BREAK does not.
Whitespace
- whitespace::= whitespace_char #(whitespace_char).
- whitespace_char::= SP | CR |LF | HT | ... .
- EOLN::=End Of Line string -- depends on the system you are using.
- |-EOLN ==> (CR | LF) #(CR | LF | HT | VT | ...).
In COBOL and MATHS the "." character is both a punctuator and a decimal point.
The following defines the cases when a "." is acting as a punctuator:
- period::="." whitespace,-- A dot followed by white space is treated as a period.
Standard Character Sets
char is the set of all ASCII characters.
- digit::="0".."9". See digits.
- letter::=upper_case_letter | llower_case_letter.
- upper_case_letter::="A".."Z". See upper_case_letters: characters 65..90.
- lower_case_letter::="a".."z". See lower_case_letters: characters 97..121.
}=::
ASCII.
See Also
Tables of ISO Latin 1 codes:
[ symbol.html ]
Notes on special uses of special characters
- The following have been used to mark the end of a string:
NUL, ESC, 2 ESCs, grave accent, apostrophe, quotation, EOLN, slash.
- The following have been used to indicate the end of input:
EOT, SUB, 2 CRs
- The following have been used to kill or delete the previous character:
DEL, BS, #.
- The following have been
used to cancel the current line of input:
DEL, NAK(^U), hash(#)
- On a network the special character take on yet more meanings. For
example, commonly RS232 communications use DC3(^S) and DC1(^Q) to
delay and restart data transmission (originally to allow data to be punched).
In an X.25 packet switched network SI(^O) forces the data through the
intervening machines and DLE(^P) allows you to send commands to your local
"Pad". Proprietary networks often have a special 'escape' character as well.
The following can allow a following control character to appear in
text:
SYN(^V),