| 4.2 Character strings |
| |
| A character string is a sequence of characters. All the characters in a character string are taken from a single character set. A character string has a length, which is the number of characters in the sequence. The length is 0 (zero) or a positive integer. A character string type is described by a character string type descriptor. |
| |
| A character string type descriptor contains: |
| - The name of the specific character string type (CHARACTER , CHARACTER VARYING , and CHARACTER LARGE OBJECT ; NATIONAL CHARACTER , NATIONAL CHARACTER VARYING , and NATIONAL CHARACTER LARGE OBJECT are represented as CHARACTER , CHARACTER VARYING , and CHARACTER LARGE OBJECT , respectively). |
| - The length or maximum length in characters of the character string type. |
| - The catalog name, schema name, and character set name of the character set of the character string type. |
| - The catalog name, schema name, and collation name of the collation of the character string type. |
| |
| The character set of a character string type may be specified explicitly or implicitly. |
| |
| The <key word>s NATIONAL CHARACTER are used to specify an implementation-defined character set. Special syntax (N'string') is provided for representing literals in that character set. With two exceptions, a character string expression is assignable only to sites of a character string type whose character set is the same. The exceptions are as specified in Subclause 4.2.7, "Universal character sets", and such other cases as may be implementation-defined. If a store assignment would result in the loss of non-<space> characters due to truncation, then an exception condition is raised. If a retrieval assignment or evaluation of a <cast specification> would result in the loss of characters due to truncation, then a warning condition is raised. |
| |
| Character sets fall into three categories: those defined by national or international standards, those defined by SQL-implementations, and those defined by applications. The character sets defined by ISO/IEC 10646 and The Unicode Standard are known as Universal Character Sets (UCS) and their treatment is described in Subclause 4.2.7, "Universal character sets". Every character set contains the <space> character (equivalent to U+0020). An application defines a character set by assigning a new name to a character set from one of the first two categories. They can be defined to "reside" in any schema chosen by the application. Character sets defined by standards or by SQL-implementations reside in the Information Schema (named INFORMATION_SCHEMA) in each catalog, as do collations defined by standards and collations, transliterations, and transcodings defined by SQL-implementations. |
| |
| NOTE 9 : The Information Schema is defined in ISO/IEC 9075-11. |
| |