Base class for single-byte and multi-byte character encodings.
abstract class
|
CharacterEncoding
|
base of
|
CharacterEncodingSimple
|
Only non-surrogate (see OtherSurrogate) Unicode characters in the range [0..65535] may be encoded to resp. decoded from a byte sequence. All other characters will result in Unprintable s.
public
constant
|
Unprintable
=
|
||
type
|
char
|
Character encoding UTF-16 (big-endian).
public
static
readonly
field
|
UTF_16_BE
|
||
type
|
CharacterEncoding
|
Character encoding UTF-16 (little-endian).
public
static
readonly
field
|
UTF_16_LE
|
||
type
|
CharacterEncoding
|
Character encoding UTF-8.
public
static
readonly
field
|
UTF_8
|
||
type
|
CharacterEncoding
|
Returns the number of bytes per character.
public
abstract
property
|
BytesPerCharacter
{
get
}
|
||
type
|
int32
|
||
value
|
|
The
number
of
bytes
per
character
or
0
if
this
encoding
uses
variable
byte
counts
(e.g.
UTF-8).
|
Returns the name of this character encoding.
public
abstract
property
|
Name
{
get
}
|
||
type
|
string
|
||
value
|
|
The encoding name. |
Returns a character encoding by its name.
public
static
method
|
For
(string name,
CharacterEncoding defaultEncoding = null)
|
||
type
|
CharacterEncoding
|
||
params
|
name
|
The
name
or
null . |
|
defaultEncoding
|
The
default
encoding
to
return
in
case
name
is
not
recognized.
Defaults
to
null .
|
||
returns
|
The
found
encoding
or
null . |
Remarks:
The
given
name
is
normalized
before
trying
to
find
a
character
encoding:
First,
the
name
is
converted
to
lower-case.
Then,
all
characters
that
are
neither
letters
nor
digits
are
removed
(i.e.
only
'0'..'9'
,
'a'..'z'
are
retained).
The
resulting
normalized
name
is
then
tested
against
the
following
values
in
order
to
find
a
character
encoding:
'utf8'
'utf16be'
'utf16'
,
'utf16le'
'cp1252'
,
'iso88591'
,
'latin1'
,
'windows1252'
'ascii'
,
'usascii'
'437'
,
'cp437'
,
'ibm437'
Decodes a single unicode character.
public
abstract
method
|
Decode
(ByteBuffer input)
|
||
type
|
int32
|
||
params
|
input
|
[not-null]
|
The input buffer. |
returns
|
The
decoded
unicode
character
or
-1
if
the
given
buffer
does
not
contain
any
more
characters.
|
Decodes a single unicode character.
public
abstract
method
|
Decode
(IDataStream input)
|
||
type
|
int32
|
||
params
|
input
|
[not-null]
|
The input buffer. |
returns
|
The
decoded
unicode
character
or
-1
if
the
given
buffer
does
not
contain
any
more
characters.
|
Converts the given encoded string to unicode.
public
method
|
DecodeString
(ByteBuffer bytes)
|
||
type
|
string
|
||
params
|
bytes
|
The single-byte encoded string. | |
returns
|
The
unicode
string
or
null
iff
bytes
is
null .
|
Encodes the given unicode character.
public
abstract
method
|
Encode
(char character,
ByteBuffer output)
|
||
type
|
int32
|
||
params
|
character
|
The unicode character. | |
output
|
[not-null]
|
The output buffer. | |
returns
|
The
number
of
bytes
that
have
been
written
to
output
if
there
was
enough
space
left;
-n
if
the
buffer
does
not
have
enough
space
left,
where
n
is
then
the
number
of
bytes
that
would
have
been
written.
|
Remarks:
Characters
that
cannot
be
encoded
will
be
replaced
with
'?'
.
Encodes the given unicode character.
public
abstract
method
|
Encode
(char character,
IDataStream output)
|
||
type
|
int32
|
||
params
|
character
|
The unicode character. | |
output
|
[not-null]
|
The output buffer. | |
returns
|
The number of bytes that have been written to output. |
Remarks:
Characters
that
cannot
be
encoded
will
be
replaced
with
'?'
.
Encodes the given unicode character.
public
abstract
method
|
EncodeCount
(char character)
|
||
type
|
int32
|
||
params
|
character
|
The unicode character. | |
returns
|
The number of bytes that would have been written to an output buffer. |
Remarks:
Characters
that
cannot
be
encoded
will
be
replaced
with
'?'
.
Converts the given unicode string to this encoding.
[OwnerReturn]
|
||||
public
method
|
EncodeString
(string str,
[Owner]
ByteBuffer bytes = null)
|
|||
type
|
ByteBuffer
|
|||
params
|
str
|
The unicode string. | ||
bytes
|
The
output
buffer
(can
be
null ).
Defaults
to
null .
|
|||
returns
|
The
resulting
buffer
or
null
iff
str
is
null .
|
Remarks:
The encoded bytes will be written to bytes beginning at the current buffer position. Before returning, this method sets the Position and Limit to the range of encoded bytes that have been output.
Returns the number of encoded bytes for the given string.
public
method
|
GetByteCount
(char[] str)
|
||
type
|
int32
|
||
params
|
str
|
[not-null]
|
The string. |
returns
|
|
The number of encoded bytes. |
Returns the number of encoded bytes for the given string.
public
method
|
GetByteCount
(string str)
|
||
type
|
int32
|
||
params
|
str
|
[not-null]
|
The string. |
returns
|
|
The number of encoded bytes. |
Returns the number of encoded bytes for the given string.
public
method
|
GetByteCount
([]
char[] str,
int32 offset,
int32 count)
|
||
type
|
int32
|
||
params
|
str
|
[not-null]
|
The string. |
offset
|
[>=0]
|
Offset to first character in string. | |
count
|
[>=0]
|
Number of characters in string. | |
returns
|
|
The number of encoded bytes. |
Returns the number of encoded bytes for the given string.
public
method
|
GetByteCount
(string str,
int32 offset,
int32 count)
|
||
type
|
int32
|
||
params
|
str
|
[not-null]
|
The string. |
offset
|
[>=0]
|
Offset to first character in string. | |
count
|
[>=0]
|
Number of characters in string. | |
returns
|
|
The number of encoded bytes. |