Previous section.
DCE 1.1: Authentication and Security Services
Copyright © 1997 The Open Group
Checksum Mechanisms
Terminology, Notation, and Conventions
is devoted to general terminology, notation and conventions that are used
throughout this specification.
Subsequent sections of this chapter specify the
(cryptographic and noncryptographic)
checksum mechanisms supported by DCE. The following list
specifies all currently supported checksum mechanisms, and this chapter
is therefore restricted to these checksums only:

Noncryptographic checksums
The CRC32 Cyclic Redundancy Checksum.

Cryptographic checksums
The MD4 and MD5 Message Digest Algorithms, of RSA Data
Security, Inc.
Terminology, Notation, and Conventions
This section introduces terminology, notation and conventions, including
lowlevel formatting details, used throughout this specification.
Use of Pseudocode
Pseudocode is employed in this and the following chapters.
It has the expository purpose of explaining observable
external behaviour only, and does not impose internal requirements on
conforming implementations. The pseudocode notation is mostly
"Clike", augmented by some other elements of standard scientific
exposition (for example, "f(x) = y'' to define the
value of a function f at an argument x to be y).
Anything expressed in pseudocode is intended to be readily
understandable by the intended audience of this specification, but
if confusion can possibly result the pseudocode is also expressed
in words.
Sequences
A (finite) sequence (or
string or vector or ntuple) V of
length lambda(V) = n >= 0 will be denoted in order of
increasing "address" (or "name" or "subscript") notation;
that is, in the form V = <v_{0}, ···,
v_{n1}>; if n = 0, this notation
degenerates to V = <>, the empty string.
The ntuple V is said to begin, at the left,
with initial element v_{0}, and to end, at
the right, with final element v_{n1}.
If 0 <= n´ < n´´ <= n1, then
v_{n´} is said to precede or occur
before (or earlier than or to the left of)
v_{n´´} in V, and
v_{n´´} is said to follow or occur
after (or later than or to the right of)
v_{n´}.
The natural identification of a 1tuple is always made with "the
thing itself": <V> = V. Similarly, the
concatenation of an mtuple and an ntuple are
also identified with an
(m+n)tuple via <U, V> =
<<u_{0}, ···, u_{m1}>,
<v_{0}, ···, v_{n1}>> =
<u_{0}, ···, u_{m1},
v_{0}, ···, v_{n1}>.
In particular, <V, <>> = <<>, V> = <V> = V (via
natural identifications).
The terminology prepend to V refers to concatenating another
vector to the beginning of V; append to V refers to
concatenation to the end of V.
Bits, Bytes, Words, and so on
A bit is a boolean (binary) quantity that can take either of the
values T (true) or F (false).
A byte (or octet) is a sequence of 8 bits.
A shortword is a sequence of 2 bytes, a word is a sequence
of 4 bytes, a longword is a sequence of 8 bytes, and a
quadword is a sequence of 16 bytes.
By concatenation, a sequence of bits of length congruent to 0 modulo
8, 16, 32, 64 or 128, respectively, can be identified with a sequence
of bytes, shortwords, words, longwords or quadwords (of the appropriate
shorter length). Similarly, a sequence of bytes of length divisible by
4 can be identified with a sequence of words, and so forth.
 Notes:

Some other terms exist in the literature that won't
be used in this specification; for example, "nibble" for a sequence
of 4 bits. Also, the definitions of some of the terms above vary
throughout the literature; for example, in some quarters ("16bit
architectures"), "word" means a sequence of 2 bytes, with
corresponding redefinitions of "longword" and "quadword".

Bits can be naturally identified with the elements of the
finite field of 2 elements (integers modulo 2), F_{2} =
{0, 1} (called the bit field in this context), via
the mapping (or correspondence) defined by:
F <> 0, T <> 1.
And then, ntuples of bits can be viewed as elements of the
vector space F^{n}_{2} of dimension
n over F_{2}.
This point of view is standard in mathematical treatments of
cryptography, but it is not insisted upon here.
Integer Representations (Endianness)
The notion of "endianness" arises when one maps bit and/or
bytesequences to (nonnegative) integers. Namely,
the question is whether the left (~ "big") or right
(~ "little") end of the sequence is mapped to the higher
value (that is, is "more significant").
First recall that any nonnegative integer i has a natural
2adic expansion (which is unique if it is the minimal
2adic expansion;
that is, if one does not allow superfluous "leading zeroes"): i =
2^{k1}i_{k1} +
··· +
2^{0}i_{0} with each coefficient
i_{j} (0 <= j <= k1,
k >= 0) equal to 0 or 1 (and 2^{k1} = 1).
The question of endianness arises when one imposes an order on the
collection of coefficients i_{0}, ···,
i_{k1};
that is, when one realises this collection of coefficients as a
sequence. Given a mapping between coefficients and sequences,
bits, bytes, shortwords, words, longwords and quadwords
(or, for that matter, bit or bytesequences of any length)
can be interpreted as nonnegative ("unsigned") integers, in the ranges
[0, 2^{k}1] where k = 1, 8, 16, 32, 64 or 128,
respectively, as described below.
See
Endianness
for an illustration of the endianness concepts defined below. As shown
there, "addresses" (names or subscripts of sequence elements) are always
visualised as increasing (with respect to a fixed ordering on addresses
themselves) from left to right. The meaning of
big (respectively, little) endianness is then determined by whether
sequence elements are mapped to integer powerof2 values
("significance") in a decreasing (respectively, increasing) manner
from left to right, respectively.
Figure: Endianness
Mapping BitSequences to Integers
In all mappings of bitsequences to integers that are considered,
single bits are mapped to integers via the mapping
F <> 0 and T <> 1 (and these mappings are always
treated as identifications, even notationally);
that is, for bitsequences of length 1 the
mapping/identification are always made between bitsequences and integers:


<0> <> 0that is, "<0> = 0"
<1> <> 1that is, "<1> = 1"
Bytes (and more generally, any bitsequences) can be interpreted as
integers via either the bigendian (or mostsignificantfirst)
mapping, or the littleendian (or leastsignificantfirst)
mapping, which are now defined.
(Other mappings of bitsequences to integers are also possible, but
not of interest.)
In bigendian mapping, the bits of the byte are considered
to be ordered in decreasing significance, from the highorder
or most significant bit (MSb) (on the left) to the
loworder or least significant bit (LSb) (on the right).
That is, for bytes (for example, bitsequences of other lengths are
similar):


<a_{0},
a_{1}, ···, a_{7}> <>
2^{7}a_{0} +
2^{6}a_{1} + ··· +
2^{0}a_{7}
And in littleendian mapping, the ordering goes the opposite way:


<a_{0},
a_{1}, ···, a_{7}> <>
2^{7}a_{7} +
2^{6}a_{6} + ··· +
2^{0}a_{0}
When one of these mappings has been chosen and fixed in a given context,
it will usually be considered to be an "identification" instead of a
"mapping", and the symbol "=" instead of "<>" will be used
(by a common abuse of notation).
Mapping ByteSequences to Integers
Similarly, bytes, shortwords, words, longwords and quadwords (and more
generally, any bitsequence of length a multiple of 8 that is considered
as a bytesequence) can be interpreted, once a mapping of bytes to
integers has been chosen and fixed, as integers via either
bigendian or littleendian mapping (for single bytes, the two
mappings coincide of course), which are now defined.
In bigendian mapping, the bytes of the bytesequence are considered
to be ordered in decreasing significance, from the highorder or most
significant byte (MSB) to the loworder or least significant
byte (LSB). That is, for words (for example, bytesequences of
other lengths are similar):


<a, b, c, d> <>
(2^{8})^{3}a +
(2^{8})^{2}b +
(2^{8})^{1}c +
(2^{8})^{0}d
And in littleendian mapping, the ordering goes the opposite way:


<a, b, c, d> <>
(2^{8})^{3}d +
(2^{8})^{2}c +
(2^{8})^{1}b +
(2^{8})^{0}a
Mapping Mixed Bit/ByteSequences to Integers
Since bitsequences of length a multiple of 8 can also be
viewed as bytesequences, for such sequences
the above mappings can be mixed to arrive at the following
taxonomy of endianness for "byte/bit"sequences.
Let W be, say, a ("wordsized") integer in the range
[0, 2^{32}1] (similar remarks hold for integers of other
sizes), with 2adic expansion:


W =
2^{31}w_{31} + ··· +
2^{0}w_{0} =
(2^{8})^{3}(2^{7}w_{31}
+ ··· + 2^{0}w_{24}) +
(2^{8})^{2}(2^{7}w_{23}
+ ··· + 2^{0}w_{16}) +
(2^{8})^{1}(2^{7}w_{15}
+ ··· + 2^{0}w_{8}) +
(2^{8})^{0}(2^{7}w_{7}
+ ··· + 2^{0}w_{0})
There are then the following four bitrepresentations of W.
Here, the terminology "X/Yendian"
(for X, Y member of {"big", "little"})
means: "Xendian with respect to bytes, the bytes being
Yendian with respect to bits". (See also
Endianness
.)

Big/bigendian mapping:
W <>
<<w_{31}, ···, w_{24}>,
<w_{23}, ···, w_{16}>,
<w_{15}, ···, w_{8}>,
<w_{7}, ···, w_{0}>>

Little/bigendian mapping:
W <>
<<w_{7}, ···, w_{0}>,
<w_{15}, ···, w_{8}>,
<w_{23}, ···, w_{16}>,
<w_{31}, ···, w_{24}>>

Little/littleendian mapping:
W <>
<<w_{0}, ···, w_{7}>,
<w_{8}, ···, w_{15}>,
<w_{16}, ···, w_{23}>,
<w_{24}, ···, w_{31}>>

Big/littleendian mapping:
W <>
<<w_{24}, ···, w_{31}>,
<w_{16}, ···, w_{23}>,
<w_{8}, ···, w_{15}>,
<w_{0}, ···, w_{7}>>
Of the four mappings listed above, the first two are the most important
in this specification, because bytes in computer memories are
invariably considered to represent integers
(in the range [0, 2^{8}1]) via the bigendian mapping.
(On the other hand, bytes are variously transmitted over different
serial networks in bigendian or littleendian order:
the most or least significant bit may be transmitted first.)
Clearly, one could extend the above ideas about
endianness to include such higherlevel constructs as
shortwordsequences, wordsequences, and so onthe ideas are
straightforward, and there is no need to elaborate on them here.
Modular Arithmetic
The usual convention is adopted that modular arithmetic on integers is
denoted by "(mod n)" where n > 0 is the modulus, and
the modular equivalence class of any integer N is
identified with its representative in the interval [0, n1].
That is, for any integer N (positive, negative or zero):


0 <= N(mod n) <= n1
Bitwise Operations and Rotations
The following notations for four bitwise operations will be used, all
operating on bitsequences (usually words) U, V, W of
the same length:

~ U
Bitwise boolean NOT (or binary complement).
On bits, it is defined by:


~ 0 = 1
~ 1 = 0

U  V
Bitwise boolean OR.
On bits, it is defined by:


0  0 = 0
0  1 = 1
1  0 = 1
1  1 = 1

U & V
Bitwise boolean AND.
On bits, it is defined by:


0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1

U ^ V
Bitwise boolean XOR ("exclusive or"; that is,
bitwise binary (mod 2) addition of integers).
On bits, it is defined by:


0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1
1 ^ 1 = 0
Also, the following notation for two kinds of rotations will be
used (the first is used in this chapter, the second is used in
Encryption/Decryption Mechanisms
):

i <<< s
Left numerical rotation (or left numerical circular shift)
by s, operating on the value of an integer i
(not any of its bitrepresentations), is defined as follows.
If i =
2^{k1}i_{k1}
+ ···
+ 2^{0}i_{0}
is a 2adic expansion of i in powers of 2 (not necessarily the
minimal 2adic expansion of i; that is, leading zeroes are permitted),
then for any integer s:


i <<< s =
2^{k1}i_{(k1+s)(mod k)}
+ ···
+ 2^{0}i_{(0+s)(mod k)}
(Even though this definition is valid for all s, it will only be
used for 0 <= s <= k1.)

V <<<< s
Left bitwise rotation (or left bitwise circular shift)
by s, operating on the bitvector V (not its
value with respect to either endianness), is defined as follows.
If V = <v_{0}, ···,
v_{n1}>, then for any integer s:


V <<<< s =
<v_{(0+s)(mod n)}, ···,
v_{(n1+s)(mod n)}>
(Even though this definition is valid for all s, it will only be
used for 0 <= s <= n1.)
 Notes:

Note that the two kinds of rotations agree for, say, words
(k = n = 32) if and only if words are identified with
integers via the big/bigendian mappingbut we're using the
little/bigendian mapping in the remainder of this chapter.

The notations "<<<" and "<<<<" for the two left circular shift
operators defined above have been adopted in analogy with the Clanguage
"<<" operator. (The C "<<" operator is a left noncircular numerical
shift operator, despite the fact that it is usually spoken of as a
"bitwise" operator.)
(IDL/NDR) Pickles
By a pickle is meant an encoded/marshalled (or
"linearised" or "flattened") bitvector representation of a
(value of a) data type specified in some computer language,
suitable for applicationlevel storage purposes in the absence
of a communications context (hence the use
of the word "pickle", meaning "preserved substance").
Pickles appear in several places in this specification.
In the case of the present revision of DCE, the only
language currently supported for pickles is IDL, and the only
encoding/marshalling currently supported for IDL pickles is NDR
(for the specifications of the IDL language and its NDR
marshalling/encoding, see the referenced Open Group DCE 1.1 RPC Specification). Therefore,
for the purposes of this revision of DCE,
"pickle" always means IDL/NDR pickle
(specified in detail below).
In essence, then, a pickle for the purposes of this specification
is an inmemory representation of RPC input/output data that
"normally" exists only "onthewire" (and hence is normally
only interpreted by the RPC runtime), in a form that can be
interpreted at application level.
Pickles as specified in this specification are bitvectors (actually,
bytevectorssee below) having a common structure,
which will be denoted:


PKL = <H, B>
where:

H is the pickle's header. It is metadata, describing the actual
data carried by the (body bgcolor="#FFFFFF" of the) pickle. It is always nonempty.

B is the pickle's body bgcolor="#FFFFFF". It embodies the actual data
carried by the pickle. It consists of IDLdefined
NDRmarshalled data. (Actually, as will be seen below, B
itself also contains some secondlevel metadata.)
It is always nonempty.
Each of H and B is actually a bytesequence
(that is, has bitlength a nonnegative integral multiple of
8); they will henceforth always be regarded as bytesequences, not
bitsequences. In particular, the length
of such a (byte)vector M henceforth in this section
will always mean its length in bytes.
PKL's header H is specified as an IDL data type
(but whose encoding is not NDR)
as follows (for the IDL definition of the data types not defined
here, see Appendix N, IDL Data Type Declarations, of the referenced Open Group DCE 1.1 RPC Specification):


typedef struct {
uuid_t stx_id;
unsigned32 stx_version;
} rpc_syntax_id_t;
typedef struct {
unsigned8 pkl_version;
unsigned8 pkl_length_hi;
unsigned16 pkl_length_low;
rpc_syntax_id_t pkl_syntax;
uuid_t pkl_type;
} idl_pkl_header_t;
The fields of these data types are the following:

stx_id
RPC ("transfer") syntax identifier (16byte IDLdefined UUID
(uuid_t), encoded in big/bigendian format
see below). Since this field represents a UUID, it can also be specified as
a string (see Appendix A, Universal Unique Identifier, of the referenced Open Group DCE 1.1 RPC Specification).
Its registered values are specified in
Appendix I, Protocol Identifiers, of the referenced Open Group DCE 1.1 RPC Specification
(the only currently supported syntax is NDR).

stx_version
RPC syntax version number (4byte integer, big/bigendian encoded).
This field specifies the version number of the syntax identified in the
stx_id field.
Its registered values are specified in
Appendix I, Protocol Identifiers, of the referenced Open Group DCE 1.1 RPC Specification
(the only currently supported version number of NDR is version 1).

pkl_version
Pickle header version number (1byte integer, bigendian encoded).
The only currently supported value is 0.

pkl_length_hi and pkl_length_low
The length (3byte integer, big/bigendian encoded), in bytes,
of the pickle body bgcolor="#FFFFFF" B. Its value is in the range
[0, 2^{24}1].

pkl_syntax
RPC syntax (20byte rpc_syntax_id_t)
of the encoded/marshalled data in the pickle body bgcolor="#FFFFFF" B.
In the case of NDR version 1 (the only currently supported
value), the pickle body bgcolor="#FFFFFF" B includes some metadata in
addition to its actual encoded/marshalled data
(this is specified in detail below).

pkl_type
Pickle type identifier (16byte IDLdefined UUID (uuid_t),
encoded in big/big endian formatsee below).
Since this field represents a UUID, it can also be specified as
a string (see Appendix A, Universal Unique Identifier, of the referenced Open Group DCE 1.1 RPC Specification).
This field specifies the semantics of the data in the pickle body bgcolor="#FFFFFF" B.
Here, "encoding a UUID in big/bigendian format" means
that the UUID's string representation (see
Appendix A, Universal Unique Identifier, of the referenced Open Group DCE 1.1 RPC Specification)
is to have its hyphens removed, and the resulting
32character string interpreted as a hexadecimal integer, which
is then encoded in big/bigendian format. Thus, for example,
the UUID (in string representation)
0123456789abcdef0123456789abcdef is mapped to the byte
vector <0x01, 0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef, 0x01,
0x23, 0x45, 0x67, 0x89, 0xab, 0xcd, 0xef> (where "0x" is the usual
Clanguage hexadecimal prefix notation, and where each byte is
bitencoded in bigendian format). (This definition can also be
formulated equivalently in terms of big/bigendian encodings of
the individual fields of a uuid_tsee
Appendix N, IDL Data Type Declarations, of the referenced Open Group DCE 1.1 RPC Specification
but this is not done here.)
Note that the encoding/marshalling of the fields of
rpc_syntax_id_t and of idl_pkl_header_t
is included in their specifications above
(namely, it is always big/bigendian, not NDR).
It is further specified that the encoded
order of the fields of rpc_syntax_id_t and of
idl_pkl_header_t is the same order as the syntactic
listing of the fields in their IDL structure definitions, above; and
there are no "padding" bits. Thus, according to this encoding
specification, the encoded/marshalled length of
idl_pkl_header_t is 40 bytes, formatted as follows (using
Clike "dot notation" for structure fields, and using subscripts
for sizes in bytes):


<pkl_version_{1},
<pkl_length_hi_{1}, pkl_length_low_{2}>,
<pkl_syntax.stx_id_{16}, pkl_syntax.stx_version_{4}>,
pkl_type_{16}>
PKL's body bgcolor="#FFFFFF" B depends on PKL's
syntax (that is, PKL's syntax field, H.pkl_syntax).
The pickle bodies for the currently supported pickle syntaxes are
specified as follows:
 Note:

Fullyfledged support for pickling (or "encoding services"), as
opposed to the "handrolled" pickles described in this section,
is anticipated in a future revision of DCE.
CRC32
This section specifies the "noncryptographic" checksum mechanisms
employed in this specification.
Only one such type of mechanism, CRC32, is currently supported,
so this whole section is devoted to that.
 Notes:

This section is based on, and (unless stated otherwise) is technically
aligned with, CCITT V.42.
However, for editorial reasons, this section stands
independently, and no familiarity with that document is required.
(Thus, the part of this chapter that duplicates information in the cited
document is intended to be technically equivalent to that document,
rewritten for the expository purposes of this document, and any technical
discrepancies between the two are inadvertent and to be reconciled.)

CRC32 is used in this specification primarily in
Key Distribution (Authentication) Services
(Kerberos, see
Registered Encryption Types
), and in
Protected RPC
1
(Protected RPC).
It also makes a minor appearance in
rs_acct_key_transmit_t
.
Cyclic Redundancy Checksums
Cyclic redundancy checksums (CRCs) are defined abstractly in
terms of polynomials with modulus2 coefficients (that is, in terms of the
polynomial ring F_{2}[X] in the indeterminate
X), as follows.
(Note that in this abstract definition there is no need to specify
any conventions regarding identifications (endianness) of bitsequences
with integers.)
For purposes of this definition, identify arbitrary nonempty
bitvectors V of length k > 0 with polynomials
V(X) regarded as having highest power k1 via
the following bigendian correspondence:


V = <v_{0}, ···, v_{k1}>
<­>
V(X) =
v_{0}X^{k1} +
··· + v_{k1}X^{0}
Here, V(X) is regarded as having "highest power" k1,
though not necessarily "degree" k1, since some of its leading
coefficients v_{0}, v_{1}, ···,
may be 0.
Now fix an integer N > 0 (for the purposes of DCE,
N will always be 32),
and fix a polynomial G(X) member of F_{2}[X]
of degree exactly
N (that is, the coefficient of X^{N} in
G is nonzero). Then, for any S(X) =
s_{0}X^{N1} + ··· +
s_{N1}X^{0}
regarded as having highest power N1,
and any M(X) =
m_{0}X^{K1} + ··· +
m_{K1}X^{0}
regarded as having highest power K1 (K > 0),
there exist unique polynomials Q(X) and R(X)
(depending on K, S(X), G(X) and
M(X)), such that:

X^{K}S(X) + M(X) =
G(X)Q(X) + R(X)

degree(R(X)) < N
This follows from the elementary technique of polynomial long division
(using modular arithmetic with modulus 2 on the coefficients);
that is, Q(X) is the quotient and R(X)
is the remainder of the division of
X^{K}S(X) + M(X)
by G(X).
Note that under the (bigendian) identification of bitvectors with
polynomials, the correspondence is as follows:


X^{K}S(X) + M(X) =
s_{0}X^{K+N1} +
··· +
s_{N1}X^{K} +
m_{0}X^{K1}
··· +
m_{K1}X^{0}
<­>
<s_{0}, ···, s_{N1},
m_{0}, ···, m_{K1}>
(That is, in terms of bitvectors, the polynomial
X^{K}S(X) + M(X)
corresponds to prepending (the bits of) S(X)
to (the bits of) M(X).)
Now by definition, the Nbit CRC of an
arbitrary nonempty "bitmessage" M, with respect to the
generator G and the seed (or initialisation
vector) S (identifying bitvectors and polynomials as above, of
course), is the Nbit vector
<r_{0}, ···, r_{N1}>
identified with the remainder polynomial R(X) as described
above. The notation for this CRC is:


CRC_{G}(S, M)
In the common case of the seed S being 0 (that is, a 0vector or
0polynomial), the notation CRC_{G}(0, M)
is sometimes simplified to:


CRC_{G}(M)
With this notation, the following CRC composition law (or
chaining property) is immediately seen to hold:


CRC_{G}(S, <M, M´>) =
CRC_{G}(CRC_{G}(S, M), M´)
Now suppose further that the generator G(X) is
irreducible (that is, G(X) cannot be expressed as a
product of two polynomials of positive degree). Then
Nbit CRCs based on G(X) have the errordetecting
property that any two distinct messages differing
in at most a subsequence of N bits have distinct CRCs
(it also detects "most" other errors, in a sense not described here).
That property, while important for the data communications heritage
of CRCs, is not significant for this chapter.
Instead, for the purposes of this specification, the significant property
met for CRCs based on irreducible generators is the "probabilistic
collisionresistance" property mentioned previously. (Note that any
given message can easily be modifiedby appending a single 32bit word
to itto yield any given CRC_{G}checksum value,
so that CRCs are not "cryptographically collisionresistant".)
Finally, a "twisted" version of CRCs is defined as follows.
Let "§" denote the bitreflection (or bitreversal)
operation, which is defined on arbitrary bitvectors
V = <v_{0}, ···,
v_{k1}> by:


V^{§} =
<v_{0}, ···,
v_{k1}>^{§} =
<v_{k1}, ···, v_{0}>
Likewise, let "§§" denote the perbyte bitreflection operation,
which is defined on bitvectors M as follows.
First, if M is not a bytevector; that is, if
the bitlength of M is not a positive multiple of 8,
append the minimal number of zerobits to it such that the resulting
bitvector (still denoted M, by abuse of notation) does have
bitlength a positive multiple of 8. Then, define the perbyte
bitreflection of the resulting bytevector M by:


M^{§§} =
<<m_{0}···, m_{7}>,
···,
<m_{8l+0}, ···,
m_{8l+7}>>^{§§} =
<<m_{0}, ···, m_{7}>^{§},
···,
<m_{8l+0}, ···,
m_{8l+7}>^{§}> =
<<m_{7}, ···, m_{0}>,
···,
<m_{8l+7}, ···,
m_{8l+0}>>
Then with these notations, the twisted CRC corresponding to
CRC_{G}, denoted CRC^{§}_{G},
is defined for bitvectors M as follows:


CRC^{§}_{G}(S, M) =
(CRC_{G}(S^{§},
M^{§§}))^{§}
If the seed S is 0, the notation CRC^{§}_{G}(0,
M) is sometimes simplified to:


CRC^{§}_{G}(M)
Note that twisted CRCs still satisfy the CRC chaining property
(for bytevectors M, M´):


CRC^{§}_{G}(S, <M, M´>) =
CRC^{§}_{G}(CRC^{§}_{G}(S,
M), M´)
 Note:

The twisting convention is related to the data communications heritage of
CRCs: bytes in computer memories are often communicated across
serial data lines leastsignificantbitfirst. That is, while "incore"
bytes are invariably interpreted as bigendian, they are often (but not
always) twisted to become littleendian bytes "onthewire".
The specific irreducible generating polynomials, G(X),
currently registered in DCE are collected in
Registered CRCs
.
Registered CRCs
The only CRC currently used in DCE
arises by taking G(X) to be the following specific choice of
irreducible generating polynomial (of degree N = 32), said to be
the CCITT32 (or ITUT, OSI/IEC, AutodinII,
Ethernet, FDDI, PKZip, and so on) polynomial:


G_{CCITT32}(X) =
X^{32} + X^{26} + X^{23} +
X^{22} + X^{16} + X^{12} +
X^{11} + X^{10} + X^{8} +
X^{7} + X^{5} + X^{4} +
X^{2} + X^{1} + X^{0}
If the X^{32}term is ignored (or rather, considered to
be implicitly present, since the X^{32}term of a
32degree generating polynomial must always be present),
this polynomial corresponds, according to the (bigendian)
identification of polynomials with bitvectors, to the bitvector:


G_{CCITT32}(X) =
<0,0,0,0, 0,1,0,0, 1,1,0,0, 0,0,0,1, 0,0,0,1, 1,1,0,1, 1,0,1,1, 0,1,1,1>
 Note:
 This bitvector in turn corresponds to various integers under the
various identifications of bitvectors with integers.
For example, under the big/bigendian identification it corresponds to
the integer 0x04c11db7 = 79764919, and under the little/littleendian
identification it corresponds to 0xedb88320 = 3988292384.
These integer representations are of no interest for the purposes of
this specification, though they may be of some use for particular
implementations.
This CCITT32 CRC, being the only CRC used in DCE, will
henceforth be denoted simply, without fear of
confusion: ("the") CRC32
(this is perhaps a slight abuse of terminology, though a commonly
accepted one, because "CRC32" is sometimes also used as a
generic term meaning "an Nbit CRC where N = 32").
Accordingly, this CCITT32 CRC and its twisted version (which is the
version actually used in
Protected RPC
)
will henceforth be denoted simply:


CRC_{32}(S, M)
CRC^{§}_{32}(S, M)
If S = 0, the notation is sometimes simplified to:


CRC_{32}(M)
CRC^{§}_{32}(M)
MD4
This section specifies MD4, one of the "cryptographic" checksum
mechanisms employed in this specification.
 Note:

This section is based on, and (unless stated otherwise) is technically aligned
with, the Internet document RFC 1320, by R. Rivest, dated April 1992.
However, for editorial reasons, this section stands
independently, and no familiarity with that document is required.
(Thus, the
part of this section that duplicates information in the cited document is
intended to be technically equivalent to that document, rewritten for the
expository purposes of this document, and any technical discrepancies
between the two are inadvertent and to be reconciled.)
MD4 is described by an algorithm that takes as
input a bitmessage M of arbitrary length and produces as
output a 128bit message digest (or hash, checksum,
checksumtext, fingerprint) of M. MD4 is a
noninvertible ("oneway") function, and it is
claimed that it has the cryptographic property of being
collisionresistant (or collision"proof"):
it is computationally infeasible to exhibit
(that is, to produce, generate or construct) two distinct messages having
the same message digest as one another, or to exhibit a single
message having a given prespecified message digest. More precisely, it
is claimed that the "difficulty" (computational complexity) of exhibiting
two distinct messages having the same message digest has average order
O(½·2^{64}), and that
the difficulty of exhibiting a single message having a given message digest
has average order O(½·2^{128}).
Let M = <m_{0}, ···,
m_{n1}>
be an arbitrarily given bitmessage (sequence) of length n bits.
Here n is an arbitrary nonnegative integer (it may be 0, it
need not be congruent to 0 (mod 8), and it may be arbitrarily large).
Then the algorithm below is performed to compute the message digest
of the message M, which is denoted by MD4(M).
For the remainder of this section, the little/bigendian
identification of integers and 32bit vectors is used (as defined in
Mapping Mixed Bit/ByteSequences to Integers
).
Some Special Functions
The following four special functions are defined for use in the MD4
algorithm. Let U, V, W be 32bit vectors (or, what
amounts to the same given the fixing of the little/big endian
correspondence in the remainder of this section,
integers in the range [0, 2^{32}1]).

F(U, V, W) =
(U & V)  ((~ U) & W)

G(U, V, W) =
(U & V)  (V & W)  (W & U)

H(U, V, W) =
U ^ V ^ W
 Note:
 It is interesting (from a cryptographic design perspective) to note that
various observations can be made about these functions (even though
these observations are quite unnecessary from a DCE conformance point
of view). For example, it may be noted that F acts as the
conditional: "if U then V else W" (or in Clanguage
notation, "U?V:W"). The function F
could have been defined using "addition (mod 2^{32})" instead
of "", since "U & V" and
"(~U) & W" will never have 1s in the same bit position.
In each bit position, G acts as the majority function:
if at least two of (the bits of) U, V, and W are set,
then the corresponding bit of G(U, V, W) is
set, otherwise it is reset.
If the bits of U, V and W
are independent and unbiased in the statistical sense, then each bit of
F(U, V, W) will be independent and unbiased.
The functions G and H are similar to the function
F, in that they act in "bitwise parallel" to produce their
output from the bits of U, V and W, in such a manner
that if the corresponding bits of U, V and W are
independent and unbiased, then each bit of G(U, V,
W) and H(U, V, W) will be independent
and unbiased.
The function H is just the parity function of its inputs
(that is, each bit position of its output is the bit sum (mod 2) of
its inputs).
Append Padding Bits
The message M is padded (appended to), to produce
a new bitmessage of bitlength n´ = lambda(M´) > n
and n´ equivalence 448 (mod 512) (the meaning of the "magic number"
448 is merely that 448 + 64 = 512see
Append Length
on the significance of the number 64):


M' = <M, 1, <0, ···, 0>>;
Padding is always performed, even if the length of the original message
M is already congruent to 448 (mod 512).
It is performed as follows: a single "1" bit is appended to the
message (on the right), and then a 0vector (one consisting
entirely of (zero or more) "0" bits) is appended, so that the
length in bits of the padded message becomes congruent to 448 (mod 512).
In all, at least 1 bit and at most 512 bits are appended.
Append Length
The padded message M´ is appended
with the 64bit representation of n (mod 2^{64}),
to produce the new bitmessage of bitlength
n´´ = lambda(M´´) = n´ + 64:


M" = <M', n(mod 2^{64})>;
Namely, write n (mod 2^{64}) as an integer, as described
above, and then append the 8 bytes of this longword (in little/bigendian
order) to the end of the padded message M´ from the previous step.
The result is now n´´ equivalence 0 (mod 512).
As described above, M´´ can also be viewed as a sequence of
words: let M´´[0], ···,
M´´[N1]
denote these words, where N = n´´/32 equivalence 0 (mod 16).
 Note:
 This step has the theoretical limitation that it does not
distinguish between messages that differ in length by multiples of
2^{64}. In practice this is not a significant limitation (in
particular, it does not pose a significant security threat), for two reasons:

"For all practical purposes", it may be assumed that
all messages have length
less than 2^{64}. This is because the length of time it
would take to merely transmit (over RPC or any other medium) a message
of length 2^{64} bits (much less compute its MD4 checksum) is
impractically large. Indeed, at a transmission rate of
one gigabit (10^{9} bits) per second,
it would take ~584½ years just to transmit such a
message. Adding to this transmission time any actual local processing time
on the data (such as "generating" it on the sending side, or
"interpreting" it on the receiving side) reduces to insignificance the
practical uses of such very large messages (in the present era of computing).

In the actual usage of MD4 as specified in DCE (namely, in
Protected RPC), all messages actually protected with MD4 have length
< 2^{64} (see
Protected RPC
,
and the referenced Open Group DCE 1.1 RPC Specification).
Initialise State Buffer and Trigonometric Vector
A quadword (= 128 bits) (external) state buffer,
denoted <A´, B´, C´, D´>,
is used to describe the pseudocode computation below.
Here, each of A´, B´, C´, D´ is a
wordsized variable. These variables are initialised to the
following initialisation values or seeds
(expressed as integers in Clike hexadecimal notation, with the
corresponding little/bigendian bytesequences in comments):


A' = 0x67452301; /* = <0x01,0x23,0x45,0x67> */
B' = 0xefcdab89; /* = <0x89,0xab,0xcd,0xef> */
C' = 0x98badcfe; /* = <0xfe,0xdc,0xba,0x98> */
D' = 0x10325476; /* = <0x76,0x54,0x32,0x10> */
The following array of 2 "quadratic"
(unsigned) words, Q[2] and Q[3] is further
initialised:


Q[2] = 0x5a827999;
Q[3] = 0x6ed9eba1;
 Note:
 The origin of the name "quadratic" comes from the fact that
Q[i] = [2^{32}·sqrt(i)],
where on the right hand side "[···]" denotes the integral
part of a real number,
and "sqrt" is the usual square root function.
The values Q[2] and Q[3] are chosen for use
in the MD4 algorithm because the bits of these 2 values are
statistically welldistributed for this application.
Compress Message in 16Word Chunks
Perform the following algorithm (expressed in pseudocode).
Its outer loop processes the padded, appended message
M´´ in chunks of 16 words (= 512 bits) as follows:


for (i = 0; i <= N/16  1; i += 1) {
<A',B',C',D'> =
COMPRESS(A',B',C',D',M"[16*i+0],···,M"[16*i+15]);
}
Its inner compression function, denoted
COMPRESS(A´, B´, C´, D´,
R[0], ···, R[15]),
takes as input a 128bit state vector
<A´, B´, C´, D´>
and a 512bit message chunk <R[0], ···, R[15]>,
and produces as output a 128bit state vector, say
<A´´, B´´, C´´, D´´>.
It is defined by the following algorithm (consisting of 3
rounds of 16 compression operations each), where "+" denotes
("unsigned") addition (mod 2^{32}), and where
F, G and H are the special functions defined in
Some Special Functions
:


/* Initialise internal state buffer <A,B,C,D> = <A',B',C',D'>. */
A = A'; B = B'; C = C'; D = D';
/* Round 1  Let "FF{abcd,r,s}" denote the Fcompression statement
a = (a + F(b,c,d) + R[r] <<< s
and invoke it 16 times as follows: */
FF{ABCD, 0, 3}; FF{DABC, 1, 7}; FF{CDAB, 2,11}; FF{BCDA, 3,19};
FF{ABCD, 4, 3}; FF{DABC, 5, 7}; FF{CDAB, 6,11}; FF{BCDA, 7,19};
FF{ABCD, 8, 3}; FF{DABC, 9, 7}; FF{CDAB,10,11}; FF{BCDA,11,19};
FF{ABCD,12, 3}; FF{DABC,13, 7}; FF{CDAB,14,11}; FF{BCDA,15,19};
/* Round 2  Let "GG{abcd,r,s}" denote the Gcompression statement
a = (a + G(b,c,d) + R[r] + Q[2]) <<< s
and invoke it 16 times as follows: */
GG{ABCD, 0, 3}; GG{DABC, 4, 5}; GG{CDAB, 8, 9}; GG{BCDA,12,13};
GG{ABCD, 1, 3}; GG{DABC, 5, 5}; GG{CDAB, 9, 9}; GG{BCDA,13,13};
GG{ABCD, 2, 3}; GG{DABC, 6, 5}; GG{CDAB,10, 9}; GG{BCDA,14,13};
GG{ABCD, 3, 3}; GG{DABC, 7, 5}; GG{CDAB,11, 9}; GG{BCDA,15,13};
/* Round 3  Let "HH{abcd,r,s,t}" denote the Hcompression statement
a = (a + H(b,c,d) + R[r] + Q[3]) <<< s
and invoke it 16 times as follows: */
HH{ABCD, 0, 3}; HH{DABC, 8, 9}; HH{CDAB, 4,11}; HH{BCDA,12,15};
HH{ABCD, 2, 3}; HH{DABC,10, 9}; HH{CDAB, 6,11}; HH{BCDA,14,15};
HH{ABCD, 1, 3}; HH{DABC, 9, 9}; HH{CDAB, 5,11}; HH{BCDA,13,15};
HH{ABCD, 3, 3}; HH{DABC,11, 9}; HH{CDAB, 7,11}; HH{BCDA,15,15};
/* Output state <A",B",C",D"> = <A',B',C',D'> + <A,B,C,D>. */
A" = A'+A; B" = B'+B; C" = C'+C; D" = D'+D;
Output
Finally, the message digest, MD4(M), is defined to be the final
quadword state resulting from the above algorithm, after the outer loop
completes (it begins with the loworder byte of A´, and ends
with the highorder byte of D´):


MD4(M) = <A', B', C', D'>;
This completes the description of MD4.
MD5
This section specifies MD5, one of the "cryptographic" checksum
mechanisms employed in this specification.
 Note:

This section is based on, and (unless stated otherwise) is technically aligned
with, the Internet document RFC 1321, by R. Rivest, dated April 1992.
However, for editorial reasons, this section stands
independently, and no familiarity with that document is required.
(Thus, the
part of this section that duplicates information in the cited document is
intended to be technically equivalent to that document, rewritten for the
expository purposes of this document, and any technical discrepancies
between the two are inadvertent and to be reconciled.)
MD5 is described by an algorithm that takes as
input a bitmessage M of arbitrary length and produces as
output a 128bit message digest (or hash, checksum,
checksumtext, fingerprint) of M. MD5 is a
noninvertible ("oneway") function, and it is
claimed that it has the cryptographic property of being
collisionresistant (or collision"proof"):
it is computationally infeasible to exhibit
(that is, to produce, generate or construct) two distinct messages having
the same message digest as one another, or to exhibit a single
message having a given prespecified message digest. More precisely, it
is claimed that the "difficulty" (computational complexity) of exhibiting
two distinct messages having the same message digest has average order
O(½·2^{64}), and that
the difficulty of exhibiting a single message having a given message digest
has average order O(½·2^{128}).
Let M = <m_{0}, ···,
m_{n1}>
be an arbitrarily given bitmessage (sequence) of length n bits.
Here n is an arbitrary nonnegative integer (it may be 0, it
need not be congruent to 0 (mod 8), and it may be arbitrarily large).
Then the algorithm below is performed to compute the message digest
of the message M, which is denoted by MD5(M).
For the remainder of this section, the little/bigendian
identification of integers and 32bit vectors is used (as defined in
Mapping Mixed Bit/ByteSequences to Integers
).
Some Special Functions
The following four special functions are defined for use in the MD5
algorithm. Let U, V, W be 32bit vectors (or, what
amounts to the same given the fixing of the little/big endian
correspondence in the remainder of this section,
integers in the range [0, 2^{32}1]).
 Note:
 As with the functions E and H (see
Some Special Functions
), the functions G and I act in "bitwise parallel".
Append Padding Bits
The message M is padded (appended to), to produce
a new bitmessage of bitlength n´ = lambda(M´) > n
and n´ equivalence 448 (mod 512) .
This is performed as described in
Append Padding Bits
.
Append Length
The padded message M´ is appended
with the 64bit representation of n (mod 2^{64}),
to produce the new bitmessage of bitlength
n´´ = lambda(M´´) = n´ + 64.
This is performed as described in
Append Length
.
 Note:
 In the actual usage of MD5 as specified in DCE (namely, in
Protected RPC), all messages actually protected with MD5 have length
< 2^{64} (see
Protected RPC
,
and the referenced Open Group DCE 1.1 RPC Specification).
Initialise State Buffer and Trigonometric Vector
A quadword (= 128 bits) (external) state buffer,
denoted <A´, B´, C´, D´>,
is used to describe the pseudocode computation below.
Here, each of A´, B´, C´, D´ is a
wordsized variable. These variables are initialised to the
same initialisation values or seeds
as described in
Initialise State Buffer and Trigonometric Vector
.
The following array of 64 "trigonometric"
(unsigned) words, T[0], ···, T[63] is further
initialised:


T[] = {0xd76aa478, 0xe8c7b756, 0x242070db, 0xc1bdceee,
0xf57c0faf, 0x4787c62a, 0xa8304613, 0xfd469501,
0x698098d8, 0x8b44f7af, 0xffff5bb1, 0x895cd7be,
0x6b901122, 0xfd987193, 0xa679438e, 0x49b40821,
0xf61e2562, 0xc040b340, 0x265e5a51, 0xe9b6c7aa,
0xd62f105d, 0x02441453, 0xd8a1e681, 0xe7d3fbc8,
0x21e1cde6, 0xc33707d6, 0xf4d50d87, 0x455a14ed,
0xa9e3e905, 0xfcefa3f8, 0x676f02d9, 0x8d2a4c8a,
0xfffa3942, 0x8771f681, 0x6d9d6122, 0xfde5380c,
0xa4beea44, 0x4bdecfa9, 0xf6bb4b60, 0xbebfbc70,
0x289b7ec6, 0xeaa127fa, 0xd4ef3085, 0x04881d05,
0xd9d4d039, 0xe6db99e5, 0x1fa27cf8, 0xc4ac5665,
0xf4292244, 0x432aff97, 0xab9423a7, 0xfc93a039,
0x655b59c3, 0x8f0ccc92, 0xffeff47d, 0x85845dd1,
0x6fa87e4f, 0xfe2ce6e0, 0xa3014314, 0x4e0811a1,
0xf7537e82, 0xbd3af235, 0x2ad7d2bb, 0xeb86d391};
 Note:
 The origin of the name "trigonometric" comes from the fact that
T[i] = [2^{32}sin(i+1)],
where on the right hand side "[···]" denotes the integral
part of a real number, "···" denotes absolute value,
and "sin" is the usual trigonometric sine function, whose
arguments ("angles") i+1 are measured in radians.
The values T[0], ···, T[63] are chosen for use
in the MD5 algorithm because the bits of these 64 values are
statistically welldistributed for this application.
Compress Message in 16Word Chunks
Perform the following algorithm (expressed in pseudocode).
Its outer loop processes the padded, appended message
M´´ in chunks of 16 words (= 512 bits) as follows:


for (i = 0; i <= N/16  1; i += 1) {
<A',B',C',D'> =
COMPRESS(A',B',C',D',M"[16*i+0],···,M"[16*i+15]);
}
Its inner compression function, denoted
COMPRESS(A´, B´, C´, D´,
R[0], ···, R[15]),
takes as input a 128bit state vector
<A´, B´, C´, D´>
and a 512bit message chunk <R[0], ···, R[15]>,
and produces as output a 128bit state vector, say
<A´´, B´´, C´´, D´´>.
It is defined by the following algorithm (consisting of 4
rounds of 16 compression operations each), where "+" denotes
("unsigned") addition (mod 2^{32}), and where
F, G, H and I are the special functions defined in
Some Special Functions
:


/* Initialise internal state buffer <A,B,C,D> = <A',B',C',D'>. */
A = A'; B = B'; C = C'; D = D';
/* Round 1  Let "FF{abcd,r,s,t}" denote the Fcompression statement
a = b + ((a + F(b,c,d) + R[r] + T[t]) <<< s)
and invoke it 16 times as follows: */
FF{ABCD, 0, 7, 0}; FF{DABC, 1,12, 1}; FF{CDAB, 2,17, 2}; FF{BCDA, 3,22, 3};
FF{ABCD, 4, 7, 4}; FF{DABC, 5,12, 5}; FF{CDAB, 6,17, 6}; FF{BCDA, 7,22, 7};
FF{ABCD, 8, 7, 8}; FF{DABC, 9,12, 9}; FF{CDAB,10,17,10}; FF{BCDA,11,22,11};
FF{ABCD,12, 7,12}; FF{DABC,13,12,13}; FF{CDAB,14,17,14}; FF{BCDA,15,22,15};
/* Round 2  Let "GG{abcd,r,s,t}" denote the Gcompression statement
a = b + ((a + G(b,c,d) + R[r] + T[t]) <<< s)
and invoke it 16 times as follows: */
GG{ABCD, 1, 5,16}; GG{DABC, 6, 9,17}; GG{CDAB,11,14,18}; GG{BCDA, 0,20,19};
GG{ABCD, 5, 5,20}; GG{DABC,10, 9,21}; GG{CDAB,15,14,22}; GG{BCDA, 4,20,23};
GG{ABCD, 9, 5,24}; GG{DABC,14, 9,25}; GG{CDAB, 3,14,26}; GG{BCDA, 8,20,27};
GG{ABCD,13, 5,28}; GG{DABC, 2, 9,29}; GG{CDAB, 7,14,30}; GG{BCDA,12,20,31};
/* Round 3  Let "HH{abcd,r,s,t}" denote the Hcompression statement
a = b + ((a + H(b,c,d) + R[r] + T[t]) <<< s)
and invoke it 16 times as follows: */
HH{ABCD, 5, 4,32}; HH{DABC, 8,11,33}; HH{CDAB,11,16,34}; HH{BCDA,14,23,35};
HH{ABCD, 1, 4,36}; HH{DABC, 4,11,37}; HH{CDAB, 7,16,38}; HH{BCDA,10,23,39};
HH{ABCD,13, 4,40}; HH{DABC, 0,11,41}; HH{CDAB, 3,16,42}; HH{BCDA, 6,23,43};
HH{ABCD, 9, 4,44}; HH{DABC,12,11,45}; HH{CDAB,15,16,46}; HH{BCDA, 2,23,47};
/* Round 4  Let "II{abcd,r,s,t}" denote the Icompression statement
a = b + ((a + I(b,c,d) + R[r] + T[t]) <<< s)
and invoke it 16 times as follows: */
II{ABCD, 0, 6,48}; II{DABC, 7,10,49}; II{CDAB,14,15,50}; II{BCDA, 5,21,51};
II{ABCD,12, 6,52}; II{DABC, 3,10,53}; II{CDAB,10,15,54}; II{BCDA, 1,21,55};
II{ABCD, 8, 6,56}; II{DABC,15,10,57}; II{CDAB, 6,15,58}; II{BCDA,13,21,59};
II{ABCD, 4, 6,60}; II{DABC,11,10,61}; II{CDAB, 2,15,62}; II{BCDA, 9,21,63};
/* Output state <A",B",C",D"> = <A',B',C',D'> + <A,B,C,D>. */
A" = A'+A; B" = B'+B; C" = C'+C; D" = D'+D;
Output
Finally, the message digest, MD5(M), is defined to be the final
quadword state resulting from the above algorithm, after the outer loop
completes (it begins with the loworder byte of A´, and ends
with the highorder byte of D´):


MD5(M) = <A', B', C', D'>;
This completes the description of MD5.
Please note that the html version of this specification
may contain formatting aberrations. The definitive version
is available as an electronic publication on CDROM
from The Open Group.