Previous section.
NMF SPIRIT Issue 3.0 Platform Blueprint
NMF SPIRIT Issue 3.0 Platform Blueprint
Copyright © 1995 Network Management Forum
Source Code Transfer Profile
The main objective of defining application portability profiles
is to enable the porting of application resources, such as source
code and data, from one development environment to another via portable
media or telecommunication lines.
Such porting requires the physical means to transfer application
source code and reference data from one SPIRIT Platform implementation
to another.
In particular, this requires:
-
portable media,
such as floppy disks, magnetic tapes and CD-ROM disks,
used to port application resources across different environments
-
telecommunication protocols,
such as FTAM and FTP, used to transfer application resources between
physically connected systems
-
interchange formats,
such as sequential files with fixed-length records and pax
formats, used to transfer application resources
to or from file or archive formats.
- Note:
- Different platform implementations may encode native character
data differently.1
SPIRIT distinguishes between the concept of character set and code set:
-
A character set is a well-defined set of symbols, without
regard to the binary representation.
Latin-1, Kanji, Katakana, Cyrillic, Arabic, Hebrew and Hangul
are all character sets.
-
A code set is a mapping of one or more character sets into
a set of binary codes.
ASCII, EBCDIC, ISO/IEC 10646 Universal Character Set and Shift-JIS are
all examples of code sets.
Transfer of source code and reference data requires not only a
common base of character sets, but also the means to translate from
one platform's native code set encoding to another.
Model
To achieve application source code transfer across platforms with
different native platform encodings, SPIRIT requires translation to
a character set encoding understood in common by the source and
target platforms.
See
Source Code Porting Model
.
Figure: Source Code Porting Model
The essential elements for source code portability are a common
character encoding for the transfer of source data, the physical
means to transfer the data from the source development environment to
the target development environment, and/or file transfer protocols.
Normative References
To gain maximum application portability, SPIRIT specifies the
profile described below.
All references are provided in Part 1, Overview and Core Specifications,
Normative References
.
Each corresponding reference is identified by the label used to
classify standards in Part 1, Overview and Core Specifications.
Portable Media
Physical media specifications for application portability are
defined using the base standards given below.
A platform, however, need not directly support devices for reading
from and writing to the media defined here, but need only support
a mechanism for converting to and from the character set
encoding contained on the media.
- MED-1
- Floppy disks
- MED-2
- Magnetic tape
- MED-3
- CD-ROM disks
Telecommunication Protocols
The following specifications are alternatives to portable media
(see
Portable Media
)
and to each other:
- PRO/APPL-8
- File Transfer, Access and Management
- PRO/APPL-9
- Internet File Transfer Protocol
See Part 3, Communications for further information.
Interchange Formats
The following specifications are information interchange formats.
- EXFOR-4
- Source Code Transfer File Formats - pax
(tar and extended cpio format)
- EXFOR-5Numerical Data Representation
- EXFOR-6Character Set Encoding (ASN.1 BER)
Character Sets
The following specifications are character sets for source code
transfer:
- I18N-1
- ISO Latin 1
- I18N-2
- Alphanumeric
- I18N-3
- Kanji
- I18N-4
- Katakana
- I18N-5
- ISO Latin 2
Code Sets
The following specifications are code sets for source code transfer:
- EXFOR-1
- Seven and eight-bit encodings
- EXFOR-2
- ISO 2022/JIS Transmission Code Set
- EXFOR-3
- Universal Multiple-Octet Coded Character Set
Seven/eight-bit encoding of source data is sufficient when the
source data uses only characters defined in ISO Latin 1 or ISO Latin 2.
Use of ISO 2022 and JIS transmission code sets is an alternative
to the use of the Universal Multiple-Octet Coded Character Set
(ISO/IEC 10646).4
When using the ISO 2022 and JIS transmission code sets, code
extension techniques comply with ISO 2022, and the following
should be applied:
-
Graphic characters
Only the G0 set shall be used.
Invocation shall not be used.
The G0 set shall be considered invoked in columns 2 to 7.
The escape sequences registered in the ISO International Register of
Character Sets shall be used with the Escape Sequence.
The announce sequence shall be omitted.
The alphabetic character set, defined in JIS X 0201, designates
the initial shift state.
-
Control characters
Control character set in JIS X 0201 shall be designated in the C0 set.
Mapping Between Character Sets and an Exchange Format
When using UCS (EXFOR-3) as an exchange format, every character in
the following collections of characters should be converted to/from
a code point specified by a particular table of Annex 3 JIS X0221
(EXFOR-3):
-
JIS X0201: Table 1 and Table 2
-
Non-Kanji characters in JIS X0208: Table 3
-
Non-Kanji characters in JIS X0212: Table 4.
Character Set Profile for SPIRIT SQL
The following character sets are supported by SPIRIT SQL:
-
Alphanumeric character set
- Name:
- SIMPLE_LATIN
- Character Set Repertoire:
- ISO/IEC 646 (I18N-2)
- Form-of-use:
- Implementation-defined.
- Default Collating Sequence:
- Implementation-defined.
-
Latin-1 character set
- Name:
- LATIN1
- Character Set Repertoire:
- This character set consists of the 191 graphic characters defined in
ISO 8859-1 (I18N-1).
- Form-of-use:
- The coded representation of each character by a single 8-bit byte,
with no designation escape sequences for other character sets.
- Default Collating Sequence:
- Implementation-defined.
-
Latin-2 character set
- Name:
- LATIN2
- Character Set Repertoire:
- This character set consists of the 191 graphic characters defined
in ISO 8859-2 (I18N-5).
- Form-of-use:
- The coded representation of each character by a single 8-bit byte,
with no designation escape sequences for other character sets.
- Default Collating Sequence:
- Implementation-defined.
-
Japanese Katakana character set
- Name:
- JAPANESE_KATAKANA
- Character Set Repertoire:
- JIS X0201 (I18N-4)
- Form-of-use:
- Implementation-defined.
- Default Collating Sequence:
- Implementation-defined.
-
Japanese Kanji character set
- Name:
- JAPANESE_KANJI
- Character Set Repertoire:
- JIS X0208 (I18N-3)
- Form-of-use:
- Implementation-defined.
- Default Collating Sequence:
- Implementation-defined.
-
Japanese all-in-one character set
- Name:
- JAPANESE
- Character Set Repertoire:
- JIS X0201 (I18N-4) + JIS X0208 (I18N-3) + ISO/IEC 646 (I18N-2)
- Form-of-use:
- Implementation-defined.
- Default Collating Sequence:
- Implementation-defined.
Footnotes
- 1.
- Although ISO has defined a universal code set, ISO/IEC 10646, most
existing platforms do not use this for their native code set.
It is anticipated that transition to pervasive use of ISO/IEC 10646 as
the native encoding across all platform implementations will take
some time.
- 2.
- This is needed for encoding numerical data embedded in file headers.
- 3.
- This is needed for transferring files via telecommunication lines.
- 4.
- ISO/IEC 10646 represents SPIRIT's future direction.
Why not acquire a nicely bound hard copy?
Click here to return to the publication details or order a copy
of this publication.