SPIRIT Platform Blueprint - Source Code Transfer Profile

NMF SPIRIT Issue 3.0 Platform Blueprint

NMF SPIRIT Issue 3.0 Platform Blueprint
Copyright © 1995 Network Management Forum

Source Code Transfer Profile

The main objective of defining application portability profiles is to enable the porting of application resources, such as source code and data, from one development environment to another via portable media or telecommunication lines.

Such porting requires the physical means to transfer application source code and reference data from one SPIRIT Platform implementation to another. In particular, this requires:

portable media, such as floppy disks, magnetic tapes and CD-ROM disks, used to port application resources across different environments
telecommunication protocols, such as FTAM and FTP, used to transfer application resources between physically connected systems
interchange formats, such as sequential files with fixed-length records and pax formats, used to transfer application resources to or from file or archive formats.

Note:: Different platform implementations may encode native character data differently.¹

SPIRIT distinguishes between the concept of character set and code set:

A character set is a well-defined set of symbols, without regard to the binary representation. Latin-1, Kanji, Katakana, Cyrillic, Arabic, Hebrew and Hangul are all character sets.
A code set is a mapping of one or more character sets into a set of binary codes. ASCII, EBCDIC, ISO/IEC 10646 Universal Character Set and Shift-JIS are all examples of code sets.

Transfer of source code and reference data requires not only a common base of character sets, but also the means to translate from one platform's native code set encoding to another.

Model

To achieve application source code transfer across platforms with different native platform encodings, SPIRIT requires translation to a character set encoding understood in common by the source and target platforms. See Source Code Porting Model .

Figure: Source Code Porting Model

The essential elements for source code portability are a common character encoding for the transfer of source data, the physical means to transfer the data from the source development environment to the target development environment, and/or file transfer protocols.

Normative References

To gain maximum application portability, SPIRIT specifies the profile described below. All references are provided in Part 1, Overview and Core Specifications, Normative References . Each corresponding reference is identified by the label used to classify standards in Part 1, Overview and Core Specifications.

Portable Media

Physical media specifications for application portability are defined using the base standards given below. A platform, however, need not directly support devices for reading from and writing to the media defined here, but need only support a mechanism for converting to and from the character set encoding contained on the media.

MED-1: Floppy disks
MED-2: Magnetic tape
MED-3: CD-ROM disks

Telecommunication Protocols

The following specifications are alternatives to portable media (see Portable Media ) and to each other:

PRO/APPL-8: File Transfer, Access and Management
PRO/APPL-9: Internet File Transfer Protocol

See Part 3, Communications for further information.

Interchange Formats

The following specifications are information interchange formats.

EXFOR-4: Source Code Transfer File Formats - pax (tar and extended cpio format)
EXFOR-5Numerical Data Representation
EXFOR-6Character Set Encoding (ASN.1 BER)

Character Sets

The following specifications are character sets for source code transfer:

I18N-1: ISO Latin 1
I18N-2: Alphanumeric
I18N-3: Kanji
I18N-4: Katakana
I18N-5: ISO Latin 2

Code Sets

The following specifications are code sets for source code transfer:

EXFOR-1: Seven and eight-bit encodings
EXFOR-2: ISO 2022/JIS Transmission Code Set
EXFOR-3: Universal Multiple-Octet Coded Character Set

Seven/eight-bit encoding of source data is sufficient when the source data uses only characters defined in ISO Latin 1 or ISO Latin 2. Use of ISO 2022 and JIS transmission code sets is an alternative to the use of the Universal Multiple-Octet Coded Character Set (ISO/IEC 10646).⁴

When using the ISO 2022 and JIS transmission code sets, code extension techniques comply with ISO 2022, and the following should be applied:

Graphic characters
Only the G0 set shall be used. Invocation shall not be used. The G0 set shall be considered invoked in columns 2 to 7. The escape sequences registered in the ISO International Register of Character Sets shall be used with the Escape Sequence. The announce sequence shall be omitted. The alphabetic character set, defined in JIS X 0201, designates the initial shift state.
Control characters
Control character set in JIS X 0201 shall be designated in the C0 set.

Mapping Between Character Sets and an Exchange Format

When using UCS (EXFOR-3) as an exchange format, every character in the following collections of characters should be converted to/from a code point specified by a particular table of Annex 3 JIS X0221 (EXFOR-3):

JIS X0201: Table 1 and Table 2
Non-Kanji characters in JIS X0208: Table 3
Non-Kanji characters in JIS X0212: Table 4.

Character Set Profile for SPIRIT SQL

The following character sets are supported by SPIRIT SQL:

Alphanumeric character set

Name:
SIMPLE_LATIN

Character Set Repertoire:
ISO/IEC 646 (I18N-2)

Form-of-use:
Implementation-defined.

Default Collating Sequence:
Implementation-defined.
Latin-1 character set

Name:
LATIN1

Character Set Repertoire:
This character set consists of the 191 graphic characters defined in ISO 8859-1 (I18N-1).

Form-of-use:
The coded representation of each character by a single 8-bit byte, with no designation escape sequences for other character sets.

Default Collating Sequence:
Implementation-defined.
Latin-2 character set

Name:
LATIN2

Character Set Repertoire:
This character set consists of the 191 graphic characters defined in ISO 8859-2 (I18N-5).

Form-of-use:
The coded representation of each character by a single 8-bit byte, with no designation escape sequences for other character sets.

Default Collating Sequence:
Implementation-defined.
Japanese Katakana character set

Name:
JAPANESE_KATAKANA

Character Set Repertoire:
JIS X0201 (I18N-4)

Form-of-use:
Implementation-defined.

Default Collating Sequence:
Implementation-defined.
Japanese Kanji character set

Name:
JAPANESE_KANJI

Character Set Repertoire:
JIS X0208 (I18N-3)

Form-of-use:
Implementation-defined.

Default Collating Sequence:
Implementation-defined.
Japanese all-in-one character set

Name:
JAPANESE

Character Set Repertoire:
JIS X0201 (I18N-4) + JIS X0208 (I18N-3) + ISO/IEC 646 (I18N-2)

Form-of-use:
Implementation-defined.

Default Collating Sequence:
Implementation-defined.

Footnotes

1.: Although ISO has defined a universal code set, ISO/IEC 10646, most existing platforms do not use this for their native code set. It is anticipated that transition to pervasive use of ISO/IEC 10646 as the native encoding across all platform implementations will take some time.
2.: This is needed for encoding numerical data embedded in file headers.
3.: This is needed for transferring files via telecommunication lines.
4.: ISO/IEC 10646 represents SPIRIT's future direction.

Why not acquire a nicely bound hard copy?
Click here to return to the publication details or order a copy of this publication.

Contents

Next section

Index