1.3 Data Types and Host Representations

Up | Previous | Next | Title Page | Contents

1.3 Data Types and Host Representations

Different host computers have different ways of representing data internally. Some machines are “big-endian”, meaning the high-order byte of an integer is stored first in memory, while others are “little-endian”, meaning the low-order byte is stored first .
Data that are to be transferred between these machines must be byte-swapped. Most machines use the IEEE floating point standard, but DIGITAL VAXes and Alphas running the VMS operating system have their own standard. Some of the IEEE-format machines are byte-swapped relative to each other. Data transferred between these machines must be converted as well.

1.3.1 VICAR File Representations

Conversion among hosts would be greatly simplified if all data were stored in ASCII instead of binary. However, that is inefficient in both time and space for image data. Image data must be stored in a binary representation. The question is, which one?
A standard, canonical representation could be chosen, such as Sun format: big-endian, IEEE floating point. That would simplify the file format, but would lead to inefficient operation on other machines with different formats. Doing processing locally on a VAX, every pixel would be converted to Sun format every time it got read in or written out for every processing step. There wouldn't be enough coffee in the world to keep you awake while waiting. Due to the huge quantity of existing images written in VAX format, the canonical representation would have to be VAX format, which is not desirable in the long run.
Since most processing is done locally on one machine, and transfers between machine architectures are comparatively less frequent, the solution is to use the native format of whatever machine you are running on, and to identify that machine in the image label. That way, local operations are done efficiently, and conversion is done only when switching machines.
Applications must be able to do data format translations automatically. In order to ease the burden, the following conventions have been adopted:

Applications shall be able to read files from any host representation.
Applications shall normally write files in the native host representation of the machine on which they are currently running.

Placing the burden only on reading greatly simplifies the writing, while still insuring that the translations will take place in all cases. Some special-purpose applications may choose to write in a non-native format on occasion; however, all applications must be able to read all formats, without exception .
The Run-Time Library relieves most of this burden. When the standard I/O routines are called ( x/zvread and x/zvwrit), the translations as stated above are performed automatically for the image data. The application merely calls x/zvread and it receives the data in the native format, ready for processing. It calls x/zvwrit, and the data is written out in the native format (which is what the buffer is in).
There are three cases where applications will have to do their own conversion:

Binary labels: both headers and prefixes must be converted. See Using Binary Labels.
Array I/O: Any program using Array I/O will get the data as it exists in the file, without any translation. Applications using Array I/O are responsible for doing their own data format translations on the data they read.
Convert OFF: It is possible for an application to turn off the RTL's automatic conversion. This should not normally be done, but is available for special cases. If this option is selected, the application must do its own translation.

The x/zvtrans family of RTL routines are used to translate. Do not attempt to write your own data format conversion routines, even if you think it's only byte-swapping. Although at the present time byte-swapping is the only integer conversion, this may not always be the case. Other integer representations exist, such as one's-complement and sign-magnitude, that can not be translated by a simple byte swap. By having only one set of conversion routines, porting to a new platform with a different data format is easier. x/zvtrans translations are standardized, and thoroughly debugged. They are coded to be efficient, especially for simple byte-swapping.

Up | Previous | Next | Title Page | Contents