Table of Contents

Contents

6 Appendix B: IBIS Internal File Formats
      6.1 IBIS-1 Format - A Review
            6.1.1 VICAR-Level File Format
            6.1.2 File Parameters
            6.1.3 Data Organization
      6.2 IBIS-2 Format
            6.2.1 VICAR-Level File Format
            6.2.2 File Parameters
            6.2.3 Data Organization
                  6.2.3.1 IBIS Segments
                  6.2.3.2 IBIS  RECSIZE  &  BLOCKSIZE
                  6.2.3.3 IBIS Data Location - COFFSET
                  6.2.3.4 Guidelines for Creating a New IBIS-2 file from Scratch.

6 Appendix B: IBIS Internal File Formats

6.1 IBIS-1 Format - A Review

For the sake of comparison, we recall here the internal implementation of the old VAX-VMS based IBIS-1 format (including GRAPHICS-1). This information has been derived from the "IBISFIL.HLP" and "IBISGR.HLP" file used in the online VAX HELP on the subroutine libraries. We include the significant text of the help files here for reference only:

The IBIS TABULAR file format is a table of numbers. There are a number of columns all of a certain length in 4 byte real format. Usually real numbers are stored but sometimes four characters or an integer are stored in each real number space. GRAPHICS-1 files are always REAL*4.

The actual way the TABULAR and GRAPHICS-1 file is stored on disk is in a Vicar-2 image file. The Vicar format is byte, the number of samples is 512, and the number of lines is the number of blocks.

The first block (line) of a TABULAR file, after the Vicar label, contains the column length. The column length is in integer*4 format at the beginning and all of the rest of the block is zero. The columns are stored as one complete column followed by another. A column takes up at least one block and columns always start at the beginning of a block. This method of storing the columns allows for easy access to columns, but more difficult access to rows. The number of columns is calculated from the number of lines (stored in the Vicar label) and from the column length.

Currently one program (QUERY) uses the empty space in the first block to store labels and formats for each column. All of the IBIS programs that use tabular files may be upgraded to use this format in the future.

The GRAPHICS-1 file does not contain any structure information in the file data or label.

6.1.1 VICAR-Level File Format

An IBIS-1 TABULAR and GRAPHICS-1 file is a VICAR-2 format image, whose SYSTEM label indicates that the file is of BYTE format, 512 samples per line, and VAX-VMS host. There have been attempts to port the IBIS-1 subroutine library to other platforms (e.g., VIDS, and ASU's port of VICAR), and so the possibility that other HOST label entries cannot be completely ruled out.

6.1.2 File Parameters

The first four bytes of the first VICAR image line of a TABULAR file is to be read in as a VAX-VMS ("LOW" byte-order) longword integer. This number indicates the number NR of rows-per-column. The number of columns in the file may be computed from the equations:

blocks_per_column = (NR + 127)/128 
NC = NL / blocks_per_column

As mentioned above, the rest of the first block (line) of the image is either zero, or contains application-specific column-labeling information (which never became standard).

The GRAPHICS-1 file does not indicate the NC (Dimension) information; this must be know and specified by the user. Given the NC value, the NR may be computed from:

NR = (NL * 127) / NC

6.1.3 Data Organization

The actual column data of a TABULAR file starts on the second block of the image-data, and as mentioned in the help text above, is parcelled into 4-byte elements, which may be REAL, FULL, or possibly a 4-character array. Usually the data is in VAX-VMS floating point format, but there is no way of determining this from the contents or label of the file itself.

The data is composed of contiguous column data. A column takes up at least one block and columns always start at the beginning of a block.

The data begins with column #1, followed by those of column#2, etc. A column takes up at least one block and columns always start at the beginning of a block.

A GRAPHICS-1 file stores, in order, all NC column elements of the first row in VAX-VMS REAL*4 format, followed by those of the second row, until the end of the file. There is no padding between rows.

6.2 IBIS-2 Format

6.2.1 VICAR-Level File Format

An IBIS-2 file is a VICAR-2 format file, possessing an IBIS property label. The SYSTEM label's 'BHOST' value indicates the internal representation of the column data types, as the data is stored in the binary header of the file. Thus there there is no explicit assumption about byte-orders, floating point representations, etc. In general, for IBIS-2 files the convention of "read foreign, write local" applies.

6.2.2 File Parameters

The NR and NC of the IBIS-2 file are explicitly specified in the "IBIS" property label. The format of each column is specified by the "FMT_XXX" and the "ASCII_LEN" property labels. The overall file organization is specified by the "ORG" properties, and the column data locations are specified by the COFFSET array, as described below.

6.2.3 Data Organization

The column data is stored in the binary header of the VICAR-2 file. The image data, if any is therefore completely separate, meaning that theoretically a file could contain both a VICAR image and a set of IBIS-2 formatted columns. However, if an image is present, it will be make it difficult for programs to increase the physical size of the IBIS-2 data, as all of the image data will have to be moved down (the IBIS-2 lib does not currently support this).

6.2.3.1 IBIS Segments

An IBIS-2 file is composed of data "segments" whose size in bytes is specified by the SEGMENT property in the IBIS label. It may be thought of as the smallest logical building block of an IBIS file. For alignment purposes, SEGMENT must obey the following SEGMENT-RULE:

The SEGMENT size  must be either a divisor or a multiple of 8. 

For ROW oriented files a segment must be at least large enough to contain an entire row, and for COLUMN files a segment must be large enough to contain a complete BYTE format column of NR rows. To minimize wasted space, it is suggested (but not required) that SEGMENT be a power of 2.

6.2.3.2 IBIS RECSIZE & BLOCKSIZE

The VICAR RECSIZE has little to do with the internal IBIS organization, and there are no restrictions on the NS value, Pixel format, etc. However, in some cases only a portion of each vicar-record may used, which is called a block. The size of a block is specified by the IBIS BLOCKSIZE parameter (which shall always be less than or equal to the VICAR 'RECSIZE' label value). As far as the IBIS format is concerned, the bytes of each vicar line in the binary header past BLOCKSIZE do not exist. For the sake of improving efficiency of file access, the SEGMENT size and the BLOCKSIZE are related by the following BLOCKSIZE-RULE:

BLOCKSIZE either a divisor or a multiple of SEGMENT.

There are no other restrictions on SEGMENT and BLOCKSIZE. To minimize wasted space from padding, it is suggested (but not required) that the physical recordsize be a power of 2.

6.2.3.3 IBIS Data Location -- COFFSET

The data for a given column may occur anywhere in the file, and is determined by the COFFSET array in the IBIS property label. There is one entry in the label item for each column, in the logical column order. The use of offsets for column locations has the benefit of avoiding I/O intensive in-file data shuffling whenever a column is created, deleted or renumbered. There are no restrictions on the values of COFFSET, other than obviously requiring that no two columns have COFFSET's pointing to the same part of the file.

For data element addressing within a file, the COFFSET array is interpreted differently (but analogously) for files with ORG='ROW' and ORG='COLUMN'. The 'byte_offset' values described below refer to the offsets from the start of the binary header AS IF each vicar-record only consisted of BLOCKSIZE bytes. In other words, the line and sample coordinates within the binary header of the data is given by:

line = (byte_offset  /  BLOCKSIZE) + 1
samp = (byte_offset MOD BLOCKSIZE) + 1

ORG=ROW: In this case the first segment contains the first row, the second segment the second row, etc, so that SEGMENT must be large enough to contain the size of a single row. The COFFSET array refers to the byte offsets, relative to the segment boundary, of the successive column values. In other words, the byte_offset of the binary header to the start of the element  (row, col ) is given by the formula (using FORTRAN style indexing):

byte_offset(row,col) = (row-1)*SEGMENT + COFFSET(col)

ORG=COLUMN: This is similar to the old IBIS-1 format, in that all of the data for a single column is stored in contiguous segments. There are as many segments in a single column as there are bytes in a single element of that column (we shall call this the DATA_SIZE), and so SEGMENT need only be large enough to contain a single column of BYTE data. The COFFSET array here refers to the number of SEGMENT length segments to the start of the column data for row #1. Thus the analogous formula for the byte_offset to the start of the element (row, col ) is given by the formula:

byte_offset(row,col)= COFFSET(col)*SEGMENT + (row-1)*DATA_SIZE(col)

In general, the numeric values of the columns are stored in the same manner as the equivalent VICAR pixel format value corresponding to that column, using the BHOST integer and real representations specified in the VICAR system label. The ASCII columns are stored in the null-delimited C-string format. Note that this means the DATA_SIZE of an "An" column is actually (n+1) bytes long to allow for the null.

6.2.3.4 Guidelines for Creating a New IBIS-2 file from Scratch.

There are many way of constructing IBIS-2 files to satisfy the rules listed above. We outline here one general method which will guarantee that the files will be valid (Note: a program designed to read IBIS-2 files without the aid of the subroutine library should not assume that this or any other particular method was used to construct the IBIS file).

Step 0: Choose the size parameters NR and NC for the file, as well as the formats of the columns C1,...C<NC>. Also choose the BHOST representation (typically , the NATIVE host) of the data, and the organization of the file by ROW or by COLUMN.

Step 1: Determine the datasizes of each column in terms of the BHOST representation (recalling that "A<n>" ascii data is stored in (n+1) bytes).

Step 2: Define a default format (typically, this would be the most frequently occurring format chosen in the set). When in doubt, use REAL.

Step 3: Perform a loop to determine a set of COFFSET values for each column so that the data elements, if placed at those offsets, will not overlap. The COFFSET values do not need to take byte or word alignment into account.

Step 4: Define the EXTENT of the file to be the largest COFFSET value computed, added to its corresponding column's datasize. This is the amount of space taken up by a complete row of data.

Step 5: Determine a LOWER-BOUND for the SEGMENT value:

ORG=COLUMN Files: The lower bound in bytes is given by the NR (as this would be the size of a BYTE column). Technically, this should also be multiplied by the BHOST size in bytes of a BYTE pixel.

ORG=ROW Files: The lower bound is given by the EXTENT of the file (the amount of space taken up by a single row).

Step 6: If there is image data to be appended to the IBIS binary header, determine its VICAR RECSIZE value in bytes. Otherwise, take some reasonable value such as 512.

Step 7: The choice of SEGMENT and BLOCKSIZE now depend on the size of this lower-bound:

LOWER-BOUND <= RECSIZE: Define SEGMENT=LOWER-BOUND, and make BLOCKSIZE the largest multiple of SEGMENT less than or equal to RECSIZE.

LOWER-BOUND > RECSIZE: Make BLOCKSIZE=RECSIZE and choose SEGMENT to be the smallest multiple of BLOCKSIZE greater than or equal to LOWER-BOUND.

Note: this is only a useful heuristic; it is not to be assumed that this is the only way that BLOCKSIZE and SEGMENT will be computed (for example, the "512" in step 6 could also be "1024", etc); IBIS readers may only assume that the BLOCKSIZE-RULE and the SEGMENT-RULE are satisfied, and that BLOCKSIZE<=RECSIZE.

Step 8: Determine the number of VICAR lines in the binary header to contain the entire set of columns:

ORG=COLUMN Files: The NL B of the file should be at least

(EXTENT * SEGMENT + BLOCKSIZE -1)/BLOCKSIZE

Extra lines simply permit space for additional columns within the file.

ORG=ROW Files: The NL B of the file should be at least

(NR * SEGMENT + BLOCKSIZE -1)/BLOCKSIZE.

Extra lines simply permit space for additional rows within the file.

Step 9: Create a standard VICAR image, using the desired BHOST, BINTFMT, and BREALFMT labels, with NLB lines and RECSIZE bytes, with appropriate FORMAT and NS values (by default, these should be BYTE). Install the NR, NC, ORG, FMT_DEFAULT, FMT_xxx, SEGMENT, BLOCKSIZE and COFFSET values into the "IBIS" property label, as specified in section 3.3.1. Add any optional GROUPS or UNITS, and initialize the column values, if desired.

Step 10: Go have a beer, you're done!