U.S. patent number 3,656,178 [Application Number 04/857,707] was granted by the patent office on 1972-04-11 for data compression and decompression system.
This patent grant is currently assigned to Research Corporation. Invention is credited to Paul A. D. De Maine, Gordon K. Springer.
United States Patent |
3,656,178 |
De Maine , et al. |
April 11, 1972 |
DATA COMPRESSION AND DECOMPRESSION SYSTEM
Abstract
A high speed, multistage, compressor-decompressor system for
processing arbitrary bit strings by reversibly removing redundant
information. Alphanumeric information is processed by Type 1
compression which involves removing patterns of contiguous bytes
and replacing each removed pattern by decompression information
which takes considerably less storage space, and Type 2 compression
which involves removing individual redundant bytes and constructing
a bit map identifying the location of the removed bytes. Numerical
information is processed by a compression technique involving
truncation, recursive differencing, sequence removal, packing, and
then utilizing the Type 1 and Type 2 compression which are used in
conjunction with alphanumeric information. The information which is
to be compressed is arranged in strings of bytes and any
information defining removal of redundant information from a string
is kept together with the string. As a result, each string is
self-defined in the sense that it contains all information needed
to decompress that string. ##SPC1##
Inventors: |
De Maine; Paul A. D. (State
College, PA), Springer; Gordon K. (State College, PA) |
Assignee: |
Research Corporation (New York,
NY)
|
Family
ID: |
25326570 |
Appl.
No.: |
04/857,707 |
Filed: |
September 15, 1969 |
Current U.S.
Class: |
341/87 |
Current CPC
Class: |
H03M
7/3066 (20130101) |
Current International
Class: |
H03M
7/30 (20060101); G06f 007/06 () |
Field of
Search: |
;340/172.5 |
References Cited
[Referenced By]
U.S. Patent Documents
Other References
PA.D. de Maine, B. A. Marron, and K. Kloss, The Solid System II;
Numeric Compression The Solid System III; Alphanumeric Compression
Nat. Bureau of Standards Technical Note 413, Aug. 15, 1967 .
R.W. Bemer, Data Compression System, IBM Tech. Disc. Bull. Vol. 3,
No. 8, Jan. 1961.
|
Primary Examiner: Zache; Raulfe B.
Assistant Examiner: Chirlin; Sydney R.
Claims
We claim:
1. Method of utilizing a digital computer system including memory
and control sections comprising the steps of:
a. storing in the memory a string comprising a set of multibit
information units;
b. identifying and storing in the memory a LEXICON table comprising
Type 1 codes defined as units which are of the same format as the
stored string units and which do not occur in the stored
string;
c. searching the stored string for the presence of a plurality of
patterns of contiguous string units which patterns are repeated in
the string, and identifying such repeated patterns;
d. replacing each of the patterns of a plurality of repeated
patterns identified in the preceding step by a unique Type 1 code
from the LEXICON table;
e. storing in the memory decompression information associated with
the string and defining the replacement carried out in the
preceding step; and
f. storing in the memory a PCORD table containing one pattern from
each plurality of identical patterns which have been replaced by a
Type 1 code.
2. Method as in claim 1 including computing and storing in the
memory a savings ratio indicative of the saving in the length of
the stored string achieved through replacing by a Type 1 code
string pattern identical to patterns stored in the PCORD table to
thereby associate a savings ratio with each pattern stored in the
PCORD table.
3. Method as in claim 2 including: defining a maximum value for the
number of patterns that can be stored in the PCORD table; testing
before storing a pattern in the PCORD table if the PCORD table is
full; if the PCORD table is full, testing if the savings ratio of
the pattern to be stored in the PCORD table is more favorable than
the least favorable savings ratio of the patterns already in the
PCORD table; and, if the answer is yes, removing the pattern with
the least favorable savings ratio from the PCORD table and storing
therein the pattern with the more favorable savings ratio.
4. Method of utilizing a digital computer having memory and control
sections to decompress a string compressed by the method of claim 1
comprising:
a. storing the compressed string an the decompression information
in the memory;
b. identifying from the decompression information a Type 2 code and
the pattern replaced by it;
c. locating in the compressed string Type 2 codes identical to the
Type 2 code identified in the preceding step; and
d. replacing each Type 2 code located in the preceding step by the
pattern identified in the identifying step.
5. Method of utilizing a digital computer system including memory
and control sections comprising the steps of:
a. storing in the memory a string comprising a set of multibit
information units;
b. identifying and storing in the memory a LEXICON table comprising
Type 1 codes defined as units which are of the same format as the
stored string units and which do not occur in the stored
string;
c. storing in a PCORD table in the memory a list of PCORD patterns
each composed of a plurality of units of the same format as the
units of the stored string;
d. searching the stored string for a plurality of patterns each
composed of contiguous units and each identical to a PCORD
pattern;
e. identifying such patterns for replacement by Type 1 codes;
f. replacing each of the patterns of a plurality of repeated
patterns identified in the preceding step by a unique Type 1 code
from the LEXICON table; and
g. storing in the memory decompression information associated with
the string and defining the replacement carried out in the
preceding step.
6. Method as in claim 5 including defining prior to the searching
step if the stored string is to be subjected to slow mode
compression or to a fast mode compression, and -- if slow mode
compression is defined -- searching in the searching step for a
plurality of patterns each composed of contiguous units and
identifying such patterns for replacement by Type 1 codes without
reference to the PCORD patterns stored in the PCORD table, but --
if fast mode compression is defined -- searching in the searching
step only for repeating patterns identical to PCORD patterns from
the PCORD table.
7. Method of utilizing a digital computer system including memory
and control sections comprising the steps of storing in the memory
a string comprising a set of multibit information units;
identifying and storing in the memory a LEXICON table comprising
Type 1 codes defined as units which are of the same format as the
stored string units and which do not occur in the stored string;
searching the stored string for the presence of a plurality of
patterns of contiguous string units which patterns are repeated in
the string, and identifying such repeated patterns; replacing each
of the patterns of a plurality of repeated patterns identified in
the preceding step by a unique Type 1 code from the LEXICON table;
storing in the memory decompression information associated with the
string and defining the replacement carried out in the preceding
step; including in said identifying and storing step the substep of
identifying and storing in the LEXICON table Type 2 codes defined
as units which are of the same format as the stored string units
and which occur in the stored string at least a preselected number
of times, and including the additional steps of: searching the
stored string for string units identical to Type 2 codes stored in
the LEXICON portion of the memory in the preceding step;
constructing and storing in the memory a bit map identifying the
locations in the stored string of units identical to a Type 2 code
and occurring at least a preselected number of times in the string;
removing from the string the units identified in the preceding
step; and storing in the memory decompression information
associated with the string and comprising the bit map and one of
the removed string units.
8. Method of utilizing a digital computer system including memory
and control sections comprising the steps of:
a. storing in the memory a string comprising a set of multibit
information units;
b. identifying and storing in the memory a LEXICON table comprising
Type 1 codes defined as units which are of the same format as the
stored string; units and which do not occur in the stored
string;
c. searching the stored string for the presence of a plurality of
patterns of contiguous string units which patterns are repeated in
the string, and identifying such repeated patterns;
d. replacing each of the patterns of a plurality of repeated
patterns identified in the preceding step by a unique Type 1 code
from the LEXICON table; and
e. storing in the memory decompression information associated with
the string and defining the replacement carried out in the
preceding step;
f. storing in the memory a PCORD table containing preselected Type
2 codes each of the same format as the units of the stored
string;
g. searching the stored string for units which are identical to a
Type 2 code and which occur in the string at least a preselected
number of times;
h. removing from the string units identified in the preceding
step;
i. constructing a bit map identifying the locations of the removed
units; and
j. storing in the memory decompression information associated with
the string and comprising the bit map and one of the removed string
units.
9. Method as in claim 8 including: defining prior to the step of
searching the stored string for units identical to Type 2 codes if
the stored string is to be subjected to fast mode compression or to
slow mode compression; if slow mode compression is defined
identifying and storing in said identifying and storing step in the
LEXICON table Type 2 codes defined as identical in format with the
units of the stored string and occurring in the string at least a
preselected number of times, searching the stored string for string
units identical to a Type 2 code from the LEXICON table of the
memory and occurring at least a preselected number of times in the
stored string, and then proceeding to the bit map constructing step
without recourse to the table of Type 2 codes; but if fast mode
compression is defined, then searching the stored string for units
identical to Type 2 codes from the PCORD table of Type 2 codes
without recourse to the LEXICON portion of the memory.
10. Method of utilizing a digital computer having memory and
control sections to decompress a string compressed by the method of
claim 8 comprising the steps of:
a. storing the compressed string and the decompression information
associated with the string in the memory;
b. identifying from the decompression information a Type 1 code and
the pattern replaced by it;
c. locating in the compressed string Type 1 codes identical to the
Type 1 code identified in the preceding step;
d. replacing each Type 1 code located in the preceding step by the
pattern identified in step b;
e. identifying from the decompression information a bit map and one
of the string units removed in conjunction with constructing the
bit map; and
f. storing the removed string unit in the places in the string
identified by the bit map and expanding the string accordingly.
11. Method of utilizing a digital computer system including memory
and control sections comprising the steps of:
a. storing in the memory a string comprising a set of multibit
information units;
b. storing in a PCORD table in the memory at least one PCORD
pattern composed of a plurality of units which are of the same
format as the units of the stored string;
c. searching the stored string for the presence of a plurality of
patterns each composed of contiguous string units and each
identical to a PCORD pattern, and identifying such string patterns
if any are found; and
d. replacing each of the string patterns identified in the
preceding step by a Type 1 code defined as a unit which is of the
same format as the string units but which does not occur in the
stored string.
12. Method of utilizing a digital computer having memory and
control sections to decompress a string compressed by the method of
claim 11 comprising the steps of:
a. identifying a Type 1 code which has replaced a pattern and the
pattern replaced by the code;
b. replacing each Type 1 code identical to the identified code and
occurring in the string by the pattern identified in the preceding
step; and
c. expanding the string by a number of units equal to the
difference between the number of units in the replaced patterns and
the number of Type 1 codes which have been replaced by
patterns.
13. Method of utilizing a digital computer system including memory
and control sections comprising the steps of:
a. storing in the memory a string comprising a set of multibit
information units;
b. identifying and storing in a LEXICON table in the memory Type 2
codes defined as units which are of the same format as the stored
string units and which occur in the stored string at least a
preselected number of times;
c. searching the stored string for units identical to a Type 2 code
and identifying the locations in the string of such identical
units;
d. constructing and storing in the memory a bit map identifying the
locations in the stored string of units identified in the preceding
step;
e. removing from the string the identified units; and
f. storing in the memory decompression information associated with
the string and comprising the bit map and one of the removed string
units.
14. Method of utilizing a digital computer having memory and
control sections to decompress a string compressed by the method of
claim 13 comprising the steps of:
a. storing the compressed string and the decompression information
in the memory;
b. identifying from the decompression information a bit map and a
string unit removed in conjunction with constructing the bit map;
and
c. inserting the string unit identified in the preceding step in
the locations in the string identified by the bit map and expanding
the string accordingly.
15. Method of utilizing a computer system including memory and
control sections comprising the steps of:
a. storing in the memory a string composed of a number of multibit
information units;
b. storing in a PCORD table in the memory Type 2 codes defined as
units which are of the same format as the stored string units;
c. searching the stored string for units which are identical to a
Type 2 code from the PCORD table and which occur at least a
pre-selected number of times, and identifying such units if any are
found;
d. constructing a bit map identifying the locations in the stored
string of units identified in the preceding step; and
e. removing from the string the identified units and storing in the
memory together with the string the bit map and one of the removed
string units.
16. Method of utilizing a digital computer having byte-oriented
memory and control sections to compress information supplied in the
form of a string comprising a set of bytes of information bits,
comprising the steps of:
a. using the value of each byte of the string to address a 256byte
table in which each byte address corresponds to a unique one of the
256 possible bit configurations of a byte and each byte address
contains a count of the number of times the byte address has been
addressed;
b. storing an indication of the address of each byte address of the
256 byte table that has not been addressed in the course of step
(a) in a LEXICON table to compile thereby a set of Type 1 codes
which are bytes that do not occur in the string;
c. detecting the occurrence in the string of a group of
non-overlapping patterns, if any, of R contiguous bytes (R is an
integer greater than 1) which patterns are identical with each
other;
d. replacing each of the patterns detected in the course of step c.
with an identical Type 1 code selected from available Type 1 codes
in the LEXICON table and compressing the string to eliminate the
space vacated because of the difference in length between each such
replaced pattern of R bytes and the one byte Type 1 code replacing
it;
e. associating with the string decompression information comprising
the Type 1 code used in the course of step d and one of the
replaced patterns of R bytes;
f. changing the value of R and repeating steps c through e for as
long as both i the combined length of the Type 1 codes used as
pattern replacements and the decompression information is less than
the combined length of the replaced patterns and ii. previously
unused Type 1 codes are available in the LEXICON table; and
g. storing a pattern from each group of patterns of R bytes which
has been deleted from the string in a PCORD table.
17. Method as in claim 16 including storing a pattern from each
group of patterns of R bytes which has been deleted from the string
in a PCORD TABLE; associating with each pattern in the PCORD table
a savings ratio indicative of the degree of compression of the
string resulting from the deletion of said pattern from the
string.
18. Method as in claim 17 including: limiting the capacity of the
PCORD table; checking whether the PCORD table is full when an
attempt is made to include therein a new pattern; and, in case the
PCORD table is full, storing the new pattern in the PCORD table and
deleting from the PCORD table the pattern having the lowest
associated savings ratio, but only if said lowest savings ratio is
lower than the savings ratio of the new pattern.
19. Method of utilizing a digital computer having memory and
control sections to decompress a string compressed by the method of
claim 16 comprising the steps of:
a. storing the compressed string and the decompression information
in the memory;
b. identifying from the decompression information a Type 1 code and
the pattern of R bytes replaced by it;
c. locating in the compressed string Type 1 codes identical to the
Type 1 code identified in the preceding step;
d. replacing each Type 1 code located in the preceding step by the
pattern of R contiguous bytes identified in step b; and
e. expanding the string by a number of bytes equal to the
difference between the number of bytes of the patterns replacing
the Type 1 codes and the number of replaced Type 1 codes.
20. Method of utilizing a digital computer having memory and
control sections to compress numeric information supplied in the
form of a first string comprising a set of words A, B, C, D, E. . .
. , each word containing a number, comprising the steps of:
a. recursively differencing by
i. adding the absolute value of the contents of the words of the
first string to obtain a first sum;
ii. generating from the first string a second string having the
same number of words, each word except the first having a value
equal to the difference between the correspondingly located word of
the first string and the next preceding word of the first string,
whereby the second string comprises words A, A-B, B-C, C-D, D-E, .
. . , and adding the absolute values of the words of the second
string to obtain a second sum;
iii. comparing the magnitudes of the first and the second sums and,
continuing to step (b) if the first sum is less than the second
sum, but continuing to substep (iv) if the magnitude of the first
sum is greater than or equal to the magnitude of the second
sum;
iv. generating from the second string generated in substep (ii) a
third string in the same manner as in substep (ii), whereby the
third string comprises words A, A-(A-B), (A-B)-(B-C), (B-C)-(C-D),
(C-D)-(D-E), ... , and repeating substeps i and ii by considering
each newly generated string as a second string and considering the
previously generated string as a first string;
b. detecting sequences of identical words;
c. determining if the replacement of such sequences by defined
decompression information would result in saving in string length
and proceeding to step (d) if saving is indicated but proceeding to
step (e) if no saving is indicated;
d. deleting each detected sequence of identical words and
associating with the string decompression information comprising
the value of the deleted word, the number of deleted words and the
address in the string prior to deletion at which the deleted
sequence started, and compressing the string to take up the space
vacated by the deleted words; and
e. packing the words of the string into double words, each double
word having a set of bits designating the total number of and a set
of bits storing values of words packed in the double word, each
word occupying a fixed number of bit positions, said fixed number
determined by the number of significant bits of the highest value
word packed in the double word.
21. Method as in claim 20 including converting floating point
numbers contained in the words A, B, C, D, E, . . . , into integer
numbers by a logical right shift truncation process.
22. Method as in claim 21 including: placing each integer number in
the string into a four byte word; searching the string to find a
minimum and maximum value of the words; storing said maximum and
minimum values; determining the median value by dividing the sum of
the minimum and maximum values by 2; recording said median value;
subtracting the recorded median value from each word in the string;
associating with the string decompression information defining the
above steps; and storing the decompression information.
23. Method as in claim 21 including: providing a value LSX
indicating the degree of accuracy at which the information stored
in the string is to be maintained; dividing LSX by 2 and storing
the result; finding the minimum value of the words in the string;
subtracting from each word of the string the minimum value and
dividing the difference by the value LSX over 2; and rounding by
removing all digits to the right of the decimal point in the words
leaving only the digits to the left of the decimal point.
24. Method as in claim 21 including: locating in the string
composed of integer numbers each contained in a word any patterns
of words which patterns are composed of a plurality of contiguous
words identical to each other; replacing each such pattern by two
new words one of which is a count of the number of the words
repeated in the pattern and the other one of which is a copy of the
word which is repeated in the pattern, and associating with the
string a third new word indicating the location in the string of
the pattern which is replaced by said two new words.
25. Method of utilizing a digital computer having memory and
control sections to decompress strings compressed by the method of
claim 20 comprising the steps of:
a. storing in the memory the packed string and its decompression
information;
b. unpacking the packed string double words by reference to the set
of bits designating the total number of words packed in the double
words and to the set of bits storing the values of words to
generate unpacked words;
c. if sequences of words were deleted in the course of compressing
the string, utilizing the decompression information to replace in
the string the deleted words; and
d. if second or subsequent string were generated during
compression, carrying out the reverse of the second and subsequent
string generating step to regenerate the original string of words
A, B, C, D, E, . . . .
Description
A program for a general purpose digital computer for storing,
retrieving or updating and purging a collection of items whose
individual members have been assigned descriptor sets. The program
comprises first translating the assigned descriptor sets into a
special digital linear informational representation form. Within
core storage of the digital computer a first index column array
(MARRAY) is provided. MARRAY consists of subarrays each having
EXECUTIVE POINTERS. The EXECUTIVE POINTERS in the index array
contains only an address of length &ADDLY bytes. The EXECUTIVE
POINTERS are arranged in the MARRY subarray seriatim in accordance
with the associated M value and the addresses contain the beginning
address of a JARRAY subarray.
A plurality of subarray JARRAYS are provided in core storage which
JARRAYS each contain EXECUTIVE POINTERS including the next
descriptor in the JOB-LIST ITEM and the address of the next
subarray to be checked. The EXECUTIVE POINTERS in the column JARRAY
subarrays are arranged seriatim in accordance with the value of the
JARRAY EXECUTIVE POINTERS descriptors.
The address portion of the JARRAY EXECUTIVE POINTER points a
subarray in a memory block resident in core. Said memory block
subarrays are similar to said JARRAY subarray.
The final column arrays (RFILE) in the memory block have addresses
which point to the address in the bulk storage of the digital
computer where the collection of items may be stored, retrieved,
updated or purged.
At the end of each subarray in the MARRAY, JARRAY and RFILE
subarrays, a link address is provided where the search may be
continued when a particular column subarray is filled. Core storage
also maintains composite addresses, namely EMPTY which gives the
exact location in core where the next newly created memory block is
to be stored, with the fast memory address (FMADD of EMPTY)
containing the beginning address of a transient portion of the core
storage array or the beginning of the memory block; CURRENT the
address in core storage where the memory block normally resides,
with the fast memory address portion of CURRENT (FMADD of CURRENT)
being the relative address of the first byte of the resident memory
block that is not part of an existing subarray, and; ADDRESS which
is an address extracted from a subarray during a search and which
may be a link address pointing to a continuance of the subarray; an
address extracted from an EXECUTIVE POINTER, or zero if the subpath
is missing. Additionally, there are provided in storage various
indicator elements of which one is called MSIGNAL which will give a
continuous updated indication of the status of the current
search.
After initializing the system, the program searches each descriptor
in the JOB-LIST ITEM starting in the MARRAY and continuing in the
JARRAY through the memory block until, in retrieval, one finds in
RFILE an address or addresses in bulk storage where the information
to be retrieved can be found or, in storage, an address where
information can be stored. Provision is made for the automatically
updating storage, for eliminating certain descriptors by using
overrides, and for transferring memory blocks back and forth into
core with a minimum time loss.
The information in bulk storage is in compressed form and is
decompressed only after having been retrieved or prior to storage.
The compressor and decompressor takes two forms; one is an
alphanumeric compressor and decompressor (SANPAK) and the other is
numeric compressor and decompressor (SNUPAK).
The compressor and decompressor can be either in the form of a
program for a general purpose digital computer or may be a hard
wired special computer.
The alphanumeric compressor operates to compress a string of
digital signals by first scanning a segment of the string on a byte
to byte basis. A table (LEXICON) is provided in the core storage of
the digital computer. It has 256 byte positions. Each byte position
corresponds to the 256 bit configurations possible in a single byte
of information. In each of the appropriate byte positions of the
table is stored the number of times particular bit configurations
appear in the segment. Those bit configurations that do not appear
in the segment are segregated as possible Type 1 code bytes. If
there are Type 1 code bytes then the segment is scanned in multiple
byte segments, byte by byte to determine if there are any common
groupings of bytes for which a Type 1 code byte may be substituted.
If a common multibyte segment is found, a Type 1 code byte is
substituted in their place in the string and the string is closed
up. At the head of the string is placed the common multi-byte
segment for which substitution has been effected, the Type 1 code
byte substituted, and the number of times that the Type 1 code byte
was substituted. This information is used upon decompression of the
string. The common multi-byte segments (PCORDS) may also be kept in
a special table called the PCORD TABLE and, where certain PCORDS
are expected to be found in a given information, it may not be
necessary to scan the string byte by byte with different common
multi-byte segments, but the string may be scanned with one of the
PCORDS from the PCORD TABLE so as to speed up the compression
process. The PCORD TABLE is continuously being updated with the
amount of saving achieved with particular PCORDS so that only those
PCORDS which achieve a saving may be used in successive compression
steps.
Additionally, where this type of compression has been utilized to
its fullest, or where it cannot be used because there are no Type 1
codes available, Type 2 compression is effected. In this type of
compression, where a particular byte appears more than 34 times in
the first 256 bytes in the string, these common bytes are removed
from the string and a bit map is placed at the head of the string
to show where the bytes have been removed.
The numeric compressor compresses digital numeric strings by
converting the strings of numeric information into integers,
eliminating floating point exponents, differencing successive
integers seriatim, placing at the head of the string a number
indicating the number of differencing procedures, condensing
identical sequences in the string and placing information at the
head of the string showing the place where the identical sequences
have been condensed and packing all of the substring integers into
double words in a optimal fashion.
The search procedure and the compression techniques are all
integrated into a single storage, retrieval an updating and purging
system which has been called the SOLID SYSTEM.
BACKGROUND OF THE INVENTION
The invention is in the field of data storage and retrieval systems
and particularly in the field of large scale systems of this
nature. It also relates to data compression and decompression
systems.
In designing a large computer oriented data storage and retrieval
system it is desirable that the final product meet the following
design and performance specifications:
i. Storage, retrieval, updating and purging tasks must be
accomplished as fast as possible.
ii. The system must be independent of the information base.
iii. Components should be capable of being coded independently of
all the other components in the system.
iv. Programming a particular modification of a fully implemented
scheme or combining equivalent hardware components to meet the
varied needs of users, should be as simple as possible.
v. The system should be open ended to provide for future
modifications.
vi. The coded system should be as free of machine dependence as
possible to provide for easier translation to other computer
configurations.
Because size and scope would prohibit writing such system as a
single program or creating a one piece hardware embodiment, the
system is organized on a component by component basis. Each
component should perform a single task in the overall scheme. For
example, one component can handle card input; another the output;
and so on for each separate task the system will perform.
To simplify recoding of a large system for another computer, it is
essential that a higher language be used or developed. The present
higher languages (e.g., FORTRAL, ALGOL, COBOL, SNOBOL, COMIT, etc.)
are not suitable for coding large retrieval or indexing systems
because they do not have the bit and byte manipulation capabilities
that are essential for efficient machine coding. A large system
should, therefore, have associated with it an open ended higher
language, such as ALLOCATE, which can grow as the system is
implemented. Thus, a fully implemented System can be coded in a
machine independent higher language that can provide the basis for
a retrieval language. The macro language provided in the IBM System
360 can provide a starting point for the language ALLOCATE.
Each component can be coded in the macro language. The central
concept is thus one of extensively nested macros incorporated into
the assembly language processor of the computer. In this way the
normal operations of the assembly language are extended with macro
instructions that perform the special operations needed. In the IBM
360 System the assembly language is called BAL. A programmer can
add, delete, or ignore certain components of the system to suit
specific needs. This design allows for the unrestricted growth of
the system and the retrieval language (ALLOCATE) by adding new
components to the system macro library.
Translation of a System defined in this manner to other computer
configurations can be greatly simplified by the use of the
component type system. For example, it is possible to translate
directly to FORTRAN IV by a suitable translator. The translation
can be performed component by component rather than by trying to
rewrite an entire system. Moreover, since the components are
independent of one another, only those components needed for the
particular application of SOLID need be translated. The necessity
for programming around deleted components becomes unnecessary.
In regard to the data compression part of the invention, the need
for effective compressors is obvious because it is always desirable
to reduce the number of information indicia required to represent
information of given content without affecting the information
content. Special recoding techniques that save storage or
transmission time, such as the "SQUOZE encoding" developed for the
Share 709 System, and the PREST scheme for the IBM 7094 have been
developed in the past and are in use but both are tied
inseparatably to the particular data base or to a particular
hardware. It is desirable therefore to have a data compressing
system which is completely independent of the data base so as to
have unrestricted general purpose use and which in addition meets
the following design objectives:
i. By compression, increase the amount of information that can be
stored in mass storage or on magnetic tapes and other peripheral
devices.
ii. Increase the rate of transmission either from "slow" to "fast"
memory or between receiver/transmitter stations by transferring
compressed information.
iii. Automatically decompress the compressed information when it is
needed either by a computer (in fast-memory) or by users of the
system.
iv. Error checking procedures which will insure that errors in the
transmitted compressed information will be found either before or
during compression.
v. Increase the efficiency of the computer system by decreasing the
time required for search and/or fetch operations.
SUMMARY OF THE INVENTION
The invention is in a data management system for manipulating large
amounts of information. In a particular form, the invention resides
in a data storage and retrieval system which, once it has
associated a set of descriptors to an item of information of any
size, is independent of the actual data base of that item of
information. That item of information can then be stored in bulk
memory, in compressed form, and the set of descriptors associated
with it can be stored, retrieved, updated or purged within a fixed
time which is independent of: the number of sets of descriptors
maintained by the invented system, the actual size of a particular
set of descriptors, or the type of search, retrieval, updating or
purging operation carried out. The advantage of this fixed time for
manipulating descriptors derives from the fact that they are
manipulated independently of the items of information in bulk
storage, and that an efficient novel manner of manipulating sets of
descriptors has been provided.
In particular, a very small portion of each set of descriptors is
kept in the fast memory of a computer incorporated into the
invented system such that only a small part of that fast memory is
occupied even though the total storage space required for all sets
of descriptors may exceed the fast memory capacity many times. The
remaining portions of all sets of descriptors are organized in
memory blocks of which one is at all times in fast memory but all
others are kept in virtual memory. The system ensures, through the
use of "continuance tables" which come into use before a search
associated with a particular set of descriptors goes into a memory
block, that the search will be completed within a single memory
block. Thus, for any number of memory blocks needed for a
particular great number of sets of descriptors, a search should
involve a transfer of only one block from virtual memory to fast
memory. The size of the bulk storage space occupied by the
information with which the descriptor sets are associated thus also
has no effect on the search speed.
Storage space in fast and in virtual memory is utilized efficiently
in that the need for reserving specific blocks of memory for a
particular use has been avoided. The invented system utilizes
arrays and subarrays which have no fixed location and which can
vary in size as needed in a manner not requiring the intervention
of a user of the system, but controlled in an optimum manner by a
system control package. Further, any available spaces within a
memory block which have been vacated by purged sets of descriptors
are used for creating new descriptor paths before attempting to
locate previously unused memory space. The control package for
overseeing the use of these vacated spaces operates by linking each
vacated space to another such space to create a continuous chain,
such that only the beginning of it need be kept track of. Once it
is determined that a new descriptor set is to be stored in a
particular memory block, the control package need only locate the
start of this chain of vacated spaces and then insert appropriate
descriptor information in the first available locations along the
link which can take that information. The newly occupied spaces are
then deleted from the link but the rest of the link, if any, closes
again around the deleted spaces.
Storage space and retrieval time for the bulk storage information
are optimized because, before entering bulk storage, the bulk
information may be compressed into self-defining strings such that
each string has associated with it all information needed for
decompressing it. When compressed information is taken out of bulk
storage, it can be decompressed without referring to any additional
information associated with that particular string but stored
elsewhere.
While the compressor-decompressor portion of the invention is of
great importance to the efficiency of the data management system
referred to above, it is also of great utility in any situation
where data compression-decompression may be desirable, such as in
communication between various combinations of peripheral devices,
computer systems and subsystems and communication networks.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A, 1B and 1C illustrate a flow diagram of a macro SANPAKC
used in alphanumeric compression.
FIG. 2 is a flow diagram of a macro SANPAKD used in alphanumeric
decompression.
FIG. 3 is a generalized flow diagram of a macro SNUPAKC used in
numeric compression.
FIG. 4 is a flow diagram of step a of the macro SNUPAKC of FIG.
3.
FIG. 5 is a flow diagram of step b of the macro SNUPAKC of FIG.
3.
FIG. 6A, 6B and 6C illustrate an expanded flow diagram of the macro
SNUPAKC of FIG. 3.
FIGS. 7A, 7B and 7C illustrate a flow diagram of a macro SNUPAKD
used in numeric decompression.
FIG. 8 shows the information at the head of a string prior to its
being supplied to the decompressor SNUPAKD of FIG. 7.
FIG. 9 is a table showing input format code and its meaning with
respect to the type of compression in the string.
FIG. 10 is a flow diagram of Part A of a macro SMEMORY which is in
a GLOBAL MEMORY component of the invented system.
FIG. 11 is a simplified flow diagram of a macro COPAK for use in
compression-decompression.
FIG. 12 is a diagramtic showing of a typical array in storage.
FIG. 13 is a diagramatic showing of the manner in which a JOB-LIST
Item is searched in an AUXILIARY FILE.
FIG. 14 is a diagram illustrating the hierarchical arrangement in
the design selected for the invented system.
FIG. 15 is a diagramatic showing of CONTROL routines utilized in
the invented system.
FIG. 16 is a flow diagram showing the status of information called
MSIGNAL and used in the retrieval package of the invented
system.
FIG. 17 is a flow diagram showing a macro AUXFILE, used in search
procedure of a file called RFILE.
FIG. 18A and 18B are a flow diagram of a macro called SCREEN and
used in search procedure.
FIGS. 19A and 19B are a flow diagram of a macro SUPERSCH used in
search procedure.
FIGS. 20A and 20B are a flow diagram of a macro called INSERT.
FIG. 21 is a flow diagram of a macro called MMATCH.
FIG. 22 is a flow diagram of a macro called STRATEGY.
FIG. 23 is a flow diagram of a macro called TBADD.
FIGS. 24A and 24B are a flow diagram of a macro called CREATE.
FIG. 25 is a flow diagram for a CONTROL package called SOLIDE.
FIG. 26 is a flow diagram for a CONTROL package called COPAKCO.
FIG. 27 is a flow diagram for a CONTROL package called COPAKAN.
FIG. 28 is a flow diagram for a CONTROL package called COPAKNU.
Preface
In order to facilitate initial orientation into this sophisticated
and multifaceted invention, the detailed description begins with a
brief exploration of the mathematical basis of associating sets of
descriptors with items of information for the purpose of creating a
versatile and flexible data management system. Under the heading
following that, an illustrative example is given of the invention
as a data storage and retrieval system using a particular
normalized form of these descriptors to access information items
stored in large scale bulk storage. Once the cooperation between
various portions of the invented system is indicated by means of
the illustrative example, these portions are explained in great
detail and particularity, with emphasis on their interrelation. The
detailed description concludes with a portion devoted to a data
compressing and decompressing system used in conjunction with the
invented data management system. The data management system is
referred to below as SOLID, and the data compressing and
decompressing system is referred to as COPAK.
A. Mathematical Basis of the SOLID System
Suppose that a particular document has associated within it the
following nine descriptors or designators:
A, b, c, d, e, f, g, h, and I.
To simplify matters, we shall suppose that there are no Type 2
over-rides. Perhaps, at this time, a simple description of the need
for Type 2 over-rides should be presented. There are four codes for
descriptors or designators which are reserved. These codes are as
follows:
0, 1, 2 and 3.
The last three are called Type 1, Type 2, and Type 3 over-rides
respectively. The 0 code for a designator indicates that there is
no information at that point in the information representation.
In the retrieval mode, the Type 1 override code indicates that any
non-zero designator at that particular point in the information
representation is to be accepted for retrieval purposes. In the
storage mode, that is when the information representation is being
stored, the Type 1 over-ride code is used to create a new class
with a 1 as a designator. It is thus possible to create a new
non-specific class by substituting a 1 for one or more of the
designators in an information representation.
The Type 2 over-rides are substantially similar to the Type 1
over-rides in the retrieval mode except that the Type 2 over-ride
indicates that any zero or non-zero designator in the Type 2
over-ride position is acceptable.
When utilized in the storage mode, a Type 2 override code for a
designator in an information representation means that the existing
designator must be replaced by another designator which can be
found in a separate table in storage (table AOVER2R).
The translation routine, used for an entire collection of
documents, arranges the designators A, B, C, D, E, F, G, H, and I
in the form of a label as follows: ##SPC2##
Here the linear form is simply a shorthand way of writing the
square array. / and // are inserted for clarity. LABEL is called an
information representation (IR) of the particular designator-set.
Its elements are divided into the following three categories or
levels of disclosure.
Kernels: These lie on the principal diagonal (i.e., A, B, and
C).
Ci connectors: Lie above the principal diagonal (i.e., D, E, and
F).
Cii connectors: Lie below the principal diagonal (i.e., G, H, and
I).
In an information representation, IR, the designators cannot be
reclassified by an transformation. This means that IR's can be
manipulated by any transformation rules that leave the designators
in their assigned levels of disclosure. The mathematics of the
transformation rules and normalization are shown and fully
discussed in Appendices I and II of the book "Information
Retrieval: a Critical View" in the article by P.A.D. deMaine and B.
A. Marron, "The SOLID System I. A Method for Organizing and
Searching Files." The book was edited by Schecter and was published
by the Thompson Book Company in Washington, D. C. in 1967.
The elements in an IR can be assigned numbers, from a look-up table
for each level of disclosure, in the translation step. However, as
was discussed previously, four numbers have been assigned special
meanings, namely Type 1, Type 2 and Type 3 over-rides and, of
course, the zero which means no information for a designator.
Information representations can be expanded by replacing a single
kernel by a new IR. Contraction occurs when an information
representation is replaced by an IR with fewer kernels. These
properties of expansion and contraction permit the reclassification
of documents without having to reorganize the file.
Reclassification may be desirable because either the original
designator-set did not adequately describe the referenced
information, or, due to natural growth, new subclasses must be
created. Thus these information representations permit the
uninhibited growth of a retrieval system without incurring
redundancy or obsolescence.
The status of an information representation with respect to
expansion or contraction is indicated by the first bit-map
(B.sub.1) thus:
B.sub.1 =(M/m.sub.1, m.sub.2 ,---, m.sub.M /L/1.sub.1,1.sub.2
,---1.sub.L) (2)
Here M is the number of nested representations in the IR; m.sub.1
is the number of kernels in the ith nested IR with the basic IR; L
and 1.sub.i refer to the Type 2 over-rides. The first bit-map for
the above example, 1, is B.sub.1 =(1/3/0). Here we have assumed
that the kernels are single information representations and that no
Type 2 override codes are present. The second bit-map (B.sub.2) is
the binary projection of the linear form of LABEL. For example, if
all nine designators in the LABEL are not zero, B.sub.2 is:
B.sub.2 =(111/11/1//11/1)
binary (3)
=(7/3/1//3/1) in decimal ten } (3)
The second bit-map (B.sub.2) is constructed from the information
representation in its square-array form. The linear form of LABEL
is compressed by eliminating all zero designators. Terminal zeros
in B.sub.2 are omitted. For example, if E, G and I in (1) are zero,
then:
Label = (abc/d/f//h)
b.sub.1 = (1/3/0) } (4)
b.sub.2 = (7/2/1//1)
it should be noted that the original information representation can
be constructed from LABEL and B.sub.2.
The MOBILE CANONICALIZATION is used to transform the information
representation and its bit-maps to any of its equivalent forms,
i.e., another form for 4 is:
LABEL (BAC/DE//H); B.sub.1 = (1/3/0); B.sub.2 = (7/3/0//01).
One
of these equivalent forms (the NORMAL FORM) is unique for each and
every designator set. This means that there is a unique
"information path" associated with each and every descriptor-set,
no matter what its source. Thus the SOLID System is independent of
the information base. It is possible, to avoid normalization, if
that is desired, and to use the unnormalized information
representation. This situation occurs when the designators must
remain in a particular order because of the nature of the
information itself. This might occur, for example, where
designators in the three classes (Kernels, CI, and CII Connectors)
have been assigned the same codes, and their places in the
information representation indicates the type of designator.
In the SOLID System the information in (4) is combined thus:
JOB-LIST ITEM = (1/3/ /ABC/2/D/1/F//I/H*).. (5)
The first two numbers are the values for m (=1) and
J=m.sub.1,m.sub.2,m.sub.3 ,-,m.sub.M = 3. ABC is the principal
diagonal of the information representation. The remaining numbers
are the other diagonals of the second bit-map (B.sub.2) and LABEL
alternated. The asterisk (*), added to the last non-zero LABEL
diagonal, indicates that the path is terminated. The field
separators are inserted for clarity. They are not present in the
machine representation. For a general information representation
which contains M nested representations and J = m.sub.1 m.sub.2
m.sub.3 -m.sub.M kernels 5 is replaced by 6:
JOB-LIST ITEM = (m/J/LO.sub. o /BD.sub.1 /LD.sub.1
/.../BD.sub.1.sub.-J /LO.sub.1.sub.-J *) (6)
Here BD.sub.i is the decimal ten value of the binary-bit projection
of the i.sup.th diagonal of the IR, whose compressed form is
LD.sub.i (i.e., no zeros). The range of i can be expressed as
follows: (1-j, .ltoreq. i .ltoreq. (j-1). LD.sub.1.sub.-J * means
that the compressed diagonal LD.sub.1.sub.-J terminates the
JOB-LIST ITEM. Diagonals of B.sub.2 and LABEL are numbered
beginning with the principal diagonal; through the diagonals with
CI connectors; then through diagonals with CII connectors.
The diagonals of the second bit-map (B.sub.2) are binary bit
projections of the associated diagonals of the information
representation. Each bit in the B.sub.2 diagonal indicates the zero
(bit off) or non-zero (bit on) status of a particular designator in
the information representation. The actual machine method of
representing the B.sub.2 diagonals is based on the fact that the
basic data-unit in the IBM 360 system, a byte, contains eight bits.
Thus in a single byte of a B.sub.2 diagonal the status of up to
eight designators in the associated IR diagonal can be recorded in
the SOLID System each B.sub.2 diagonal is left adjusted to a byte
boundary. This means that a particular B.sub.2 diagonal begins at
the left-most bit in a particular byte and continues, across byte
boundaries, if necessary, until all the bits needed to indicate the
zero or non-zero status of all the designators in the associated IR
diagonal have been recorded. The used bits in the last byte of a
particular B.sub.2 diagonal will be left adjusted in the byte and
the unused bits (of the original eight) are set to zero or
turned-off. For example, if one takes the principal diagonal in
(1), its B.sub.2 diagonal would contain a single byte with the
eight bit binary-number 11100000. This would be the decimal ten
number 224. If both designators in the LD.sub.1 diagonal of the IR
(1) are not, zero, then its single byte BD.sub.1 diagonal would
contain the binary number 1100000 or the decimal ten number 192. It
is understood that where the IR diagonal has more than eight
designators it may be necessary to use two or more bytes in the
B.sub.2 diagonal, to record the zero and non-zero status of the
designators. By using the above left adjusted method for B.sub.2
diagonals, the binary representations will follow without the need
for any additional calculation by the computer.
As an example, suppose that the elements in the linear form of
LABEL are from an assigned descriptor set that has been rearranged
to to the square array form as follows: ##SPC3##
The second bit-map diagonals are binary bit projection of LABEL
diagonals, left adjusted to a byte boundary. The principal B.sub.2
diagonal, in this case 240, is not present in the JOBLIST ITEM.
The information in LABEL, B.sub.1, and B.sub.2 are combined
thus:
JOB-LIST ITEM = (1/4/ABCD/160/EF/0/128 /I/96/PQ*)
The asterisk added to diagonal PQ singifies this is the terminal
link in the information path. The Translation Package produces the
JOB-LIST ITEM as a self-defined string thus: ##SPC4##
The first two bytes contain the length of the JOB-LIST ITEM. Each
diagonal is preceded by two bytes which define its length. The
compressed referenced information is stored in bulk storage at an
automatically assigned location.
m.sub.1 utilized only one byte. This means that a single
representation cannot have more than 255 kernels. It is understood
that the parent IR, which can have up to M representations nested
in it, may have as many as M times 255 kernels. The two byte length
of M limits its maximum value to 65,535. Thus, without considering
the storage limitations, the parent IR can have a maximum of
between 255 (M=1) and 14,211,415 (M=65,535) kernels. It is unlikely
that these limits will ever be attained in applications of the
SOLID System.
Overview - An Illustrative Configuration of the SOLID System In
Data Storage and Retrieval
The JOBLIST items defined under the previous heading are utilized
in a system such as that shown in block form in FIG. 16. Although
the operation of the configuration of FIG. 16 is described in much
greater detail later on in the specification, it is believed that a
brief and qualitative preview of it at this time will help
illuminate and unify understanding the operation, interrelation and
cooperation of the various portions of the invention particularized
below.
In one specific example in reference to FIG. 16, stages 600 and 602
initialize all appropriate registers in a computer system which is
used in the embodiment of the invention and which has fast and
virtual memories.
Then, a JOBLIST item, which may have been stored or created at
stage 604, is used during the processing at stage 602. A portion of
the JOBLIST item is examined at stage 606 where it may be
determined that it is an index. Control then goes to stage 610.
In stage 610, a control procedure explained in much greater detail
below carries out a search through an array in memory associated
with indexes to locate information which may be associated with the
particular index of the current search. If such information exists,
it would be in the form of an EXECUTIVE POINTER. One of the
following three situations may occur:
a. An EXECUTIVE POINTER is found and control goes to stage 616
which contains a control package which is called MMATCH and is
explained in detail further in the specification;
(b) An EXECUTIVE POINTER is not found in the searched array or
possible extensions of it. If the system is in retrieval mode, the
control package at stage 616 determines that further search
procedures are unnecessary and aborts the search. If the search had
been for a descriptor or screen in the JOBLIST item in a screen
array (SCREEN SEARCH), and the JOBLIST item currently being used
contained overrides, the control package MMATCH would call, through
its macro called STRATEGY another control package called MOBILE
CANONICALIZATION PACKAGE. The contents and function of the macros
and packages mentioned above are defined in detail in the
description below. If the system is in a storage or updating mode
then steps which are explained in detail further in this
specification are taken to create a new searching path for the
JOBLIST item currently in use; and
(c) If the memory array reserved for the EXECUTIVE POINTERS which
are being searched is full then control again goes to stage 616
containing the control package MMATCH, but with an indication that
more space for the array may be needed.
Stage 620 contains a control package called TBADD which assumes
control as directed by portions of the MMATCH control package of
stage 616. The control package TBADD which is a complex and highly
efficient control for determining if any transfer of memory blocks
between fast and virtual memory may be required at a particular
stage of using a JOBLIST ITEM. It should be noted that limiting all
search procedures associated with one JOBLIST item to a single
memory block is an important facet of the invention because that
fixes the maximum time required for a search associated with a
JOBLIST. One memory block is always in fast memory; a number of
memory blocks may be in virtual memory. If a search is limited to a
single memory block then only one block need be transferred from
virtual to fast memory. A search thus will take the same maximum
time whether there are two or N memory blocks in the system.
Because of this provision the search time is considered independent
of the file size.
If a new memory block is required, then the control package TBADD
in Stage 620 transfers control to another control package called
SMEMORY which saves, if necessary, the memory block which had been
in fast memory up to this time and replaces it by a new one
obtained from virtual memory. If only a combination of the memory
space defined for a particular array is needed, the TBADD package
transfers control to a control package called CREATE which performs
the needed operations.
Once all operations associated with a portion of the JOBLIST ITEM
currently being used have been completed, the control package TBADD
transfers control to a stage 626 where it is determined if there
are any more portions left of the JOBLIST ITEM. If there are,
procedures similar to the one referred to above is repeated for
each such portion, until a location is reached in a memory array
called RFILE which is for addresses of information stored in bulk
storage.
In addition to the INDEX SEARCH and the search through RFILE
(AUXFILE) mentioned above, there are also operations involving
search procedures using the JOBLIST ITEM between the INDEX SEARCH
and AUXFILE called SCREEN SEARCH.
The function and operation of the control packages mentioned above,
as well as the function and operation of portions thereof, has been
explained in detail under the subsequent headings. Particular
attention is directed to those individual steps or points in the
control packages and portions thereof described below which relate
to transferring or accepting control from another control package
or a portion thereof.
STRUCTURE OF THE FILE
Storage is divided into the MAIN and AUXILIARY FILES. The MAIN FILE
contains the referenced information which is stored in bulk storage
(i.e., tapes, data-cells, etc.). The AUXILIARY FILE contains
information paths which terminate in locations that contain
addresses of the referenced information in the MAIN FILE. Job-List
items, like those in (5) and (6), are used to trace, modify, or
create information paths in the AUXILIARY FILE. The fully automatic
COPAK compressor, discussed hereinafter, is used to substantially
reduce the storage requirements of the MAIN FILE.
The AUXILIARY FILE consists of a maze of fully self-organizing
column-arrays that are associated with an index (M), and screens
(J, LD.sub.o, BD.sub.1, LD.sub.1, -) that appear in the Job-List,
(6). The column arrays are:
a. A single column array, MA, which is associated with the number
of nested representations (M). Here M.sub.m is the maximum value of
M in the system.
b. M.sub.m column arrays (JA(II), with the range of II being
between 1-II-M.sub.m). The Mth of these [JA(M)] is associated with
the J screen (m.sub.1,m.sub.2,m.sub.3 -m.sub.M) for the M nested
representations.
c. With each element of column array JA(M) there are associated two
families of column arrays (BD(I) and LD(I)). BD(I) and LD(I) are
associated with the configurations of the Ith diagonals, for
B.sub.2 and LABEL in the Job-List item (see 6).
For each B.sub.2, "I" begins with the first positive diagonal,
proceeds through all positive diagonals and then all negative
diagonals, ending with the last non-zero diagonal. For each LABEL,
"I" begins with the principal diagonal then proceeds as for
B.sub.2.
d. The RECORD FILE (RFILE) contains the addresses in bulk storage
of the referenced information.
Each column array has the structure shown in FIG. 12. The first
four bytes if the array contains its length. This is followed by
entries or elements, called executive pointers, and, in the last
position, the link or continuance address for the array. The link
address is &ADDL bytes long, and it contains the locations
where the particular kind of column-array (viz MA, JA, or LD(I),
etc) is continued or extended. There are two kinds of elements or
EXECUTIVE POINTERS. One kind, which is used exclusively in the MA
array or its extensions, contains only an address of length
&ADDL bytes. The second kind of EXECUTIVE POINTER is used in
the arrays associated with screens. It contains a screen (e.g., J
in 6) and a composite address of length &ADDL, bytes. The
System Parameter &ADDL, whose value can be changed to meet
storage requirements, will be discussed hereinafter.
If the column array, shown in FIG. 12, is for the index M, the
elements or EXECUTIVE POINTERS are stored seriatum with respect to
the associated M value, and the addresses contain the beginning
address of a JA array which is associated with the particular M
value. Since the EXECUTIVE POINTERS are arranged seriatum, one need
only go to position (M times &ADDL plus four) in the MA array
to find the address of the JA array that is associated with the
particular M value. At the end of the array in FIG. 12 is the
&ADDL long continuance address, which contains the location of
another array where the array is extended (or continued) for other
values of, for example, M. This seriatum arrangement continues over
as many arrays as are required to store the different EXECUTIVE
POINTERS. Thus the system is totally expandable, as the number of
different M values does not in anyway affect the system. It is
possible to store as many EXECUTIVE POINTERS, each associated with
a different M value, in the MA array(s) as the system is capable of
storing.
All those arrays associated with screens differ from the MA, or
index array, in that each EXECUTIVE POINTER in the arrays contains
two parts. That is, the first element is a particular value of the
particular screen (viz J, LD.sub.o, BD.sub.1, etc.) that is
associated with the array in question. The second part contains the
address of the array that is associated with the next screen where
the search should continue. Further, within each array, the
EXECUTIVE POINTERS are arranged in numerical order with the lowest
value for the screen in the first position. For example, if the
screen J6 is equal to 1, then the corresponding EXECUTIVE POINTER
would be in the first position in array JA, and the address would
contain the location of the array that is associated with screen
LD.sub.o, where the search continues. It should be noted that the
screen J cannot be zero and that the search is aborted if it is. It
should also be noted that all the elements or EXECUTIVE POINTERS in
a particular array (shown in FIG. 12) are the same length(=screen
address length). The length of a principal array, that is the total
length of all EXECUTIVE POINTERS (=screen & address) between
the initial four bytes, which indicates the array length, and the
link address is a variable System Parameter (&MATRIXL), that is
discussed hereinafter. Of course, each screen array has a link
address for an extension array, as discussed above. Also the
EXECUTIVE POINTERS are arranged in numerical order over an array
and all of its extensions.
Now column-arrays are created when they are needed for inserting
the missing links or sub-paths (i.e., the elements in
column-arrays) that will define a new "information path." The
length of each newly created array is determined by the value of
&MATRIXL.
Each element in the column arrays is called an EXECUTIVE POINTER
which contains the beginning address of another column array.
Elements in arrays that are associated with the diagonals B.sub.2
and LABEL contain a screen (VIZ. J, LD.sub.O, etc.), in addition to
the address. Within each column array and its extensions the
elements are ordered according to the numerical value of the
associated screen (for diagonals J, LD.sub.O, etc.) or the index M.
The element in the column array that is associated with the last
non-zero diagonal in the JOB-LIST item (see (5) and (6)) contains a
screen (i.e., LD.sub.1.sub.-J *) and the address of a sub-array of
RFILE. The sub-array of RFILE contains the address in bulk storage
of the compressed referenced information.
The values of M, J, LD.sub.O -(see (6)) are used to trace an
"information path" through the maze of arrays of RFILE. For fixed
values of M, each different configuration for each diagonal is
entered only once and then only if it occurs. Since arrays are
created in unoccupied storage areas only when they are needed,
there is minimal movement of data. Moreover, because of the "paths"
are essentially independent of each other, the time needed for a
search is not altered by increasing or decreasing size of the file.
Also, because only non-duplicate subpaths are stored, the AUXILIARY
FILE will require substantially less storage than conventional
files require. Further, because the entire search is accomplished
in core, the search is extremely fast in providing the exact
address in bulk storage where the desired compressed referenced
information is stored.
FIG. 13 is a diagrammatic showing of the manner in which a JOB-LIST
item is used to trace an "information path" in the AUXILIARY FILE.
The value of M in the JOB-LIST item locates the EXECUTIVE POINTER
in the array MA which points to the particular screen array JA(M)
associated with that value of M. In the screen array JA(M), the
screen J (as found in the JOB-LIST item) is used to locate a second
EXECUTIVE POINTER which will point to the particular screen array,
LD(.phi.), that is associate with the JOB-LIST entries J and M. In
the array LD(.phi.), the screen LD is used to locate an EXECUTIVE
POINTER which points to the next array, BD(1)). It should be noted
that in this case, in searching through the first column array or
LD(.phi.), there was no EXECUTIVE POINTER found for the value of
screen LD in the JOB-LIST item. However, when one came to the
bottom of the first column array of LD(.phi.) a link address was
given, indicating that the search should be continued in the first
extension column array of LD(.phi.). In this extended column array
of LD(.phi.) there was an EXECUTIVE POINTER pointing to the next
column array, BD(1), which contained the screen LD.sub.o. In BD(1),
the next EXECUTIVE POINTER with BD.sub.1 as a screen was found.
This continues until, finally, the array LB(1) is found where the
screen LD.sub.1 * indicates that the JOB-LIST ITEM is ended and the
EXECUTIVE POINTER then points to a place in RFILE where the
address(es) in bulk storage of the referenced information can be
found. These address(es) in bulk storage are used to fetch the
referenced information that was requested by the JOB-FILE item. It
should be noted that in the search to reach an address in RFILE,
one has been working solely with core storage information. Once the
RFILE address has been located, the time necessary to obtain the
particular information will be determined by the characteristics of
the device on which the bulk storage information is stored. For
example, if the bulk storage information is on a disk, the time it
takes to bring the referenced information into core storage will
include the disk access and transfer times. It should be understood
that, there is no need to search the MAIN FILE because the bulk
storage address(es), which are found in the RFILE array(s), are the
exact location(s) of the requested information. It is also
understood that the information in the bulk storage or MAIN FILE
is, within the context of this system, stored in compressed form,
and that it will be decompressed by the COPAK decompressor in the
core-storage. The COPAK compressor will be discussed
hereinafter.
Retrieval operations in the SOLID System are illustrated in FIG. 13
and they have just been described. It should be noted that if,
during the search of the AUXILIARY FILE, no EXECUTIVE POINTER can
be found in a particular array, then this means that the requested
referenced information is not in the MAIN FILE or the bulk storage.
If this situation occurs in the retrieval mode then the search is
discontinued and the user is advised that the information is not in
the system. If this occurs in thestorage mode, then a new subpath
is created by inserting new EXECUTIVE POINTERS in those arrays
which do not have them. This creation of a new information path
continues through the AUXILIARY FILE until the RFILE is reached and
the new bulk storage address has been allocated of the compressed
referenced information. Thus, later in the retrieval mode, the same
JOBLIST item will trace the newly created information path and
locate the new referenced information. Thus, the system expands, by
itself, independently of the user. Further, it should be noted that
the new items are stored without in any way effecting any of the
other information previously stored in the system. Thus, the system
can automatically expand until all the allocated storage has been
used. Moreover, there is no duplication of information in the
AUXILIARY FILE because, in the storage mode, EXECUTIVE POINTERS are
added and new arrays are created only if subpaths, defined by
addresses in EXECUTIVE POINTERS, cannot be found. The expansion of
the arrays in the AUXILIARY FILE, which occurs when new information
is stored, is entirely independent of the user. Storage areas for
newly created or extended arrays are automatically allocated by the
computer, and is not in any way controlled by the user. Thus,
because all retrieval and storage operations are fully automatic,
the user has no concern whatever with the actual machine structures
of either the AUXILIARY or the MAIN FILES. The actual way in which
the computer uses the AUXILIARY FILE will be described next.
However, the computer method of organizing the AUXILIARY FILE,
which is fully described hereinafter, must first be briefly
discussed.
In the computer, the AUXILIARY FILE is divided into two parts. One
part, which is permanently resident in core storage, contains the
column arrays that are associated with the index M and the screen
J(=m.sub.1 m.sub.2- m.sub.M). This part is generated or read from
cards in the RESERVE macro-instruction when the SOLID System is
initialized. It is automatically generated when the system is used
for the first time. The Second part of the AUXILIARY FILE contains
all those column arrays associated with the remainder of the
screens in the JOB-LIST ITEM (i.e., J, LD.sub.o,- ,BD.sub.1.sub.-J,
LD.sub.1.sub.-J *)and the bulk storage address. This part is
divided into memory blocks which are stored on disks in the virtual
memory. They are transferred to core storage by the Global Memory
component (SMEMORY) whenever they are needed. The size of the
memory blocks determines the efficiency of the SOLID System,
because as the memory block size increases the average search time
decreases, since the memory blocks will be transferred less
frequently. Each information path is restricted to a single memory
block.
In retrieval operations the "continuance tables" will be used by
the permanently resident part of the AUXILIARY FILE to select the
memory block that might contain the request path, described by a
JOB-LIST ITEM. The Global Memory Component, SMEMORY, decides either
that the selected memory-block is already core-resident, in which
case it is not transferred, or transfers it from virtual memory to
core. Thus, a maximum of one memory-block is transferred for each
JOBLIST item or request. In storage operations (i.e. "continuance
tables" will be used to ensure that no "information path" will
extend over more than one memory-block, which guarantees that a
maximum of one memory block will be transferred for each request.
One System Parameter (<HAYY), which is described hereinafter,
sets the memory-block size. It should be understood that in normal
applications of the SOLID System, the memory-block size will be
large enough so that there will be a high probability that many
requests can be answered from a resident memory-block and,
consequently, there will infrequent transfers of memory-blocks from
the virtual memory (i.e.. disks) to core. During the storage cycle,
if the particular storage path crosses more than one memory block,
a program can be used to transfer that path to a single memory
block so as to avoid the problem of crossing memory blocks during a
retrieval cycle.
The system contains many composite addresses that are used during
storage and retrieval operations to insure that memory blocks are
correctly positioned either in core or in storage, and for enabling
the machine to know where information is to be stored or retrieved
at any particular time.
Composite addresses have two parts. The first part, which is called
the slow memory address, specifies a location on a peripheral
device like disks, drums, magnetic tape, data-cells, etc. It is
used when memory-blocks or referenced information is to be
transferred to or from core memory. The second part of the
composite address is called the fast memory address, and it
specifies a location in core-memory. Now there are two such
composite addresses at the head of the permanently core-resident
part of the AUXILIARY FILE, which contains the slow memory address
in virtual memory (e.g. disks) where the next new memory-block can
be stored. The first such composite address is called EMPTY. The
slow memory address part of EMPTY will be used and updated when a
new memory block is created in core-storage, and it is the location
in virtual memory where the newly created memory-block can be
stored by the Global Memory component (SMEMORY). The fast memory
address in EMPTY is the location in core storage where the new
memory block can be created. This location is always immediately
after the permanently resident part of the AUXILIARY FILE.
The second composite address at the head of the permanently
core-resident part of the AUXILIARY FILE is called BULK. It also
has a slow memory address and a fast memory address part. The slow
memory address part of BULK contains the bulk storage location
where newly compressed referenced information can be stored. This
part of BULK is used and updated in the storage mode when new
"information paths" are created or another bulk storage address is
added to an existing RFILE subarray. This occurs in the storage
mode when new referenced information is added to the MAIN FILE. The
fast memory part of BULK contains the location in core storage
where the new uncompressed referenced information is found. In the
SOLID System the address of the uncompressed information is located
in the full word named LBRYY. The COPAK compressor, which is
executed after new bulk storage (or slow memory) addresses as
assigned, compresses the information in the location specified by
the fast memory address and transfers it to the bulk storage
location specified by the recently assigned
slow-memory-address.
During the initialization of the SOLID System, which occurs in the
macro-instruction RESERVE, the permanently core-resident part of
the AUXILIARY FILE is either generated by the macro-instruction
MJARRY, or it is read from cards. This card-deck is punched by the
Global-Memory component (SMEMORY) during the termination procedure.
SMEMORY will be described later in this disclosure. During the
initialization, the first two composite addresses, which precede
the M-J arrays are loaded and the third composite address, which
follows immediately after the M-J arrays, is made equal to the
hexadecimal number FFFFFFFF. These three composite addresses, which
at this point are in the principal data-array (YY), are transferred
to the locations EMPTY, BULK, and CURRENT. Three other composite
addresses (CORD1, (EMPTY+&ADDLY), and ADDRESS) also plays a
significant role in the SOLID System. Two of these,
(EMPTY+&ADDL) and CORD1, are used to ensure that machine or
operation errors will not damage the AUXILIARY FILE. The last
composite address, ADDRESS, is used to trace or create the
information paths in the AUXILIARY FILE. The roles played by these
six composite addresses (EMPTY, BULK, CURRENT, (EMPTY + &ADDL),
ADDRESS, and CORD1) are described next.
The SSEARCH component used the information in the JOBLIST ITEM to
trace (retrieval or storage) or create (new storage)information
paths in the AUXILIARY FILE in core-storage. The composite address
BULK is used at the end of storage operation, when the RFILE array
is reached, to assign bulk storage locations for new compressed
referenced information. After each use BULK is updated to show the
location of the next available space in the bulk storage. In the
termination procedure, which is executed by SMEMORY, BULK is stored
in its assigned location at the head of M-J arrays and the card
deck for the next initialization of the SOLID System is punched. Of
course, this new card-deck will also contain the new value of
EMPTY.
For our present purpose we will suppose that a search is not
discontinued because the sub-path cannot be found or created.
SSEARCH begins each task by setting (EMPTY +&ADDL) equal to the
address EMPTY, and then it searches the MA array, which resides
permanently in core, for the EXECUTIVE POINTER associated with the
M value in the JOBLIST item If a new sub-path is being executed
then a new EXECUTIVE POINTER will be constructed from the address
EMPTY and correctly inserted in the array. The address part of the
located (or constructed) EXECUTIVE POINTER is placed in ADDRESS. in
the next step, which occurs in the TBADD macro-instruction, the
slow memory address parts of CURRENT and ADDRESS are used to
determine whether or not the request address (ADDRESS) points to a
location in core memory, The three alternatives are:
i. If the slow memory address parts of CURRENT and ADDRESS are
equal, then the fast memory address part of ADDRESS points to a
location in the resident memory block. In this case a transfer
between the core and virtual memory does not occur.
ii. If the slow memory address part of ADDRESS is zero, then the
fast memory address part points to a location in the permanently
resident part of the AUXILIARY FILE. In this case, which occurs
only if the located EXECUTIVE POINTER is in array MA, a transfer of
memory-blocks does not occur.
iii. If the slow memory address parts of CURRENT and ADDRESS are
not equal and that for ADDRESS is not zero, then ADDRESS points to
a memory block that is not resident in core. In this complex
situation the Global Memory component uses the slow parts of
addresses CURRENT and ADDRESS, and the MSIGNAL signal byte to
decide what course of action should be taken. Full details of the
various procedures that are executed by the RETRIEVAL and GLOBAL
MEMORY PACKAGES are given hereinafter.
From the foregoing discussion it should be recognized that in the
SOLID System the management and organization of both the AUXILIARY
and the MAIN FILES is automatic in every respect. It should be
further recognized that the system can be started anew or restarted
when there is no information about memory-blocks in core storage
and/or in the virtual memory. Moreover, fully automatic safety
procedures, which are discussed hereinafter, protect the files from
all machine and operator hazards. Finally, because storage is
allocated automatically, the growth of the system is bound only by
the total storage capacities of the machine that is being used.
Thus, the entire system does not depend in any sense upon the
amount of information that is stored, whether it is zero or
billions on bytes.
SYSTEM PARAMETERS
There are 14 parameters in the SOLID System which must be set
before the system is compiled. Properly selected values for these
parameters insure that usage of core-storage and the performance of
the SOLID System will be optimal. Nine of the 14 parameters can be
reset, by recompiling the system, at any time. If the other five
parameters (&ADDL, &LSLOW, &LFAST, &ENTRKS, and
&TRKL) are reset, the AUXILIARY FILE must be regenerated from
scratch.
Six parameters are used to define the eight principal data-arrays
for the entire SOLID System. One of these (<HAYY) is the
amount of core-storage that can be used by both the AUXILIARY FILE
and the COPAK Compressor. Two parameters name (&JBLIST) and
define the length (&LJBLIST) of the data-array that is used for
storing the JOBLIST ITEMS that are produced by the TRANSLATION
PACKAGE from the descriptor-sets. &LJBLIST is also the length
of a work-array (JBWORK). Three parameters (&LOVER1,
&LOVER2, and &LOVER3) specify the lengths of the five
arrays that are used to store information about the three over-ride
codes (Type 1 (1); Type 2 (2); and Type 3 (3)).
Two variable parameters are associated with the COPAK Compressor
One of these, <HAYY, which is also associated with the
AUXILIARY FILE, has already been mentioned. The other parameter
(&TPCORD) is used to optimize the performance of the
alphanumeric component of the COPAK Compressor.
Eight parameters are associated with various aspects of the
AUXILIARY-FILE. One of these (<HAYY) has already been
mentioned. Four of the eight parameters (&ADDL, <HAYY,
&NTRKS, and &TRKL) are concerned exclusively with the way
in which the AUXILIARY FILE is transferred between core-storage and
virtual memory (i.e. disks). The last four variable parameters
(&LSLOW, &LFAST, &MATRIXL, and &MATRIXS) together
determine the length of each newly created column array. Two
parameters (&MATRIXL and &MATRIXS), and information about
the over-ride codes, determine the way in which the arrays that are
associated with screens are searched.
The roles played by these fourteen System Parameters are discussed
next.
A. Structure of JOBLIST ITEMS.
In the SOLID System a TRANSLATOR component, which is called by the
TRANSLATION PACKAGE, rearranges the previously assigned
descriptor-sets to the particular linear form that is used to trace
the "Information Paths" through the part of the AUXILIARY FILE in
core-storage. A new TRANSLATOR component must be recoded for each
new collection of items that are to be stored in the SOLID
System.
The linear form (JOBLIST ITEM) for an information representation
with M nested representations and
Kernels is:
JOBLIST ITEM = (M/m.sub.1 m.sub.2 -m.sub.m /LD.sub.0 /BD.sub.1
/LD.sub.1 /.-./BD.sub. 1.sub.-J /LD.sub.1.sub.-J *) . (7) In the
computer 7 is expanded to 8. ##SPC5##
Ljbi is the number of bytes in the JOBLIST ITEM (up to and
including the last screen, LD.sub.1.sub.-J *). Each screen is
preceded by a half-word (two bytes) which contains its length plus
two. For example (LL.sub.O -2) is the length of screen
LD.sub.0.
The screen that is associated with the "zero" or principal diagonal
of the second bit-map (B.sub.2) is omitted from the computer
representation, (8). Screens associated with terminal zero on empty
diagonals in both the second bit-map (B.sub.2) and LABEL (i.e.
LD.sub.0, LD.sub.1,-, BD.sub.1.sub.-J, LD.sub.1.sub.-J) are omitted
from the representation, 8. B.sub.2 diagonals in 8 are
left-adjusted to a full byte boundary. Thus the JOBLIST ITEM 5
should be written:
JOBLIST ITEM = (1/3 /ABC/128/D/128/F/64/H*)
In the computer this representation is expanded to 9 thus:
In 9 there are 29 bytes in the JOBLIST ITEM. The first screen, J=3,
is used to compute the rank
of the information representation (IR). The first B.sub.2 diagonal
entered in 9 is the screen 128. This screen and the K value
together indicate that the associated diagonal of the IR, whose
screen is D, actually is D.phi.. The asterisk in the last screen
(H*) indicates termination of the information path. The terminal
zero or empty screens associated with B.sub.2 and LABEL have been
omitted.
JOBLIST ITEMS like 9 are constructed from assigned discriptor-sets
by the TRANSLATOR components in the data-array &JBLIST
(=JBLIST), which is &LJBLIST bytes long. Random JOBLIST ITEMS
can be generated for test purposes. The TRANSLATOR components also
extract information about over-rides from the assigned or generated
descriptor-sets and stores it in the five over-ride arrays (AOVER1,
AOVER2, AOVER2R, AOVER3, AOVER3R), whose lengths are defined by the
parameters &LOVER1, &LOVER2, and &LOVER3. The
TRANSLATOR components can produce more than one JOBLIST ITEM.
Moreover, the JOBLIST may be automatically changed by adding or
deleting items during retrieval or updating of the AUXILIARY FILE.
The information that is generated by the TRANSLATOR components
(i.e. JOBLIST ITEMS, over-ride tables, etc) is used to trace (for
retrieval), modify (for updating), create (for new storage), or
purge "information paths" in the AUXILIARY FILE.
The variable parameters &JBLIST, &LJBLIST, &LOVER1,
&LOVER2, and &LOVER3 that are associated with the
TRANSLATION PACKAGE can be changed at anytime. However, in this
case, the CONTROL routine and three components (SMEMORY, SRESULT,
and SSEARCH) might have to be recompiled.
The structure of the composite address used in the SOLID System are
discussed next.
B. Address Structure:
Three system parameters (&ADDL, &LSLOW, and &LFAST)
sets the number of bytes in the addresses that are stored in the
AUXILIARY FILE. The elements in the column-arrays in the AUXILIARY
FILE are EXECUTIVE POINTERS, each of which contain a single
address. In the "path" tracing procedure, the address part of an
EXECUTIVE POINTER is the location of the next column array that is
to be created (in a new storage or updating act) or searched (for
retrieval). The elements of the RFILE sub-arrays, which are the
terminal location of the "information paths," contain the Bulk
Storage Address (BSA) of the referenced information.
In the current version of the SOLID System the structure of the
component addresses is:
Here:
D is a code (0 to 15) which specifies a particular type of
peripheral storage device (viz., disk, magnetic tape, data-cell,
drum, etc.)
Dno is a code (0 to 15) which specifies a particular device of type
D.
Trk (0 to 63) is the track where the information begins.
Cyln (0 to 1023) is the cylinder where the information begins.
(Note if D specifies magnetic tape then the record is stored in the
two bytes occupied by TRK and CYLN).
Fmadd is the beginning location in core-memory where the
information will reside.
The lengths of the slow (D,DNO,TRK,CYLN) and fast (FMADD) parts of
the composite addresses are set by the variable parameters
&LSLOW and &LFAST respectively. The total length of the
composite address is set by the parameter &ADDL. In one program
the values of &LSLOW, &LFAST, and &ADDL, are 3, 3, and
6 bytes respectively. Assignments of the device type code (D) for
the AUXILIARY FILE and RFILE address need not be the same. In one
programs D=0 means a magnetic tape drive (for Bulk Storage
Addresses) or an IBM 2314 disk drive (for AUXILIARY FILE
Addresses).
Three service-macros (APART, ASADD, and COMPARE) disassemble,
assemble, and compare addresses of the above type. If any of the
above three System Parameters (&ADDL, &LSLOW, or
&LFAST) are changed, these three macros must be recoded, and
the AUXILIARY FILE must be regenerated. In one of the version of
the SOLID System, one IBM 2314 disk pack is used to store the
AUXILIARY FILE. This disk has been assigned D=0 and DNO=0. The
compressed "referenced" or "bulk" information is stored on magnetic
tapes which have been assigned D=0 and DNO = 0, 1,-, 15.
C. Computer Organization of the AUXILIARY FILE
In the SOLID System the AUXILIARY FILE is divided into two parts.
One part, which is permanently resident in core-storage, contains
the column arrays that are associated with the prime index M and
the screens beginning with J(=m.sub.1 m.sub.2 -. m.sub.m) (see
(8)). This part is generated by the service-macro MJARRAY, which is
called by the initializing component SSTATECL, when the SOLID
System is used for the first time. Thereafter, it is read from
cards by SSTATECL in the macro RESERVE at the start of each
job-steam. A new card-deck is punched as the final act in each
job-stream.
The second part of the AUXILIARY FILE contains all those column
arrays that are associated with the remainder of the screens in the
JOBLIST ITEMS (i.e., LD.sub.0, BD.sub.1, LD.sub.1,-, LD.sub.1-J *)
and the Bulk Storage Addresses (BSA), which are assigned when they
are needed. This part is divided into memory-blocks, which are
stored on disks in virtual memory, and are transferred to
core-storage by the Global Memory component (SMEMORY) whenever they
are needed. The size of the memory-blocks determines the efficiency
of the SOLID System. As the size of the memory-blocks increase, the
average search-time decreases because the memory-blocks will be
transferred less frequently. Complete information paths are
restricted to a single memory-blocks. Thus, because a maximum of
one memory-block is transferred per query, search-time is virtually
independent of the AUXILIARY FILE size.
Each memory-block is prefaced by a composite address (of length
(&ADDL) whose fast-memory address, FMADD (of length
&LFAST), is its first unused byte. This information is used
when new sub-paths are to be created. The slow-memory part of the
composite addresses (of length &LSLOW) contains the beginning
location in virtual memory where the memory-block will be stored.
The first part of the AUXILIARY FILE, which resides permanently in
core-storage, is prefaced by two composite addresses. The first of
these is associated with the first unused byte in the last
memory-block. This information is used to create new memory-blocks
or to extend blocks that are full. The second composite address is
the Bulk Storage Address that is used to assign slow-memory
locations for new referenced information. Its fast-memory part,
FMADD (of length &LFAST) contains the location in core-storage
when the compressed referenced information will reside.
Three system parameters (<HAYY, &NTRKS, and &TRKL)
together determine the size of every memory-block. &TRKL is the
length of a single record in virtual memory. &NTRKS is the
number of records of length &TRKL in each memory-block.
<HAYY is the number of bytes in the principal data-array
(YY), in core-storage. This array, YY, must contain the permanently
resident part of the AUXILIARY FILE (i.e., arrays associated with M
and J); a single memory-block; and approximately two strings of
uncompressed referenced information. In one example of the program
for the SOLID System &TRKL=7294; &NTRKS=10; and
<HAYY=100,000. If these three parameters are changed once
then the service-macros APART, ASADD and COMPARE must be changed,
and the system started from scratch.
Two system parameters (&MATRIXL and &MATRIXS) determine the
lengths of all newly-created column-arrays. These two parameters,
which can be reset at any time are defined in the next section,
D.
D. Definitions of Parameters
In this section, D, the fourteen system parameters of the SOLID
System are defined.
&ADDL: The number of bytes in the composite addresses that are
used in the AUXILIARY FILE.
&lslow: the length (in bytes) of the slow-memory part of the
composite address.
&LFAST: The number of bytes in the fast-memory part of the
composite address.
Note: &addl, &lslow, and &LFAST are associated with the
service macros APART, ASADD, and COMPARE. These macros must be
changed if &ADDL, &LSLOW or &LFAST is altered.
&JBLIST: (=JBLIST): The name of the array that is used for
storing the JOBLIST ITEMS that are produced by the TRANSLATOR
COMPONENTS (see Section A).
&ljblist: the number of bytes in the JOBLIST array,
&JBLIST, and in the JOBLIST work-array, JBWORK.
&lover1: the length (in bytes) of the three principal over-ride
arrays (AOVER1, AOVER2, and AOVER3) that are used for storing
information about Type 1, Type 2, and Type 3, over-rides. Two
additional over-ride arrays used by the TRANSLATOR components.
AOVER1, AOVER2, and AOVER3 normally contain information about the
specific locations (in the JOBLIST array, &JBLIST) of the
designated types of over-ride codes.
&LOVER2: The length of the second, Type 2 over-ride array
(AOVER2R). This array normally contains information for updating
the "information paths" (in the AUXILIARY FILE) and the compressed
referenced information (in the Bulk Storage).
&LOVER3: The number of bytes in the second Type 3 (or gate)
over-ride array (AOVER3R), which normally contains the two gates
for each Type 3 over-ride whose location is in array AOVER3.
&LOVER3, should be about the same length as &LOVER2, and
twice the length of &LOVER1.
<hayy designates the length of the principal data-array (YY)
of the SOLID System. YY should be large enough to include the
permanently resident portion of the AUXILIARY FILE, which holds the
M-J arrays in about 1080 bytes; a memory-block
(=&NTRKS*&TRKL bytes); and two strings of uncompressed
reference information. In one program <HAYY is set equal to
&NTRKS*&TRKL plus 20,000. In large applications of the
SOLID System (viz., a national retrieval network) it is anticipated
that the value of <HAYY will be set between 1,000,000 and
15,000,000.
&MATRIXL is the number of elements in the column-arrays that
are associated with the screen of the principal diagonal of the
information representations. In the JOBLIST ITEM 9 this is ABC.
&matrixs: the number of elements (e.g. EXECUTIVE POINTERS) in
the column arrays that are associated with all screens other than
the principal one. In JOBLIST ITEM (8) these are: J(=m.sub.1
m.sub.2 -m.sub.M), BD.sub.1, LD.sub.1, etc. At present &MATRIXS
is also the number of Bulk Storage Addresses (BSA(in the RFILE
sub-array which terminates the "information paths." However, the
RFILE structure can be changed so that it can be accessed
independently.
&NTRKS: The number of records in virtual memory that are in
each memory-block (see Section C). In our program the record length
(&TRKL) is set equal to the IBM 2314 disk track length (=7294
bytes). Thus &NTRKS is the number of tracks needed to store a
single memory block.
&TPCORD: The number of permanent cords that are to be used by
the alphanumeric compressors in the fast mode. This is one of two
parameters which together determine the optimum throughput rate for
the COPAK compressor
&TRKL: The number of bytes in each record in virtual memory. In
one program &TRKL is set equal to the track length of the IBM
2314 disks (7294).
DESIGN PHILOSOPHY
In designing a large computer oriented system it is essential that
design and performance specifications for the completed system be
set up. The design goals for the SOLID System are given next.
i. Storage, retrieval, updating, and purging tasks must be
accomplished as fast as possible.
ii. The system must be independent of the information base.
iii. Components of the SOLID System should be capable of being
coded independently of all the other components in the system.
iv. Programming of the fully implemented scheme, to meet the varied
needs of users, should be as simple as possible.
v. The system should be open ended to provide for future
innovations.
vi. The coded system should be as free of machine dependence as
possible to provide for easier translation to other computer
configurations.
Because the size and scope prohibit writing the system as a single
program, it was decided to write the system on a component by
component basis. Each component performs a single task in the
overall scheme. For example, one component handles card input;
another the output; and so on for each separate task the system
will perform.
To simplify recoding of a large system for another computer, it is
essential that a higher language be used or developed. The present
higher languages (e.g., FORTRAN, ALGOL, COBOL, SNOBOL, COMIT, etc.)
are not suitable for coding large retrieval or indexing systems
because they do not have the bit and byte manipulation capabilities
that are essential for efficient machine coding. The SOLID System
has associated with it an open ended higher language, ALLOCATE,
which grows as the system is implemented. Thus, the fully
implemented SOLID System will be coded in a machine independent
higher language that can provide the basis for a retrieval
language. The macro language provided in the IBM System 360
provides a starting point for the language ALLOCATE.
Each component of SOLID is coded in the macro language. The central
concept is one of extensively nested macros incorporated into the
assembly language processor of the computer. In this way the normal
operations of the assembly language are extended with marco
instructions that perform the special operations needed for SOLID.
In the IBM 360 System the assembly language is called BAL. A
programmer can add, delete, or ignore components of the system at
any time. This design allows for the unrestricted growth of the
system and the retrieval language (ALLOCATE) by adding new
components to the system macro library.
Translation of the SOLID System to other computer configurations is
greatly simplified by the use of the component type system. For
example, it is possible to translate the SOLID System directly to
FORTRAN IV by a suitable translator. The translation will be
performed component by component rather than by trying to rewrite
the entire system. Moreover, since the components are independent
of one another, only those components needed for the particular
application of SOLID need be translated. The necessity for
programming around deleted components becomes unnecessary.
Design of the SOLID System
The design of the SOLID System is based on the concept of a system
which contains two subsets of instructions. In one subset are all
the assembly language instructions. The second subset, which is
entered in the macro library (SOLID.MACLIB) of the System 360
Assembly Language Processor (ALP), contains all of the components
of the SOLID System and certain selected service macros. Components
are entered as macro subroutines with their own USING controls so
that they may be placed anywhere in the system. Independently
compiled components are stored in a partitioned data set,
SOLID.LOAD.
Since both subsets of instructions are processed by the same
compiler, the programmer can code in any arbitrarily selected
combination of assembly language and macro instructions. In the
remainder of this discussion the terms: "level of coding" or
"coding level" refer to one such arbitrarily selected combination
of instructions. At every coding level, instructions can be added,
deleted, or over-riden. New Instructions can be programmed at a
previously defined coding level. In its final form, the SOLID
System, which will be coded at the highest level (in the language
ALLOCATE), is independent of the machine used. As the coding level
is lowered, the proportion of instructions needed from the first
subset (assembly language) in increased and the programming becomes
more difficult. The language ALLOCATE can have less than fifteen
instructions, drawn from the two subsets mentioned above.
The two part design that has envolved for the SOLID System consists
of the various components and service macros, MACROPAK, and a
control routine (CONTROL). The control routine, which is coded in
the evolving higher language ALLOCATE, assigns tasks to the various
components of the SOLID System when a search, storage, compression
or decompression job is executed.
The service macros in MACROPAK perform specialized tasks such as
bit, byte and string manipulation which are necessary for an
information system. Also included in this group are macros used for
the input/output operations; the calling procedures for branching
from the control routine (CONTROL); and other specialized macros.
The components are coded in a hierarchical fashion with extensive
nesting. The kind of hierarchical arrangement that has been
achieved is illustrated in FIG. 14.
Each of the coding levels indicated in FIG. 14 designates an
arbitrarily selected combination of assembly language (first
subset) and macro (second subset) instructions. The
pseudo-operations used in each of these arbitrarily selected coding
levels are defined in MACROPAK and are themselves coded in mixtures
of instructions from any of the lower coding levels. In a
hierarchical scheme of this kind, the difficulty of coding the
system is decreased as the coding level is raised. This occurs
because both the proportion of assembly language instructions and
the need for a specific knowledge of the contents of RESERVE are
decreased.
The open ended, two part design just described permit optimum
machine language coding while giving rise to a "machine
independent" language that greatly simplifies the task of recoding
the SOLID System for a new configuration. For the System 360 for
example this design has naturally led to the full utilization of
the System 360 macro language facility. In some configurations it
may be necessary to extend the assembly language processor so that
those instructions that are not defined in the machine instructions
set can be handled. Moreover, some or all of the macro instructions
in the second subset may suggest ways in which existing hardware
can be modified or they may influence the design of fourth
generation machines. In this connection the automatic multi-stage
COPAK compressor can be realized in the form of a small fast
computer with an equivalent hardware set.
Contents of MACROPAK
The macro-instructions in MACROPAK can be classified as
follows:
i. Reserve: There are twenty-two heavily nested macro-instructions
which together contain the data declarations, system parameters,
and status controls for the SOLID System and its stand-alone
subsystems. Only two of these 22 instructions actually appear in
the different CONTROL routines.
ii. Service-Macros: macro-instructions which simulate useful but
unavailable "instructions", including certain rather elaborate
operations.
iii. Components: useful macro-subroutines too large or complex to
be viewed as single operations.
The 151 entries in one of the versions of MACROPAK are stored in
SOLID.MACLIB, which is concatenated with the macro-library of the
System 360 Assembly Language Processor (ALP). The contents and
functions of entries in the three classes (Reserve, Service Macros,
and Components) are described below:
Reserve:
The first and last executable instructions in each CONTROL routine
is a "RESERVE" type and a "SUBMP" type of macro-instruction
respectively. MACROPAK contains two "RESERVE" (RESERCO and RESERVE)
and eight "SUBMP" macro-instructions. Different combinations of
these two kinds of instructions are used in the CONTROL routines
for the SOLID System and its stand-alone subsystems. The fourteen
System Parameters defined earlier, which are used to tailor the
SOLID System to the machine configuration, are primarily associated
with the "RESERVE" and "SUBMP" types of instructions.
The "RESERVE" type of macro-instruction is located immediately
after the START instruction of the CONTROL routine. It establishes
addressability; defines all global constants, counters, DCB or
format statements, work arrays, and registers; and initializes the
CONTROL routine. It also contains the instructions for opening and
closing all those input/output devices that are used in the SOLID
System for communication purposes and storing the referenced
information. These various functions are performed by one of three
"JUNK" type instructions (JUNK, JUNKC, JUNKR), which contain six of
seven special macro-instructions that perform the identified
special tasks (e.g. define constants). One of these special
macro-instructions (DEVICES) calls the component OPENSHUT, which
executes all opening and closing instructions, and the component
SACTION, which is designed to execute all error-correcting
procedures for the COPAK compressor. The CONTROL routine for the
SOLID System is initialized in the component SSTATECL, which is
called from the RESERVE macro-instruction. JUNK and JUNKC are
slightly modified versions of JUNKR. JUNK is used to separately
compile components see Components). JUNKC is used in RESERCO for
the stand-alone subsystems.
The fully expanded "JUNK" type instruction contains more than 300
items whose significance must be understood if the SOLID System is
to be coded entirely in assembly language. However, at the highest
coding level (i.e., the language ALLOCATE) the significance of only
the twenty-seven input/output commands need be fully
understood.
The "SUBMP" type of instruction appears immediately before the END
instruction of the CONTROL routines. Its function is to locate the
literal pool; define the principal data array and specify their
lengths; dummy addresses of unused components; and positions of the
OPENSHUT component at compilation time. The principal data-array
(YY) is specified in "SUMBP" instructions. The two JOBLIST arrays
(JBLIST and JBWORK) and five over-ride arrays (AOVER1, AOVER2,
AOVER3, AOVER2R, and AOVER3R) are specified in the WORKAREA
macro-instruction, which appears in certain of the "SUMBMP" type of
instruction.
Service Macros:
In one version there are 98 entries in MACROPAK that are classified
as part of the service macros.
For the most part, the macros classified under this heading
performs special tasks and are extensively nested within components
see Components). Among these entries are macros which will truncate
a floating point number to fixed-point (TRUNC); convert a fixed
point number to floating point (CONVE); move any string of
information left (LMOVE) or right (RMOVE=RMVC) by a designated
number of bytes; and several other macros which perform the
specialized bit, byte, and string manipulative tasks needed for the
SOLID System.
All the macros dealing directly with the input/output devices such
as the card reader, card punch, printer, disks, and tape units are
also classified as part of the service macros. The macros for
reading and writing magnetic tape also perform the tasks of
blocking or deblocking the information. At the highest coding level
(i.e., the language ALLOCATE) card input, tape input and the entire
output-operation(s) are single macro-instructions.
Other macros of special note here are nine macros which facilitate
branching between the components.
Components:
Components are macro-instructions that contain their own USING
controls for establishing addressability. They can be compiled in
the control routine, or they may be separately compiled in named
CSECTS. Compiled CSECTS are link-edited and stored in the
partitioned data-set, SOLID.LOAD. The calling procedures are
discussed herein after.
In one version there are 31 components of the SOLID System. The
following three components initialize the system at the indicated
times. OPENSHUT, which is called by the DEVICES macro-instruction
in the "JUNK" type instructions, performs the tasks of opening and
closing input/output devices at the beginning and end of each
job-stream. SSTATECL, which is called by the RESERVE
macro-instruction, initializes the CONTROL routines for the SOLID
System at the beginning of each job-stream. SCOMMAND, which is
called from the control routines, initializes the system at the
beginning of each new job and before each use of the COPAK
compressor.
Four of the components handle the input/output for the SOLID System
(SJOBLIST, SREADC, SREADT, and SOUTPUT). These entries use the
input/output service macros and another three components (SPR1NT,
SPUNXH, and SREID). Performance data for each job (component
SRESULT) and for the COPAK Compressor (service macro SAVINGS in
SANPAKC) are printed. In production runs this information would not
be required.
There are six components which are associated with the COPAK
Compressor. One of these (SACTION) which is called from the "JUNK"
type instruction, is intended to process the error-correcting
procedures for the several forms of the COPAK Compressor. Six
service-macros handle the task of setting-up the strings of
referenced information and setting the status controls for the
three principal compressor components, (SANPAKC, SANPAKD, and
SNUPAK). The remaining two components, (SNAPAKJ and SNUPAB) are
variants of SANPAKD and SNUPAK that are used in the separate
alphanumeric and numeric stand-alone compressors.
Eleven components are used by the TRANSLATION PACKAGE, which
generates JOBLIST ITEMS in their normal forms. The supervisory
component (SJOBLIST), which also handles input, uses three
service-macros (JLITEM, TRANSLATE, and NORMFORM) to call the random
JOBLIST generator component (SGENITEM); the five translator
components (STLATOR1, STLATOR2, STLATOR3, STRATLOR4, and STLATOR5;
and the normalization component (SNORMAL). The fully implemented
SNORMAL component will use the three transformation rules, which
are to be coded in components SCYCLIC, SREFLECT and SXCHANGE. These
transformation rules will be used in the RETRIEVAL PACKAGE also.
SGENITEM can be used to generate the random JOBLISTS that are
needed to evaluate the performance of the SOLID System. The five
translator components will actually perform the task of extracting
(if necessary) and rearranging the descriptor-sets to the JOBLIST
ITEM form. A new translator component must be coded for each new
application of the SOLID System. There are provisions for
incorporating up to 255 different translator components.
Five components are used by the RETRIEVAL PACKAGE, which duplicates
the tracing, creating, purging and modification of "information
paths" described earlier. One component (SRESULT), which is
mentioned above, prints performance data for the RETRIEVAL PACKAGE.
The supervisory component (SSEARCH) uses two service-macros (MMATCH
and TBADD) to call the MOBILE CANONICALIZATION PACKAGE and the
Global Memory component (SMEMORY). The Global Memory transfers the
memory-blocks of the AUXILIARY FILE between core-storage and
virtual-memory whenever they are needed. Two new components (SMATCH
and SMOBILE) and the three transformation rules (SCYCLIC, SREFLECT
and SXCHANGE) are used in the MOBILE CANONICALIZATION PACKAGE. The
supervisory component of the RETRIEVAL PACKAGE, SSEARCH, has ten
variable parameters which together designate the number of elements
in each column array, and optimize the searches.
C. The CONTROL Routine.
The CONTROL routines are used to "thread" the retrieval, storage,
updating, purging or compression problem through the various parts
of the SOLID System. CONTROL is coded exclusively in the higher
language ALLOCATE. In one version the CONTROL routine for the
entire SOLID System contains sixteen statements. Five of these are
ALP instructions. The other eleven instructions are taken from the
second subset of macro-instructions. Thus the CONTROL routines can
be easily changed to meet user needs by simply adding or deleting
single instructions. This facility, and the fourteen System
Parameters mentioned earlier permit the translation of the SOLID
System to a particular machine configuration or application.
There are seven service macros which facilitate the branching among
components. The three types of control that can be achieved by
single macro-instructions are illustrated in FIG. 15. Five service
macros (CALL1, CALL2, CALL3, CALL4, and CALL5) permit branching
from the CONTROL routine to a component and the eventual return to
any pre-assigned address in CONTROL. For example, the
pseudo-operation; CALL1, SEARCH,RESEARCH passes control to the
SSEARCH component and then returns it to the address RESEARCH in
the CONTROL routine. The TRANSFER and GLOBAL service-macros permit
branching between components.
The first and last executable statements in the CONTROL routines
are the "RESERVE" and "SUBMP" type of macro-instruction
respectively. Any allowed assembly language and service-macro
instruction can be inserted between these two statements. The
following program will read a record of compressed information from
magnetic tape and print it in hexadecimal format.
SOLIDIO START 0 RESERIO 600 MVC MODE (4),ZERO REIDT LIST,JII LA
BRY,JII PRINT B,LIST,O(BRY) B CL1 LIST DS 150F SUBIO END
SOLIDIO
the information is read into the array LIST, which can contain up
to 600 bytes. JII contains the record length.
Reserio and SUBIO are minor modifications of the RESERCO and SUBCE
macro-instructions respectively.
DESCRIPTION OF MACR.phi.-INSTRUCTIONS IN MACROPAK
The two-part open-ended design that has evolved for the SOLID
System consists of MACR.phi.PAK, which contains the second subset
of instructions, and a control routine (CONTROL). This control
routine, which operates under the O/S system, assigns the specific
tasks to the individual components of the SOLID System. It is
easily changed to include new hardware or for special applications
of parts of the SOLID System.
MACR.phi.PAK, which is entered in the macro-library (SOLID.MACLIB)
of the System 360 Assembly Language Processor (ALP), contains
entries which are identified in the Reserve, Service-Macro, or
Component classes (see above). The individual macro-instructions
are described below. Listings of the macro-instructions are given
in the Appendix.
In the following, the macro-instruction name (e.g., SKIPP) precedes
the prototype statement thus: SKIPP - (&J SKIPP &LC). A
blank left field in the prototype statement means that there is no
location variable.
A. Reserve
Twenty-two entries in MACROPAK are identified in the "Reserve"
category. There are three "JUNK", two "RESERVE", and eight "SUBMP"
types of instructions. The roles played by these three kinds of
macro-instructions has already been described.
Seven of the 22 entries can be viewed as special service-macros for
the "JUNK" type instruction. Another two entries, ENTRANCE and
WORKAREA can be regarded as special service-macros for the SUBMP
type instructions. These nine "special service-macros" of the
Reserve category perform tasks like defining global constants;
beginning and terminating jobstreams; declaring formats; specifying
relocatable addresses and entry points; designating registers,
counters and gates; and defining work areas for the SOLID
System.
Two of the three "JUNK" type instructions are used in the two
"RESERVE" type instructions. The third "JUNK" type instruction is a
special variant that is used to separately compile components of
the SOLID System. Only one of the "RESERVE" type instructions, and
one of the eight "SUBMP" type instructions appears in each CONTROL
routine. Three components (OPENSHUT, SACTION and SSTATECL) are
associated exclusively with the "RESERVE" type of instruction. Two
of these (OPENSHUT and SACTION) are called from one of the special
service-macros for JUNK type instructions, DEVICES. The third
component, SSTATECL, is called only by the macro-instruction
RESERVE. The components are described in Section C of this
Chapter.
The fourteen System Parameters, defined elsewhere in this
disclosure, are the variable parameters for the "RESERVE" and
"SUBMP" type macro-instructions. The System Parameters are briefly
described in Table 1.
Table 1. Brief descriptions of System Parameters for the solid
system. Definitions are given in the section on "System
Parameters."
System Parameter Description
__________________________________________________________________________
&ADDL Length of composite addresses.
&JBLIST Name of the JOBLIST array (JBLIST) &LFAST Length of
fast-memory part of composite addresses &LJBLIST Length of the
&JBLIST array.
&LOVER1 Length of the principal over-ride arrays &LOVER2
Length of the second Type II over-ride array &LOVER3 Length of
the second Type III over-ride array.
&LSLOW Length of slow-memory part of composite addresses
<HAYY Length of the principal data-array &MATRIXL Number
of entries in sub-arrays of the principal screen
&MATRIXS Number of entries in sub-arrays of the secondary
screens &NTRKS Number of seconds in each memory-block
&TPCORD Number of permanent cords to be used in fast mode
&TRKL Length of records in memory-blocks
__________________________________________________________________________
The 22 macro-instructions in the Reserve category are described
next.
Special Service-Macros:
1. CONSTANT-(CONSTANT )
Global constants and some error messages are defined in CONSTANT.
This macro-instruction appears only in the three "JUNK" type
instructions.
2. DEVICES-(&J DEVICES ) .
The DEVICES macro loads the second (BR3 11) and third BR4 12) USING
registers for the CONTROL routines. It calls the component
OPENSHUT, which executes the IBM OPEN and CLOSE system
macro-instructions for the communication (e.g. Card Reader and
Punch; Printer) and bulk storage (e.g. magnetic tapes) devices. DCB
statements for these devices are given in the INOUT special service
macro of the Reserve category (see below). DEVICES prints messages
about the bulk-storage devices and it calls the components SACTION
and SMEMORY. After executing the termination procedures (at the end
of each job-stream) control is returned to the IBM Operating System
O/S).
3. entrance-(entrance )
entrance is used in six of the eleven "SUBMP" type instructions to
specify that the relocatable addresses for all components are in
the main-stem of the designated CONTROL routines. These CONTROL
routines are actually compiled in the so-called extended forms (see
Section C). The remaining five CONTROL routines, whose "SUBMP" type
instructions do not contain ENTRANCE, are executed with planned
overlays.
4. INOUT-(INOUT )
The DCB or format statements for communication (e.g. Printer, Card
Reader and Punch) and bulk storage (e.g. Magnetic Tapes) devices
are specified in INOUT. The current allocation of resources is as
follows:
DCB Name Device Assigned Purpose
__________________________________________________________________________
MASTER Card-Read Read 80-columns of a card. PRINT Print Prints 132
bytes, preceded by the control character. PUNCH Card-Punch Punches
1 columns on a card. TAPEIND Tape Input tape containing
uncompressed referenced information TAPEINC Tape Input tape
containing compressed referenced information. TAPEOTC Tape Output
tape for the COPAK Compressor. TAPEJB Tape Input tape containing
the descriptor-sets.
__________________________________________________________________________
If new DCB's are added, then the DEVICES macro must be changed
also. Opening and closing operations, and DCB's for the Global
Memory component, SMEMORY, are specified in the DCBMEM
marco-instruction, which is executed in SMEMORY.
5. modadi-(modadi )
the relocatable addresses for the 31 components, five over-ride
arrays, and two JOBLIST arrays (JBLIST and JBWORK) are specified in
the MODADI macro-instruction by V-type addresses. The address of
the principal data-array, YY, is also specified. MODADI is sued in
the JUNIO, JUNKC and JUNKR macro-instructions. All new V-type and
A-type addresses for the SOLID System must be specified in both
MODADI and MODADX.
6. modadx-(modadx )
except for the EXTRN declarations of all A-type addresses, the
MODADX and MODADI macro-instructions are identical. MODADX is used
only in the macro-instruction JUNK, which is used to separately
compile the individual components as named CSECTS.
7. savearea-(savearea )
global storage-areas for saving registers in the SOLID System are
specified in the SAVEAREA macro-instruction. Some registers are
also assigned names. The five COSAVEX (X=1,2,3,4 or 5) arrays are
used by the TRANSFER calling instruction. If more than five levels
are to be used, new arrays must be defined (see Section C).
8. storage-(storage &tpcord,&ljblist)
the STORAGE macro-instruction allocates storage for the
input/output commands, indicators, counters and gates; and for the
composite addresses used by the RETRIEVAL PACKAGE. The permanent
cords table (PCORDS), and other arrays used by the COPAK compressor
are also allocated in STORAGE. The System Parameters &TPCORD
and &LJBLIST are mentioned in Table 1, above.
9. WORKAREA-(WORKAREA
&LOVER1,&LOVER2,&LOVER3,&LJBLIST)
The lengths of the fiver over-ride arrays (AOVER1,AOVER2,AOVER2R,
AOVER3, and AOVER3R) and the two JOBLIST arrays (JBLIST and JBWORK)
are assigned in WORKAREA, which is a special service-macro for the
"SUBMP" type of instruction. These eight arrays have been assigned
relocatable addresses (in MODADI and MODADX). The absolute machine
address for the beginning of each array is found in the location
specified in the MODADI and MODADX macro-instructions. For example
the word AAOVER1 contains the absolute machine address of array
AOVER1.
"JUNK" TYPE INSTRUCTIONS
Two of the four "JUNK" type instructions (JUNKC and JUNKR) are
actually service-macros for the two "RESERVE" type instructions.
The third, JUNK, is used to separately compile the components of
the SOLID System for the overlay forms of the CONTROL routines. The
two System Parameters that are associated with "JUNK" type
instructions specify the number of permanent cords (&TPCORD)
and the length of the two-JOBLIST arrays (JBLIST and JBWORK),
&LJBLIST. The values for these parameters are stored in
locations PCGATE and LJBLIST in the STORAGE macro-instruction.
10. JUNK-(JUNK &TPCORD,&LJBLIST)
Junk is used as a DSCET to separately compile the components of the
SOLID System. It uses the MODADX macro-instruction.
11. JUNKC-(JUNKC &TPCORD,&LJBLIST)
Junkc is used in the RESERCO macro-instruction, which appears in
the six CONTROL routines for stand-alone operation of the COPAK
compressor and its components.
12. JUNKR-(JUNKR &TPCORD,&LJBLIST)
Junkr is the service macro for RESERVE. The two CONTROL routines
for the SOLID System, SOLIDE and SOLIDO, contain the RESERVE
macro-instruction.
"RESERVE" Type Instructions
The three "RESERVE" type instructions are the initializing
instructions in different CONTROL routines. The four System
Parameters associated with these instructions have been defined in
Table 1. Each "RESERVE" type instruction executes the IBM SAVE
instruction, which saves all registers, and establishes the
addressability of the entire CONTROL routine with registers 10, 11
and 12. These registers have been named BR1, BR3 and BR4
respectively. A brief description of each "RESERVE" type
instruction is given below:
13. RESERCO-(RESERCO <HAYY,&TPCORD,&LJBLIST).
Reserco is the initializing macro-instruction for the six CONTROL
routines for the stand-alone COPAK compressor and it components.
The addresses SAVEYY, SBRYY and SBRY are computed and the permanent
cords table, PCRODS, is set to zero.
14. RESERVE-(RESERVE
&ADDL,<HAYY,&TPCORD,&LJBLIST)
Reserve initializes the CONTROL routines for the entire SOLID
System (SOLIDE and SOLIDO). The actual initialization of registers,
M-J arrays, and the permanent cords table (PCORDS) occurs in the
component SSTATECL, which is called from RESERVE.
"SUBMP" Type Instructions
The "SUBMP" type instruction is the last executable statement in
the CONTROL routines. Its function is to locate the literal pool;
to dummy the relocatable addresses of unused components; to specify
and position the eight principal arrays; and to position components
compiled in the main stem. There are eight "SUBMP" type
instructions in MACROPAK. Six of these (SUBCB, SUBCBO, SUBCE,
SUBCO, SUBCJ, and SUBCJO) are associated with the six CONTROL
routines for the stand alone compressors. Two (SUBME and SUBMO) are
used in the CONTROL routines SOLIDE and SOLIDO respectively. A
"SUBMP" type instruction ending with an 0 is used in the CONTROL
routine that is to be executed as a planned overlay. "SUBMP" type
instructions without the end 0 are used for the so-called extended
forms of the CONTROL routines. (see below)
Thirteen of the 14 System Parameters are associated with the
"SUBMP" type of instruction. These parameters have been defined in
Table 1. The eleven "SUBMP" type instructions are described
next.
15. SUBCB-(SUBCB <HAYY).
Subcb and RESERCO are used in the CONTROL routine for the extended
form of the stand-alone numeric compressor-decompressor
(COPAKNU).
16. subcbo-(subcbo <hayy)
subcbo is a variant of SUBCB which is used for the planned overlay
form of the stand-alone numeric compressor-decompressor
(COPAKNUO).
17. subce-(subce <hayy)
subce and RESERCO are used for the extended form of the stand alone
combined compressor-decompressor, COPAKCO. This
compressor-decompressor contains both the numeric and alphanumeric
part.
18. SUBCJ-(SUBCJ <HAYY)
The extended form of the stand-alone alphanumeric
compressor-decompressor, COPAKAN, uses the macro-instructions
RESERCO and SUBCJ.
19. subcjo-(subcjo <hayy)
subcjo is the special variant of SUBCJ that is used in the planned
overlay form of the CONTROL routine COPAKAN, which is called
COPAKANO.
20. subco-(subco <hayy)
subco is a special variant of SUBCE. It is used in the planned
overlay CONTROL routine COPAKCOO.
21. subme-(subme
&addl,&lslow,&lfast,&ntrks,&trkl,<hayy
&jblist,&ljblist,&lover1,&lover2,&lover3,&matrixl,&matrixs)
subme and RESERVE are used in the extended form of the CONTROL
routine for the entire SOLID System, SOLIDE. This CONTROL routine
executes all 31 components. The 13 System-Parameters associated
with SUBME have been defined in Table 1.
22. SUBMO-(SUBMO &LOVER1,&LOVER2,
&LOVER3,&LJBLIST)
Submo is sued for the overlay form of the CONTROL routine for the
entire SOLID System, SOLID0.
A "RESERVE" type macro-instruction is the second instruction in the
control routine. A "SUBMP" type macro-instruction always precedes
the the END statement.
B. Service-Macros
The 98 service-macros in the current version of MACR.phi.PAK can be
identified as either General or Special Service-Macros. The 37
General Service-Macros are needed for most information processing
systems. Thus they can be regarded as basic operations which are
not in the Assembly Language Processor (ALP) instruction set. The
64 Special Service-Marcos execute the special bit, byte and string
manipulative operations used in the SOLID System.
In the following discussion an "address" means either a named
location or a singly subscripted variable. The two register form of
IBM addressing (i.e., D2(X2 ,B2)) is not allowed.
a. General Service-Macros
The thirty-seven General Service-Macros are identified in five
classes (see Table 2). The General Service-Macros can be used
anywhere in the SOLID System since all registers used by these
macros are preserved. Seven macro-instructions, in Class 4, are
obvious extensions of existing IBM 360 ALP instructions. The
remaining 30 General Service Macros must be viewed as vital
operations for the SOLID System. Twenty-one of these, in Classes 2
and 3, are calling procedures which are used to manage all I/O and
transfers during execution. The last nine macro-instructions, in
Classes 1 and 5 (see Table 3), perform vital arithmetic and string
movement operations on data.
The hardware implementation of many of the General Service Macros,
either singly or in special combinations, can very substantially
increase the already very impressive performance of the SOLID
System.
The 37 General Service Macros are discussed below.
TABLE 2.
Classes of the General Service Macro Instructions
Macro Macro Class Macro Name No. Class Macro Name No.
__________________________________________________________________________
1 CONVE 23 REID4 42 DICE 24 REID5 43 FACE 25 REID6 44 TRUNC 26
REID7 45 REID8 46 2 CALL1 27 CALL2 28 3(ii) REIDJB 47 CALL3 29
REIDT 48 CALL4 30 WR1TE 49 CALL5 31 4 CSSCRN 50 GLOBAL 32 DECPC 51
STSR 33 DUMADD 52 TESTL 34 HEXPC 53 TRANSFER 35 LARGEXC 54 SKIPP 55
3(i) PR1NT 36 TALE 56 PUNSH 37 PUNXH 38 5 LMOVE 57 REID1 39 RMOVE
58 REID2 40 RMVC 59 REID3 41
__________________________________________________________________________
1. Arithmetic Operations
23. CONVE-(&J CONVE &FROM,&TO)
The integer number in address &FROM is converted to a
normalized (short) floating-point number and stored in address
&TO. All registers are unchanged.
24. DICE-(&J DICE &NOBITS,&FROM,&TO)
A random integer number with the number of bits specified in
&NOBITS is produced in the full-word location &RANDNO
(right adjusted). &ODDNO contains the multiplicand, and it is
updated after each use of DICE. If &ODDNO contains zero, a
starting 32 bit odd number is constructed from the "Time of Day".
Eleven global words (in STORAGE) are reserved exclusively for
storing the random starting numbers. They are initialized to zero.
These words are GENERATE, ODDNO, and ODDNOX (with X=1,2,-9). All
registers are unchanged.
25. FACE-(&J FACE &NFACES,&RANDNO,&ODDNO)
Face is a special variant of DICE which produces a random integer
number (in &RANDNO) that lies in the range from 0 to that
specified in &NFACES. For example, if &NFACES contains 3,
&RANDNO will contain 0,1 or 2. All registers are unchanged.
26. TRUNC-(&J TRUNC &FROM,&TO)
The normalized (short) floating point address &FROM is
truncated to an integer number and stored in address &TO. All
registers are unchanged.
2. Calling Procedures
Special calling procedures are used in the SOLID System to branch
between the CONTROL routine and the components or to branch among
the components themselves. The three types of calling procedures
have already been illustrated (FIG. 15). The Macro-instruction
CALL1 (see below) is the first type. TRANSFER is the second type.
GLOBAL is a special version of TRANSFER which is used exclusively
for calling the Global Memory component, SMEMORY.
CALL2,CALL3,CALL4, and CALL5 is the third type of calling
procedure.
There are two macro-instructions (STSR and TESTL) which are
associated with the seven special calling instructions (see Table
3, Class 2). TESTL is nested in the CALLX(X=1,..,5), and TRANSFER
macro-instructions. The STSR instruction is the first executable
instruction in each component. Together the TESTL and STSR
macro-instructions permit the use of V-type relocatable addresses
(in MODADI and MODADX) for all components of the SOLID System. With
a CALLX(X=1,-5) instruction the USING and BRANCH registers are
altered. No register is changed by using either the GLOBAL OR
TRANSFER instructions.
In the following macro-instructions &NAME, &NAME1, etc. are
the relocatable addresses (V-type) of the designated components. In
the SOLID System the name of a component is its relocatable address
preceded by an S(i.e., ANPAKC (relocatable address); SANPAKC
(Name)). &ALINST is the return address in the CONTROL routine.
&RETURN is the return address in the CONTROL routine or in the
component in which the call is issued.
27. CALL1-(&J CALL1 &NAME,&ALINST)
After the branching to the component S&NAME (entry point
&NAME) control is returned to location &ALINST in the
CONTROL routine. The USING(&UR) and BRANCH(&RR) registers
specified for component S&NAME are altered.
28. CALL2-(&J CALL2&NAME1,&NAME2,&ALINST)
Components S&NAME1 and S&NAME2 are executed before control
is returned to location &ALINST in CONTROL.
29. call3-(&j call3
&name1,&name2,&name3,&alinst)
the components with relocatable addresses &NAME1,&NAME2 and
&NAME3 are executed before returning to the address &ALINST
in the CONTROL routine.
30. CALL4-(&J CALL4
&NAME1,&NAME2,&NAME3,&NAME4,&ALINST)
The four components whose relocatable addresses are given in the
instruction are executed before returning to address &ALINST in
the CONTROL routine.
31. CALL5-(&J CALL5
&NAME1,&NAME2,&NAME3,&NAME4,&NAME5,&ALINST)
The indicated five components are executed before control is
returned to address &ALINST.
32. global-(&j global &addl,&ntrks,&return)
this instruction can be used anywhere in the SOLID System to call
the Global Memory component, SMEMORY. The return address
(&RETURN) can be in the CONTROL routine or anywhere in the
component where the call is issued. All registers are unchanged.
&ADDL is the length of the composite addresses, and &NTRKS
is the number of tracks or records in each memory-block (see Table
1).
33. STSR-(&J STSR &UR,&RR,&DUMMY)
This instruction, which is the first executable statement in each
component, defines addressability and restores registers 0, 1, 14
and 15 which are used in V-type addressing. &UR and &RR are
the using and branch registers for the component. &DUMMY is
DUMMY (for separate compilation of components) or SOLID (if the
component is positioned by the SUBMP service-macro). Further
details are given in the subsection on Components, Section C.
34. testl-(&j testl &name)
&name is the relocatable address of the component (viz. the
relocatable address of component SANPAKC is ANPAKC). This
instruction appears in the calling procedures for components (viz.
CALLX, TRANSFER, and GLOBAL) and the input/output routines (see
below). Together with the STSR instruction, TESTL ensures that the
registers 0, 1, 14, and 15 will be restored after executing a
branch and link instruction to a V-type address.
35. TRANSFER-(&J TRANSFER &N,&NAME,&RETURN)
Transfer is the so-called global calling procedure (see Section C).
&N is the assigned level of the component S&NAME.
&RETURN is defined above. If a TRANSFER instruction is issued
within a component, then &N must be greater than the assigned
level of the component. All registers are unchanged.
3. Input/Output Calling Procedures
Assembly Language Processor (ALP) input/output packages for tapes
and cards are a part of the SOLID System. They are global calling
procedures and can be used anywhere in the SOLID System. The DCB
statements for these packages are in the INOUT macro-instruction
(see Reserve). Thus, these DCB statements can be changed without
recompiling the individual input/output components. Here two forms
of the calling procedures are noted.
i. Card/Printer Operations
The print (PR1NT), read (REIDY, Y=1,2,-8) and punch (PUNXH)
operations in the SOLID System are extremely versatile. A single
variable (&FORMAT) completely designates the formats (i.e., A,
I, J, B, E, F, mixed (e.g., IFB), or X). If certains kinds of
errors are made in these instructions, the operation is aborted and
an appropriate error message is printed. The addresses in these
operations can be either named locations or singly subscripted
variables. The tworegister form of IBM addressing (i.e., D.sub.2
(X.sub.2,B.sub.2)) is not permitted.
36. PR1NT-(&J PR1NT &FORMAT,&FORM,&TO)
All four-byte words (for numerical data) or bytes (for alphanumeric
data) between the addresses &FROM and &TO+4 are printed in
the fields and formats designated by &FORMAT. The first address
(&FROM) must be located on a word boundary. If &FROM is
greater than &TO the operation is aborted with the printed
message: ADDRESSING ERROR. All registers are unaltered.
37. PUNSH-(&J PUNSH &FORMAT,&FROM,&TO)
Punsh is a special form of the PUNXH operation which punches (on
cards) the information between addresses &FROM and &TO in
the X or column binary format. Unlike PUNXH, PUNSH does not have an
associated component. No register is changed.
38. PUNXH-(&J PUNXH &FORMAT,&FROM,&TO)
Information between addresses &FROM and &TO+14 is punched
on cards in the field(s) and formats designated by &FORMAT.
Format options are A, I, B, E, F, mixed (e.g., IEF) or X. The
address &FROM should be located on a single word boundary.
Output obtained from the PUNXH operation can be read with the REIDY
(Y=1,2-8) operation. If &FROM is greater than &TO the PUNXH
operation is aborted with the printed message: ADDRESSING ERROR.
All registers are unaltered.
39 to 47 REIDY, with Y = 1,2,-8.
There are eight separate card-read instructions with the form:
&J REIDY &FROMAT,&W1,&W2, -&WY
&w1, &w2, -&wy are the addresses of the single
variables or arrays where the information is to be stored in the
designated format, &FORMAT (see Chapter V). The REIDY operation
can be used to simultaneously load up to eight separate locations
or arrays. The formats can be A, I, B, J, E, F, mixed or X. At the
end of each completed card-reading operation the total number of
bytes that were stored is in location JII. If any address (&W1,
&W2, etc.) is in protected storage (i.e., in the O/S system or
CONTROL routine) or the &FORMAT conflicts with the number of
variables (Y) the REIDY operation is aborted with an appropriate
error message. No register is changed.
ii. Tape Operations
In the SOLID System there are two tape-read (REIDT and REIDJB) and
on tape-write (WR1TE) macro-instructions which are used in the
input (SREIDT), job-list (SJOBLIST) and output (SOUTPUT)
components. The DCB statements associated with these three
macro-instructions are in the INOUT macro-instruction. &ADDRESS
is the location in memory where the reading (from tape) or writing
(on tape) is to begin. The location &JII contains either the
number of bytes read into memory (for REIDT and REIDJB) or the
number of bytes that are to be written on tape (for WR1TE).
Registers are unchanged in these operations. In their present form,
all tape DCB's have a block-size of 3000 bytes and two buffers.
Variable length records (up to 3000 bytes) are used for all tapes.
However, there is no limit for the value in location &JII.
48. reidjb-(&j reidjb &address,&jii)
this instruction uses the tape DCB statement TAPEJB (see INOUT in
the Appendix). This tape normally contains the index data that is
to be translated (or rearranged) to the JOB-LIST ITEM form which is
used to trace the information path through the AUXILIARY FILE. The
DCB name is TAPEJB and the DDNAME = COPAK8. No register is
changed.
49. REIDT-(&J REIDT &ADDRESS,&JII)
In this macro-instruction the retrieval command (i.e., MODE =0
(retrieve) or .noteq. 0 (store or update)) determines which of two
tapes will be read. If MODE = 0 compressed bulk information, which
will be decompressed by COPAK, is read with the TAPEINC DCB. For
MODE .noteq. 0 bulk (or referenced) information is read with the
TAPEIND DCB. This information will be processed by the COPAK
compressor and stored on the device indicated by the output
command, OUTPXT (see Chapter V). The DDNAMES of TAPEINC and TAPEIND
are COPAK5 and COPAK4 respectively. No register is changed.
50. WR1TE-(&J WR1TE &ADDRESS,&JII)
The &JII bytes of information beginning the &ADDRESS are
written on the tape with DCB TAPEOTC. The DDNAME of the TAPEOTC DCB
Bis COPAK6. All registers are unaltered.
4. Extended IBM and Message Operations.
There are seven macro-instructions in Class 4. Four of these
(CSSCRN, DUMADD, LARGEXC, and SKIPP) are extensions of existing IBM
360 ALP instructions. The remaining three (DECPC, HEXPC, and TALE)
are used to print messages. Brief descriptions of these Class 4
general Service Macros are given below.
51. CSSCRN-(&J CSSCRN &HLGTH,&ADD1,&ADD2)
This instruction compared the number of bytes in the half-word
&HLGTH of the items beginning in locations &ADD1 and
&ADD2. &HLGTH can contain any positive integer number up to
65,535. If &HGLTH contains zero the first 256 bytes are
compared. Condition code settings are the same as those for the IBM
CLC (Compare Logical Character) instruction. The registers are not
changed by the CSSCRN instruction.
52. DECPC-(&J DECPC &NAME,&LOC,&DISCR)
This instruction prints a message about the decimal-ten number in
&LOC thus: &NAME=&LOC &DISCR. If &DISCR=SECS,
the number in &LOC is printed as a fixed-point number with two
decimal places. Otherwise &LOC appears as an integer number. If
&DISCR=RATE then BYTES PER SEC. is printed for &DISCR.
&NAME and &DISCR are strings of up to 20 alphanumeric
characters. No register is altered by the DECPC instruction.
53. DUMADD-(&J DUMADD &BRANCH,&DUMMY)
This instruction generates the IBM 360 ALP statement:
&DUMMY EQU &BRANCH
The address that is to be dummied, &DUMMY, is set equivalent to
the address &BRANCH. The DUMADD macro-instruction is used
extensively in the "SUBMP" type instructions of the category
Reserve.
54. HEXPC-(&J HEXPC &NAME,&LOC,&LENGTH)
In this instruction, a string of hexadecimal characters, whose
length (in bytes) is in the full word &LENGTH is printed
thus:
&NAME=HEXADECIMAL STRING
The descriptive label (&NAME) can be any string of alphanumeric
characters up to 19 bytes long. If &LENGTH contains a number
greater than 65 or less than one, it is set to 65. The HEXPC
instruction does not change any register.
55. LARGEXC-(&J LARGEXC &NBYTES,&TO,&FROM)
Largexc is the extended form of the IBM XC (Exclusive Or Character)
instruction. The half-word &NBYTES contains the number of
bytes, (up to 65535) beginning in the address &FROM, that are
to OR'ed into the string beginning in address &TO. If
&NBYTES contains zero LARGEXC is not executed. If it contains a
negative half-word number, then the first 256 bytes are OR'ed. All
registers are unchanged.
56. SKIPP-(&J SKIPP &LC)
&lc pages are skipped on the printer. All registers are saved.
The instruction: SKIPP 1 skips one page.
57. TALE-(&J TALE &DASH,&MESSAGE,&NOTIMES)
This instruction prints the message: START OF NEW
&MESSAGE.,&NOTIMES times. &DASH is the background in
which the message is printed. Thus TALE ASTERICK,SEGMENT, 2 will
print (extended over 133 characters):
**********START OF NEW SEGMENT**********
**********start of new segment**********
all registers are unchanged.
5. String Movement Instructions
There are three service-macros which greatly facilitate the
movement of strings of bytes in core-storage. In the first two
instructions (LMOVE and RMOVE), the halfword &NOBYTES contains
the number of bytes that are to be moved, beginning in address
&FROM, to begin in location &TO. The addresses cannot be
doubly subscripted in the RMOVE and LMOVE instructions. In all
three macro-instructions all the registers are saved.
58. LMOVE-(&J LMOVE &NOBYTE,&TO,&FROM)
If &NOBYTES .ltoreq. O the LMOVE instruction is not executed.
The maximum number of bytes that can be moved is 65,535. LMOVE is
the extended form of IBM MVC instruction, which moves the bytes in
strings from left to right. No registers are changed.
59. RMOVE-(&J RMOVE &NOBYTES,&TO,&FROM)
This instruction is similar to LMOVE except that the bytes in the
strings are moved from right to left. Thus RMOVE can be used to
displace a string (of length &NOBYTES) right. &TO and
&FROM are the addresses of the leftmost bytes. No registers are
changed.
60. RMVC-(&J RMVC &D1,&IR,&B1,&D2,&B2)
In this special form of the RMOVE macro-instruction
&D1(&B1) is the address of the last byte in the string's
new location. &D2(&B2) is the address of the last byte in
the string that is to be moved. &D1 and &D2 are
displacements. &B1, &B2 and &IR are registers. &IR
contains the number of bytes which are to be moved. If &IR
contains zero or is negative the RMVC instruction is not
executed.
The reverse move macro-instructions (RMOVE or RMVC) are much slower
than the LMOVE or MVC instructions. The speed of the LMOVE and
RMOVE instructions primarily determine the speed of the compressor
and decompressor components of COPAK. These two instructions can
perform best when hardware implemented.
b. Special Service-Macros
There are 64 macro-instructions entered in MACROPAK that are
exclusively associated with the 31 components of the SOLID System.
Two of these, BEGINS and MJARRAY, are used to initialize various
aspects of the entire SOLID System. Another three (DEVICE,
GETJLIST, and STRING), which use the General Service-Macro Input
calling procedures, handle input for the SOLID System and its
subsystems.
One special service-macro, SAVINGS, computes both the percentile
savings and the thruput rate for the COPAK Compressor. The
remaining 58 special service macros perform the special bit, byte,
string, and search operations that are an essential part of the
SOLID System.
The last 58 special service macros, the String Movement and
Arithmetic Operations (see Section B (a)), to a very substantial
degree, determine the speed of compression, decompression,
retrieval, storage, purging, and updating. The principal
service-macros can be hardwired in the SOLID System.
In this Section, B(b), the 64 special service-macros are discussed
in terms of the nine groups of components (see Section C). For
example, there are two components, OPENSHUT AND SSTATECL, which
initialize the SOLID System. Also, there are six components in the
COPAK compressor that used the twenty-nine compressor special
service macros. In general all the special service macros have been
designed to perform highly selective tasks at particular locations
in the SOLID System. Registers, arrays, gates, and counters are
changed to achieve the designed purpose. They cannot be used
outside their designated environments.
1. Initializing Operations
61. BEGINS-(&J BEINGS )
This instruction, which appears near the beginning of the SSEARCH
component, is the initializing routine for the global memory
(SMEMORY) component. The physical characteristics of the virtual
memory devices (i.e., the number of cylinders, tracks/cylinder, and
the number of devices) are recorded in arrays CHECK, SOS, BWX, and
LSX.
62. mjarray-(&j mjarray &addl)
mjarray is used in the principal initializing component, SSTATECL.
It generates the M-J arrays when the SOLID System is used for the
first time. The MJARRAY macro-instruction is executed if NEWFILE
appears on the first card in the data-deck. &ADDL is the number
of bytes in the composite addresses (see Table 2).
2. Input Operations
Three macro-instructions (DEVICE, GETJLIST, and STRING) are used in
the CONTROL routines to read (from cards) twenty-three of the 27
input commands for the SOLID System and its subsystems. Here the
functions of the three macro-instructions will be discussed.
63. DEVICE-(&J DEVICE
&INPUT,&OUTPUT,&SKIPS,&SLENGTH,&LLENGTH,&RNOS,&CORDS).
The DEVICE and STRING (see below) macro-instructions are special
calling procedures for executing the SCOMMAND component. With
DEVICE the seven device commands are read from a single card. The
default options for each device command are executed in the
component SCOMMAND.
64. getjlist-(&j getjlist
&jlinpxt,&jlrskip,&jltran,&jlnorm,&klength,&njobs,&ntaks,&nvalue,&jvalue,&
numdiag,&generate)
getjlist is a special calling procedure for executing the component
SJOBLIST which is the supervisory component for the TRANSLATION
PACKAGE. The first eight commands, which are associated exclusively
with the control and use of the TRANSLATORS, are read from a single
card. The last five commands are associated exclusively with the
random generation of JOBLIST ITEMS by Monte Carlo Generators. These
five commands are read (from a single card) only if the input
command &JLINPXT=16.
65. string-(&j string
&mode,&postop,&lexcon,&lexmode,&lexpch)
string is a special calling procedure for executing the second half
of the input component SCOMMAND. The five string commands are read
from a single card.
3. Compressor
There are 29 special service macros associated with the COPAK
Compressor, which has ten components (see Section C). Four of the
special service macros (COPAB, COPAJ, JIMP1, and SINCOP) and two
components SANPAKJ and SNUPAK) are special variants that are used
in the stand-alone Numeric, COPAKNU, and Alphanumeric, COPAKAN,
versions of COPAK. The relationships between the components of the
compressor are discussed hereinafter. Here it is sufficient to note
the following:
i. Three components handle input (SREADC and SREADT) and output
(SOUTPUT) for the compressor part of the SOLID System.
ii. Two components SCOMMAND and SSTATECL) set up the strings of
information for processing by the COPAK compressor. SSTATECL also
initializes the AUXILIARY FILE.
iii. The actual compression/decompression (of compressed
information) is done by one special service macro (COPAB, COPAJ, or
COPAK), which calls the components when it is used. Table 3 shows
the relationships between the special service macro and the
compressor/decompressor components they call.
TABLE 3.
Decompressor/compressor components combined means both Numeric and
Alphanumeric Information. ##SPC6##
Strings of information are divided into segments, which are handled
by the alphanumeric components (SANPAKC, SANPAKJ and SANPAKD).
Segments are divided into substrings which are processed by the
numeric components SNUPAB and SNUPAK). Detailed descriptions of the
composition of the strings before and after compression will be
found elsewhere.
The 32 special service macros associated with the COPAK compressor
are identified in the following three categories:
Calling Procedures: Three macro-instructions (COPAB, COPAJ and
COPAK) are used to call the compressor/decompressor components (see
Table 3.)
Alhanumeric: Sixteen macro-instructions are associated exclusively
with the three alphanumeric components (SANPAKC, SANPAKD and
SANPAKJ).
Numeric: Ten special service macros are associated exclusively with
the two numeric components (SNUPAB and SNUPAK).
A description of the 32 special service macros in their three
categories begins next.
i. Calling Procedures:
A single macro-instruction in each CONTROL routine is used to
execute the compressor/decompressor parts of the several different
forms of the COPAK compressor.
66. COPAB-(&J COPAB &RADD)
This special service macro uses the CALL1 instruction for calling
the stand-alone numeric component (SNUPAB). After executing SNUPAB
control returns to address &RADD in the CONTROL routines
(COPAKNU and COPAKNUO). The COPAB Macro-instruction is associated
exclusively with the "SUBMP" type instructions SUBCB and
SUBCBO.
67. copaj-(&j copaj &radd)
copaj uses the CALL2 instruction for calling SANPAKJ then SANPAKC
for the stand-alone alphanumeric compressors (COPAKAN and
COPAKANO). &RADD is the return address in the CONTROL routines.
COPAJ is used only when the "SUBMP" type instructions SUBCJ and
SUBCJO are used
68. COPAK-(&J COPAK &RADD)
Copak uses the CALL3 instruction to call SANPAKD, SNUPAK, then
SANPAKC (in that order) before returning control to the address
&RADD in the CONTROL routines. COPAK is used in the CONTROL
routines for the SOLID System (SOLIDE and SOLIDO) and for the
stand-alone combined compressor (COPAKCO and COPAKCOO). Four
"SUBMP" type instructions (SUBCE, SUBCO, SUBME, and SUBMO) are
associated with COPAK.
ii. Alphanumeric
The distribution of the sixteen Special Service-Macros among the
several components is shown in Table 4.
TABLE 4.
Distribution of the Sixteen Alphanumeric Special Service Macros
among the indicated components. ##SPC7##
Two macro-instructions, OPCORDS and PPCORDS, organize and punch the
permanent cords table (PCORDS) respectively. The PCORDS table is
used by SANPAKC when it operates in the fast mode, (as described
hereinafter). Thruput rates and percentile savings for the
Compressors are computed in SAVINGS, which is executed at the end
of SANPAKC. In production situations SAVINGS should be removed. The
JIMP1 macro-instruction is used only in the stand-alone
alphanumeric compressor (COPAKAN and COPAKANO). It performs the
substring manipulations that are normally executed in the numeric
component (SNUPAK). The remaining twelve special service macros
perform highly specialized bit, byte, and string manipulative
operations.
69. BBM-(&J BBM &JII,&CODE,&CORD1)
The BBM macro searches the first &JII bytes of a string for the
repeated occurences of the high order byte in &CODE. A bit-map
is constructed to denote the relative positions in the string where
the repeated byte occurs. The resulting bit-map and its length are
stored beginning at the location &CORD1. All registers are
preserved by this macro.
70. BBMD-(&J BBMD &JII,&CODE)
Bbmd disassembles the bit-map which was constructed by the BBM
macro. The associated code and bit-map are removed from the head of
the string. The code is stored in &CODE. The addresses of the
occurences of the code in the string are determined by
disassembling the bit-map and are stored in the array TL, which is
defined in JUNKR. &JLL is reduced by the number of bytes in the
bit-map minus two and the string is moved left the same number of
bytes. All registers are preserved by this macro.
71. CCD-(&J CCD &JII,&NBB)
The CCD macro searches a string for the repeats of a R-byte pattern
which is located &NBB bytes from the beginning of a string.
Only the &JII-&NBB bytes following the pattern in the
string are searched. The addresses of the repeats of the pattern
are stored in the array TL. This macro also determines whether or
not a savings can be made by comparing N*(R-1) with R+2. N is the
number of repeats of the pattern. If N*(R-1) > R+2, then a
savings can be made. Registers 0 to 3 are altered in this
macro.
72. FIND-(&J F1ND &JII,&CODE)
The F1ND macro searches the first &JII bytes of a string for
the single byte which is contained in &CODE. The addresses
where the byte occurs are stored in the array TL. Registers 0 to 3
are altered by the F1ND macro.
73. JAR-(&J JAR &JII,&NR)
The JAR macro combines the three low order bytes of &JII and
the low order byte of &NR into a single machine word. &JII
is increased by four bytes before the word is constructed. This
composite control word is then inserted at the head of the string
addressed by the register BRYY. This control word is used to
decompress the compressed strings of information.
74. JIMP1-(&J JIMP1 &RADD)
Jimp1 contains the numeric special service macros SOSCODE, ENDSS,
and STRINGA, which are normally executed in the component SNUPAK
(see (iii) Numeric below). JIMP1, which appears at the end of
component SANPAKJ (see Section C), is used for the stand-alone
alphanumeric compressor, COPAKAN and COPAKANO.
75. land-(&j land &jii,enr)
the LAND macro removes the four bytes of control information placed
at the head of the string by the JAR macro. The four bytes are then
disassembled into two full words (&JII and &NR). &JII
is decreased by four to its original value. &JII and &NR
occupy three and one bytes in the composite control word. Registers
0, 1 and 2 are changed by this macro.
76. LEX-(&J LEX &JII,&CODE,&CORD1)
This macro places the high order byte in &CODE and the R-bytes
in &CORD1 at the head of a string. The string is moved to the
right R+1 bytes to accommodate the addition. &JII is increased
by the number R+1. Register 2 is altered by this macro.
77. LEXD-(&J LEXD &JII,&CODE,&CORD1)
The LEXD macro removes the first R+1 bytes of a string storing the
first byte in &CODE and the following R bytes in &CORD 1.
The string is moved to the left R+1 bytes. &JII is reduced by
the number R+1. Registers 0 and 1 are changed by this macro.
78. MAS-(&J MAS &JII,&NBB,&CORD1)
The MAS macro creates an R-byte opening in a string beginning
&NBB byte from the head of the string. The R-bytes contained in
&CORD1 are then substituted for the single byte which was
located &NBB bytes from the head of the string. &JII is
increased by R-1. Registers 0, 1 and 2 are changed by this
macro.
79. OPCORDS-(&J OPCORDS &RADD)
In this macro-instruction, which appears at the beginning of the
SANPAKD component (see Section C), some indicator messages are
printed and the timing of the compression and decompression steps
is begun. In production runs, the statements which perform these
functions would be removed. The principal purpose of OPCORDS is to
select those &TPCORD permanent cords with the highest saving
ratios. &TPCORD is one of fourteen variable parameters which is
set before SOLID is compiled. &RADD is the beginning location
in the SANPAKD component of the string disassembly
macro-instruction (STRINGD).
80. ppcords-(&j ppcords &radd)
the PPCORDS macro-instruction appears at the beginning of the
SOUTPUT component. &RADD is the first instruction is SOUTPUT
which follows PPCORDS. The principal function of this macro is to
print and punch (in column binary) the table of permanent cords
(PCORDS) if the LEXMODE command is zero.
81. RAND-(&J RAND &JII,&REPEATS)
The RAND macro removes a single composite byte from the head of
string. The left four bits of the byte are incremented by one and
stored in &REPEATS. The right four bits of the byte are stored
in RM. RM is incremented by one and stored in R. &JII is
reduced by 1. Registers 0 to 3 are changed by this macro.
82. RRL-(&J RRL &JII,&REPEATS)
The RRL macro constructs a composite byte which is to be added to
the head of a string. The left four bits of the byte contain the
number in &REPEATS decreased by one. The right four bits of the
byte contain the number in RM. The composite byte is inserted at
the head of the string. &JII is increased by 1. Registers 0, 1
and 2 are altered by this macro.
83. SAM-(&J SAM &JII,ENBB,&CODE)
Sam substitutes the single byte contained in &CODE for the
R-bytes which are located &NBB bytes from the head of a string.
The trailing bytes of the string are moved left R-1 to eliminate
the spaces created during the substitution. Registers 0, 1 and 2
are changed by the SAM MACRO.
84. savings-(&j savings &radd)
the percentile savings in storage and the thruput rate expressed in
bytes/second of uncompressed information) are computed in SAVINGS,
which appears at the end of SANPAKC. Both the percentile savings
and thruput rate are printed if MODE.noteq. 0 (storage and update
modes). Just the thruput rate is printed if MODE=0(retrieval).
SAVINGS would be omitted during production runs. &RADD is a
dummy address.
iii. Numeric
Two of the ten special numeric service macros (CONSTRCT and
EXTRAKT) assemble and decompose the segment control word (PARM),
which contains the segment length and the number of substrings.
Another two macro-instructions (SOSCODE and ENDSS) ensure that the
compressor (NUPAKC) and decompressor (NUPAKD) parts will process
each segment of information one substring at a time. SOSCODE also
redefines the state-of-substring control word, SOS. Of the
remaining three macro-instructions, two (STRINGA and STRINGD)
assemble and decompose compressed substrings respectively. The
macro-instruction COPAKEND, which appears in the output component
(SOUTPUT), executes the post-operation commands.
85. CONSTRCT-(&J CONSTRCT &PARM,&III,&NV)
The four byte composite status-of-segment control word, &PARM,
is constructed with &JII in the leftmost three bytes and
&NV in the rightmost byte. &JII is the total number of
bytes in the segment of information. &NV is the number of
substrings in the segment. &JII and &NV are four byte
words. The CONSTRCT operation leaves all registers unchanged.
86. COPAKEND-(&J COPAKEND)
The COPAKEND macro-instruction, which appears at the end of
component SOUTPUT, executes the post-operation commands
POSTOP(=LJ), NJOBS (number of bulk-storage items), and NTASKS
(number of ITEMS in Joblist). These commands are:
Lj<0; increment LJ by one and, if LJ=0, set SWITCH=0 and
LEXPCH=1. This means that for the next segment of information the
alphanumeric compressor (SANPAKC) will operate in the fast mode and
the PCORDS table will not be punched.
Decrement NJOBS by one. If NJOBS >0 control goes to the location
CARDREAD in the CONTROL routine, where the next string is read.
If, after decrementation, NJOBS<0, then NTASKS is examined. For
NTASKS <0 control goes to the location LOOKFILE in the CONTROL
routine. Normally the RETRIEVAL PACKAGE is entered at LOOKFILE. For
NTASKS .ltoreq.0 control goes to the macro-instruction DISPENSE or
DISPOSE at location ANSWER in the CONTROL routine. Other options of
the POSTOP command are executed in DISPENSE and DISPOSE.
87. endss-(&j endss &radd)
the ENDSS macro-instruction appears in the SNUPAK and SNUPAB
components after the SOSCODE, NUPAKC, and NUPAKD
macro-instructions. In ENDSS the new four byte status-of-substring
composite control word (SOS) is constructed. During decompression
the absolute error check for each decompressed substring occurs. If
an error is detected, control passes to location RTRANMIT in the
macro-instruction DEVICES (see Section A), where the error
procedure component (SACTION) is called (see Section C). If no
errors are detected, and some substrings are still to be processed,
control returns from ENDSS to location &RADD in SNUPAB or
SNUPAK.
88. extrakt-(&j extrakt &jii,&nv,&parm)
this macro-instruction is the reverse of the CONSTRCT instruction.
The number of bytes in the segment (&JII) and the number of
substrings (&NV) are extracted from the four byte composite
status-of-segment control word, &PARM. &JII and &NV are
four byte words. All registers and &PARM are unchanged.
89. NUPAKC-(&J NUPAKC &RADD)
Nupakc is the numeric compressor macro-instruction which is used in
numeric compressor components, SNUPAB and SNUPAK. After a substring
is processed by NUPAKC, control goes to location &RADD in
SNUPAB or SNUPAKC for execution of the ENDSS macro-instruction.
90. NUPAKD-(&J NUPAKD &RADD)
Nupakd is the numeric decompressor instruction in the component
SNUPAK After processing a substring by NUPAKD, control is returned
to the ENDSS instruction in SNUPAK or SNUPAB at location &RADD.
NUPAKD reverses the steps executed in NUPAKC.
91. sincop-(&j sincop &radd)
sincop is used in the numeric stand-alone compressor component
(SNUPAB). It contains the macro-instruction STRINGD and those parts
of component SANPAKD that are needed to process segments of
information. Information about the segments is printed and timing
of compression/decompression is begun in SINCOP. &RADD is a
dummy address.
92. SOSCODE-(&J SOSCODE &RADD)
The SOSCODE macro-instruction appears near the beginning of the
SNUPAK component (see Appendix). Its principal purpose is to
initialize the substrings for processing by either the compressor
or decompressor parts of the numeric compressor component, SNUPAK
or SNUPAB. The sign and NDR, which occupies the rightmost four bits
of the status-of-substring commands, are modified to divert control
to either the macro NUPAKC or to component SANPAKC. &RADD is
the instruction in SNUPAK which follows SOSCODE.
93. stringa-(&j stringa &radd)
this macro is used near the end of the SNUPAK and SNUPAB components
(see Appendix). Its principal function during decompression is to
check that processing by the numeric decompressor part of SNUPAK
yielded the correct number of bytes for the segment. Disagreement
leads to the printing of an appropriate error message and control
passes to the location RTRANMIT, for processing by the
error-procedure component (SACTION).
During compression STRINGA inserts the information that is needed
to decompress and check the substrings at the head of the segment
of compressed information.
&RADD is the location of the instruction which follows STRINGA
in the components SNUPAK or SNUPAB.
94. stringd-(&j stringd &radd)
this macro extracts the control information that was inserted at
the head of the segment by STRINGA during compression. This control
information is used to check the alphanumeric decompression (just
completed in SANPAKD) and is used by the NUPAKD macro-instruction
to decompress the segment, one substring at a time. Additional
error checks are made on each substring (in ENDSS) and also on the
segment after processing by SNUPAK (in STRINGA). &RADD is the
return address in the component SANPAKD or SANPAKJ.
4. TRANSLATION PACKAGE
The TRANSLATION PACKAGE is a subsystem of SOLID that produces
normalized JOBLIST ITEMS from the assigned descriptor sets or
generates them with random number generators. The JOBLIST ITEMS are
used to trace, create, purge, or update the information paths in
the AUXILIARY FILE.
The package consists of eleven components and eleven special
service macros. One special service macro (GETJLIST), which has
been described above (see macro 68, calls the TRANSLATION PACKAGE
supervisory component (SJOBLIST) from the CONTROL routines SOLIDE
and SOLIDO. SJOBLIST reads the thirteen TRANSLATION PACKAGE
commands and executes them. SJOBLIST performs the following six
functions:
i. Read TRANSLATION PACKAGE commands from cards.
ii. Read the assigned descriptor-set from the designated input
device, or
iii. Generate JOBLIST ITEMS with random number generators.
iv. Translate the assigned desriptor-sets to the JOBLIST form.
v. Read the override data from cards.
vi. Normalize the JOBLIST ITEMS produced in functions iii and iv.
The last function, vi, is not executed if the command JLNORM equals
zero.
One component (SGENITEM), which is called from SJOBLIST by the
special service macro JLITEM, generates the random JOBLIST ITEMS.
Another six special service macros (BITTHROW, JBLISTI, KERTHROW,
LBLTHROW, MJTHROW and SQUEEZE) are associated exclusively with the
component SGENITEM.
The special service macro TRANLATE calls the five TRANSLATOR
components from SJOBLIST. There are provisions for including up to
255 different TRANSLATORS. The TRANSLATOR components, which
rearrange the assigned descriptor-sets to the JOBLIST form, must be
coded for each new collection of items.
The override data is read (from cards) by a single special service
macro (OVERRIDE) in SJOBLIST.
The NORMALIZATION PACKAGE consists of four components (SNORMAL,
SCYCLIC, SREFLECT, and SXCHANGE) and one special service macro,
NORMFORM. However, three of these components (the TRANSFORMATION
PACKAGE) are also used by the MOBILE CANONICALIZATION PACKAGE.
NORMFORM is used in SJOBLIST to call the normalization supervisory
component, SNORMAL.
Seven of the ten special service macros that are to be described
below require an understanding of the JOBLIST ITEM structure, which
follows. ##SPC8##
Ljbi is the number of bytes in the JOBLIST ITEM. M is the number of
nested information representations, which have ranks m.sub..sub.1,
2,m.sub. 3 -m.sub. M. LD.sub.o is the principal diagonal of the
Information Representation (IR) of rank
and LL.sub.o -2 is is length. BD.sub.i and LD.sub.i (i.noteq.0) are
the left adjusted second bit-map (B.sub.2) diagonal and the
associated LABEL diagonal respectively. (LB.sub.i -2) and LL.sub.i
-2) are the lengths of the screens BD.sub.i and LD.sub.i. A JOBLIST
item terminates with an asterisk attached to the last non-zero
diagonal of LABEL. A JOBLIST consists of NTASKS JOBLIST items,
whose total length is found in JLL. A Bit Map Item consists of a
Bit-Map Head (LB.sub.i) and a Bit-Map Screen (BD.sub.i) Information
Representation Items consists of a I.R. Head (i.e. LL.sub.i) and an
I.R. Screen (i.e. LD.sub.i). Bit-Map and I.R. elements are the bits
and elements in their screens.
95 BITTHROW-(&J BITTHROW &NUMBITS,&BITMAP)
This macro-instruction generates a pseudo-random Bit Map Item,
which contains a two-byte Bit-Map Head and the Bit-Map Screen. The
Bit-Map Head contains the length of the Bit-Map Screen plus two.
&NUBMITS contains the number of elements (i.e. bits) in the
screen. &BITMAP is the beginning address of the Bit-Map Item.
All registers are unchanged.
96. JBLISTI(&J JBLISTI
&JBLIST,&MVALUE,&JVALUE,&NUMDIAG).
Jblisti constructs a pseudo-random JOBLIST item, in the array
&JBLIST(=JBLIST). Location &MVALUE contains the stipulated
value of `M`. &JVALUE contains the maximum value for each of
the m.sub. i (1 .ltoreq. i .ltoreq. M) values in the first screen
(see above). &NUMADIAG contains the number of Bit-Map Item -
I.R. Item pairs that are to be constructed. No register is
altered.
97. JLITEM-(&J JLITEM
&JBLIST,&MVALUE,&JVALUE,&NUMDIAG)
This macro-instruction appears in the supervisory component
(SJOBLIST) and it calls the random JOBLIST ITEM Generator
Component, SGENITEM. Registers 2 through 5 are loaded with the
addresses of the four variables, in that order. If the &JBLIST
array is less than 256 bytes long, an error message is printed and
execution is terminated. The four variable parameters (&JBLIST,
&MVALUE,&JVALUE and &NUMDIAG) are defined above.
98. KERTHROW-(&J KERTHROW &JBLIST) (i.e.
Kerthrow constructs a random principal diagonal for the information
representation i.e., LD.sub.o above) and stores it in the array
JBLIST at the location &JBLIST. The principal diagonal, which
is used as the second screen in the JOBLIST item, has no `zero`
I.R. elements. Registers 5 through 15 are not changed.
99. LBTHROW-(&J LBLTHROW &JAYBITS,&BITMAP)
This macro-instruction produces a single I.R. Item that is
associated with the Bit Map Item which begins in location
&BITMAP. &JAYBITS contains the number of I.R. elements that
are to be generated. All registers are unchanged.
100. MJTHROW-(&J MJTHROW &JBLIST,&M,&JAY)
Mjthrow assigns to `m` the value of &M and then generates M
random values of m.sub. i (see above) whose values lie between one
and &JAY. &JBLIST is the starting address in the array
JBLIST of the current JOBLIST ITEM. &M contains assigned
numbers of nested representations (M). &JAY contains the
maximum rank for the M nested representations. If &M .ltoreq. 0
or &JAY .ltoreq. 1 an error message is printed and execution is
terminated. All sixteen registers are unchanged.
101. NORMFORM-(&J NORMFORM
&JBLIST,&NTASKS,&JLL,&KLENGTH)
This macro-instruction is used in the TRANSLATION PACKAGE
supervisory component (SJOBLIST) to call the principal
NORMALIZATION PACKAGE component (SNORMAL). The four variable
parameters of NORMFORM are dummy parameters intended to indicate
the key information that is needed by SNORMAL. &JBLIST(=JBLIST)
is the beginning address of the JOBLIST. &NTASKS (=NTASKS)
contains the number of items in the JOBLIST. &JLL (=JLL)
contains the total length of the JOBLIST. &KLENGTH (=KLENGTH)
contains the number of bytes in each element of the I.R. NORMFORM
does not change any register.
102. OVERRIDE-(OVERRIDE )
The OVERRIDE macro-instruction is executed in component SJOBLIST
when control returns from TRANLATE. Information about the number of
Type 1, Type 2, and Type 3 override codes (which is automatically
collected by the TRANSLATORS) is used in OVERRIDE to read updating
information from cards. Registers 3 and 4 are changed.
103. SQUEEZE-(&J SQUEEZE &NUMBER,&ADDRESS)
Squeeze removes the `zero` or `empty` elements from a single I.R.
Screen and modifies the I.R. Head accordingly. &NUMBER CONTAINS
THE NUMBER OF I.R. Elements in the I.R. Screen. &ADDRESS is the
beginning address of the associated Bit Map Item. All registers are
unaltered.
104. TRANLATE-(&J TRANLATE
&ARRAY,&NTASKS,&JLL,&KLENGTH)
This macro-instruction is used in the supervisory component of the
TRANSLATION PACKAGE, SJOBLIST, to call the five TRANSLATOR
components (STLATORX, with X=1,2,...5). These TRANSLATORS rearrange
the assigned descriptor-sets to the JOBLIST form. Timing of
searches begins, and information about the TRANSLATOR is printed in
TRANLATE. The translator gate, (TGATE) which is a command read in
SJOBLIST, determines which of the five TRANSLATORS will be used.
There are provisions in TRANLATE for incorporating up to 255
different translators. When control returns from the selected
TRANSLATOR component, NOVER1, NOVER2 and NOVER3 contain the number
of occurrences of Type 1, Type 2 and Type 3 over-ride codes. The
locations of these override codes in the JOBLIST is located in the
three primary override-arrays. This information is used by the
OVERRIDE macro-instruction in the component SJOBLIST to read update
information from cards. The variables are the dummy parameters
defined for NORMFORM.
5. Transformation Package
The TRANSFORMATION PACKAGE consists of three components (SCYCLIC,
SREFLECT, AND SXCHANGE). The transformation Package is used by both
the TRANSLATION and RETRIEVAL PACKAGES.
6. Normalization Package.
At this time the NORMALIZATION PACKAGE consists of one empty shell
component (SNORMAL) and its calling macro, NORMFORM (see
Translation Package Above). The normalization Package will use the
Transformation Package and it will be executed in the supervisory
component SJOBLIST after the TRANSLATORS have produced JOBLISTS
(see 4. Translation Package). New special reserve macros will be
incorporated here when they are implemented.
7. Retrieval Package
The RETRIEVAL PACKAGE will automatically retrieve, store, purge, or
update the AUXILIARY FILE and/or the MAIN FILE with information
produced by the Translation Package and, if necessary, the MOBILE
CANONICALIZATION PACKAGE also.
The RETRIEVAL PACKAGE consists of 16 special service macros, two
components (SSEARCH and SRESULT), and the MOBILE CANONICALIZATION
and GLOBAL MEMORY PACKAGES. Another two special service macros
(BEGINS and MJARRAY) and one component (SSTATECL) initialize the
RETRIEVAL Package (see 1. Initializing Operations above). They will
not be considered further here.
The GLOBAL MEMORY PACKAGE is called from the SSEARCH component by
the TBADD macro-instruction whenever a memory-block is to be
transferred between core-storage and virtual memory. The calling
procedure, GLOBAL, has been described in (2. Input Operations)
SRESULT, which prints search performance data, is called from the
CONTROL routines SOLIDE and SOLIDO after each use of the RETRIEVAL
PACKAGE. In production situations SRESULT will not be used.
The structure of the composite addresses, that are used in the
AUXILIARY FILE to reference the memory-blocks, is:
D and DNO specify the type of device and its number respectively.
TRK, CYLN, and FMADD are the track, cylinder, and fast-memory
addresses (relative to the starting point) respectively. If a
magnetic type is specified, then the sixteen bits of TRK and CYLN
specify the record number.
Three of the 16 special service macros (APART,ASADD, and COMPARE),
which are used throughout the RETRIEVAL and GLOBAL MEMORY Packages,
are used to decompose, assemble, and compare the component parts of
the composite addresses. Another macro-instruction, BULK, updates
the Bulk Storage Composite Address (also called BULK) after each
new allocation of storage for referenced information. The
macro-instruction DISPENSE is executed in the CONTROL routines
SOLIDE and SOLIDO when control returns from components SSEARCH and
SRESULT. Its primary function is to execute the post-operation
commands which determine whether or not the COPAK compressor will
be used. DISPOSE is a special form of DISPENSE that is used in the
stand-alone compressor CONTROL routines.. The macro-instruction
LINKHOLE, listed as instruction (115) below, ensures that unused
storage in memory-blocks, released during purging operations, will
be efficiently searched. The remaining nine of the 16 special
service macros are associated exclusively with the heavily nested
SSEARCH component.
105. APART-(&J APART
&ADDRESS,&RD,&RDNO,&RTRK,&RCYLN,&RFMADD).
Apart extracts the five component parts from the composite address
(in location &ADDRESS) into the designated register. If
&RTRK and &RCYLN are the same register, a magnetic tape is
specified, and both registers contain the record number. The
composite address, in location &ADDRESS, and all other
registers are unchanged.
106. ASADD-(&J ASADD
&ADDRESS,&RD,&RDNO,&RTRK,&RCYLIN,&RFMADD)
This macro-instruction assembles the composite address in location
&ADDRESS from its five component parts, in the designated
registers. If &RTRK and &RCYLN are the same register, a
magnetic tape is specified, and the record number is placed in the
twelve bits normally occupied by RTRK and CYLN (see above).
107. AUXFILE-(&J AUXFILE &LSLOW,&ADDL)
Auxfile is executed in the SSEARCH component whenever the terminal
location of an information path has been found or created. The
terminal locations are in RFILE, and they contain the address(es)
of the compressed reference information. AUXFILE inserts (storage
or updating), deletes (purging), or collects (retrieval) the
bulk-storageaddress(es) in the RFILE sub-arrays of the AUXILIARY
FILE. New composite bulk storage addresses are assigned and the
macro-instruction BULK is executed in AUXFILE. The two System
Parameters (&SLOW and &ADDL) are the lengths of slow and
fast and the entire composite address respectively. In our program
&SLOW=3 and &ASSL=6.
108. bulk-(&j bulk )
this macro-instruction is executed in AUXFILE after each new
assignment of a bulk storage address (called BULK). It increments
the record number in the composite address BULK by one.
109. COMPARE-(&J COMPARE &ADDL,&ADD2,&ADDL)
The first four parts (D, DNO, TRK and CYLN) in the composite
addresses &ADD1 and &ADD2 are compared. &ADDL is the
number of bytes (viz. six, in the addresses. The COMPARE
instruction sets the condition code in the PSW OF THE System 360.
&ADDL is a System Parameter which is set at compilation
time.
110. CREATE- (&J CREATE
&ADDL,&LSLOW,&LFAST,&NTRKS,&TRKL,&MATRIXL,&MATRIXS)
The CREATE MACRO, which is called by TBADD in the SSEARCH
component, is entered whenever a new sub-array is to be created in
a memory-block. The new subarray may be needed to define a new
subpath or it may be needed to extend an RFILE subarray. CREATE
which is closely interconnected with the global memory component
(SMEMORY), via the TBADD instruction, is extremely complex. Without
this special service-macro the generalized retrieval system which
we have devised would not exist. The seven variable names (of
CREATE) are System Parameters that have been fully defined
elsewhere in this disclosure. If &ADDL, &SLOW, &LFAST,
&NTRKS or &TRKL are changed, the AUXILIARY FILE must be
started from scratch. &MATRIXL and &MATRIXS can be changed
at any time.
111. CSCREEN-(&J CSCREEN &HLGTH,&ADD1,&ADD2)
Cscreen is used in the SCREEN macro-instruction whenever the screen
portion of an EXECUTIVE POINTER is to be compared with the
corresponding screen in the JOBLIST item. &ADD1 is the
beginning location of the EXECUTIVE POINTER screen. &ADD2 is
the address of the JOBLIST item screen. &HLGTH is a half-word
which contains the screen length. All registers are unchanged.
The byte JI is used to indicate the comparison status thus:
Ji=00; the screens are equal.
Ji=01; the first screen (&ADD1) is zero (i.e., the location is
empty).
Ji=02; the first screen (&ADDL) is less than the second
(&ADD2).
Ji=04; the first screen is greater than the second.
112. DISPENSE-(&J DISPENSE &RADD)
Dispense is used in the CONTROL routines SOLIDE and SOLIDO
immediately after the SSEARCH and SRESULT components are called. It
is executed after each use of SSEARCH and again after each use of
the COPAK Compressor, when it is called from COPAKEND in SOUTPUT.
DISPENSE executes the Post-Operation (LJ), NTASKS, and NJOBS
commands. Register 1 is altered. &RADD is the entry location
(in the CONTROL routine) for reading input for the compressor.
113. DISPOSE-(&J DISPOSE &RADD)
Dispose is used in place of DISPENSE in the CONTROL routines for
the six stand-alone compressors. The primary difference between
DISPENSE and DISPOSE is the way they execute the NTASKS and NJOBS
commands.
114. INSERT-(&J INSERT &ADD1,&LFAST,<HAYY)
Insert is used in the macro-instruction SCREEN (see below) to
insert new executive pointers in their correct positions in
subarrays in the AUXILIARY FILE. It is executed only when an
EXECUTIVE POINTER must be moved from a location in any subarray.
&ADDL, &LFAST, and <HYY are System Parameters.
115. LINKHOLE-&J LINKHOLE &EPLNGTH,&FEMPTY)
Linkhole is used in the CREATE macro-instruction. It is executed
only when a memory-block is completely used, and before a new
memory-block is created. Its designed purpose is to reuse vacant
storage areas in the resident memory-block that might have been
released during purging operations. Implementation instructions
have been incorporated in the macro (see Appendix). EPLNGTH
contains the length of the new Executive Pointer. &FEMPTY is
the address of the first available subarray in the memory-block.
All available subarrays are chained together via their link
addresses.
116. MMATCH-(&J MMATCH
&NOVER1,&NOVER2,&NOVER3,&CURSCRN,&JBLIST,&JBWORK,&KLENGTH)
This macro-instruction is executed in the SSEARCH component
whenever a screen or index, which corresponds to a subpath, is not
found. It is also executed during retrieval operations if there are
override codes present in the JOBLIST items. If a storage operation
is being performed, the signal, MSIGNAL, is set to indicate that an
insertion is to be made in the resident memory-block and, after
exiting from MMATCH, the insertion is made. The screening
procedure, the CREATE macro, and the global-memory component
(SMEMORY) are tied together by the MSIGNAL and JI (see CSCREEN)
multi-bit signalling system. This signalling system polices the
resident memory-block and notifies CREATE whether or not arrays are
to be created. It also notifies SMEMORY what procedure is to be
executed when a new memory-block is needed or when the job-stream
is to be terminated. In retrieval operations MMATCH aborts the
search, if no overrides are present, passes control to the MOBILE
CANONICALIZATION PACKAGE via its calling procedure, STRATEGY. The
last four variable names to MMATCH specify information needed in
STRATEGY (see below). The screen procedures, MMATCH, and the MOBILE
CANONICALIZATION PACKAGE are tied together by the SRGATE multi-bit
signalling system.
&NOVER1=NOVER1 contains the number of Type 1 override
codes.
&NOVER2=NOVER2 contains the number of Type 2 override
codes.
&NOVER3=NOVER3 contains the number of Type 3 override codes.
&CURSCRN,&JBLIST,&JBWORK, and &KLENGTH are dummy
variable names for STRATEGY which are explicitly defined in the
STRATEGY macro (see Appendix).
117. SCREEN-(&J SCREEN
&ADDL,&LFAST,&ADD1,<HAYY)
The SCREEN macro-instruction is executed in the SSEARCH Component
whenever a screen is sought, (in the resident memory-block,) in one
of the column arrays, or its extensions. After initializing the
counters and registers, if no over-ride codes are present in the
JOBLIST item, the location in the array where the executive pointer
should be is determined by the SUPERSCH macro-instruction. If
overrides are present, control will go directly to MMATCH, for
processing by the MOBILE CANONICALIZATION PACKAGE. If the executive
pointer selected by SUPERSCH contains the sought screen (i.e. the
sub-path has been found) the tracing procedure (through TBADD)
continues. If the two screens are not identical, one of the
following can occur:
1. A retrieval job is terminated via the MMATCH macro with a
message that the search was unsuccessful.
2. If a vacancy exists, control again goes to MMATCH for insertion
of the new executive pointer, which defines a new subpath, and the
tracing of the information path continues.
3. If no vacancy exists a hole is created for the new executive
pointer and (2) occurs. The hole creating procedure is as follows:
If the column-array has a vacant location then all executive
pointers greater than the one to be inserted are moved and (2)
occurs. If the array is filled, the last executive pointer
(EP.sub.L) is saved; a hole is created; the new executive pointer
is inserted; the continuance or extension array is found (via the
link-address or created (via TBADD), and CREATE); and SCREEN is
used to search the new array for the location to insert EP.sub.L.
This procedure is repeated until all the executive pointers in an
array and its continuances are arranged in increasing order, then
the tracing (or creation) of sub-paths continues in the normal
manner. The procedure that will be used to ensure that the
information paths do not cross memory-blocks, thus eliminating
costly additional accesses to the virtual memory, is somewhat
analogous to the hole creating procedure that has just been
described. However, in this case, the Automatic purging (still to
be implemented), hole-closing, and hole-creating capability of
SCREEN must all be used. Implementation of this capability will
require new service macros for SCREEN and TBADD. The four variable
names in the Screen prototype statement are System Parameters,
defined earlier.
118. STRATEGY-(&J STRATEGY
&CURSCRN,&JBLIST,&JBWORK,<HAYY).
This macro instruction appears in the MMATCH macro. It is the entry
macro for the MOBILE CANONICALIZATION PACKAGE. (see below).
STRATEGY is executed if any overrides are present. It interacts
closely with SCREEN and TBADD via the multi-bit SRGATE command.
&CURSCRN, &JBLIST, &JBWORK, and <HAYY are dummy
variables which are fully explained in the STRATEGY macro in the
Appendix.
119. SUPERSCH-(&J SUPERSCH &ADDL,&LFAST)
Supersch is executed in the SCREEN macro-instructions. It is
executed if no override codes are present. However, it can be
entered from the macro-instruction STRATEGY. SUPERSCH does a
partition search of an array to find where the new executive
pointer should be located. The two System Parameters, &ADDL and
&LFAST, have already been defined.
120. TBADD-(&J TBADD
&ADDL,&LSLOW,&LFAST,&NTRKS,&TRKL,&MATRIXL,&MATRIXS)
The link address (for continuance or extension arrays) or address
in the executive pointer obtained by SCREEN is stored in ADDRESS,
and control goes to the TBADD macro. The peripheral equipment
addresses, which occupy the three left bytes of ADDRESS and
CURRENT, are compared in TBADD to determine if the required
memory-block (designated by ADDRESS) is resident in core. If it is
not in core the global-memory component (SMEMORY) fetches it. If
the create bit of the MSIGNAL signal byte is on, TBADD directs
CREATE to create a new column-array with &MATRIXL or
&MATRIXS locations for executive pointers. The variables for
TBADD are defined in the section on System Parameters.
8. Global Memory
The fully implemented GLOBAL MEMORY consists of one component
(SMEMORY and two special service macros (DCBMEM and GLOBAL). The
calling procedure (GLOBAL), which has already been described above,
is used in the TBADD macro-instructions. GLOBAL MEMORY transfers
memory-blocks between core-storage and virtual memory. The Global
Memory component, SMEMORY, contains its own DCB or format
statements, which are specified in the macro-instruction
DCBMEM.
121. dcbmem-(&j dcbmem )
this macro-instruction appears at the end of the SMEMORY
components. It contains DCB or format statements together with the
IBM OPEN and CLOSE instructions for all the peripheral devices that
can be used by the GLOBAL MEMORY. If new peripherals are added the
initializing macro-instruction BEGINS, which appears in the SSEARCH
component, must be altered. However, except for the recompilation
of SSEARCH and SMEMORY, no other changes are necessary. One IBM
2314 disk is currently used for the virtual memory. Branching to
the location CLOSE terminates the job-stream. The virtual memory
peripheral devices are opened (in DCBMEM) when SMEMORY is used for
the first time. The DCB name is GLOBAL1 and its DDNAME is
COPAK7.
9. Mobile Canonicalization Package
This package consists of the calling procedure STRATEGY, which is
described above, and two components (SMATCH) and SMOBILE), which
are referred to below. The aforementioned MOBILE CANONICALIZATION
PACKAGE can also use the TRANSFORMATION PACKAGE.
Component SMATCH will determine whether or not mismatches (in
MMATCH) are solely due to the presence of one or more override
codes. Component SMOBILE will be executed after SMATCH only if the
NORMALIZATION PACKAGE was used (to obtain NORMAL FORMS). The design
objectives for the various parts of the MOBILE CANONICALIZATION
PACKAGE have been fully discussed in the article by P.A. D. deMaine
and B.A. Marron, "The SOLID System I. A Method for Organizing and
Searching Files." in the book "Information Retrieval: A Critical
View." The book is edited by G. Schecter and was published by the
Thompson Book Company of Washington, D.C. in 1967.
C. Components
Components are macro-instructions that are stored in the
macro-library, SOLID.MACLIB, which have their own using statements
to define addressability. In the components it is assumed that data
about information that is being processed and the information
itself will be found in certain locations and arrays whose
addressability is established in the CONTROL routine. Except for
this restriction, the components may be viewed as independent
subprograms or subroutines. They can be separately compiled (as
named CSECTS) for use in planned overlays (see CONTROL PROGRAMS),
or they may be compiled with the CONTROL routine by inserting their
ENTRY and prototype statements into the `SUBMP` type-instruction.
This flexibility in the use of components facilitates the
implementation or modification of components, and, with planned
overlays, makes it easy to fit the SOLID System onto small 360
configurations. Moreover, with this flexibility the SOLID System
can be spread over several partitions in a single computer or over
several computers in a network to further improve its already
impressive performance.
The calling procedures are used to branch between the CONTROL
routine (viz. main-stem) and the components or among the components
themselves. The three different types of calling procedures are
illustrated in FIG. 15, and they are described in Section B. The
points that are to be emphasized in this section, C, will be
illustrated with the calling procedures CALL1 (prototype: &J
CALL1 &NAME,&ALINST) and TRANSFER (prototype: &J
TRANSFER &N,&NAME,&RETURN), and the alphanumeric
compressor component (prototype: SANPAKC &NAME, &UR,
&RR, &DUMMY). &NAME is the relocatable address of the
component, obtained by dropping the first S from the component's
name. &ALINST and &RETURN are return addresses. &UR and
&RR are the USING and branch registers for the component.
&RR can be any register except &UR or 10, 11, and 12, which
are used to establish addressability in the CONTROL routine. The
choice of the base or USING register (&UR) is generally
restricted to 8, 9, and sometimes 7 (see below). &DUMMY
designates whether the component was separately compiled as a named
CSECT (&DUMMY=DUMMY) or if it was compiled in the CONTROL
routine by inserting the prototype statement in the "SUBMP" type
instruction (&DUMMY=SOLID). If &DUMMY is equal to DUMMY,
the separately compiled component is stored in the module-load
library, SOLID.LOAD, for use in planned overlays.
The instruction:
CALL1 ANPAKC,RANPAKC
executes the branch-and-link to the component (SANPAKC) and, on
completion of the component's task, returns control to the location
RANPAKC. Because the CALLX (X=1,2,-5) instruction changes the
values of registers &UR, &RR and others they can only be
safely used if the return address (&ALINST=RANPAKC) is in the
CONTROL routine. In general, they cannot be used to branch between
two components.
The TRANSFER instructions can be used to branch between the CONTROL
routine and components or among components. In this instruction all
registers are unchanged. Thus the return address, &RETURN, can
be either in the CONTROL routine or in the component where the
instruction was issued. This greater versatility has been achieved
by assigning levels to every component. The level of the requested
component is incorporated into the TRANSFER instruction For
example:
TRANSFER 1, ANPAKC,RANPAKC
means that control is to be transferred to component SNAPAKC at the
first level. The return address RANPAKC can be in the CONTROL
routine or it can be in another component (where the TRANSFER was
used). The rules for using the TRANSFER instructions are given
next.
i. There are, in the given form, five levels available. This number
can be increased by adding to the COSAVE&N (&N=1,2,-5)
arrays in the macro-instruction SAVEAREA.
ii. While every component has been assigned a specific level, new
levels can be arbitrarily reassigned. However, a level assigned to
the preceding members of a chain of components cannot be used. This
means that with a CALL1 and five successive TRANSFER instructions
control can be transferred from the CONTROL routine through five
components. The return address for each TRANSFER instruction can be
in the CONTROL routine or in the component which contains it.
iii. Components which have been assigned the level "GLOBAL" are not
subject to the restrictions in (ii). They have their own special
calling procedures, which can be used anywhere in the SOLID system.
PRINT,PUNXH,REIDX(X=1,2,-8) and GLOBAL are such calling
procedures.
The levels assigned to each component are shown in Table 5.
TABLE 5.
Levels and Classes of Components. The assigned levels are used in
the TRANSFER instruction (see Text). The classes are the categories
in this Section, C. ##SPC9##
The thirty-one components of the SOLID System (see Table 5) are
identified in nine classes or categories. Two (OPENSHUT and
SSTATECL) may be viewed primarily as components which initialize
the SOLID System. One of these (OPENSHUT) opens and closes the
peripheral device (except virtual memory). Three (SPRINT, SPUNXH,
and SREID) of the seven I/O components perform the basic operations
of reading or punching cards, and printing. The other four
components (SCOMMAND, SOUTPUT, SREADC and SREADT) are used to
communicate with the user. They, together with components SJOBLIST
and SRESULT, handle the input and output for the SOLID System and
its subsystems.
The six components in Class 3 are exclusively associated with the
COPAK compressor subsystem. One of these six (SACTION), which is
called from the macro-instruction DEVICES (see Section A), is
executed when a decompression error is detected. Another two
components, SANPAKJ and SNUPAB, are the special forms of SANPAKD
and SNUPAK that are used in the stand-alone alphanumeric and
numeric compressors.
The TRANSLATION PACKAGE contains seven components (Class 4). The
supervisory component, SJOBLIST, also reads the translation Package
commands from cards and information about descriptor-sets. The
TRANSFORMATION (Class 5) and NORMALIZATION (Class 6) packages
contain three and one components respectively.
The RETRIEVAL PACKAGE (Class 7) contains two components, SRESULT
and SSEARCH. The component SRESULT prints information after each
use of the SSEARCH component. In production situations SRESULT can
be omitted.
The GLOBAL MEMORY (Class 8) and MOBILE CANONICALIZATION (Class 9)
packages contain one and two components respectively.
In the remainder of this section, C, short descriptions of the 31
components are given.
1. Initializing Components
122. OPENSHUT-(OPENSHUT )
This level 0 component is always compiled in the main stem (viz.
CONTROL ROUTINE). It is positioned by the "SUBMP" type instruction
(see Section A). It opens (at the beginning of each jobstream) and
closes (during termination) the DCB's specified in the INOUT
macro-instruction. All peripheral devices, other than those in the
virtual memory, must be specified in INOUT and OPENSHUT. OPENSHUT,
which is called in the macro-instruction DEVICES, uses registers 8
and 9 for USING and Branching.
123. SSTATECL-(SSTATECL
&NAME,&UR,&RR,&ADDL<HAYY,&DUMMY)
Sstatecl is called from the RESERVE macro-instruction at the start
of each job-stream. It initializes addresses used in SSEARCH and
generates or reads (from cards) the permanently resident part of
the AUXILIARY FILE, which contains the array associated with the
prime index, M, and the screen J(=m.sub. 1 m.sub. 2 m.sub. 3
-m.sub. M). If the first card of the data-deck is NEWFILE then the
MJARRY macro-instruction generates the M-J arrays.
Level=1
&name is STATECL
&ur can be registers 8 or 9
&RR can be any register except &UR, 10, 11 or 12.
&ADDL is the number of bytes in the composite addresses which
preface a memory-block.
<HAYY is the number of bytes in the principal data-array,
YY.
2. I/O Components
SPR1NT, SPUNXH and SREID are the basic card/printer I/O components
for the SOLID System. SCOMMAND reads (from cards) both the device
and string commands. SOUTPUT is the principal output component. It
handles the compressed and decompressed referenced information.
SREADC and SREADT read in the referenced information that is to be
compressed or decompressed.
i. Basic Components
There are two output (SPR1NT and SPUNXH) and one input (SREID)
components in the SOLID System which are used to print, punch cards
and read cards. These components, which are called in the SOUTPUT
and SREAD components, can be used (with the PR1NT, PUNXH, and REIDX
macro-instructions) on a stand-alone basis. They are extremely
versatile, and can be used anywhere in the SOLID System.
124. SPRINT-(SPRINT &NAME,&UR,&RR,&DUMMY)
the PR1NT service-macro calls the component SPR1NT, which prints
the requested information in the designated format(s) on the
printer. The DCB, PR1NT, is specified in the INOUT
macro-instruction.
Level=GLOBAL
&name is PR1NT
&ur must be register 9; &RR can be any register except
9-12.
125. SPUNXH-(SPUNXH &NANE,&UR,&RR,&DUMMY)
This component is called by PUNXH (see Section B). The DCB, PUNXH,
is specified in INOUT (see Section A).
Level=GLOBAL
&name is PUNXH.
&ur must be register 9.
&RR can be any register except 9, 10, 11, or 12.
126. SREID-(SREID &NAME,&UR,&RR,&DUMMY)
This component is called by the eight REIDX (X=1,2,-8)
service-macros. It reads information from cards with the DCB named
MASTER (specified in INOUT).
Level=GLOBAL
&name is REID.
&ur must be 9
&RR can be any register except 9, 10, 11 or 12.
ii. Special Components
127. SCOMMAND-(SCOMMAND &NAME,&UR,&RR,&DUMMY)
This component has two parts, both called from the CONTROL
routines. The first part (&NAME=COMANDD) reads the device
commands from cards (calling instruction is DEVICE). The second
part (&NAME=COMANDS) is called by STRING, it reads the string
commands. The default options for all input commands are set in
SCOMMAND.
Level=1
&NAME=COMMAND, COMANDD or COMANDS.
&ur must be 8
&RR can be any register except 9, 10, 11 or 12.
128. SOUTPUT-(SOUTPUT &NAME,&UR,&RR,&DUMMY)
This output package for the COPAK compressor prints, punches, or
writes on tape the information strings that are processed by COPAK.
A format code, which is stored with the compressed string of
information, is used to print or punch the output in the entry
format type.
Level=1
&NAME is OUTPUT
&ur can be register 8 or 9
&RR can be any register except &UR, 10, 11, or 12.
129. SREADC-(SREADC &NAME,&UR,&RR,&DUMMY)
The SREADC component reads control information and the substrings
of data on cards that are processed by COPAK. The PCORDS table is
read and checked in SREADC. The status-of-substring control words
(SOS) are modified and the status-of-segment control word (PARM) is
constructed. The input commands are printed at the end of the
SREADC component.
Level=1.
&NAME is READC.
&ur is register 8 or 9.
&RR can be any register except &UR, 10, 11, or 12.
130. SREADT-(SREADT &NAME,&UR,&RR,&DUMMY)
This component reads the compressed and/or decompressed information
from magnetic tape. Compressed information is read with the DCB
named TAPEIND (DDNAME=COPAK4) and uncompressed information is read
with the DCB named TAPEINC (DDNAME=COPAK5). DCB's are specified in
the INOUT macro-instruction.
Level=1
&NAME is READT
&ur can be either 8 or 9.
&RR can be any register except &UR, 10, 11 or 12.
3. Compressor
The stand-alone numeric (COPAKNU), alphanumeric (COPAKAN), and
combined (COPACOKO) Compressors use different combinations of the
six compressor components. The combined compressor, which is also
used in the CONTROL routines SOLIDE and SOLIDO, consists of two
alphanumeric (SANPAKC and SANPAKD) and one numeric (SNUPAK)
component. COPAKAN contains SANPAKC and SANPAKJ, which is a
modified form of SANPAKD. COPAKNU contains the modified form of
SNUPAK, which is called SNUPAB. The SACTION component is called
from the main stem whenever decompression errors occur.
131. SACTION-(SACTION &NAME,&UR,&RR,&DUMMY)
This component is called from the macro-instruction DEVICES in the
"Reserve" type instruction (see Section A). Control passes to
SACTION from the decompressor parts of the compressors whenever an
error is found. Currently SACTION prints appropriate error messages
and terminates the job-stream. Error correcting procedures and/or
retransmission requests should be handled by SACTION.
Level=1
&NAME is ACTION
&ur can be any register other than 0, 1, or 10-15.
&RR can be any register other than &UR, 10, 11 or 12.
132. SANPAKC-(SANPAKC
&SNAME,&&UR,&RR,&DUMMY)
This component compresses the strings of information by two
recursive bit-pattern methods in one of two anodes. In the
SLOW-MODE the recursive bit-patterns that are used are obtained
from the string itself, and those bit patterns which yield savings
are stored in a table, PCORDS. In the FAST-MODE only those
bit-patterns in the PCORDS table are used. The three input commands
associated with SANPAKC (LEXCON, LEXMODE, AND LEXPCH) provide the
following options:
i. Build a new PCORDS by compressing the first X strings of
information in the SLOW-MODE, then process all subsequent strings
in the FAST-MODE.
ii. Read in PCORDS and operate exclusively in the FAST-MODE.
iii. Extend the PCORDS table by processing the first X strings in
the SLOW-MODE and then switch to the FAST-MODE.
Level=1.
&NAME is the relocatable address ANPAKC.
&ur -- the base or using register -- can be 8 or 9.
&RR -- the branch register -- can be any register except
&UR, 10, 11 or 12. In our programs &RR = 1.
&dummy is discussed above.
133. SANPAKD-(SANPAKD &NAME,&UR,&RR,&DUMMY)
This component first decompresses strings compressed by SANPAKC
then disassembles its decompressed strings for processing by the
decompressor part of the numeric compression package (SNUPAK).
Since all the information that is needed to decompress the strings
is stored in the strings themselves, no additional data is
needed.
Level=1.
&NAME is the relocatable address ANPAKD.
The restrictions on &UR, &RR and &DUMMY for SANPAKC
apply for SANPAKD also.
134. SANPAKJ-(SANPAKJ &NAME,&UR,&RR,&DUMMY)
Sanpakj is the modified form of SANPAKD that is used in the
stand-alone alphanumeric compressors (COPAKAN). It contains the
macro-instruction JIMP1, which performs those substring operations
which are normally executed in SNUPAK.
Level=1
&NAME is the relocatable address ANPAKD.
&ur,&rr, and &DUMMY have the specifications given for
SANPAKC and SANPAKD.
135. snupab-(snupab &name,&ur,&rr,&dummy)
snupab is a modified form of SNUPAK, that is used in the
stand-alone numeric compressor (COPAKNU). SNUPAB contains the
macro-instruction STRINGD, which is normally executed at the end of
SANPAKD.
Level=1.
&NAME is the relocatable address NUPAK.
&ur can be 8 or 9
&RR can be any register except &UR, 10, 11 or 12.
&DUMMY is set equal to SOLID if SNUPAB is compiled in the
main-stem (viz. extended form). It is set equal to DUMMY if SNUPAB
is compiled separately as a named CSECT.
136. snupak-(snupak &name,&ur,&rr,&dummy)
snupak is the numeric compressor-decompressor package in COPAK. It
processes strings of information one substring at a time.
Compression is accomplished by the four step procedure: truncation,
differencing, sequencing, and packing. There are three truncation
methods, two of which are automatic. If savings cannot be achieved
compression is terminated without loss of information.
Level=1
&NAME is NUPAK.
&ur can be register 8 to 9.
&RR can be any register except &UR, 10, 11, or 12.
4. Translation Package
The TRANSLATION PACKAGE consists of seven components and the
NORMALIZATION PACKAGE, which uses the TRANSFORMATION PACKAGE. One
of the seven components, SJOBLIST, can be regarded as the
supervisory routine for the entire Translation Package. The
macro-instruction GETJLIST calls SJOBLIST from the CONTROL routines
SOLIDE AND SOLIDO.
SJOBLIST is also the input component for the Translation Package.
It reads the Translation Package commands; generates or reads
descriptor-sets; reads over-ride information; rearranges
descriptor-sets to the JOBLIST item form; and normalizes the
JOBLIST items. Random JOBLIST items are produced by the component
SGENITEM. Five TRANSLATOR components (STLATORX, with X=1,2,-,5),
which are called by the special service macro TRANSLATE, convert
the descriptor-sets to their JOBLIST item form. There are
provisions for incorporating up to 255 TRANSLATORS. JOBLIST item
are converted to their NORMAL FORMS by the Normalization Package,
which is called in SJOBLIST by the special service macro
NORMFORM.
The TRANSLATION PACKAGE components are briefly described next:
137. SGENITEM-(SGENITEM &NAME,&UR,&RR,&DUMMY)
This component generates JOBLIST items in the array stipulated by
its calling procedure (JLITEM), which appears in the component
SJOBLIST. SGENITEM uses the information in registers 2, 3, 4 and 5
that were loaded in JLITEM.
Level=2
&NAME is the relocatable address GENITEM.
&ur can be register 8 or 9.
&RR can be any register except 2, 3, 4, 5, &UR, 10, 11 or
12
&DUMMY has been defined above.
138. SJOBLIST-(SJOBLIST &NAME,&UR,&RR,&DUMMY)
Sjoblist is the supervisory component for the TRANSLATION PACKAGE.
Its functions have been briefly described above.
Level=1.
&NAME is JOBLIST
&ur can be either 8 or 9
&RR can be any register except &UR, 10, 11, or 12.
139.
to
STLATORX, with X=1,2,3,4, or 5
143.
There are five TRANSLATOR components which have prototype
statements like: STLATOR1 &NAME,&UR,&RR,&DUMMY. The
first, STLATOR1, is reserved for the AGISAR Translator, which
rearranges automatically, extracted data from N-dimensional graphs
(or picutres) to the JOBLIST form. New Translators must be coded
for each new collection of items. There are provisions in the
special service macro TRANLATE, which appears in component
SJOBLIST, for incorporating up to 255 Translators.
Level=3.
&NAME is TLATOR1 or TLATOR2 or TLATOR3 or TLATOR4 or
TLATOR5.
&ur can be 8 or 9.
&RR can be any register except &UR, 10, 11 or 12.
5. Transformation Package
The TRANFORMATION PACKAGE consists of three components whose design
purposes are to execute the CYLIC SHIFT, REFLECTION, and the
INTERCHANGE Transformation Rules. These components can be used by
both the NORMALIZATION and MOBILE CANONICALIZATION packages.
144. SCYCLIC-(SCYCLIC &NAME,&UR,&RR,&DUMMY)
The CYCLIC component will execute both the left and right cycle
shifts. Entry information needed in CYCLIC is defined in the
component SNORMAL.
Level=3
&NAME is CYCLIC
&ur can be 8 or 9
&RR may be any register except &UR, 10, 11 or 12.
145. SREFLECT-(SREFLECT &NAME,&UR,&RR,&DUMMY)
This component can be called from SNORMAL or SMOBILE. It will
execute the Reflection Rule.
Level=3
&NAME is REFLECT
&ur, &rr and &DUMMY are the same as for component
SCYCLIC
146. sxchange-(sxchange &name,&ur,&rr,&dummy)
sxchange executes the kernel Interchange Rule.
Level=3
&NAME is XCHANGE
&ur, &rr and &DUMMY have been specified for
SCYCLIC.
6. Normalization Package
The NORMALIZATION PACKAGE contains one component, SNORMAL, which is
called by the NORMFORM macro-instruction in component SJOBLIST.
SNORMAL can use the TRANSFORMATION PACKAGE (see above) to obtain
the NORMAL FORMS of JOBLIST items that are produced by the
TRANSLATORS.
147. snormal-(snormal &name,&ur,&rr,&dummy)
this component is called by NORMFORM in the component SJOBLIST
after the TRANSLATORS have been executed. Full details of its
assigned role in the SOLID System are given in the publication by
P.A.D. deMaine and B.A. Marron, mentioned earlier.
Level=2.
&NAME is NORMAL
&ur may be 8 or 9
&RR can be any register except &UR, 10, 11, or 12.
7. Retrieval Package
The RETRIEVAL PACKAGE contains two components, SRESULT and SSEARCH,
and it uses the GLOBAL MEMORY and MOBILE CANONICALIZATION PACKAGE,
which uses the TRANSFORMATION PACKAGE. In its present form the
Retrieval Package can handle explicit retrieval and storage
questions. With minor changes, in the MMATCH and AUXFILE
macro-instructions, it would handle the explicit purging and
updating tasks also. The fully implemented retrieval package is
able to handle any kind of explicit, implied (or non-explicit), or
browsing question.
The SSEARCH component and the GLOBAL MEMORY and MOBILE
CANONICALIZATION PACKAGES are extensively interrelated together.
They must be regarded as the Central Core of the SOLID Retrieval
System. The Global Memory Component (SMEMORY), which is called from
the TBADD macro-instruction in SSEARCH, transfers the memory-blocks
between core-storage and the preselected storage devices. The
SSEARCH component automatically traces and/or creates the
information paths in the resident memory-block. The MOBILE
CANONICALIZATION PACKAGE, which is called from the MMATCH
macro-instruction, is used in retrieval operations if override
codes are present. It makes possible the automatic implied (or
non-explicit), fragment, or browsing searches. The SRESULT
component, which is called from the CONTROL routine after
completion of a search, prints results obtained by the SSEARCH
component. SRESULT can be changed to collect statistics on the
performance of the retrieval system. Hard-copy of the stored
bulk-information is produced by the principal output component,
SOUTPUT.
The complex independence of SSEARCH and the two packages mentioned
are described elsewhere in this disclosure.
148. SRESULT-(SRESULT
&NAME,&UR,&RR,&LSLOW,&JBLIST,&DUMMY)
The SRESULT component is called from the CONTROL routine after the
SSEARCH component has been executed. In its present form it
constructs and prints an obvious message like: REQUESTED
INFORMATION APPEARS ON PRINTER. It also prints the request JOB-LIST
item and the BULK address(es) assigned or retrieved in RFILE. These
address(es) are normally the location of the compressed referenced
information in the bulk storage. SRESULT can be modified to collect
and analyze performance data for the SOLID System.
Leval=2
&NAME is RESULT.
&ur is register 8
&RR can be any register except &UR, 10, 11, and 12
&SLOW is the length of the slow position of composite
addresses.
&JBLIST(=JBLIST) is the Joblist array whose address is in
AJBLIST
&dummy has been defined.
149. SSEARCH-(SSEARCH
&NAME,&UR1,&UR2&RR,&ADDL,&LSLOW,&LFAST,NTRKS,&TRKL,<HAYY,&JBLIST,&LBJLIS
T,&MATRIXL,&MATRIXS,&DUMMY)
This component used the information in the JOB-LIST item(s) to
retrieve (MODE=0), store (MODE=1), update or purge (MODE=2,3 or 4)
items in the AUXILIARY FILE and compressed referenced information
in BULK STORAGE. Except for providing the retrieval command, MODE
all operations are fully automatic. Thus information sub-paths are
automatically traced (MODE=0, 2, and 3) and/or created (MODE=1, 2
or 3) and/or purged (MODE=2 and 3) in fast memory with the JOB-LIST
item information. The Global Memory component (SMEMORY) ensures
that the correct memory-block is resident in core-storage. SMEMORY
also updates the memory-blocks of the AUXILIARY FILE in the virtual
memory (see SMEMORY, above). Protection feature in SSEARCH ensure
that the AUXILIARY FILE (in virtual memory) will never be altered
by coding, operator of machine errors.
The MOBILE CANONICALIZATION PACKAGE, which handles implicit and
intersecting file questions, is called from the Service-Macro
MMATCH.
Purging and updating operations in the AUXILIARY FILE will be
executed in MMATCH. Thus the basic form of the SSEARCH component
will never be altered.
Level=1
&NAME - the relocatable address is SEARCH.
&ur1 and &UR2 are the two USING registers (7 and 8) which
establish addressability in the SSEARCH component.
&RR - the branch register - can be any register except 7, 8,
10, 11 or 12).
&DUMMY is defined above. The ten System Parameters were defined
earlier in this disclosure.
8. Global-Memory
The GLOBAL MEMORY PACKAGE consists of one component (SMEMORY) and
its calling procedure (GLOBAL), which has already been discussed.
The DCB or format statement(s) and the opening and closing
instructions for the Global Memory are specified in the
macro-instruction DCBMEM, which is found at the end of component
SMEMORY. If more storage is allocated then DCBMEM must be altered
and the macro BEGINS, which is used in SSEARCH, must be changed.
The Global Memory Package is fully described elsewhere in this
disclosure.
150. SMEMORY- (SMEMORY
&NAME,&URI,&UR2,&RR,&ADDL,&INTRKS&TRKL,&JBLIST,&DUMMY)
This component must be positioned by the SUBMP macro-instruction in
the CONTROL routine at compilation time. SMEMORY supervises the
transfer of the memory-blocks (in the AUXILIARY FILE) between core
storage and the designated virtual memory devices. The two parts of
SMEMORY accomplish the following:
Part A: (Relocatable Address MEMORY) This part is entered from the
TBADD macro-instruction in the SSEARCH component if a new
memory-block is needed. If necessary the currently resident
memory-block is rewritten at its assigned location (in virtual
memory) and then the requested memory-block is fetched from virtual
memory. New memory-blocks, which can be created in core-storage,
are automatically assigned storage areas in virtual memory).
Part B: (Relocatable address SAVEFM) This part is entered
immediately before the Operating System of the 360 regains control
to terminate the job. If the resident memory-block has been altered
in any storage, updating or purging operation it is stored at the
assigned location in virtual memory. It is suggested that the IBM
O/S job terminating routine be modified to include a final call to
Part B.
The service macro DCBMEM, which appears at the end of SMEMORY,
specifies the DCB, OPEN and CLOSE instructions for SMEMORY. The
macro-instruction BEGINS, which is used in SSEARCH, specifies the
device numbers.
Level=GLOBAL
&name is MEMORY.
&ur1 and &UR2 are two base registers (7 and 8).
&RR -- the branch register - can be any register except 7, 8,
9, 10. 11 or 12.
&ADDL is the length of the composite addresses.
&ENTRKS is the number of tracks (or records) in a
memory-block.
&TRKL is the length of the track (or record).
&JBLIST(=JBLIST) is the Joblist array whose address is in
location AJBLIST.
&dummy is defined above.
9. Mobile Canonicalization Package
The MOBILE CANONICALIZATION PACKAGE consists of a service macro
(STRATEGY), which appears in MMATCH, and two components, SMATCH and
SMOBILE. The component SMOBILE will use the TRANSFORMATION PACKAGE
to rearrange previously normalized JOBLIST items that contain
override codes. The component SMATCH will determine if mismatches
occur only because overrides are present. SMATCH and SMOBILE will
also perform the intersecting type of search.
151. SMATCH-(SMATCH &NAME,&UR,&RR,&DUMMY)
The information needed by the SMATCH macro has been given in the
macro-instruction STRATEGY (see Appendix).
Level=2
&NAME is MATCH
&ur can be 7,8, or 9
&RR can be any register except registers 7-12.
152. SMOBILE-(SMOBILE &NAME,&UR,&RR,&DUMMY)
Smobile is executed after SMATCH and then only if a mismatch could
not be resolved. Information required by SMOBILE is stipulated in
the macro STRATEGY (see Appendix).
Level=2
&NAME is MOBILE
&ur can be 7, 8 or 9.
&RR can be any register except registers 7-12
&DUMMY is defined above.
RETRIEVAL PACKAGE
Overview:
The AUXILIARY FILE can be viewed as a maze or network that
automatically grows, contracts, or is modified to exactly fill the
indexing needs for each and every application of the SOLID
Retrieval System. Each path through the maze is unique and
terminates in a location which contains the address in bulk storage
where the compressed referenced information is stored. The JOBLIST
items, which are produced by the Translation Packages from the
assigned descriptor-sets, are used to trace or create the subpaths
that together define an information path. New subpaths are created
only when they are needed during a storage or updating assignment.
Subpaths are eliminated during purging and in some updating
operations. The "length" of a path (or search) is determined solely
by the number of decisions that are made while tracing the path,
not by the operation (i.e. storage or retrieval or purging or
updating) that will be performed at the bulk storage address.
In many respects this scheme is analogous to a telephone network.
Each telephone number can be viewed as a unique description (viz.
JOBLIST item) of a path from the subscriber's substation to another
substation in the network. A telephone call made from the
subscriber's substation will be aborted if any link (i.e. subpath)
of the path does not exist. In the SOLID System the prime index, M,
and the screen, J are analogous to an "area code", and the other
screens are descriptions of the intermediate substations that are
to be linked for the telephone call. The analogy with a telephone
network breaks down when the following facets of the SOLID System
are considered.
a. Unlike telephone numbers, which are somewhat arbitrarily
assigned to each subscriber, the JOBLIST items actually describe
both the path and the referenced information. This means that
assignment of "idiot numbers", like those in the National Compound
Registry, are quite unnecessary.
b. When the proposed new components are implemented, the Retrieval
Package will have a capability for "browsing", which has no
parallel in telephone networks.
c. Unlike telephone networks, whose new substations must be created
at quite rigidly prescribed locations, the SOLID System creates new
paths or substations wherever storage is available.
The AUXILIARY FILE is divided into two parts. One of these parts,
which resides permanently in the computer, is associated with the
prime index M and the screens J. The second part is divided into
memory-blocks and is stored in virtual memory. The Retrieval
Package uses the JOBLIST items produced by the Translation Package
and a single input command, MODE, to automatically execute all
tracing, creating and purging operations in core storage. A global
memory component, SMEMORY, transfers the memory blocks between
virtual memory and core storage when they are needed. The
Continuance Tables will be used to restrict all paths within single
memory-blocks. This will ensure that each explicit storage, purge,
retrieval, or update request can be executed with, at most, the
transfer of one memory-block.
AUXILIARY FILE
It has already been noted that the AUXILIARY FILE is divided into
two parts. Part A, which resides in core storage, is generated by
the macro MJARRAY or it is read from cards by the SSTATECL
component when the file is initialized. Part B is divided into
memory blocks that are stored in fast-slow or visual memory.
The principal data array (YY) is divided into three portions as
follows:
a. The first portion contains that part of the AUXILIARY FILE which
will reside permanently in core storage (i.e., Part A below).
b. The second, transient portion must be large enough to hold one
memory-block.
c. The third portion is used to manipulate the strings of
referenced information that are stored or retrieved in the bulk
storage.
Memory blocks are transferred to the transient portion of core
storage by the global memory component (SMEMORY) whenever they are
needed for tracing or creating new "information paths". The two
parts of the AUXILIARY FILE are discussed next:
Part A:
Part A contains one sub-array associated with the prime index, M,
and five subarrays which are associated with the screen J. It is
prefaced by two composite addresses, EMPTY and BULK. The first four
items in EMPTY together give the location in virtual memory where
the next newly created memory block is to be stored. The fast
memory address portion of EMPTY (FMADD) contains the beginning
address of the transient portion of the data array, YY. The first
four items in BULK together give the location in bulk storage where
compressed referenced information is stored. The fast memory
address portion of BULK (FMADD) specifies the core-location of the
referenced information.
Normally, the first input item for the SOLID System is a card deck
with Part A punched in column binary. When the SOLID System is used
for the first time, this card-deck is replaced by a single card
that contains the word NEWFILE. This generates the initializing
information for Part A. Thereafter, if part A was changed, a new
card deck is punched at the end of each job-stream. The information
on the first two cards and last two cards is used to check the card
deck at input time.
Part B:
Each of the memory-blocks is prefaced by a composite address,
CURRENT. The first four items in CURRENT disclose the location in
virtual memory where the memory-block normally resides. The fifth
item, FMADD, is the relative address in the principal data array
(YY) where a new sub-array can be created. Thus FMADD is the
location of the first byte in the resident memory-block that is not
a part of an existing sub-array.
Description:
The translated JOBLIST item which is stored in the array
&ARRAY, is used by the Retrieval Package to trace (retrieval
and old storage) or create (new storage) the information path in
the AUXILIARY FILE. This is accomplished with the aid of three
addresses (EMPTY, CURRENT, and ADDRESS) and eight bit indicators in
a single byte (MSIGNAL). The composite addresses EMPTY and CURRENT
initially preface Part A and the resident memory block of the
AUXILIARY FILE respectively. ADDRESS is extracted from a subarray
during the search. It is either the link address, pointing to an
extension or continuance of the subarray, or an address extracted
from an EXECUTIVE POINTER. The position of the EXECUTIVE POINTER
will disclose the prime index or screen in the JOBLIST item. If a
subpath is missing, ADDRESS contains zero. In this case the MMATCH
macro instruction either completes the construction of a new
EXECUTIVE POINTER, thus creating a new subpath, or it aborts the
search.
The eight bits of MSIGNAL are used by the retrieval package to
indicate the status of the AUXILIARY FILE with respect to the
current search. The signal system is discussed next.
Signal System (MSIGNAL):
The meanings that are assigned to each of the eight bits in MSIGNAL
are given next.
MSIGNAL BIT ON Instruction (HEXA- DECIMAL) 80 A new memory-block is
to be created. 40 The search component has been used before. 20
ADDRESS contains a link address. 10 Tracing with screen J has been
completed. 08 Tracing with index M has been completed. 04 A new
subarray is to be created. 02 The resident memory-block is new. 01
The resident memory-block has been changed.
The 40 bit is turned off when the system is initialized. This
occurs at the start of each job-stream in the SSTATECL component
after the card-deck of Part A, has been read. Three bits (01, 02
and 80) are used by the Global Memory component (SMEMORY) to save
the resident memory-block and to fetch a new one. Two bits (04 and
20) indicates the status of the sub-array whose beginning address
is in ADDRESS. The last two bits (08 and 10) are used to indicate
the type of executive pointer (e.g., with or without screen) in the
subarray that is to be searched next.
The role played by MSIGNAL is shown in FIG. 16. A step-by-step
description follows:
Step a
At the start of the job-stream the bits in MSIGNAL are turned off.
This occurs in the SSTATECL component at stage 600. If
CURRENT=FFFFFFFF, bit 80 is turned on. This means that no memory
blocks are present in the AUXILIARY FILE.
Step b
All bits in MSIGNAL except 80, 40, 02 and 01 are turned off at
stage 602. The index registers which point to M in the JOBLIST item
(in array &ARRAY) and to the sub-array associated with M are
initialized, and at stage 602, the composite address EMPTY is saved
at CORD1. If the 40 bit is off, this address will also be stored in
location EMPTY+&ADDL and the 40 bit turned on. If EMPTY and
EMPTY+&ADDL do not contain the same address at the end of the
job-stream a new card deck will be punched for Part A. Step b at
604 is the normal entry point to the retrieval package.
Step c
At stage 606 the 20 bit in MSIGNAL, which is used indicated that an
extension or link sub-array is to be fetched or created, is turned
off.
Step d
Bit 08 is inspected at stage 608. If it is one, then either the
JOBLIST item index points to a screen, or a search of an RFILE
array, which contains bulk storage address(es) of compressed
referenced information, is indicated. The length of the unused part
of the JOBLIST item is used at stage 612 to differentiate these
situations. If the MSIGNAL 08 bit is zero, then the JOBLIST item
index points at M, and an index search of the subarray MA is
completed at stage 610.
Step e
The sub-aray in core storage is searched at stage 614 for an
EXECUTIVE POINTER which contains the screen in the JOBLIST item.
The following situations can occur:
1. An EXECUTIVE POINTER is found. The address portion is loaded
into ADDRESS and control goes to MMATCH at stage 616. In this case,
MSIGNAL is not altered.
2. The subarray is full, so one of its extensions or continuances
must be fetched or created. In this case the "continuance bit", 20,
is turned on and the link address is loaded in ADDRESS.
3. an EXECUTIVE POINTER which contains the screen in the JOBLIST
item cannot be found in the sub-array or in its extension(s). If
MODE=0 (i.e., retrieval) ADDRESS is set to zero. In the storage or
updating modes (MODE>0) a hole is made at the correct spot in
the subarray and the screen is stored in it (left adjusted).
ADDRESS is set to zero.
Step f
In the MMATCH macro instruction at stage 616, ADDRESS is compared
to zero. If it is zero and MODE=0 (i.e. retrieval) the search is
aborted as unsuccessful at stage 618. It should be noted that the
MOBILE CANONCALIZATION PACKAGE is called by the macro STRATEGY from
MMATCH whenever a mismatch (i.e. ADDRESS is zero) occurs in the
retrieval mode and there are over-rides present. If MODE=1 (i.e.
storage), the composite address CURRENT is stored in the correct
location in the subarray, and both the 01 and 04 bits in MSIGNAL
are turned on. This completes the construction of the new EXECUTIVE
POINTER and also indicates that the resident memory-block has been
changed. If ADDRESS is not zero the next subpath has been found and
MSIGNAL is not changed.
Step g
TBADD at stage 620 is the most complex and powerful instruction in
the Retrieval Package. It uses the information in the composite
addresses CURRENT and ADDRESS plus bits 01, 02, 04 and 80 in
MSIGNAL to accomplish the following:
1. If a new memory block is required, then the global memory
component (SMEMORY at stage 624) saves, if necessary, the resident
memory block and fetches a new one. Bits 01, 02, and 80 in MSIGNAL
are used by SMEMORY.
2. if the 04 bit is "on", the CREATE macro instruction at 622
creates a new subarray, beginning in the location specified by the
core address portion of ADDRESS. The composite address CURRENT is
recomputed at stage 626 and, if necessary, the address EMPTY is
recomputed also. The create bit (04) is turned off at the end of
CREATE.
Step h
At stage 626 the continuance bit 20 of MSIGNAL indicates whether or
not the JOBLIST item index is to be incremented and bits 10 and 08
are to be altered. This procedure is executed until either the
search is aborted (in MMATCH) or the search ends successfully at
stage 682 after retrieval or insertion of the bulk storage
addresses in the RFILE subarray of the resident memory-block. These
tasks are performed by the AUXFILE macro instruction at stage 632.
It turns on MSIGNAL 01 bit if a bulk storage address is inserted in
RFILE. The 20 bit is turned on if a continance of the RFILE
subarray has to be fetched or created.
After a successful search, the bulk storage address(es) are used to
store (MODE=1), update (MODE=4), or retrieve (MODE=0) the
compressed referenced information in the MAINFILE. The retrieved
referenced information is decompres-sed by COPAK before it is
disseminated.
Search Procedures:
The subarrays in core-storage are searched for EXECUTIVE POINTERS
in one of three different ways (see FIG. 16). These are:
Indexes
The EXECUTIVE POINTERS that are associated with the prime index (M)
are stored in the subarray at the relative address indicated by the
M value. A zero at the relative address means that no EXECUTIVE
POINTER has been inserted.
RFILE
The bulk storage address is stored in the first vacant element of
the RFILE sub-array or its extension (or continuances). A single
macro-instruction, AUXFILE, is responsible for searching and
maintaining the RFILE sub-arrays. Retrieved or newly assigned bulk
storage addresses are stored in the high address end of the
principal data array, YY. The beginning address is found in
location SBRY. The number of retrieved or stored addresses is in
location NJOBS. An unsuccessful retrieval is indicated by setting
JII to zero.
The RFILE subarrays are created at the first available location in
core-storage when they are needed. The number of bulk storage
addresses that can be stored in a particular RFILE sub-array is
determined by the System Parameter &MATRIXS.
The AUXFILE macro-instruction is executed at stage 630 of FIG. 16
whenever the residual or unused length of the JOBLIST item is less
than zero. This occurs after the beginning address of the terminal
RFILE array has been found. AUXFILE uses the retrieval command,
MODE, and, if need be, the bulk storage address (BULK), to retrieve
or store addresses of the compressed referenced information in
RFILE.
The retrieval command, MODE, has been assigned the following five
meanings:
MODE MEANING 0 Retrieve 1 Store or Update 2 Change items in the
AUXILIARY FILE 3 Purge paths from the AUXILIARY FILE 4 Purge,
replace and add compressed referenced information to the
MAINFILE.
the AUXFILE these five commands mean that bulk storage addresses
are to be retrieved, stored, purged or changed in RFILE. The three
types of override codes (Type 1, Type 2, and Type 3), which are
automatically inserted in the JOBLIST items during the translation
of descriptor-sets, have different meanings for each of the five
values of MODE. These are as follows:
MODE Override Meaning of Over-ride 0 1 Accept any non-zero value
for the designated descriptor or element
2 Accept any value, zero or non-zero, for the designated element. 3
Accept any value for the designated descriptor that lies is the
range specified in array AOVER3R.
1 1 create normal paths with the 2 specified override. If such
paths exist the inverted file searches will not be executed during
retrieval.
2 1 Replace specified element(s) by a "1". 2 Replace specified
element(s) by a "2". 3 Replace specified element(s) by a "3". (For
the MODE=2 update the Normalization Package will not be used).
3 Purge the information path specified in JOBLIST from the
AUXILIARY FILE. 2 Replace the element that is specified in the
JOBLIST item by the element in the array AOVER3R.
3 use the information in array AOVER3R to construct alternate paths
for the path described by the JOBLIST item.
4 Purge the referenced information items whose bulk storage
address(es) are given in array AOVER1 from the MAIN FILE. Add the
compressed referenced whose storage addresses are given in array
AOVER2. Replace the items in MAINFILE as specified in array
AOVER3.
in its present form the SOLID System can process explicit retreival
(MODE=0) and storage (MODE=1) requests which do not use the
NORMALIZATION or MOBILE CANONICALIZATION PACKAGES. The full
potential of the SOLID System, which permits the use of all five
MODE options and overrides, include utilization of the
NORMALIZATION and MOBILE CANONICALIZATION PACKAGE instructions and
the MMATCH and AUXFILE macro-instructions are modified slightly. In
the following discussion it will be seen that the AUXFILE
macro-instruction has been designed so that branches for the
remaining three MODE options (2, 3, and 4) can be very easily
incorporated. The flow-chart for AUXFILE (FIG. 17) is discussed
next.
In FIG. 17, the operation starts at stage 634 wherein the question
is asked "is MODE greater than, equal to, or less than 4?" If the
answer is that MODE is greater than 4, then the operation is
terminated at stage 636, because meanings have not yet been
assigned for MODE greater than 4. If MODE is less than 4, control
goes to stage 638. If MODE is equal to 4 control passes to stage
640.
At stage 638, the question asked is "is MODE equal to, less than,
or greater than 2?." If MODE is greater than or equal to 2, then,
control again goes to stage 640. Stage 640 is operative to handle
the conditions when MODE is equal to 2, 3 or 4.
If MODE equals one or zero, the program continues to stage 646 and
registers R0 and R1 are loaded with zero and NJOBS
respectively.
At this point the significance of two counters, NJOBS and RNJOBS,
must be understood. Both counters are initialized in the
TRANSLATION PACKAGE, which is executed before the RETRIEVAL PACKAGE
is used. For retrieval operations NJOBS and RNJOBS are both set
equal to zero. NJOBS are used to count the number of bulk storage
addresses retrieved from the RFILE array by AUXFILE. For storage
operations both NJOBS and RNJOBS are set equal to the number of
items of compressed referenced information that are to be stored
for the particular JOBLIST item. Thus, if MODE=1, NJOBS cannot be
less than one. NJOBS is incremented (retrieval) or decremented
(storage) each time a bulk storage address is retrieved or inserted
in the RFILE arrays. The bulk storage addresses are also recorded
in a sub-array of the principal data array, YY, that begins at the
location stored in SBRY. SBRY is set during the initialization of
the SOLID System. NJOBS and RNJOBS are used to compute the actual
locations in the sub-array of YY where each bulk storage address is
recorded.
In the following discussion of stage 648 it is assumed that this is
the first time that stage 646 is executed for a particular JOBLIST
item. It will be understood that R1 is incremented (retrieval mode)
or decremented (storage mode) in cycles through or within AUXFILE.
At stage 648, a determination is made as to whether the system is
operating in the retrieval or the storage mode. If it is in the
retrieval mode then the program continues to stage 650 and, the
location where the bulk storage address is to be recorded is
computed in the register R1 from the System Parameter &LSLOW,
SBRY, and R1 itself. It should be noted that before this
computation R1 contains the number of bulk storage addresses that
have been recorded in the subarray of YY to this point. &LSLOW
is the length of each bulk storage address.
If, at stage 648, the system is operating in the storage mode, then
control goes to stage 652, wherein register R1 is set equal to the
difference between RNJOBS and NJOBS. As already noted above this
difference (in R1) is the number of bulk storage addresses that
have been assigned so far for the particular JOBLIST item. From
stage 652 the program goes to stage 650, wherein the new R1 is
computed as described above.
After stage 650 has been executed control goes to stage 654,
wherein the question asked is "is MODE zero?". At this stage it
should be noted that the location of the first element in array
RFILE is recorded in register IR6 and that register IR3 contains
the location of the continuance address of the RFILE array.
If MODE=0 at stage 654 then control goes to stage 656 and the
question is asked "is the element of RFILE specified in register
IR6 zero?" If the answer at stage 656 is "it equals zero" control
then goes to stage 640 and the exit operations of AUXFILE are
executed. If the answer at stage 656 is "it is not zero" then
control goes to stage 658. Therein the bulk storage address in the
RFILE array is recorded in the subarray YY at the location
specified in register R1, and NJOBS is incremented by one. From
stage 658 the program branches to stage 660 and register IR6 is
incremented by the value &LSLOW, which is the length of
elements (viz., Bulk Storage Address(es), in the RFILE array. At
the next stage, 662, the contents of registers IR6 and IR3 are
compared. If IR6 equals IR3, control goes to stage 664, wherein the
question is asked "is the continuance address of the RFILE array
zero?" If the continuance address is zero, control passes to stage
666 where the question asked is "is MODE equal or greater than
zero?" If MODE equals zero, the retrieval mode is indicated and
control goes to stage 640 to begin the AUXFILE exit operations. If,
at stage 666, MODE is greater than zero, then the retrieval mode is
indicated, and the program goes to stage 670, where the MSIGNAL 20
bit is turned on. From stage 670 control goes to stage 640, exits
from AUXFILE. If, at stage 664, the continuance address is not
zero, then the program goes directly to stage 670 as described
above. The MMATCH and TBADD macro-instructions use the MSIGNAL 20
bit, which was turned on at stage 670, to fetch or create an
extension of the RFILE array. They will be fully discussed
hereinafter.
It should be noted that the MSIGNAL 20 bit is interrogated after
each use of AUXFILE. If it is "one" control goes to the MMATCH
macro-instruction (at stage 616) and the extension of the RFILE
array is fetched (retreival or storage) or created (storage) at
stages 620. Eventually control returns to the AUXFILE macro via
stages 626, 606, 608 and 612 of FIG. 16.
If, at stage 662, IR6 is greater than IR3, something is wrong, and
the program goes to stage 668, where an error message is printed
before exiting from the system at stage 636. If IR6 is less than
IR3 control returns to the first stage in AUXFILE (634) where the
next cycle through AUXFILE begins.
If the answer to the question asked at stage 654 was "MODE is not
zero," then the storage mode is indicated and control goes to stage
672. At stage 672 the question asked at stage 656 is again posed.
If, the answer is "it is not zero" then control goes to stage 660
and the previously described operations are executed. If the answer
at stage 672 is "it is zero", this means that a zero element has
been found in the RFILE array, and control goes to stage 674. At
stage 674 the new bulk storage address (in BULK) is stored in the
RFILE and the YY array, at the locations specified by registers IR6
and R1 respectively. At this point it should be noted that the new
bulk storage address that has just been assigned will be used to
store compressed referenced information. COPAK will compress the
reference information then store it in the MAIN-FILE at the
assigned bulk storage addresses.
At the next stage, 676, of the AUXFILE macro, the address BULK is
updated by the macro-instruction BULK. Thus the address BULK now
contains the next location in the MAIN FILE that can be assigned
for storing compressed referenced information. From stage 676 the
program goes to stage 678, wherein NJOBS is decremented by one, and
the MSIGNAL 01 bit is turned on. It should be noted that at this
point NJOBS contains the number of bulk storage addresses that must
still be assigned for the JOBLIST item, and the 01 MSIGNAL bit now
indicates that the resident memory-block has been changed.
At stage 680 the question is asked "is NJOBS greater than zero?" If
the answer is "it is greater than zero" then control goes to stage
660, and the previously described operations are executed. If the
answer is "it is not greater than zero," which is identical with
the answer "equal zero or negative", then control goes to stage
640.
At stage 640 JII and IR1 are set equal to one and AYY respectively.
Later in the SSEARCH component JII=1 indicates that a successful
search has been performed and that the new assigned (storage mode)
or retrieved (retrieval mode) bulk storage addresses will be found
in the YY array, beginning in the location specified in SBRY. At
the end of the SSEARCH component the information in this part of
array YY is transferred to a work array, JBWORK, and, for the
storage mode, NJOBS is set equal to RNJOBS. The information in
NJOBS and in array JBWORK are used by the COPAK compressor, which
is discussed hereinafter. The register IR1 has been reset to point
to the beginning of that part of the AUXILIARY FILE that resides
permanently in core-storage.
It should also be noted that in stage 678 the MSIGNAL 01 bit was
turned on. This bit will signal the GLOBAL MEMORY component
(SMEMORY), when it is called later by the TBADD macro-instruction
(see stage 620, FIG. 16), that the resident memory-block must be
updated in virtual memory. GLOBAL MEMORY is discussed later in this
disclosure.
FIG. 18 is a flow diagram of the program steps in the SCREEN
macro-instruction. SCREEN is used (at stage 614 in FIG. 16)
whenever the MSIGNAL 08 bit is "on" and the residual length of the
JOBLIST item, which is recorded in the half-word DUN1 is not less
than zero. These conditions occur when the screens in the JOBLIST
item (viz., J, LD.sub.o, BD.sub.1, etc) are being used to trace or
create subpaths. Before beginning the discussion about the SCREEN
macro instruction, it should be understood that there are two steps
involved. First, an array whose EXECUTIVE POINTERS have the same
length screen as the one in the JOBLIST item must be located.
Second, the located array must be searched for the element where an
EXECUTIVE POINTER with the JOBLIST item screen should be located.
Of course, the located element may contain zero, the desired
EXECUTIVE POINTER, or another EXECUTIVE POINTER. Once the location,
where the EXECUTIVE POINTER should be, is found, then a decision
can be made about the course of action that should be taken.
In the AUXILIARY FILE all the arrays that are associated with a
particular screen (say LD.sub.o) and a particular value for the
preceeding screen or index (in our case the screen J) are linked
one to the other by their continuance addresses. Within each array,
all EXECUTIVE POINTERS have the same length screen. However, two
different linked arrays can have EXECUTIVE POINTERS with different
length screens.
At stage 684 in FIG. 18 the byte JI is set to zero and all
registers, R0 to R15, are stored in an array which begins at
DUM1+8. These registers are saved because they are needed if the
attempt to locate an array which can be searched is unsuccessful.
The byte JI has been initialized for the SUPERSCH
macro-instruction, which is executed later in SCREEN, and is fully
described hereinafter.
From stage 684 the program goes to stage 686, wherein the length of
the EXECUTIVE POINTER, whose screen is in the JOBLIST item, is
computed in register R14. Thus R14 contains the screen length plus
&ADDL, which is the address length. Also, at stage 686 the
total length of all the EXECUTIVE POINTERS in the array that are to
be searched is stored in register R1. This length is found in the
first four bytes of the array. At the next stage, 688, the question
is asked "is register R1 exactly divisible by R14?" If the answer
is "no" this means that the array is not be be searched, because
the length of its screens differ from the length of the screen in
the JOBLIST item. In this case control goes to stage 690 wherein
all the registers, R0 to R15, are reloaded from the array DUMX+8,
then the program goes to stage 692. At stage 692 the MSIGNAL `20`
bit is turned "on" and register IR6 is set equal to register IR3,
then the program exits from SCREEN at stage 694. At this point it
should be noted that both registers IR3 and IR6 now contain the
location of the array's continuance address. This information, in
register IR6, and the 20 MSIGNAL bit are used by the MMATCH
macro-instruction (Stage 616, FIG. 16) to eventually fetch or
create an extension of the array or to abort the search, as
described for MMATCH hereinafter. It should also be noted, from the
discussion for FIG. 16, that if a search is not aborted at stage
616 control will eventually return to SCREEN via stages 606, 608,
and 612.
If, at stage 688 of FIG. 18, the answer was "yes", this means that
the array can now be searched. At stage 696 all registers, R0 to
R15, are reloaded from the array DUMX+8, wherein they were stored
at stage 684. At the next stage, 700, a series of programmatic
steps, in a conventional manner, will determine whether or not
override codes are present in the JOBLIST item. If overrides are
present, control would go to the MMATCH macro-instruction (stage
616, FIG. 16) for eventual processing by the macro instruction
MOBILE CANONICALIZATION.
From stage 400 control goes to stage 698 wherein the question is
asked "is the half-word DUN1 greater than zero?". At this point it
should be noted that the half-word DUN1, which is set in the
initialization step at stage 602 of FIG. 16, contains the length of
the JOBLIST item screen. If, at stage 698, the answer is "not
greater than zero" control goes to stage 702 and the register R1 is
loaded with the address of &ADD1, which is the address of the
EXECUTIVE POINTER in the array that is being searched. At this
point it must be noted that there is only one EXECUTIVE POINTER in
the array, because the screen length is zero. At the next stage,
704, the question is asked "is the EXECUTIVE POINTER zero?" If it
is, the program goes to stage 710, wherein the registers IR2 and
IR6 are both incremented with register BRYY. At this point it
should be noted that register BRYY contains the length of the
JOBLIST item screen, IR2 the location of the JOBLIST stem screen in
JOBLIST, and IR6 the location of the EXECUTIVE POINTER in the array
that was reached. Thus, the incremented registers IR2 and IR6
contain the location of the next screen (in JOBLIST) and the
address part of the EXECUTIVE POINTER. The program next goes to
stage 694 and then exits from SCREEN to the MMATCH
macro-instruction at stage 616 in FIG. 16, as described above.
If, at state 704, the answer was "it is zero" then control goes to
stage 706 wherein the byte JI is set equal to one, which indicates
that a vacant element has been found in the array. At the next
stage, 708, the question is asked, "is MODE equal to one?". If the
answer is "it is one" this signifies that a retrieval operation is
being performed, and the program goes to stage 730. At stage 730
the question is " is the byte LEXICON zero?". If the answer is "it
is zero", then at stage 732 the LMOVE macro instruction is used to
move the JOBLIST item screen (specified in register IR2) to the
EXECUTIVE POINTER position in the array as specified in register
IR6. It should be noted that stage 710 is executed only in the
storage mode, and that the LMOVE operation, just described,
actually constructs the first parts of a new EXECUTIVE POINTER in
the correct position in the array. The address part of this
partially constructed EXECUTIVE POINTER will be inserted at stage
616 of FIG. 16, wherein the MMATCH macro-instruction is executed.
At that time the MSIGNAL 01 bit, which signifies whether or not the
resident memory block has been altered, will be turned on, and
subsequently, in stage 620 of FIG. 16, a new array will be created
in the resident memory block.
Control goes from stage 732 to stage 710 and thence exits at stage
694, in the manner described previously.
Before continuing the discussion, an understanding of the roles
played by SUPERSCH (stage 712), INSERT (stage 728), and the byte
indicator LEXICON is needed. SUPERSCH actually performs the task of
locating the position in the AUXILIARY FILE where an EXECUTIVE
POINTER with the JOBLIST item screen should be. If the location is
occupied by an EXECUTIVE POINTER with a different screen and the
storage mode is indicated, then a vacancy must be createdat the
particular spot so that the new EXECUTIVE POINTER can be inserted
in its correctly ordered position in the array and its
continuances. The hole creating process can involve both the
movement of EXECUTIVE POINTERS, within an array, and the transferal
of EXECUTIVE POINTERS across an array. These two tasks are
accomplished in a complex manner by the INSERT macro-instruction,
which will be discussed hereinafter. INSERT uses the LEXICON byte
as an indicator. At stage 730 of FIG. 18 the LEXICON byte indicates
the status of the transferral of EXECUTIVE POINTERS between
arrays.
Now at stage 730 of FIG. 18 the question asked was "is the byte
LEXICON zero?" If the answer is "it is zero" this means that a
vacancy exists and that there are no EXECUTIVE POINTERS being
transferred across arrays. In this case control goes to stage 732
in the manner described earlier. If, at stage 730, the answer is
"it is not zero", then a hole must be created in the array at the
location specified in register IR6. This task is accomplished by
the INSERT macro instruction (at stage 728), wherein the LEXICON
byte is also altered and, if necessary, a transferred EXECUTIVE
POINTER is inserted. Next control goes to stage 694 to begin the
sequences of operations stipulated at stages 616, 620, etc. in FIG.
16. It should be noted that the LEXICON byte indicator is used
after the TBADD macro-instruction (stage 620, FIG. 16) to return
control, if necessary, to the SCREEN macro at stage 684. After the
transferral of EXECUTIVE POINTERS has been completed the new
EXECUTIVE POINTER is constructed in the hole that was created by
the INSERT. This insertion occurs as the final step in the INSERT
macro-instruction.
The remaining stages of FIG. 18, which begin at stage 712, are
discussed next.
Stage 712 is executed when the screen length, which is recorded in
the half-word DUN1, is greater than zero. In this case control
passes from stage 698 to stage 712, wherein the SUPERSCH
macro-instruction is executed. SUPERSCH finds the location in an
array or its extensions (or continuances) where the EXECUTIVE
POINTER with the JOBLIST item screen should be. In SUPERSCH, which
is fully discussed hereinafter, the continuance 20 bit of MSIGNAL
can be turned on, and new arrays are created or fetched by passing
directly from SUPERSCH to the MMATCH (stage 616) and TBADD (stage
620) macro-instructions, as shown in FIG. 16. In both these cases,
if the search operation is not aborted in the MMATCH
macro-instructions, control returns to SUPERSCH via stages 684
through 698 in FIG. 18. Thus when control goes from SUPERSCH (stage
712) to stage 714; the register IR6 specifies exactly where the
EXECUTIVE POINTER with the JOBLIST screen should be.
At stage 714 the CSCREEN macro-instruction is used to compare the
screen in the EXECUTIVE POINTER to zero and to the screen in the
JOBLIST item, and set the byte JI with one of the following
codes:
JI Meaning 00 The two screens are equal 01 The EXECUTIVE POINTER
screen is zero 02 The EXECUTIVE POINTER screen has a lower
numerical value than the JOBLIST item screen. 04 The EXECUTIVE
POINTER screen has a higher numerical value than the JOBLIST item
screen.
At stages 716, 718 and 724 of the SCREEN macro-instructions the
specific number in byte JI is determined. If byte JI is zero, the
program goes from stage 716 to stage 710, and the exit procedures
of SCREEN are executed. If, at stage 718, JI is found to be 02
control goes to stage 720, and the register IR6 is incremented by
register BRY, which contains the EXECUTIVE POINTER length. At the
next stage, 722, the question is asked, "is register IR6 less than
register IR3?" If the answer is "register IR6 is not less than
register IR3" control goes to stage 692 wherein the MSIGNAL 20 bit
is turned on and register IR6 is set equal to register IR3. At this
point register IR6 contains the continuance address of an array
that must be fetched or created, so control goes to stage 694 and
then exits from SCREEN, as described above. If, at stage 722, it
was found that register IR6 is less than register IR3, this would
mean that there is at least one EXECUTIVE POINTER whose screen is
greater than the screen in the JOBLIST item, and control goes to
stage 714, which has been described previously. It should be
recalled that all the equal length EXECUTIVE POINTERS are ordered
within each array and its extensions (or continuances) with the
lowest screen in the first position of the array. The operations in
the branch, which begins at stage 720 and goes through stage 722 to
stage 714, have been included to ensure that correct location is
found by SUPERSCH (at stage 712).
If, at stage 724, it is determined that JI contains the number 04,
then control goes to stage 726, wherein the question is asked "is
MODE equal to zero?" If MODE is zero, the retrieval mode is
indicated and the program exits from SCREEN directly to the MMATCH
macro-instruction (at stage 616 in FIG. 16), wherein the
unsuccessful search is aborted. If, at stage 726, it is found that
MODE is not zero (i.e. the storage mode is indicated) then control
goes to stage 728 and the INSERT macro is executed, in the manner
described earlier. If, at stage 724, JI is found to contain zero,
then the sequence of steps that begins at stage 730 is executed in
the manner described earlier.
In FIG. 19, the SUPERSCH macro instruction is shown in flow diagram
form. As was stated with respect to the description of FIG. 18, the
SUPERSCH macro execution started after completion of stage 698,
where it is determined:
a. the particular array that is to be searched has EXECUTIVE
POINTERS whose length is a multiple of the contents of register
R14, as it was determined that R1 was exactly divisable by R14; R9
by dividing R1 by R2. Next, the square root of the number of
EXECUTIVE POINTERS in the array, which is in register R9, is
computed and stored at the location MASK1. If the square root is
not an exact number, it is truncated without rounding. Thus, an
integer number is always stored in MASK1. Also, at stage 724, the
absolute address of the last EXECUTIVE POINTER in the array is
computed in register R1. This is done by subtracting the EXECUTIVE
POINTER length, which is in register R2, from the absolute
continuance address, which is stored in location C7.
After completion of stage 724, control goes to stage 726 wherein
the macro-instruction LARGEXC is performed. The LARGEXC macro
instruction is an extended form of the IBM basic assembler
instruction XC. In this LARGEXC macro-instruction the number of
bytes specified in the half-word DUN1 of the array DUMX are set to
zero. Because the half-word DUN1 contains the screen length, this
means that a zero screen has been constructed in array DUMX. After
completion of stage 726, control goes to stage 728, wherein the
byte JI is set equal to the hexadecimal number 60.
At this stage it should be understood that the SUPERSCH
macro-instruction operates by moving back and forth along the array
in jumps equal to the square root of the number of EXECUTIVE
POINTER positions in the array, as recorded in MASK1. Once a
subblock, of the array of length MASK1, where the screen might be
located, is determined it is searched, one EXECUTIVE POINTER at a
time. The left-most four bits of byte JI, which are initialized at
the start of the SCREEN macro-instruction (see stage 684 of FIG.
18), are used to control the directional movements of the
register-pointer during execution of SUPERSCH. These four bits of
the byte JI have been assigned the following meanings in
SUPERSCH.
i. The 80 bit indicates the direction in which the register pointer
R1 must be changed. If the 80 bit is "on" (i.e., one) then R1 must
be increased. If the 80 bit is "off" (i.e., zero) then R1 must be
decreased.
ii. The 40 bit indicates the last direction of change for the
EXECUTIVE POINTER register R1. That is, if the 40 bit is "on", the
last direction of change of R1 was positive, and if "off", the last
direction of change was negative (i.e., R1 was decreased).
b. further, it has been determined if the screen in the JOBLIST and
the screen in the array are greater than zero. As the system enters
SUPERSCH, the first stage of the operation is accomplished at stage
716. At stage 716, registers R0 to R15 are stored in the array C1.
Then, in the next succeeding stage 718, certain registers are
set.
First, both registers R0 and R6 are set equal to the screen length
recorded in the half-word DUN1.
Register R2 is set equal to the EXECUTIVE POINTER length by adding
the screen length, set in R0, and the composite address length
&ADDL, which is a System Parameter.
Register R4 is loaded with the absolute machine address of the head
of the array. This is accomplished by taking the address in
location C4, which is the address of the first available EXECUTIVE
POINTER in the array, and subtracting therefrom four bytes. These
four bytes at the head of the array contain the relative
continuance address. Thus the value in register R4 will be the
absolute address of the start of the array.
Register R1 is then set to the length of the EXECUTIVE POINTERS in
the array. This is accomplished by subtracting four bytes from the
relative continuance address, found at the address recorded in
register R4. The four bytes are subtracted from the relative
continuance address because that address is relative to the head of
the array.
After completion of stage 718, the control is transferred to stage
720 wherein the question is asked "is R1 evenly divisable by R2?".
Actually, this question should always be yes as the same question
has been asked at stage 688 in the screen macro. However, a check
has made at stage 720 as to whether the length of the EXECUTIVE
POINTERS in the array, as recorded in register R1 are divisable by
the EXECUTIVE POINTER length recorded in register R2. If the answer
is "no", then the program would continue at stage 722 to print out
an error message for the SUPERSCH macro. If the answer at stage 720
is "yes", the program continues to stage 724.
At stage 724, certain parameters are computed. First, the number of
EXECUTIVE POINTERS in the array is computed in register
iii. The 20 bit of the byte JI is used to indicate whether or not
the first direction of change of register R1 has occurred. If it is
"on", this means that the register R1 is to be changed for the
first time. Of course, the 20 bit "off" means that register R1 has
already been changed at least once.
iv. The 10 bit is used to record the last status of the 40 bit of
the JI byte.
Now, at stage 728, the 40 and 20 bits of byte JI were turned on.
This means, with respect to the 40 bit, that the EXECUTIVE POINTER
register was last increased. This occurred when the position of the
last EXECUTIVE POINTER in the array was computed in Register R1 at
stage 724. With respect to the 20 bit, this means that the register
R1 has not been changed from its initial setting.
At stage 730 the 80 and 10 bits are turned off, and control goes to
stage 732, wherein the question is asked "What is the value of the
40 bit in byte JI?" If the 40 bit is one, then the 10 bit is turned
on at stage 736 and the program goes to stage 734. If, at stage
732, the 40 bit is zero, control goes directly to stage 734. Thus
at stage 734 the status of the 40 bit has been preserved in the 10
bit. At stage 734, the macro-instruction CSSCRN is executed. The
macro-instruction CSSCRN is an extended form of the IBM basic
assembly language instruction CLC. At stage 734, the question is
asked "is the screen at the location indicated in register R1 (the
last EXECUTIVE POINTER in the array at this stage time) equal to
zero or a number other than zero?". If the screen is equal to zero,
then control goes to stage 738. If the screen is a value other than
zero, control goes to stage 740. At stage 740, another CSSCRN macro
instruction determines if the screen, whose address is recorded in
register R1, is equal to, less than, or greater than the screen
whose address is recorded in register IR2. It would be remembered
that the screen whose address is recorded in register IR2 is the
screen in the JOBLIST item, and that the screen whose address is
recorded in register R1 is the screen which we are looking at in
the array. If the screen in the array is greater than the screen in
the JOBLIST item, then we must go backward in the array to find the
correct location and, accordingly, control goes directly to stage
738. If the screen in the array is less than the screen in the
JOBLIST item, then control goes tostage 742, wherein the 80 bit in
the JI byte is "turned on." This indicates that the register R1
should be increased to find the location in the array where the
EXECUTIVE POINTER should be. From stage 742 the program goes to
stage 738.
At stage 738, the 20 bit in the JI byte is checked. If the 20 bit
is one, then, at stage 744, the 20 bit of the JI byte is turned
off, and the register R9 is descremented by one. This enables us to
return to the first EXECUTIVE POINTER in the array when the new
value of R1 is computed from R9. Control goes from stage 744
directly to stage 756.
Before discussing what happens at stage 738, when the 20 bit is
zero, a discussion must be made of what happens at stage 740 when
the screen in the array is equal to the screen in the JOBLIST item.
When this occurs, we have found an EXECUTIVE POINTER which has the
JOBLIST screen, and control goes to stage 746 where terminationof
the SUPERSCH operation will be effected. At stage 746, the location
C4 has recorder therein the address in register R1. C15 has stored
therein the screen length recorded in register R6 and, C16 has
stored therein the EXECUTIVE POINTER length recorded in register
R2. Then registers R0 to R15 are loaded from the array C1. Next
control goes to stage 748 wherein the MSIGNAL 20 bit is checked. If
the MSIGNAL 20 bit is on, the address specified by register R6 is a
continuance address. If the MSIGNAL 20 bit is off, then the
SUPERSCH procedure has been completed.
If, at stage 748, the MSIGNAL 20 bit was "on", control goes to
stage 750 wherein the question is asked "is the continuance address
whoselocation is recorded in register IR6 equal to zero?" If the
continuance address is equal to zero, then control goes directly to
the MMATCH macro in SSEARCH (see stage 616 in FIG. 16).
If, at stage 750, the continuance address specified by register IR6
is not zero, then it is stored in the location ADDRESS, and control
goes directly to the TBADD macro-instruction in SSEARCH (see stage
620 in FIG. 16).
TBADD macro will fetch a new array, and control will eventually be
returned to the SUPERSCH macro instruction for a search of the new
continuance array.
At stage 738, if it was determined that 20 bit of JI was zero,
meaning that this is at least the second time that control has come
through stage 738, then the register R9 is reset to the value R9
minus the contents of MASK1 the square root of the total number of
EXECUTIVE POINTERS. After completion of stages 754, or stage 744,
control would pass directly to stage 756 wherein a check is made as
to whether R9 is equal to or less then zero, this means that the
subblock search has been completed and control passes directly to
stage 758. However, it is in the best interest of this discussion
to discuss the result when R9 is greater than zero and the subblock
search must be completed and then return to stage 758 at a later
time in this discussion.
When R9 is greater than zero, the program passes control to stage
760 wherein a register R15 is set equal to the contents of register
R2 times the contents of register R9 or the exact number of bits
which must be moved to look at a new EXECUTIVE POINTER in the
array. After completion of stage 760, control passes to stage 762
wherein the JI 80 bit is checked. If the 80 bit is zero, this means
we must move backwards and if the 80 bit is one this means that we
must move forward.
If the 80 bit is zero, control passes to stage 764 wherein the 40
bit of the JI byte is turned off and R15 is reset to the address in
R1 minus the value set in R15. This, in effect, sets the new
EXECUTIVE POINTER address in the array which is to be checked.
Then, at stage 766, a check is made as to whether this new address
is greater than or equal to the address in register IR6 which is
the first address in the array. If, the address in R15 is greater
then or equal to the address in register IR6, the first address in
the array, control would pass to stage 768 wherein register R1
would now be set at the address in register R15. If the answer at
stage 766 had been less then zero, then control would have passed
back to stage 700 for purposes of accomplishing a step search of
the EXECUTIVE POINTERS in the sub-block. This type of search will
be discussed later.
If the 80 bit of the JI byte at stage 762 is "one," this means that
the EXECUTIVE POINTER index, R1, must be increased. Thus register
R15 is set equal to register R1 plus the increment (which is in
register R15) and the 40 bit of byte JI is turned on to indicate
that the movement is forward. At stage 774 the question is asked
"is this new value of register R15 greater than the continuance
address which is recorded in storage C7?" If the address in
register R15 is equal to or greater than the address in location
C7, control goes to stage 766 to execute a series of instructions
that will be discussed later. However, if the address in register
R15 is less than the continuance address (recorded in location C7),
control goes to stage 768 wherein the address recorded in register
R15 is then transferred to location R1.
When the operation at stage 768 is completed the program goes to
stage 778 wherein the question is asked "Are the 40 and 10 bits of
the JI byte both on, both off, or mixed?" If they are mixed, then
the subblock search continues by returning control to stage 730. If
they are both on, this means that there has been two successive
forward going steps and, accordingly, control goes directly to
stage 776. If they were both off, this means there have been two
successive backward going steps and control goes to stage 758
wherein the register R1 is decreased by the subblock length, which
is in MASK1.
Control then goes to stage 780 wherein the question is again asked
"is the new value in register R1 equal to or greater than the
address of the first EXECUTIVE POINTER that is recorded in location
IR6?" If that is the case, control goes directly to stage 776. If
the new address in register R1 is less than the address in register
IR6, control goes to stage 770, wherein the address in register R1
would be set equal to the value of IR6. This assures that the
address in register R1 will not be less than the address of the
first EXECUTIVE POINTER in the array. From stage 770 control goes
to stage 776 wherein the macro-instruction CSSCRN is executed.
There, the question is asked whether the screen of the present
EXECUTIVE POINTER whose address is recorded in register R1 is equal
to zero?" If it equals zero, the SUPERSCH termination procedure
that start at stage 746 is executed. If at stage 776, the answer is
"not zero" control goes to stage 782. (see above). At stage 782 the
CSSCRN macro-instruction is used to ask the question: "is the
screen whose address is recorded in register R1 less than the
screen whose address is recorded in register IR2 (the JOBLIST
item)?" If the answer is "not less than", then control again goes
to stage 746 to begin the SUPERSCH termination procedure. It should
be noted here that if the two screens are equal, the exact location
in the array has been located, and no new EXECUTIVE POINTER will be
inserted. However, if, during a storage mode, the screen whose
address is in register R1 is greater than the JOBLIST item screen,
whose address is in register IR2, then the INSERT macro-instruction
of FIG. 20 will eventually be used to create a hole and insert a
new EXECUTIVE POINTER.
If the answer at stage 782 is "less than" control goes to stage 784
wherein register R1 is incremented by the EXECUTIVE POINTER length,
which is in register R2.
Then, control moves to stage 786 wherein the question is asked "is
register R1 greater than, less than, or equal to the continuance
address that is recorded in location C7?" If register R1 is greater
than the continuance address, control goes to stage 722, and
therein indicates that there is an error in SUPERSCH. If register
R1 is equal to the continuance address, then control goes to stage
788 wherein the MSIGNAL 20 bit is turned "on", thus indicating that
a continuance of the array must be fetched. Control goes from stage
788 to stage 746 wherein the SUPERSCH termination procedure begins
(see above).
If, at stage 786, register R1 is less than the continuance address,
control returns to stage 776. This cycle (through stages 776, 782,
784, and 786) is repeated until control goes to stage 746 (from
stages 776 or 782) or either of stages 722 or 788 (from stage 786).
Thus, in the SUPERSCH macro, we have either found the exact
location of an EXECUTIVE POINTER or found the exact location where
the new EXECUTIVE POINTER is to be inserted, or we have determined
that an extension (or continuance) of the array must be fetched or
created and searched.
FIG. 20 is the flow diagram for the macro-instruction INSERT. By
the time the program reaches the macro-instruction INSERT, at stage
428 of FIG. 18, certain things have occured. First, the particular
EXECUTIVE POINTER recorded in register IR6 has been defined, and,
additionally, it is known whether, at that address, the space is
empty or filled. Further, if the space is filled, we know that the
screen in the array is greater than the screen in JOBLIST.
Additionally, we know whether we are in the retrieval or storage
mode. We know that it is in the storage mode. Further, we have the
JI value set in CSCREEN; that is, the value of the JI byte is
either 01 or 04 meaning that the screen in the array is zero or
that the screen in the EXECUTIVE POINTER in the array is higher
than the screen in JOBLIST.
The first stage of the macro INSERT is stage 800 wherein the
question is asked "is JI equal to 01?". If the byte JI is 01, then
this means that the screen in the array is equal to zero.
If the value of the screen is zero, the program continues to stage
802. If JI is not 01, then the program continues to stage 804. For
purposes of discussion we will assume that the 04 bit of JI is "on"
and that we are proceeding to program stage 804. At stage 804 two
registers are set. The first register RO is set at the length of
the EXECUTIVE POINTER which is stored in the register BRY. Register
R1 is set to the address of the EXECUTIVE POINTER in the array,
which is in register IR6. After these two registers are set, the
program continues to stage 806 wherein the question is asked "is
the EXECUTIVE POINTER in the array whose address is in register R1
vacant or zero?" If it is vacant or zero then the program would
continue to stage 808.
If, at stage 806, the EXECUTIVE POINTER whose address is in
register R1 is not zero, then the program would continue through a
loop defined by stages 810 and 812 until a vacant or zero below the
starting address is found in the array. First, if the EXECUTIVE
POINTER is not zero, at stage 810, register R1 is reset to one
EXECUTIVE POINTER length beyond the address set in IR6, by adding
register R0 to register R1. Then, at stage 812, the determination
is made as to whether the continuance address has been reached. If
it has not been reached, the program returns to stage 806 and a
determination is made as to whether at that new R1 the EXECUTIVE
POINTER is zero. If the EXECUTIVE POINTER was zero, then control
goes to stage 808. If the EXECUTIVE POINTER at the new address in
R1 is filled, the program would continue through stages 810 and 812
until, either, (a) an empty EXECUTIVE POINTER location would have
been found, or, (b) the absolute continuance address in register
IR3 would be found. If the absolute continuance address in register
IR3 is found, the program will continue to stage 814. At stage 814,
the following registers would be set:
a. R1 is reset to the address of the last EXECUTIVE POINTER in the
array by subtracting the value of R0 from R1. Then,
b. Registers R0 to R6 are stored in array C1; and
c. The value of register IR3 is decreased by R0.
After completing these steps, the program continues to stage 816
where a determination is made as to whether the 01 bit in LEXICON
is on.
At this time, perhaps a discussion of the LEXICON array and its use
in INSERT is desirable. The 16 bytes of the LEXICON array of are
used by the INSERT macro. In thefirst byte, which is set in
SSEARCH, the last four bits are significant. The 01 bit is used to
signify whether an address has been determined forLEXICON in the YY
array storage. Further, when the 01 bit is on, there is an
indication that in addition to the address in the YY array having
been determined, the screen in JOBLIST has been stored in the high
end of the YY array and the address of the screen is stored
beginning at LEXICON plus 4 bytes. At stage 816, if the 01 bit is
zero, then the program would continue to stage 818. If the 01 bit
has been on, the program would have continued to stage 820. At
stage 818 certain steps are taken:
a. the register IR1 is set at a value equal to the beginning
address of the main storage array, which is storedin AYY during the
initialization by SSTATECL;
b. the System Parameter <HAYY, which is the length of the
main storage array YY, is added to IR1. This would bring us to the
end of the YY array.
c. From this value is subtracted the half word DUN1, which is the
length of the screen in JOBLIST.
c. Then, this address IR4 is stored in LEXICON plus 4 bytes.
e. In LEXICON plus 8 bytes is stored the address IR1 minus R0. R0
contains the length of the EXECUTIVE POINTER in the array.
f. In LEXICON plus 12 bytes is stored IR1 minus 2 times R0.
(g) In LEXICON plus 16 is stored ADDRESS, which is the composite
address of the next available array which will be requested in
CREATE.
Then the program continues to stage 822 wherein the LMOVE macro is
used to move the screen from JOBLIST and store it at the location
described by register IR1 at the end of the array YY. This saves
the JOBLIST screen in storage for its eventual insertion in the
correct position in the array.
Then, as with the case of the LEXICON bit being turned "on", the
program continues to stage 820.
At stage 820 a check is made to determine whether the 02 bit of
LEXICON is turned on. If the 02 bit is zero, then, at stage 824,
the 02 bit in LEXICON is turned on and register IR1 is loaded with
the address in LEXICON plus 8. If the 02 bit were on a stage 820,
the LEXICON 04 bit would have been turned on at stage 826 and
register IR1 would have been loaded with the address at location
LEXICON plus 12. In this case, it means that there is already an
EXECUTIVE POINTER defined by the address in LEXICON plus 8 and,
therefore, it is necessary to load the new EXECUTIVE POINTER in the
address defined at location LEXICON plus 12.
After completion of the steps at either 824 or 826, control goes to
stage 828 wherein the half word MASK1 is set equal to the length of
the EXECUTIVE POINTER, which is in register R0. Then control is
transferred to stage 830, wherein the left move macro instruction
LMOVE is executed and the EXECUTIVE POINTER in the array, defined
by register IR3, is stored at location in register IR1 in the YY
array. Then registers R0 to R6 are loaded from array C1 at stage
832. These registers were saved during the transfer process at
stage 814.
At stage 808, the right move macro RMVC is executed. For this
macro, certain registers are set. Register BRYY, the number of
bytes which have to be moved within the array, is set equal to
register R1 minus register R6. Register BRY set equal to register
R1 minus 1. Register R1 is set equal to register BRY plus register
R0. BRY now contains the end location of the old array.
It should be understood that R1 contains either the address of the
last byte in the last EXECUTIVE pointer in the array, or
alternatively, if this stage 808 had been reached directly from
stage 806, the address of the last byte in the first vacant
EXECUTIVE POINTER in the array.
The right move macro RMVC moves the EXECUTIVE POINTER starting at
the address IR6 to the end of the array one EXECUTIVE POINTER
length, leaving a hole in the sub array for the new EXECUTIVE
POINTER whose screen is in JOBLIST and whose address will be
inserted in MMATCH. The last EXECUTIVE POINTER in the array must be
stored and, in fact, was stored previously, in the save area of the
YY array and the address where said EXECUTIVE POINTER was stored is
recorded at either LEXICON plus 8 or LEXICON plus 12.
At the completion of the right move macro RMVC at stage 808, the
program continues to stage 834 wherein location C15 and register
BRYY are both set equal to the value of the half-word DUN1, which
is the length of the screen in JOBLIST. Location C16 and register
BRY are set equal to the value of the half word DUN1 plus
&ADDL. Thus BRY and C16 now contain the length of the EXECUTIVE
POINTER which will be inserted in the array. Next registers R0 to
R6 are stored in arry C1.
Then, the program continues to stage 836 wherein the question is
asked "is the LEXICON byte zero?" This would have occured only if
the program stage 808 had been entered directly from stage 806. If
this was so, the LEXICON array would not be used because no
continuance is required; and, accordingly, the program then
continues directly to stage 838 wherein certain operations would be
accomplished before exiting from the program.
At stage 838, RO is set to the value of the half-word DUN1, IR6 is
changed to the IR6 value plus R0. IR6 now contains the location of
the place in the array where an address must be added to the screen
from JOBLIST to form a new EXECUTIVE POINTER. The address at IR6
is, of course, set to zero. Then IR6 is reset to the screen address
in the array by subtracting therefrom the value R0.
After completing these steps, the control goes to stage 840 wherein
another LMOVE macro is executed. The screen in JOBLIST is moved to
the address designated by IR6 in the array.
After completing stage 840, IR2 is reset to equal IR2 plus BRYY or
the address of the next screen in JOBLIST. IR6 is reset to IR6 BRYY
or the position in the array where the address must be placed
adjacent to the screen just taken from JOBLIST. This is
accomplished at stage 842. After completion of stage 842, the
program is completed.
Returning to stage 836, we will consider the possibility when
LEXICON is not equal to zero. Then, the program continues at stage
844 wherein a determination is made whether the 04 bit in LEXICON
is zero or one. If the 04 bit is zero, that means that the location
defined by the address in LEXICON plus 12 byte is empty. If the 04
bit is one, then the location defined by the address in LEXICON
plus 12 is filled. It should be understood that when the location
in the YY array is filled whose address is in LEXICON plus 12 bytes
then, necessarily, the location determined by the address in
LEXICON plus 8 bytes is also filled and two EXECUTIVE POINTERS are
in place. If the 04 bit is on, the program continues to stage 846.
If the LEXICON 04 bit is zero, the program continues to stage
848.
At stage 846, the half word MASK1 has stored therein the contents
of register R0. R0 contains the EXECUTIVE POINTER length. Then
register IR1 is loaded with the address in LEXICON plus 8
bytes.
If the 04 bit of LEXICON were zero, as was stated previously, the
program would have continued at stage 848 wherein the 01 bit of
LEXICON would be checked. If the 01 bit in LEXICON were one, this
would have meant that there were no more continuances and control
would have gone to stage 852 and thence in the same manner that
will be discussed with respect to stage 802. However, for our
purposes, we will discuss the operation when stage 848 indicates
that the 01 bit of LEXICON is zero. At stage 854 the half-word
MASK1 is loaded with the EXECUTIVE POINTER length, which is in
Register R0. Then, control goes to stage 856 wherein the macro
LARGEXC is executed. This instruction merely zeros or erases the
EXECUTIVE POINTER which was in the hole identified by the address
recorded in register IR6 IR6 has recorded therein the address of
the hole where the screen of JOBLIST is to be inserted. Thus, at
stage 856, this hole has been erased and cleared for a later
insertion
After completing the step at stage 856, control is transferred to
stage 858 wherein the LEXICON 01 bit is turned on. Then, at stage
860 the address in LEXICON + 8 is recorded in register IR1. Then,
the program continues to stage 862. At stage 862, another LMOVE
macro instruction is executed and the screen of the EXECUTIVE
POINTER located by the address in register IR1 is moved into the
JOBLIST in place of the screen originally therein. The screen
originally therein, of course, has been saved in the YY array at
the address recorded in LEXICON + 4.
After completion of the operation at stage 862, control goes to
stage 864 wherein the registers R0 to R6 are loaded from the array
C1 and the MSIGNAL 20 bit is turned on. After completion of stage
864, it will be obvious that the insertion operation has not been
truly completed by the operation above described, but the program
will continue, as shown in FIG. 16 through MMATCH stage 616, TBADD
macro 620, and stage 626 back to the start of SCREEN at stage 614.
This will recycle and, when it comes through for the second time,
the LEXICON 04 bit will be on. When the LEXICON 04 bit is on,
control would have come through stage 846 to stage 850. At stage
850 an LMOVE macro-instruction would have been executed to insert
the EXECUTIVE POINTER recorded in the YY array at the address in
LEXICON plus 8 bytes into the opening whose address is recorded in
register IR6
After completing stage 850, control is transferred to stage 866
wherein register IR3 is loaded with the address in LEXICON + 12.
Then, at stage 868, another LMOVE macro is executed to transfer the
EXECUTIVE POINTER in the YY array determined by the address in
LEXICON + 12 to the place in the YY array determined by the address
in LEXICON + 8 bytes. After completing stage 868, control goes to
stage 870 wherein the 08 and 04 bits in the LEXICON byte are turned
off. Then, control goes to stage 862 wherein the screen stored in
the YY array whose address is at LEXICON + 8 bytes is transferred
to JOBLIST. Then control passes through stage 864 as discussed
previously.
If the LEXICON 04 bit was zero and the LEXICON 01 bit was one, the
control goes to stage 852, as discussed earlier. Similarly, when
the 01 bit of the JI byte was on, control also goes from stage 800
to stage 852. In stage 852, the register IR1 has recorded therein
the address in LEXICON + 8 bytes. The half word MASK1 has recorded
therein the value of R0, which is equal to the value in the half
word DUN1 plus &ADDL, the EXECUTIVE POINTER length. From stage
852 control is transferred to stage 872 wherein another LMOVE macro
instruction is executed and the EXECUTIVE POINTER stored in the YY
array at the address defined in LEXICON + 8 is inserted in the
array at the address recorded in IR6. Then, at stage 874, IR1
register is reset to LEXICON + 4 bytes. At stage 876, the screen
whose location was determined by the address in LEXICON + 4 bytes
is transferred back to the JOBLIST.
After completing stage 876, control goes to stage 878 and certain
parameters are set. First, the byte LEXICON is set to zero; the
registers R0 to R6 are loaded from array C1, and the MSIGNAL 20 bit
is turned on. After completing stage 878, control is transferred to
stage 620 in FIG. 16.
It must be understood that the hole created in the array will be
filled when control returns to SCREEN, after cycling through
SSEARCH, at stage 730 and goes to stage 732. The LMOVE macro in
stage 732 will insert the correct screen in the hole left in the
array.
In FIG. 21, there is shown the flow diagram for the MMATCH macro.
In the first stage in MMATCH a determination is made as to whether
the address associated with the EXECUTIVE POINTER in the array, as
determined by the address recorded in register IR6, is zero or a
value other than zero. IR6 has been set by any one of the three
stages in INDEX search 610, screen search 614 or AUXFILE search 630
to point at the address of the EXECUTIVE POINTER in the array or to
the continuance address. If the address specified by register IR6
is not zero then there is no mismatch; and, therefore, control
should go to TBADD. Thus, MMATCH will be bypassed and control will
go directly to TBADD. The output of stage 880 is connected to stage
882 wherein the address specified in register IR6 is recorded in
the location ADDRESS.
Then the program continues to stage 884 wherein the question is
asked "is the MSIGNAL 20 bit zero or one?". If the MSIGNAL 20 bit
is "one", indicating that one must access or create a continuance,
the program goes directly to the TBADD macro. If the MSIGNAL 20 bit
is off or zero, the program continues to stage 886. There the half
word MASK2 value is stored in the half word DUN1. As we noted
previously, the half word DUN1 contains the screen length and it
now contains the screen length of the next screen in JOBLIST.
Further, the third byte of MSIGNAL (or MSIGNAL + 2) bytes is
incremented by to indicate that this is the first screen in
JOBLIST. Each time MMATCH is executed at stage 884 and the MSIGNAL
20 bit is 0 the MSIGNAL + 2 byte is incremented by one. After
completing stage 886 the control goes to the TBADD macro at stage
620 in FIG. 16. The programatic stages 880, 882, 884 and 886 are
not, physically within the MMATCH program listing, but they have
been show in the flow diagram for purposes of clarity.
After a determining at stage 880 that the address specified by
register IR6 is zero, control is transferred to stage 888 where
SRGATE is set equal to 02. SRGATE is a special gate which has three
different settings. If SRGATE is 02, this means that the search was
successful and the search may be continued. If SRGATE is 01, this
means that the search must be reexecuted and that the JOBLIST will
have been rearranged. This occurs in the macro-instruction
STRATEGY. If the SRGATE is 00, this means that the search has
failed and it must be terminated This is accomplished by exiting to
the instruction FINISHED in SSEARCH. The next stage of MMATCH is at
stage 890 where the question is asked "is MODE equal to one?" If
MODE is equal to one, then the search is in the storage mode. If
MODE is not equal to one, then a retrieval or one of the three
update modes operations is being executed. Thus, if the answer at
stage 890 is that MODE is one, control goes to stage 892. If the
answer at stage 890 is that MODE is not equal to one, the control
is transferred to stage 894. At stage 892 the MSIGNAL 01 bit is
turned on, indicating that the resident memory block has been
changed. This information will be used by the TBADD macro and the
Global Memory component (SMEMORY) if a new memory block is
required.
After completing step 892, control goes to stage 895 and SRGATE is
checked. If the SRGATE is zero, then this means that the search is
finished and it is terminated through stage 896 at FINISHED. If
SRGATE is 01, this means that the JOBLIST was rearranged in the
STRATEGY macro and a new search must be started by going through
stage 898 to the location NEWPLAY in the SSEARCH component. If, the
SRGATE is in fact 02 at stage 895 then control goes to stage 900
wherein the composite address EMPTY is stored in the array at the
address defined by register IR6. This completes the constructions
of the new EXECUTIVE POINTER in the array. Additionally, the
MSIGNAL 04 bit is turned on and the location at the head of the M
array where address EMPTY is normally stored is zeroed. It should
be remembered that the M and J arrays are permanently resident in
core. Also, at storage 900, the address in register IR6 is recorded
in the storage location BWX plus 76. This information may be
required by the CREATE macro and the SMEMORY component.
At stage 894 the register IR6 contains the address of a vacant or
zero location in the array. This means that the retrieval and
update has been unsuccessful and the override code information must
be used to determine whether or not the search must be aborted. It
should be noted that the three remaining options for MODE (namely
2, 3, and 4), which were discussed earlier, can be carried out by
appropriate program steps between stages 890 and 894.
If, at stage 890, it was determined that the search was not in the
storage mode, the program would have continued to stage 894 wherein
the counter JII is set to zero. SRGATE is also set to zero. This
information indicates that the search has been unsuccessful. This
information can, of course, be overriden if in the next succeeding
states 904, 906 and 908 control goes to STRATEGY and STRATEGY
determines that, in fact, the search can be successful.
After completing the operation at stage 894, control goes to stage
902 wherein the question is asked "is the MSIGNAL 20 bit zero or
one?". The MSIGNAL 20 bit indicates whether or not a link (or
continuance) address is to be inserted at the foot of the array. If
a link address is being considered the answer is one, and control
goes directly to stage 895. Because the SRGATE has been set at zero
the unsuccessful search will be aborted through stage 896. If, at
stage 902, the answer was "zero," then the MSIGNAL 20 bit was not
turned on and control goes to stage 904 wherein the question is
asked "are there any Type 1 over rides?" If the answer is "no",
then control goes to stage 906 wherein the question is asked "are
there any Type 2 over rides?". If the answer at this stage is "no",
the program would have continued to stage 908 where again the
question would be asked "are there any Type 3 over rides?" If the
answer at stage 908 is "no", control goes to stage 895 and, since
the SRGATE is zero, the unsuccessful search would be aborted
through stage 896 at location FINISHED in the SSEARCH
component.
If at any one of the stages 904, 906 or 908, Type 1, Type 2, or
Type 3 overrides were found, control is transferred to stage 910
and the macro-instruction STRATEGY is executed. Type 1, Type 2 and
Type 3 overrides are introduced in the assigned descriptor-sets and
they are counted and their locations are noted in the translators
when these descriptor-sets are rearranged to their JOBLIST item
forms. This occurs before the search procedure begins. Thus, if
there is Type 1, Type 2 or Type 3 overrides are present in the
assigned descriptor-sets the macro-instruction strategy will be
used.
The flow diagram for the STRATEGY macro-intruction is shown in FIG.
22. In the first stage, 916, a special TRANSFER macro-intruction
transfers control to the special component SMATCH. In SMATCH there
will be accomplished, automatically, the inverted or intersecting
file type search.
At stage 918, the question is asked, "what is the value of of
SRGATE?". If SRGATE is two, then control goes directly to stage 922
wherein the question is again asked "is SRGATE equal to, less than,
or greater than one?". If SRGATE is zero, at stage 918, the control
passes to FINISHED in the SSEARCH macro. This would terminate the
search.
If SRGATE is one, at stage 918, the following conditions have
probably occurred. The screen in JOBLIST might have been A1BD. If,
in searching the screen array, SMATCH at stage 916 determines that
there is a ABCD screen in the array, the SMATCH would have set the
SRGATE so that the MOBILE CANONICALIZATION routine would be
executed at stage 920. SMATCH would not have set the SRGATE at one
if, during the translation stage, the JOBLIST had not been
normalized.
At stage 920, control is transferred to the macro-instruction
MOBILE wherein the MOBILE CANONICALIZATION package is executed on
the JOBLIST item. At the end of the SMOBILE program, control is
transferred back to the stage 922 in STRATEGY. MOBILE
CANONICALIZATION can be defined as a strategic rearrangement of the
JOBLIST item which might effect matching. If the JOBLIST item can
be rearranged so as to achieve matching, then, SMOBILE, will do
it.
After completing stage 920, control goes to stage 922 wherein again
the question is asked "is SRGATE equal to, less than or greater
than one?". If less than one, the unsuccessful search is terminated
at the location FINISHED in SSEARCH. If SRGATE is equal to one
control goes to the location NEWPLAY in SSEARCH. This occurs when
new permitted arrangements of the JOBLIST item were effected in
SMOBILE. If SRGATE is greater than one, the matching procedure
executed in SMATCH and or SMOBILE) has disclosed the existence of
an acceptable information subpath. In this case, a stage 912, the
composite address in location 0(IR6) is loaded into ADDRESS and
control goes to stage 924 in the TBADD macro.
The flow diagram for TBADD is shown in FIG. 23. In TBADD, the first
stage (924) contains a COMPARE macro-intruction which compares the
slow address parts of the composite addresses CURRENT and ADDRESS.
If these two slow memory addresses are equal, this means that the
requested memory-block is already resident in core-memory. It
should be remembered that CURRENT contains the virtual memory
address of the resident memory block, and ADDRESS contains the
virtual memory address of the requested memory-block. The second
part of ADDRESS, which is called the fast-memory address, specifies
the beginning address of the requested array when the requested
memory-block is core-resident.
Now to return to FIG. 23, if at stage 924 the slow address parts of
CURRENT and ADDRESS are equal then the requested memory-block is
already resident in core and control goes to stage 926. There, the
APART macro-instruction extracts the fast-memory part of ADDRESS
and stores it in register IR6 is checked. Next control is
transferred to stage 928 and the 04 bit of the MSIGNAL byte is
checked. If the 04 bit of MSIGNAL is off, control goes to stage
930. The 04 bit of MSIGNAL indicates whether or not the requested
sub array exists in core-memory. If the sub array already exists
the MSIGNAL 04 bit is off. If the subarray does not exist, then the
MSIGNAL 04 bit is one and, a new subarray must be created by the
CREATE macro at the address specified in register IR6.
If the MSIGNAL 04 bit is on, control goes to stage 932 wherein the
macro instruction CREATE is executed. The operation of the
macro-instruction CREATE will be more fully discussed hereinafter
(see FIG. 24). However, for purposes of a simplified description,
CREATE first checks to see whether it is possible to fit a new
subarray in the resident memory block. If there is not enough space
in the resident block CREATE calls the Global Memory Component
(SMEMORY), which writes the resident memory-block in virtual memory
and then creates a new resident memory-block. All the composite
addresses are updated. Control then goes to state 930 wherein the
MSIGNAL 04 bit is turned off. After completing stage 930, control
is transferred to stage 626, as shown in FIG. 16.
If there is room in the memory block for the subarray, CREATE
simply creates the subarray, updates the composite addresses in the
memory block, then control goes to stage 930.
If, at stage 924, it was determined that the slow memory portions
of CURRENT and ADDRESS were not equal, then control passes to stage
934 wherein the slow portion of ADDRESS would be compared with
zero. If ADDRESS was equal to zero, this would mean that a memory
block is not required because the fast memory address in ADDRESS
specifies a location in the permanently core-reident part of the
AUXILIARY FILE. Accordingly, control passes immediately to stage
926. If the slow-memory part of ADDRESS is not equal to zero at
stage 934, then control passes to the macro GLOBAL at stage 936.
GLOBAL calls the Global Memory component (SMEMORY), which
supervises the memory-blocks of the AUXILIARY FILE. GLOBAL insures
that the resident memory-block will be updated in virtual memory
and it fetches (or creates) the new memory-block whose address in
in ADDRESS. After GLOBAL has completed its operations, the
composite address CURRENT is set equal to ADDRESS. This is
accomplished at stage 938. This means that request address
(ADDRESS) is now also stored in CURRENT.
After completing stage 938, control goes to stage 926 in the manner
discussed previously.
In FIG. 24, there is shown the flow diagram for the CREATE macro.
In the first state, 940, initializing information is computed.
First, register LR5 is set equal to the sum of the values in
register IR6 and IR1. This sum in register IR5 is the absolute
machine address of the head of the array that is to be created.
REgister IR6 contains the relative address of the head of the array
that is to be created within the resident memory block. Register
IR1 contains the base address of the memory block.
Register BRY is set equal to &TRKL times &NTRKS or the
memory block size. &TRKL and &NTRKS are system parameters
with &TRKL being the number of bytes per track and &NTRKS
being the number of tracks per memory block. Register R0 has
recorded therein the half word DUN1, which is the screen length.
Register R1 has recorded therein &ADDL, which is the address
length for the array. It is now necessary to determine, first, how
much storage is required for the array so that it will then be
possible to determine whether there is sufficient room in the
memory block for the new array to be created. Thus, the first step
is to proceed to stage 942 where the question is asked "is R0,
which contains the screen length, greater than zero, zero, or less
than zero?" At stage 944 register R1 will be set at &LSLOW, the
slow memory address length which is a system parameter used in
AUXFILE, and Register R1 is set to zero.
If register R0 is equal to or greater than zero, control goes
directly to stage 946. Control also goes from stage 944 to 946. At
stage 946, register DUN7 has recorded therein the sum of registers
R1 and R0. In the case of control coming from stage 944, DUN7 will
contain &LSLOW. If register R0 was equal to or greater than
zero, then DUN7 will contain the sum of &ADDL and the screen
length. This is the EXECUTIVE POINTER length for the array.
Additionally, at stage 946, register BRYY is loaded with
&MATRIXS, which is a system parameter indicating the number of
EXECUTIVE POINTERS in a secondary array. Secondary arrays are those
associated with the screens BD.sub.1, LD.sub.1, BD.sub.2. . . , and
the bulk storage addresses.
The value of the byte (MSIGNAL+2) determines which kind of array is
to be created. At stage 948, the question is asked "does the
MSIGNAL plus 2 byte have two, more than two, or less than two
therein?". If there is less than two in the MSIGNAL +2 byte, then,
at stage 950, BRYY is set to 20. If there is less than 2 in the
MSIGNAL plus 2 byte this means an array associated with index M or
screen J is being created. In this case the length of the array is
arbitrarily set at twenty EXECUTIVE POINTERS. If the MSIGNAL + 2
byte is greater than two, then the present value of &MATRIXS is
correct and control goes directly to stage 952. If the MSIGNAL + 2
byte contains two, then control goes to stage 954 wherein BRYY
would be set to equal to &MATRIXL. &MATRIXL is a system
parameter indicating the number of EXECUTIVE POINTERS which would
fill the primary array, which is associated with the screen
LD.sub.o.
At stage 952, the register BRYY (the number of EXECUTIVE POINTERS
in the array is multiplied times DUN7 (the EXECUTIVE POINTER
length) and four bytes are added thereto then the total is stored
in register R0. At this point the register R0 contains the relative
continuance address in the array that is to be created. The extra
four bytes are added to account for the location at the head of the
array where the relative address of the continuance address is
stored. Then, register R1 is set equal to register R0 plus
&ADDL, which is the total length of the array that is to be
created. &ADDL is added in order to provide space for the
continuance address, which will be added at the end of the array.
Control then goes to stage 956 where the question is asked "is
register R1 (the length of the array), greater than BRY (the size
of the memory block?)". If the answer is yes, then there is
something wrong in the system which must be checked. First, the
program would proceed to stage 958 wherein the register R14 is
decreased by 1. This means that the length of the array will be
decreased by one EXECUTIVE POINTER length. Then control goes to
stage 960 where the question is asked "is R14 greater than zero?".
If this is the case, as it should be, control returns to stage 952
to repeat stage 952 and 956. This byte will continue until either
the memory block size is greater than the array size, as determined
at stage 956, or the array size is zero. If the register R14 is
zero, control goes to stage 962 where certain registers are saved.
The exact operation at stage 962 and the succeeding stage 964 will
be discussed with respect to another phase of the macro-instruction
CREATE. However, suffice to say that control eventually goes to
stage 966 wherein the question is asked "is register R14 equal to
zero or a value other than zero?". Since, at stage 969 it was
determined that register R14 was zero, control goes to stage 968
where a message is printed that a screen is too large for the
memory block, and termination of the SOLID System would begin at
location CL1 in the CONTROL routine.
If the array length is less than or equal to the memory block
length then control is transferred from stage 956 to stage 970. At
stage 970, the DUN7 is loaded with register R0. This occurs
because, if the array length had been decreased through stages 958
and 960, then DUN7 will have changed. The value in DUN7 is then
stored in the array at the location designated by register IR5.
Thus the value of IR5 is increased by 4 bytes. Thus, IR5 now points
at the first EXECUTIVE POINTER or element in the array that is to
be created. At the final step of stage 970 the registers R0 to R4
are stored in the array C1.
Next, control goes to stage 972 where the EMPTY is decomposed into
its five parts in registers R0, R1, R2, R3, and R4. At this stage,
register R4 contains the address of the first unused byte in the
memory block. Control then passes to stage 974 where register R14
is set equal to the sum of DUN7 plus &ADDL plus register R4 .
Thus register R14 contains the relative address of the last byte in
the new array that is to be created in core-memory. Register R15 is
set equal to R15 plus SAVEYY minus AYY. In this equation, the
original R15 was the length of the memory block, computed in BRYY,
SAVEYY is the absolute machine address of the beginning of the
resident memory-block and AYY is the absolute machine address of
the beginning of the M and J arrays. Thus register R15 now contains
the relative address of the last byte in the memory block. At stage
976, a determination is made as to whether or not R14 is greater
than R15. If R14 is greater than R15, then this means that the new
array cannot be created in the resident memory-block. If R14 is
less than or equal to R15, control goes to stage 978 where the
register R4 is set equal to register R14. From stage 978 control
goes to stage 980 where the macro instructions ASADD performs the
task of updating EMPTY with the new value in register R4. After
completing stage 980, control goes to stage 982 where registers R0
to R4 are loaded from array C1. Next, control is transferred to
stage 984 where register R0 is set equal to register R1 minus 4.
Since register R0 originally contained the relative address of the
continuation address and register R1 contains the length of the
array, register R0 now contains the length of the array minus four
bytes.
From stage 984 control goes to stage 986 where contain safety
checks are completed. First, the slow memory portion of address
EMPTY is compared with the slow memory portion of address EMPTY +
&ADDL. If they are equal, this means that no memory block is to
be created and, accordingly, at stage 988, EMPTY is set equal to
EMPTY thus updating the fast portion of EMPTY. If the two are not
equal at stage 986, control goes to stage 990 wherein the contents
of register R0 are stored in the DUM1. Then, at stage 992, the
macro-instructions LARGEXC is performed. In LARGEXC, the entire
array except for the first four bytes are zeroed. At this point the
new array has been created. After completing stage 992, control
goes to stage 994 where register R15 is set equal to SAVEYY, the
absolute address where the CURRENT address is recorded in the
memory-block. Next, the updated composite address EMPTY is stored
in the location specified by register R15. After completing stage
994, control returns to the TBADD macro.
The more difficult problem occurs when, at stage 976, it is found
that there is insufficient space in the resident memory block to
create the new array. When this occurs, control goes directly to
stage 996.
At stage 996, the macro-instruction LINKHOLE (defined previously as
macro 115) determines whether or not there are any unused arrays in
the memory block which can be used for the new array. If the answer
is yes, control would go immediately to stage 990, because the
array already exists in the memory block. If the answer is no,
control goes to stage 998 where a COMPARE macro-instruction
compares the slow memory address portions of EMPTY and (EMPTY +
&ADDL). If they are equal, this means we have not yet computed
the slow memory address of the new memory block that is to be
created. If they are not equal, this means that the slow memory
address of the new memory block that is to be created has been
computed and, accordingly, control goes directly to stage 1000 and,
thence, to stage 1002. If a new slow memory address had not been
computed, stage 1002 will be reached at the end of the branch which
begins at stage 962. For purposes of clarity, it is assumed that
the parts of EMPTY and (EMPTY + &ADDL) were not equal at stage
998. At this point the slow memory address of the new memory block
has been computed and, accordingly, at stage 1000, register R15 is
loadedwith the address in SAVEYY, which is the address of the
location at the foot of the M and J arrays. Then, the EMPTY address
is stored at the foot of the MJ array. Next the MSIGNAL 80 bit is
turned off to indicate that the request for a newly created
memory-block is being executed. Also, the request address (ADDRESS)
is set equal to EMPTY + &ADDL. Register R15 is then updated to
(BWX+76). (BWX+76) contains the address in the resident memory
block where the new composite address was inserted. (see stage 900
in FIG. 21). Then, in this location in the last memory block,
address EMPTY + &ADDL is inserted. Thus, the old memory block
has now been updated by inserting the new value of the address of
the new array. From stage 1000 by inserting control goes to stage
1002. However, before discussing the operation at stage 1002, let
us consider the case when, at stage 998, the slow memory addresses
of EMPTY and (EMPTY + (ADDL) were equal. In this case, control goes
to stage 962 where register R15 is loaded with the address in
SAVEYY, which is the address of the composite address at the foot
of the M and J arrays. Then, EMPTY is stored in the location at the
foot of the M and J arrays. Further, R15 is loaded with the address
contained in (BWX+76), and the composite address in the resident
memory block that is specified in register R15 is set to zero.
From stage 962 control goes to stage 964 wherein the macro
instruction APART separates the five components of the composite
address EMPTY + &ADDL and records them in the five registers R0
to R4. Then, at stage 966 the question is asked "is register R14
equal to or greater than zero? " If R14 is equal to zero, the abort
procedure which begins at stage 968 is executed. If R14 is not zero
control goes to stage 1004. For purposes of clarity, stage 1004 has
been shown as a block grouping which will effect the following:
Stage 1004 is a series of steps which compute the five components
for the new composite address for the memory block that is to be
created. These five components are as follows:
1. &RD which is the device type number (disc. tape. data cell,
etc.) which is recorded in register R0.
2. &rdo which is the device number recorded in register R1.
3. &rtrk which is the beginning track on the device recorded in
register R2.
4. &rcyln which is the beginning cylinder on the device
recorded in register R3.
5. &rfmadd which is a relative fast memory address in core
where there is space to create an array.
Because a new memory-block is being created &RFMADD is set
equal to the address of the foot of the M and J arrays, plus
&ADDL. If the device is a tape rather than a disk, it is not
necessary to have both track and cylinder numbers, but only a
record number and, accordingly, &RTRK and &RCYLN together
contain a single record number recorded in registers R2 and R3. If,
in computing the new components for the composite address of the
new memory block, it is determined that there is, in fact,
insufficient equipment to store the new memory block; for example,
one has run out of disk memory and there is no other available
virtual memory, then a message is printed to notify the machine
operator that he needs to obtain additional storage devices for
virtual memory. This abort procedure is completed through the
Global Memory component SMEMORY. The resident memory-block is saved
and the operator is notified that additional devices must be
obtained. Additionally, the system also advises the operator as to
what type of devices are to be preferred. The declared universe of
the system, what is the amount of devices available to the system,
is defined in a single macro-instruction called BEGINS.
From the block of stages 1004 control goes to address
(EMPTY+&ADDL) is assembled from the five registers R0 to R4.
Then R4 (=&RFMADD) is incremented by &ADDL at stage 1008.
It should be understood that in stage 1004 the register R4
contained the value in SAVEYY and it is necessary to increment it
by the amount &ADDL to get the relative address of the first
byte where the new array can be created. After completing stage
1008, control goes to stage 1010, where the composite address is
ADDRESS updated with the new value in register R4. EMPTY is set
equal to ADDRESS at stage 1012. Additionally, the EMPTY address is
inserted in the resident memory block at the address in (BWX+76) as
was done with respect to stage 1000.
From stage 1012 control is transferred to stage 1002 wherein
registers R0 to R4 are loaded from the array C1 and then control
goes to stage 936 in the TBADD macro. This return to GLOBAL in
TBADD saves the resident memory block. After completing the save
operation in SMEMORY, control returns from the TBADD
macro-instruction to CREATE again. In CREATE, at stage 976, a
determination is made as to whether there is sufficient space to
create the new array in core and, accordingly, it does so at stages
976, 980, 982, 984, 986, 988, 990, 992 and 994 and then exits from
CREATE.
GLOBAL MEMORY
The GLOBAL memory component (hereinafter called SMEMORY), transfers
memory blocks between the AUXILIARY FILE, which can be located on
any combination of devices, and core storage. The AUXILIARY FILE
must be separated by definition from bulk storage. That is,
information which is utilized to address information in the bulk
storage will be found in the AUXILIARY FILE or in core storage. The
AUXILIARY FILE is, normally, placed on the disk storage of the
computer and aids in finding the address of a particular group or
segments of information. The DCB (macro-instructions of the IBM 360
which define the characteristics of the data set on a peripheral
storage device) and read/write instructions of each new device that
is made a part of the GLOBAL MEMORY are incorporated in the DCBMEM
macro-instruction, which is used in SMEMORY. The storage capacity
of each new device must be given in the macro BEGINS. This
information will be used by the computer to assign new memory
blocks when all previously assigned devices are full. Thus, by
modifying SMEMORY the GLOBAL MEMORY is easily extended to include
new storage devices when they are added. SMEMORY notifies the
operator when the GLOBAL MEMORY is full. Because the existing
storage is not altered in any way when the GLOBAL MEMORY is
extended, this component (SMEMORY) permits the simultaneous growth
of the hardware and retrieval systems.
There are two parts, A and B, of the SMEMORY component. Part A
supervises the AUXILIARY FILE while the information paths are being
traced or purged or created or updated. Part B is entered when the
job stream is terminated. Its function is to save (if necessary)
the resident memory block and to punch (if necessary) the first
part of the AUXILIARY FILE. This punched card deck, which contains
the M and J subarrays, will preface the input deck for the next job
stream.
It should be noted that the M and J subarrays, although part of
AUXILIARY FILE, are always in core storage. The M and J subarrays
are in core storage because all information paths start with these
subarrays and, thus, it is possible to save considerable time by
avoiding the necessity for fetching information from disk storage
to start these paths.
At every step in the retrieval package, safety procedures are
executed which assure that the memory block in the AUXILIARY FILE
will never be damaged by program, input, operator or machine
errors. Only the physical breakdown of the virtual memory hardware
components can damage the AUXILIARY FILE. No attempt will be made
to describe the numerous safety procedures. The two parts of
SMEMORY are described next.
The flow chart of Part A of SMEMORY is shown in FIG. 10. Before
explaining the flow chart, it should be understood that the input
data for SMEMORY contains a composite address whose slow-memory
part specifies the location of a memory block in the virtual memory
storage. The fast memory part of the composite address specifies
the location of the requested information when the memory-block
resides in core-storage. The composite address is normally six
bytes long and it is used to determine the course of action of
GLOBAL MEMORY. For example, in the search procedure it is necessary
that the machine believe, at all times, that the information it is
looking for is in core storage. It is a purpose of GLOBAL MEMORY to
find the information no matter where it may be and transfer it to
core storage whenever it is needed. The composite address discussed
above contains, in its first three bytes, information relating to
non-core storage. The first four bits, designates the type of
device of non-core storage where the memory-block can be found. For
example, the size permutation and combinations of the first four
bits will include codes for tape storage, disks, drum storage, etc.
This will key the next four bits to know on which particular one of
a possible 16 different units of tape, disk, or drum storage the
memory-block can be found. The next 16 bits are divided into six
bit and ten bit sections which together designate an address on the
particular storage element. For example, if the storage element is
a disk, the next sixteen bits contain the track number on the first
six bits and the cylinder number on the next tex bits. If the
storage element is a tape, then the 16 bits together specify the
record where the memory block begins.
The remaining three bytes in the composite address contains the
core address where the particular information can be found when the
memory-block resides in core.
It should be noted that there is one byte of information known as
MSIGNAL, which is continually being updated, and it specifies the
type of action that is to be taken by the Global Memory Component
(SMEMORY). If the right most bit of MSIGNAL is a "one", this means
that the resident memory block or the permanently resident part of
the AUXILIARY file has been altered by inserting a new EXECUTIVE
POINTER and by creating a new subarray. Another portion of storage
which is checked by SMEMORY is the composite address called
CURRENT. If CURRENT equals FFFFFFFF (a condition which is placed
into CURRENT in SSTRATECL when there is no AUXILIARY FILE) the
AUXILIARY FILE does not exist, and a new memory block will have to
be created in core-storage. When CURRENT equals zero, that means
that there is no resident memory block, and one will have to be
fetched from the virtual memory or created in core-storage. These
values of CURRENT are set when the AUXILIARY FILE is initialized at
the beginning of each job stream.
In one version of the SMEMORY component, the virtual memory was on
IBM 2311 disks which had two distinctly different write modes
(i.e., new write and rewrite). The 02 or next to last, rightmost
bit of the MSIGNAL byte designates which of the two write modes is
to be used to store the resident memory block at the location in
virtual memory specified by the slow-memory part of the composite
address CURRENT.
The flow diagram for the first part of SMEMORY is shown in FIG. 10.
The first step in SMEMORY is to check the rightmost of 01 bit of
MSIGNAL at stage 322. If the MSIGNAL 01 bit is zero, this means
that the memory block was not changed and, therefore, there need be
no read out into virtual memory. If the MSIGNAL 01 bit was one,
then there must be a readout into virtual memory.
If the MSIGNAL 01 bit is "one", the next step is to check the
address CURRENT to determine whether or not there is an AUXILIARY
FILE, by subtracting from CURRENT, FFFFFFFF (CONM.). If there is no
AUXILIARY FILE, and the answer is, therefore, "zero", then control
continues to stage 326 wherein a determination is made as to
whether one wishes to create a memory block in core or whether one
wishes to read it from virtual memory. This determination is made
by testing the MSIGNAL 80 bit which, if it is "one" indicates that
you wish to create a new memory block in core and, if it is "zero",
indicates that you wish to read a memory block from virtual-memory
into core. If the MSIGNAL 80 bit is "one" then the next step is to
execute stage 328 wherein the machine updates the component address
EMPTY, a position in core storage which contains the composite
address of the next available position in virtual memory where a
new memory-block can be stored. This EMPTY address is set in
CURRENT. Of course, the MSIGNAL 80 bit is turned off and the
MSIGNAL 02 bit is turned on to indicate that this newly created
memory-block has never been written into virtual memory. When the
time comes to write this memory-block into virtual-memory the 02
bit of MSIGNAL will indicate that the new write operation must be
used.
After completing step 328, control goes to stage 330 wherein the
MSIGNAL 01 bit is turned off to indicate, that at this point, there
has been no modification of the new memory block that is now
resident in core.
If, at stage 324 the answer was something other than zero, control
would have been transferred to stage 332. There CURRENT is compared
to zero. If CURRENT were zero, then the procedure set forth with
respect to stages 326, 327 and 330 would have been executed in
substantially the same manner. However, there is one variation that
could occur. That is, if a stage 326 it was determined that an
existing memory block was to be read into core-memory from the
virtual memory address specified in ADDRESS. In this case the
MSIGNAL 80 bit would have been zero and control would have gone to
stage 334, where the memory block which had been requested during
the SEARCH procedure is read into core storage. It should be
understood that in the TBADD macro in the SSEARCH component it was
determined that the resident memory block is not the correct memory
block for a particular search and, in fact, a different memory
block was requested by SSEARCH. After the steps at stage 334 are
completed the MSIGNAL 02 bit is turned off to indicate that the new
memory block has been read from peripheral storage.
It thus should be noted that when the MSIGNAL 02 bit is off, it
indicates that a rewrite procedure must be used when the resident
memory-block is transferred back to its specified location in
virtual-memory. If the MSIGNAL 02 bit is "on" this would indicate
that the resident memory-block is new and the new write procedure
must be used to transfer it to the AUXILIARY FILE. After completing
stage 336, control goes to stage 330 wherein the MSIGNAL 01 bit is
turned off, meaning that the new resident memory block has not yet
been changed, then CURRENT is now loaded with the address of the
memory block taken from ADDRESS.
If CURRENT is not zero, it specifies where the resident
memory-block should be stored in the virtual memory. It must be
understood that the memory block is core-memory, which originally
came from the virtual memory or it was created, has been modified
before SMEMORY was called. The virtual memory must be updated with
this changed resident memory-block before a new memory-block is
transferred to core-memory or created. Thus, if there is an address
in CURRENT, control goes to stage 338 wherein the MSIGNAL 02 bit is
again reviewed to determine whether this is a new write or a
rewrite procedure. If it is a new write procedure, then the
resident memory block has not been transferred to virtual memory
before. Thus, the resident memory block must be written in the
virtual memory for the first time. Its previously assigned virtual
memory location, is found in the slow memory part of the composite
address CURRENT.
If this is a new write procedure, then the MSIGNAL 02 bit is "one"
and the program continues to stage 340 wherein the memory block in
core is written for the first time into the virtual memory at the
location specified by CURRENT. If the MSIGNAL 02 bit is zero a
rewrite procedure is used. Thus, control goes to stage 342 wherein
the rewrite procedure transfers the memory block from core back to
its address in the virtual-memory that is specified in CURRENT.
From stage 342, control goes to stage 326. It should be noted that
the entire purpose of SMEMORY is to give to SSEARCH or a similar
programmatic procedure a new memory block whenever it is needed and
take care of the procedural functions that are necessary to
preserve the core-resident memory block. The stages 322, 324, 332,
338, 340 and 342 have taken care of this procedural function. At
stage 326, a determination is made as to whether a new memory block
is to be created in core storage or whether a memory block is to be
read from virtual memory into core storage. If a new memory block
is to be created in core storage, then control is transferred to
stage 328. If the required memory block already exists it is read
into core storage at stage 334.
COPAK COMPRESSOR
COPAK is a high-speed, multistage, compressor-decompressor software
package that can be used to compress arbitrary bit-strings by
reversibly removing redundant information. Decompression occurs
without losing a single significant binary-bit of the original
string. Except for minimal commands, both the compressor and
decompressor parts of COPAK are fully automatic. COPAK operates
independently of both the data-base and the
information-content.
COPAK can be used for supervision of bulk storage and for
transmission of data in communications and computer networks. A
more effective role can be achieved by implementing COPAK on a
small, high-speed, low-cost, specially designed dedicated computer.
This unit could be interfaced with computer/communication networks
or used on a stand alone basis for compressing and decompressing
information. As such, it is highly usefull as a buffer-converter
between various combinations of computer systems and input-output
devices. Careful considerations indicate that this low-cost unit
could have a throughput between three and thirty times faster than
COPAK on the IBM 360/67. The throughput on the 360/67, inclusive of
both input and output times, lies between 40K and 900K BAUDS, with
the optimum near 550K BAUDS.
COPAK has been described in detail hereinunder as a machine process
in the form of a combination of a computer software package and a
general purpose digital computer of adequate capacity and
versatility. In fact, the COPAK package described hereinunder has
been utilized in conjunction with general purpose machines such as
IBM 360/67 and 360/40. It is noted that when carrying out COPAK,
general purpose machines perform a specialized task and only those
components of the warehouse of components contained therein which
are ordered and organized by the COPAK act, as controlled by COPAK.
In effect then, the combination of COPAK and a general purpose
machine becomes a special purpose digital computer. Alternatively,
the flow diagrams and the program steps and instructions described
hereinunder in detail comprise a teaching of combining existing
hardware components, such as those used in the general purpose
machines mentioned above, under control of COPAK, to arrive at a
special purpose computer carrying out COPAK. The process of so
combining existing components as dictated by COPAK is an
engineering task for one of ordinary skill in the art and does not
involve inventive efforts. Although COPAK is usually referred to as
"software package" hereinunder, its function as a special purpose
machine when combined with a general purpose digital computer
should remain clear.
The communications and computer industries have placed great
emphasis on engineering research which can increase the "efficiency
of networks" by increasing the channel capacity or speed (of
transfer) or by reducing the proportion of redundant signals. In
recent years the storage capacities of peripheral devices (like
disks, drums, data-cells, tapes and cards) have been enormously
increased by advances in engineering technology. Some special
recoding techniques that save storage and/or lower transmission
costs have been widely used. However, these special techniques are
of limited usefulness because they apply to particular devices
and/or they are not independent of the data base. There appears to
be no report of a major effort to devise general software packages
that can increase the information content per unit of the
information itself. Such packages could artificially increase the
storage capacities of existing facilities and lower transmission
costs.
To be of more than transient usefulness these general software
packages should meet as many of the following specifications as
possible.
i. with a minimal number of commands the software packages should
be capable of handling any binary coded information. This means
that compression and decompression must be independent of the
data-base or the information content.
ii. Compressed information should be automatically decompressed
back to the original whenever it is needed.
iii. For communications networks the rate of compression should not
be less than the rate of transmission. The decompression rate (at a
receiver station) should not be slower than the rate of
compression.
iv. There must be checks to ensure that errors in the compressed
information will be detected before or during decompression.
v. The effectiveness of the proposed package for increasing the
capacity of existing storage devices will be determined by several
factors. Some of these are: the access and transfer times of the
peripheral equipment; the speed of decompression; the frequency
that the particular information is used. Obviously, infrequently
used information can be highly compressed to release storage that
would not normally be available.
To be fully effective in both storage and communications
applications the general software packages should have adjustable
parameters which would permit the user to stipulate the maximum
amount of time that can be devoted to compressing (or
decompressing) information.
The COPAK compressor meets the five specifications just listed. The
computer speed and two variable parameters determine the rate of
both compression and decompression. On the IBM 360/67 COPAK
compresses information at rates of 40,000 to 900,000 BAUDS.
Decompression is at least one and a half times faster.
Definition and Commands
The two parts of COPAK (Compressor and Decompressor) each have two
stages (SNUPAK and SANPAK). The COPAK compressor handles the
information as strings, segments and substrings. A string of
information can be divided into non-equal segments. Segments can be
sub-divided further into substrings. The lengths of strings,
segments and substrings is a user option. The numeric stage
(SNUPAK), which can process any information designated as binary
coded numbers, handles segments of information at the substring
level. The alphanumeric stage (SANPAK) handles strings of
information at the segment level. As described hereinunder, COPAK
processes one segment per string (i.e., string = segment) with each
segment containing between one and twenty substrings.
The Device Command LLENGTH, which must have a value less than 256,
specified the number of bytes in the label or key which may preface
each string. This information is not processed by COPAK. The
leftmost LLENGTH bytes are removed from the first segment and the
shortened segment is processed by COPAK. The structure of the
stored composite string is:
The String Command MODE determines which part of the COPAK
compressor is to be used, i.e., MODE=0, decompress; MODE.noteq.0,
compress. Three other String Commands (LEXCON, LEXPCH and LEXMODE)
are associated exclusively with the alphanumeric compressor stage
(SANPAKC) of COPAK. They will be discussed later.
The three Substring Commands (NV, SOS and LSX) are used extensively
in the numeric stages (SNUPAK) of both parts of COPAK. NV, which is
entered once for each segment, is the number of substrings (maximum
20). One SOS command and LSX command are entered for each
substring. Together they determine the entry format-type of the
substring and the path that is to be taken through the compressor
parts of COPAK. The entry format-type for each substring is stored
in the compressed segment as a four-bit format code. This is used
to produce hard copy when the segments are retrieved. Format codes
used are: A=1; I=2; E or F=3; X=4 (printed in the hexadecimal
format (B)). Here X is the IBM 360 column binary. The substring
commands are not entered if MODE= 0 (i.e., for retrieval).
Overview of the COPAK Compressor
In the compression mode (MODE.noteq.0) the compressor parts of
COPAK construct a completely self-defined string which contains the
label or key; format codes; string structure (i.e., segments and
substrings retain their identities); and sufficient information to
ensure that errors will be detected during decompression. This
information, exclusive of the label, is normally less than 24 bytes
per segment. It is added even if there is no actual compression.
The decompressor parts of COPAK unscramble the self-defined string
to obtain the identical original information. The error-checks are
executed during decompression. If an error is found, control goes
to a location where error-correcting procedures and/or
retransmission commands can be executed.
The status of each segment is recorded in a four-byte work area
(PARM) which is updated whenever the segment is altered. The
structure of PARM is:
The status of each substring in a segment is indicated by a
four-byte word (SOS) which is updated whenever the substring is
altered. The substring composite control words (SOS) contain four
items of information thus:
Here NDR is the "depth of representation" that is computed in the
differencing procedure (of NUPAKC).
A flow diagram for the COPAK compressor is given in FIG. 11. As a
segment of information enters the computer its status-of-substring
control words (SOS) are changed, to the form shown above, and the
status-of-segment control word (PARM) is constructed. At this stage
the sign of SOS and the values of both NDR (in SOS) and LSX
together determine how a substring will be handled by the
compressor part of SNUPAK. After some preliminary processing of the
segment in SANPAKD it is processed, one substring at a time, by
SNUPAK. In this step the SOS composite words are updated and a new
status-of-segment control word (PARM) is constructed. The sign of
the first SOS and the number of bytes in the segment emerging from
SNUPAK are transferred to JII, which is the temporary control
variable for SANPAKC. In the final step of SNUPAK the control words
(PARM and SOS), check information, and other data needed by the
decompressor part of SNUPAK are inserted at the head of the
segment.
The information in JII is used by the alphanumeric compressor part
(SANPAKC) to decide whether or not compression of the newly defined
segment is to be attempted. If compression occurs in SANPAKC,
information, which is used by SANPAKD during decompression, is
inserted at the head of the segment. In the final step of SANPAKC,
four bytes of control-information are inserted at the head of the
segment. The label or key information is then inserted preceeding
the control-information at the head of the segment, (see Device
Commands). The structure of the four byte word of
control-information is:
Here NL is the number of redundant bit-patterns removed by
SANPAKC.
A single string command suffices to bring about decompression of a
stored segment. When this command, MODE=0, is used the label or key
is first removed from the head of the segment. Then the four bytes
of control information are extracted and the following steps are
executed:
Step a. The compressed segment is decompressed with the
alphanumeric decompressor (SANPAKD) if NL.noteq.0.
Step b. The control information that was inserted after processing
by the SNUPAK compressor is extracted.
Step c. The substrings of information are decompressed one at a
time by the decompressor part of SNUPAK.
Step d. The label or key, previously removed, is replaced at the
head of the decompressed string.
Error-checks occur at every step of this decompression procedure.
Thus a segment with N substrings has (N+2) absolute error checks.
Also, there are an additional 15 error-checks which are made during
the decompression by SANPAKD and SNUPAK. Moreover, the conventional
CHECK-SUM can be used as an additional error check. If errors are
found, the decompression is aborted and control goes to location
RTRANSMIT, were error-correcting and retransmission procedures can
be utilized.
The compressor parts of COPAK have incorporated fail-safe
procedures which prevent the inadvertant destruction of
information. For example, if SNUPAK is told to compress text or
binary information as integers it will abort and change the
processing commands to execute SANPAKC without destroying the
data.
SANPAKC
INTRODUCTION
SANPAKC is the Macro instruction used for alphanumeric compression
of information within the COPAK system.
DETAILED DESCRIPTION
The data on which SANPAKC operates is in alpha-numeric form, in
strings of units. In the embodiment described hereinunder the units
are conventional 8-bit bytes. It should be clear however, that
SANPAKC, as well as the complete COPAK package, can be equally
applicable to machine using units other than bytes.
Two distinct types of compression are carried out consecutively.
Each may be carried out either in Fast Mode or in Slow Mode.
In Type 1 compression, the string is searched for identical
patterns of two or more contiguous units. If such identical
multi-unit patterns are found, they are deleted from the string and
decompression information which takes less space but has sufficient
information content for subsequent decompression of the string to
its original form is added to the strings.
In the Slow Mode of Type 1 compression, the scan for identical
multi-unit patterns is carried out by comparing a pattern of
several contiguous units of the string with all other patterns in
the string of like size. In the Fast Mode, this is carried out by
comparing previously chosen patterns which are believed to occur
often with patterns of like size in the string.
In Type 2 compression, which is executed after the completion of
the Type 1 compression, the compressed string is scanned for
individual units which occur more than a certain number of times.
If such units are found, they are deleted from the string and
decompression information is added to the string, but only if the
length of the decompression information is less than the length of
the deleted information.
As a brief qualitative description of a particular example of
carrying out Type 1 compression in the Slow Mode, a string of 1,000
bytes is scanned such that the numerical value of each byte is used
to address a 256-byte table in which each location corresponds to a
unique one of the 256 possible combinations of the eight binary
bits of each byte of the string and each location of the table acts
as a counter for the number of times it has been addressed. After
the last byte of the 1,000 byte string has been used as an address
in this manner, the table is examined for locations which have not
been addressed. The address values of these locations, if any, are
stored consecutively in one area of LEXICON table and are called
Type 1 codes. It will be appreciated that these Type 1 codes
represent bytes which are not present in the 1,000 byte string.
Additionally, the address values of the locations of the 256-byte
table which have been addressed more than a certain number of
times, for example, more than 34 times, are stored in another area
of the LEXICON table and are called Type 2 codes. These Type 2
codes represent bytes which occur very often in the 1,000 byte
string and are likely candidates for deletion. Next, a pattern of
contiguous bytes from the string, for example, the first 12 bytes,
is compared with all other patterns of the same format in the
string. Identical patterns found in this manner are deleted from
the string and are replaced by a Type 1 code from the LEXICON
table, but only if actual saving in string length would result from
this process and only if a unique Type 1 code is available for each
group of like patterns. The same Type 1 code followed by the 12
bytes of the deleted pattern is inserted at the beginning of the
string for later use in decompression. The process is repeated for
different patterns of contiguous bytes for as long as there are
unused Type 1 codes and for as long as saving in length of the
string can be achieved. When a pattern has been found to occur
several times in the string and has been deleted therefrom, it is
stored in a PCORDS table which contains patterns likely to occur
often in similar strings. A savings ratio is associated with that
pattern to indicate the degree of compression achieved by the use
of that pattern.
In Slow Mode of Type 2 compression, a portion of the compressed
string, for example, a portion of 256 consecutive bytes, is
examined for redundancy of particular individual bytes anywhere in
the portion. If a particular byte selected from the Type 2 code in
the LEXICON table is still found to occur in that portion more than
a certain number of times, a 256 bit map is constructed in which
each bit location corresponds to a byte of the examined portion of
256 bytes. The bit map serves as a record of the byte position in
which the particular byte was found. The redundant bytes are then
deleted, the string is closed in to take up the vacated space and
the bit map together with the value of the deleted byte is added to
the string after the size of the bit map is minimized. The value of
a deleted byte and the savings ratio associated with it may be
added to the PCORDS table.
In the Fast Mode of carrying Type 2 compression, the portion of 256
bytes from the string is checked for the occurrence of bytes
selected not from the LEXICON table but from previously stored
bytes in the PCORDS table.
In both modes of both Type 1 and Type 2 compressions, continuous
track is kept of various string characteristics for the purpose of
insuring complete reconstruction of the compressed string and for
the purpose of providing adequate error detection features. A more
detailed explanation of the SANPAKC compression, with particular
reference to the flow diagrams in the drawings, can be found
below.
The flow chart of SANPAKC is given in FIGS. 1A, 1B and 1C. The
first step performed in the Macro SANPAKC is to initialize all the
registers and counters in that portion of the computer which is
being used for alphanumeric compression. The next program
instruction at step 11, is to check whether the MODE Command is set
equal to "0" or not. "0" means that no compression is desired, and
"1" means that compression is desired. The "0" value would occur
when the system was in a retrieval mode and therefore, compression
would not be required. If the machine was in the storage mode
(MODE.noteq.0) compression might be desirable. The next step in the
program is to determine if the variable JII is greater than zero.
If JII is equal to or less than "0", that means that no compression
is desired. If it is greater than "0" then compression is desired.
The only way that JII would be negative is by setting it to a
negative value prior to entering SANPAKC indicating that
compression, by SANPAKC, is not desired. Therefore, even though the
computer was in the store mode, one could prevent compression of
the information. If JII is positive, the program begins compression
at stage 12 in the flowchart of FIG. 1A. Although the flowchart
shows various stages in the program, it is understood that this is
just a means of designating a group of steps to be completed at a
particular point in time. The program listing is IBM Assembly
language and is set forth at the end of the written description. At
stage 12, the program initiates the steps of finding all available
codes and then storing the available codes in a location named
LEXICON. Thus, this step is achieved as follows:
a. A thousand byte string of information is scanned one byte at a
time starting from the first byte. The numerical value of the first
byte is used to address a location in a table of 256 byte positions
corresponding to the 256 different bit configurations possible in a
single byte of information. A count is initiated at that particular
position, to indicate that the particular byte has been found once
within the thousand byte string. The next byte within the thousand
byte string is similarly used to address a location in the 256-byte
table (i.e., by adding the numerical value of the scanned byte to
the base address (beginning address) of the table) and a count is
added at that particular location. This is continued throughout the
one thousand bytes in the string. Where bytes within the thousand
byte string are identical, the count at the particular location in
the 256-byte table will indicate that the particular byte of
information appears more than once in the thousand byte string. The
counters in the 256-byte table are not permitted to exceed the
value 255 so that an absolute frequency count of the number of
occurrances of a particular byte is not achieved if the byte occurs
more than 255 times in the string. However, in going through any
thousand byte string, there will be many of the positions or
locations in the 256-byte table which will not be utilized as there
is no byte in the thousand byte string corresponding to that
location. Those locations which are "0" in the 256-byte table are
determined by scanning the table. The corresponding numerical
values (<255) of the 256-byte table positions (i.e., the number
of bytes past the beginning of the table) are then transmitted by
means of program instructions to the 256-byte array named LEXICON
and stored in consecutive byte locations of LEXICON to act, at a
later stage, as possible Type 1 code numbers (for Type 1
compression) for groups of bytes to be compressed. The count in
each of the individual positions in the 256-byte table where there
has been one or more counts is also scanned to determine whether
any particular location shows more than 34 counts. This indicates
that the particular byte is a candidate for Type 2 compression
(which will be described below). Any location which shows more than
34 counts is also stored in LEXICON to act later as Type 2 codes.
LEXICON has only 256 positions of storage. However, the positions
in the 256 byte table mentioned above which show more than 34
counts are stored starting at the position 256 of LEXICON and
working backwards. For example, if position 193 in the 256-byte
table were to show more than 34 counts it would be placed in
position 256 in LEXICON and if position 232 in the 256-byte table
were also found to have more than 34 counts it would be stored in
the 255th position in LEXICON. It should be noted that it is
impossible for the Type 1 codes stored in LEXICON to overlap the
Type 2 codes stored in LEXICON as these positions are derived from
the 256-byte table mentioned above.
The next group of instructions in SANPAKC is shown at stage 14 of
the flowchart. At this point the question is asked "are there any
codes available?". At stage 14 LEXICON is checked at the front end
thereof to see whether any Type 1 codes have been stored as a
result of the steps taken at stage 12. It should be noted that when
the available Type 1 codes were stored in LEXICON, a count was made
of the number of available codes thus stored during the steps
defined with respect to stage 12. When the step set forth in stage
14 occurs a check is made to determine whether the last mentioned
counter has counted any available Type 1 codes being stored in
LEXICON. It is possible, if the thousand byte string contains
256-bytes of information different from each other that there may
be a count in each one of the 256 counters in the 256-byte table
mentioned previously. Accordingly, there are no available Type 1
codes stored in LEXICON. If the answer at stage 14 is that there
are no available codes, a different type of compression, namely
Type 2 compression, is used. However, Type 2 compression will be
described more fully below.
If the answer at stage 14 is "yes" then the program continues to
stage 16 where the determination is made as to whether the
compression technique should be completed in the "Fast-Mode" or the
"Slow-Mode . The Fast-Mode will be discussed separately as most of
the operations in the Fast-Mode are included in the Slow-Mode and,
in fact, the Fast-Mode may be considered a special case of the
Slow-Mode. A full description of the Fast-Mode will be set forth
below after consideration of the operation in the Slow-Mode. Thus,
if the answer at stage 16 is "no", the compression process
continues in the Slow-Mode and the program continues to stage 18
where counters R and RM are set. Counter R is a counter set to
contain the value of the largest number of contiguous bytes in the
pattern for the occurrence of which the string will be searched
later. RM is equal to R-1. For purposes of the system, in actual
use, R has been set at a maximum of 12 bytes and, of course, then
RM would equal 11. After the counters R and RM are set, the program
then proceeds to stage 20 where a scan of the input string is
initiated to locate the occurrances of redundant patterns. A
pattern is a contiguous grouping of bytes of variable length.
Initially, we will consider groups of contiguous bytes of length 12
bytes. At stage 20, as shown in the diagram, there is a set of
instructions to be followed entitled CCD: a number JII which
designates the length of the string of information at any given
point in the compression process (please note that in the example
given above the string length was initially 1,000 bytes); and a
number CS3 which is a number designating the number of bytes after
the starting point at which redundancy is to be checked. The
instruction CCD is described below.
A first contiguous group of bytes having a length determined by the
counter R (in the first instance 12 bytes) is recorded and compared
along the length of the string moving one byte at a time to find
how many times and where the like of the first contiguous group of
R bytes can be found in the string. If there is no other contiguous
group of R bytes exactly like the first group being sampled, that
result is transmitted to stage 22 of the flowchart where the
question is asked as to whether a saving can be achieved by
removing the redundant pattern of bytes. The answer to this
question is determined by examining whether the formula R+2 N (R-1)
is satisfied or not. N is the number or identical groups of bytes
found during a single scan of the string. Since only one group had
been found during the scan, R+2=14 and since N(R-1)=11; the answer
to the question as to whether there was a saving is obviously "no".
Since the answer is "no", the program continues by transmitting
this response to stage 24 where a determination is made as to
whether the operation is in the Fast Mode. As was discussed
previously, this operation is being accomplished in the Slow Mode
and the answer at stage 24 is also "no". If the answer were "yes",
another operation would take place. This operation will be
discussed in conjunction with the description of the Fast-Mode.
Since the answer is "no", the program continues by transmitting
this information to stage 26 wherein steps are taken to determine
whether additional comparisons can be made. This determination is
made by comparing whether CS3 plus 1 plus 2R is less than JII. This
equation determines whether the sampling has reached the end of the
string since many more comparisons would be useless as there are
not enough bytes left in the string to be able to effect a savings.
Since, in this case, CS3 is "0" and the group CS3+1+2R is equal to
25, and this is certainly less than 1,000 (JII) the answer at stage
26 is "yes". (What happens when the answer is "no" at stage 26 will
be discussed below.) First, since the answer at stage 26 is "yes",
program control is returned to stage 20 where CS3 now has the value
of "1" by the addition of one byte and CCD continues starting with
the next byte of the groups of 12 bytes to be compared. Thus,
starting at one byte past the first point of the string, the
succeeding group of 12 bytes is checked for redundancy going byte
by byte along the string to determine whether there are any similar
groupings of 12 byte patterns along the length of the string.
Assuming in this instance that three such grouping are found along
the length of the string, then this information is passed to stage
22 and entered into the formula R+2<N(R-1) or "is 14 less than
33?". The answer obviously is "yes" and, accordingly, this
information is passed on to stage 28. At stage 28 the counters
associated with the compressor system are updated. That is, a
counter, which for purposes of notation is designated as CS8,
counts the number of compressions which have taken place on this
particular string of information. That is, since this is the first
compression which is to take place on this particular string of
information the counter will be set equal to 1. An additional
counter CS4 is actuated to count the number of compressions which
have been taken with R bytes, that is, with 12 bytes, and,
accordingly, since this is the first compression with 12 bytes this
counter is also set equal to 1. Additionally, the counter CS6 which
is associated with LEXICON to determine the spot where one will get
a Type 1 code from the LEXICON code array is set. In this case this
counter counts "1" and selects the first available code in the
LEXICON code array which was set in the manner described with
respect to stage 12. At this point, having selected a code from
LEXICON, CS6 is now set at "2" so that it is ready to receive at a
future time a request to select a new code from the second spot in
the LEXICON. The next step is to move to stage 30. For
consideration of what happens at stage 30, the effect of the
program instructions in CCD at stage 20 must be understood. Each
time that a redundant group was found during this stage, the
addresses of the redundant groups were stored in core memory
beginning at location TL, (which is an area of core memory). The
address of the first redundant group, namely the first and original
pattern, was stored at location TL in core memory and the address
of each additional redundant group was positioned in increments of
four bytes with the array TL. Each address stored in TL is located
at a specific displacement past the beginning location TL (i.e.a
multiple of four bytes past the beginning). The displacement where
the address of the last redundant group can be found is set in a
register IR4. For example, if there were three redundant groupings,
the register IR4 would have a value equal to "8". It must be
understood that the value "8" is in relative terms and the core
address of the beginning of the TL array must be added to the
displacement value (in IR4) in order to compute the absolute
machine address desired. Thus, if the value in IR4 was "8", the
absolute machine address is the initial address of TL in the core
plus 8. Further, it should be noted that TL has physical storage
limitations and can store addresses for a maximum of 200
repetitions of a pattern during any given pass. Of course, if the
length of the string is maintained in reasonable bounds, this
limitation should not be reached in the ordinary course of
compression, but provision is made in the program for stopping the
compression should there be an attempt to store more than 200
addresses of redundancies in the TL array.
At stage 30, the address of the match or redundancy from TL(IR4)
(the last match found in the string) is loaded into a register IR2.
Note that in this contact TL(IR4) means the initial core address of
the TL array plus the displacement value past the beginning of TL
which is recorded in register IR4. That is, the address of TL plus
the contents of register IR4 gives the machine location where the
desired information is to be found. Then,after IR2 is loaded with
the address from TL(IR4), the program moves to stage 32 wherein a
series of instructions named SAM are executed. This set of
instructions first substitutes, in the string of information, a one
byte code in place of the redundant pattern whose address is
recorded in register IR2. The replacement code is the code from the
LEXICON array pulled out at stage 28. This one byte code is then
placed at the address contained in register IR2 in place of the
first byte of the redundancy. Then, a determination is made as to
the number of bytes in the redundant pattern, namely R, the address
of the redundant group, namely the address in IR2, and the length
of the string before compression, namely JII. Then the remainder of
the string following the last byte of the redundant pattern is
moved to close the space between the newly added code information
and the remainder of the string to compress the string by an amount
equal to R-1 bytes. At this time, JII is changed to reflect the
compression of the string by an amount R=1 bytes. It would be
understood that the compression is R-1 as it was necessary to add
one byte of information to account for the space required to store
the replacement code. If the code had not been added, JII would
have been reduced by R. It should be noted that at times, when Type
2 compression is being utilized (to be discussed below) no code
information is placed in the space vacated by the matched grouping
and, in such cases, JII would, in fact, be reduced by R. Since IR4
as set forth above is no longer pointing to the address of the last
group of matched information in TL, (having already made the
required substitution into the string), IR4 must be reduced by 4 as
is accomplished at stage 34 so that, with its new value IR4 points
to the address of the next to last group of matched bytes found in
the scan.
This new IR4 is then checked at stage 36 to determine whether it is
equal to or greater than "0". If it is equal to or greater than
"0", then the above procedure is repeated starting at stage 30. The
string is continually compressed through stages 30, 32, 34, and 36
until finally stage 36 determines that IR4 is less than "0". This
situation occurs when all the redundant patterns located during a
single scan of the string have been substituted for. When this
occurs, the program continues from stage 36 to stage 38.
At stage 38, a determination is made as to whether the compression
had been a Type 2 compression mentioned previously. Since this is
not a Type 2 compression, the program continues on to stage 40.
There, information relating to what has occured in stages 28
through 36 is placed at the front of the string of information.
First, the code taken from the LEXICON array, at stage 28 of the
flowchart, is stored at the head of the string followed by the
pattern which was replaced so that the code defines the particular
12 byte pattern. Thus, in decompression, when one scans the string
and finds the particular code, the information at the head of the
string will define the meaning of the code information. Following
the addition of the code and pattern information to the head of the
string, it is obvious that JII has now been increased by an amount
equal to R plus 1. Accordingly, JII is increased by R plus 1. It
should be noted that the pattern which has been replaced is stored
in the machine at location CORD1.
Before continuing, it is important now to discuss an element of the
invention which has not yet been discussed and which is germane to
the Fast-Mode procedure. There is in storage a table known as the
PCORDS Table in which are maintained a maximum of 200 patterns
which are considered to be the most repetitive patterns in strings
of information processed by SANPAKC. In certain instances, where
one knows the basic contents of the strings of information being
fed to SANPAKC, one can input a table of PCORDS (permanent cords or
matched groups), which contains the repetitious patterns. Where one
is dealing with unknown alphanumeric information or information
whose content is now known, for example, information that is purely
numeric and should have been transmitted through SNUPAK prior to
entering SANPAKC), one must, in order to operate in the Fast-Mode,
set up a PCORDS Table which will be continuously changing to
optimize the Fast-Mode by selecting the PCORDS with the best
savings ratio. By savings ratio, it is meant the original JII minus
the new JII after compression divided by the old JII, or the number
of bytes saved divided by the old string length. Obviously, it is
desired to utilize those PCORDS which provide the best savings
ratio and, if the best 200 PCORDS are utilized, it may not be
necessary to go through the entire Slow-Mode of operation as was
previously described. It is expected that by utilizing the best 200
PCORDS one would be able to reach the optimum compression while
saving an enormous amount of search time. Thus, it is important to
obtain a PCORDS Table which represents the PCORDS having the best
savings ratio. Obviously, the PCORDS Table will only record those
patterns which have, in fact, affected a saving. A pattern which
does not get past stage 22 (in the flowchart) will not be recorded
in the PCORDS Table. Although the PCORDS Table can hold up to 200
different PCORDS, it may be that the best five or ten PCORDS will
give such a substantial compression that it would be unnecessary to
utilize any further PCORDS. Machine time can thus be substantially
reduced since only those five or ten PCORDS would be searched for
in the input string.
The PCORDS Table always has at least one PCORD therein and, it is
expected in normal operation that the PCORDS Table will be
initialized with six PCORDS of the following types: two PCORDS of
12 byte lengths (one containing all zeros and the other containing
all blanks); two PCORDS of eight byte lengths (as described above)
and two PCORDS of six byte lengths (as described above). By
experience, it is known that in most strings of information there
are groups of blanks and zeros which occur with regularity and,
therefore, it is highly likely that these particular PCORDS will
affect substantial savings in any string of information which might
be fed to SANPAKC. The PCORDS Table includes 201 20-byte segments
of storage space. The first 20-byte segment is control information
which gives the status of the entire table. The next 20-byte
segment is the first PCORD recorded in the PCORDS Table. This
second 20-byte segment, like all succeeding 20-byte segments which
are stored in the PCORDS Table, has its first byte signifying the
number of bytes in the pattern to be stored. The succeeding bytes
after the first byte are the stored pattern or PCORD followed by
binary zeroes up to the 16th byte. From the 17th byte through the
20th byte is recorded the savings ratio associated with that
particular PCORD. Thus, it can be seen that by scanning the first
byte in each 20-byte segment one can determine the length of the
PCORD in the 20-byte segment. By scanning the 17th through 20th
byte in each 20-byte segment one can determine the savings ratio
relating to the particular PCORD. The first 20-byte segment is the
control information which gives the status of each PCORD in the
PCORDS Table. In this first 20-byte segment, the first 12 bytes are
used to indicate the number of patterns of each length (length one
up to length 12) stored in the PCORDS Table. The first byte
contains the number of length 1 patterns, the second byte contains
the number of length 2 patterns, and so on up to the 12th byte. It
should be noted that a byte can hold a value up to 256 and thus the
number of entries in the PCORD Table for each length can be fully
recorded in the first 12 bytes of the control segment, even if all
the patterns are the same length. The next four bytes, namely bytes
13 through 16 of the control segment contain the address of the
PCORD having the lowest savings ratio. Where the Table has not been
filled, this address would be the address of the last 20-byte
segment in the PCORD Table, as yet unfilled. However, this
information is extremely important when the PCORDS Table is filled
as it is desirable to replace the PCORD having the lowest savings
ratio with a pattern having a better savings ratio. Since this
information is recorded in the control segment, it is possible to
replace the lowest savings ratio PCORD with the pattern found to
have a higher savings ratio. Of course, in order to compare the
lowest PCORDS savings ratio, it is necessary to know what that
savings ratio is. This information is recorded in bytes 17 through
20 of the control segment.
In the PCORDS Table, the PCORDS with the longest length are placed
at the top and the smallest length PCORDS at the bottom in
sequential order. When a pattern is to be substituted in place of
the lowest savings ratio PCORD already in the Table, provision is
made for shifting the PCORDS so that this sequential arrangement is
maintained at all times. The reason for this arrangement is that it
is extremely desirable to compress starting with the longest length
PCORDS and working downward to the shortest PCORDS since the
highest savings are achieved with longer length PCORDS. Although,
it has been found desirable to always search starting with the
PCORD of the longest length working towards the PCORDS of a shorter
length, it is, of course, possible to reverse the procedure without
affecting the operation of the compression techniques, although a
different amount of compression will, in all probability occur. It
is expected that by operating from the PCORDS of the greater length
and working towards those of a shorter length that optimal
compression can be achieved.
It should be noted that providing the savings ration adjacent to
each PCORD it is possible to select, for example, the 20 PCORDS
having the best savings ratio and then scanning these selected
PCORDS starting with the PCORDS having the greatest length. It is
desirable to scan PCORDS of a common length so that it is not
necessary to place at the beginning of the string, after
compression with a particular PCORD, the length of the PCORD but
that such information can be added after all PCORDS of the same
length have been utilized in compressing the string. It will be
understood that although SANPAKC has been set up to be its own
lexicographer, (i.e., develop its own code depending upon the
particular string of information supplied thereto) if there is a
known input such as strictly alphanumeric information requiring
only 50 different bytes, the remaining 206 bytes representable in
an eight bit code are known to be available for use as coded
information and, therefore, the PCORDS can be permanently assigned
a code number without the necessity of going through stages 12 and
14 of the SANPAKC program.
After completing the program step at stage 40, the program
transmits the new JII value and the redundant pattern stored in the
location CORD1 to stage 42 wherein the savings ratio for this
particular pattern is determined. It will be understood that the
absolute savings ratio for this particular pattern in CORD1 is
determined by the formula N(R-1)-(R=2) divided by JII. For example,
if the original JII was 1000 and the number of redundant matched
segments was three and the length of CORD1 was 12, the savings
ratio would be 1.9 percent. However, if the pattern in CORD1 was
one of the PCORDS already stored in the PCORDS Table, it is
necessary to compute the savings ratio for the pattern in the
current string and to average this new savings ratio with the old
one stored in the PCORDS Table. For example, if the old savings
ratio was 3 percent and the savings ratio computed for this string
is 1 percent, the savings ratio is determined by adding the old
savings ratio to the new savings ratio and dividing by two. It
should be noted that less emphasis is being placed on past
performance of PCORDS than is placed on the performance on current
or rather current strings. In this way the PCORDS Table can reflect
very quickly changes in the type of input information so as to
provide a better indication of the true savings ratio of the PCORDS
being utilized on the particular information being supplied. The
pattern in CORD1 with its savings ratio or new savings ratio is now
added to the PCORDS Table. When the PCORDS Table is full, the
pattern is not added to the Table if its savings ratio is less than
the smallest savings ratio already stored for a PCORD in the Table.
All this is determined by the particular state of the PCORDS Table
and the computations carried out at program stage 42. Of course, if
the pattern in CORD1 was not already present in the PCORDS Table
and had effected a savings ratio greater than that of the lowest
PCORDS savings ratio recorded in the control segment of the PCORDS
Table, then it will be added to the PCORDS Table in the correct
position.
After stage 42, the program execution continues to stage 44 where a
determination is made as to whether there are more than 16 repeats
of patterns with this R length. Since this is the first pattern of
an R length of 12, the answer must be "no". However, assuming that
in fact there have been 16 patterns of a length 12 when the program
entered stage 44, then the program execution would be immediately
transmitted to stage 46 (shown on Figure 1B) wherein there would
have been placed at the front of the string a one byte code
indicating in the first four bits of the byte the number 16 and in
the next four bits the length of the patter, 12. The length of the
string would accordingly be increased by one byte. At this point
JII would have been adjusted to indicate this additional byte of
information at the front of the string.
After completing this operation at stage 46, the program continues
executing at stage 48 (shown on Figure 1C) wherein the question is
asked whether this is a Type 2 compression. If the answer is "yes",
the program next moves to stage 50 where a further question is
asked as to whether this is the fast-Mode. If the answer is "no"
then the program then moves to stage 90 (shown on Figure 1B) in the
Slow-Mode for Type 2 operations which will be discussed in more
detail below. If the answer at stage 50 is "yes" then the program
next moves to stage 54 for the Fast-Mode series of steps, which
will also be discussed in more detail below.
If the answer at stage 48 was "no", as is in this case, then the
program moves to stage 56 where the question is asked "are there
any codes available?". If the answer is "no", then the program
continues to stage 58 wherein the question is asked "is this
Fast-Mode?". If the answer is "no", then the program moves to stage
60 which is the start of the Type 2 compression in the Slow-Mode.
This will be discussed with respect to Type 2 compression below. If
the answer at stage 58 is "yes", then the program continues to
stage 62 wherein the question is asked "are there any Type 2
PRCORDS available?". If the answer is "no", then the program moves
to stage 64 which is the start of the exit procedures of SANPAKC
which will be discussed at the end of the description of SANPAKC.
If the answer at stage 62 is "yes", then the program moves to stage
66, where the counters for the PCORDS Table are updated so that the
first Type 2 PCORD is the next PCORD to be picked up for scanning,
eliminating all of the remaining Type 1 PCORDS. Then, the program
moves back to stage 110 (in Figure 1B) to start the Fast-Mode
scanning of the Type 2 PCORDS.
If there are any codes available as determined at stage 56, then
the program moves to stage 68 wherein a counter CS4 is reset to
zero. This is done so that it can count repeats of patters of a
particular R length during a further cycle of scans of the string
by SANPAKC. A further count accumulating in CS4 will result in the
addition of another composite byte being added at the head of the
string at a later stage. These composite bytes which are added at
various stages during the compression cycle, mainly whenever 16
different patterns of a particular length have affected a savings
or whenever the length of the patterns, which are being scanned for
in the compressor, is to be changed. The composite bytes are the
most important pieces of information which are used during the
decompression cycle to unscramble the compressed string. The
importance of these composite bytes will become clear when
discussing the alphanumeric decompressor, SANPAKD. After reseting
of the counter CS4 to zero, the program moves to shape 70 wherein a
determination is made as to whether the composite byte created at
stage 46 (in Figure 1B) is in the list of available codes in
LEXICON. If the answer to this question is "yes", then this
available code in LEXICON has become non-available and must be
deleted from LEXICON. Thus, if the answer is "yes", the program
moves to stage 72 where this code is deleted from the LEXICON
Table. Further, since the code has been eliminated, the counter of
the number of available codes, CS5, must be changed to indicate one
less available code, and the codes in LEXICON must be shifted one
byte to the left to fill in the space created by the absence of
this now non-available code. Once this has been accomplished, the
program moves to stage 74 where a further determination must be
made as to whether any codes are now available. This question must
be asked because the removal at stage 72 of the code may have
caused the LEXICON Table to be emptied of all available codes. If
the answer at stage 74 is "no", then the program returns to stage
58 as was discussed previously.
As was discussed previously, if the answer at stage 74 is "yes",
the program continues at stage 76. Additionally, if the answer at
stage 70 had been "no", then the program would also would have
continued at stage 76. At stage 76 the question is asked "is this
the Fast-Mode?" If the answer to this question is "yes", then the
program continues at stage 110 (in Figure 1B) mentioned previously.
If the answer at stage 76 is "no", then the program continues to
stage 78 wherein the question is asked "is R to be decremented?".
It will be understood that, as in this case, the reason why stage
46 (in Figure 1B) had been operated is that it had received its
directions from stage 44 (in Figure 1A) answering "yes", and the
answer at stage 78 would be "no". The "no" answer at stage 78 would
be the proper one because at stage 44 control was sent to the
routine which will add a composite byte to the head of the string
because there was 16 repeats of a cord with a particular R length,
in this case 12. This means that there may be more patterns of this
length which may be found in the string, therefore decrementing of
R at this point would not be desirable. If a "no" answer occurs at
stage 78, then the next step in the program is to return to stage
20 for a new cycle in the Slow-Mode. If the answer at stage 78 is
that R is to be decremented then the program continues at stage 80
and will operate in a manner to be discussed below.
If the answer at stage 44 (in Figure 1A) was "no", then the program
would continue at stage 82 wherein the question to be asked is "are
there any codes left in LEXICON which are available for
substitution?". If the answer is "no", then the program moves to
stage 84 (in Figure 1B) where the question is asked "are there any
Type 2 codes available?". The manner of this operation will be
discussed with respect to the entire Type 2 mode of operation.
If the answer at stage 82 (in Figure 1A) is "yes", then the program
moves to stage 86 where the question is asked "is this the
Fast-Mode?". If the answer to this question is "yes", then the
program moves to stage 89 (in Figure 1B) in the Fast-Mode
operations. This will be more fully discussed with respect to a
direct discussion of the operation of the Fast-Mode. If the answer
at stage 86 is "no", then the program moves to stage 88 wherein the
question is asked "is this a type 2 compression?". If the answer to
this question is "yes", then the program would move to stage 90 (in
Figure 1B) in the Type 2 mode of operation. However, that mode of
operation will be discussed in more detail below. If the answer at
stage 81 is "no", then the program moves to stage 26 at which point
the question is asked "can more comparisons be made with patterns
of this R length?"
The operation at stage 26 has been discussed previously in detail.
Obviously, if the answer is "yes", the cycle starts again at stage
20. However, if the answer is "no", meaning that all of the
possible comparisons have been made in the string of information
for this R length, then the program moves to stage at which point
the question is asked "was there a Type 1 saving?". If the answer
is that there was a Type 1 saving, then the program continues to
stage 46 and will proceed through the succeeding stages from stage
46 in the manner previously discussed. If the answer at stage 92 is
that there was no Type 1 saving, then the program continues to
stage 80 where R (the length of a pattern to be scanned) is
decremented by 1 and, therefore, R will be set equal to R-1 and RM
will accordingly be set equal to RM-1. After this decrementing
operation the program continues to stage 94 (in Figure 1B) at which
stage the question is asked "is RM equal to "0"?". If RM is not
equal to zero and the answer is therefore "no", then the program is
recycled in the Slow-Mode at stage 20 with the new value of R being
equal to one less than its previously cycled value of R. If the
answer at stage 94 is "yes", then we must be prepared to enter the
Type 2 Slow-Mode of operation and the program continues to stage 84
where the question is asked "are there any Type 2 codes
available?". As was discussed previously, Type 2 codes are
available when, in the LEXICON, there are more than 34 repetitions
of single bytes in the string of information supplied to SANPAKC
(see description of stages 12 and 14). If the answer at stage 84 is
"no", then the program continues to the exit routines of SANPAKC
which starts at stage 64 as discussed previously. A complete
discussion of the operations following stage 64 will be discussed
below.
If the answer at stage 84 is that there are Type 2 codes available,
then there is initiated the Type 2 Slow-Mode operation at stage 60.
A discussion of what Type 2 compression involves is now needed.
Where a single byte has reoccurred more than 34 times in the input
string fed to SANPAKC, there is good chance that, after compression
of the string, that the single byte will in fact still appear more
than 34 times in the first 256 bytes of the compressed string. If
this occurs, Type 2 compression will be operative and an attempt
will be made to compress the string further. Type 2 compression
operates as follows:
A scan is made of the first 256 bytes, or a lesser grouping
thereof, to determine how many occurrances there are of the
particular byte pattern and the locations of these one byte
patterns. If the scan shows less than 34 occurrances of the byte
pattern in the first 256 bytes of the string then Type 2
compression will not effect a saving and no Type 2 compression will
follow. If, however, there are more than 34 occurrances of the
particular byte pattern, then a bitmap is formed. The first byte of
the bitmap is the byte of the redundant pattern. The second byte of
the bitmap is a number designating the number of bytes in the
entire bitmap. After the second byte, a map is formed, which, for
each byte of the string, a bit is used designate the presence or
absense of the redundant byte starting from the first byte of the
string and continuing through two hundred and fifty six bytes in
the string. If, in fact, the bitmap extends for the full 256 bytes
in the string, then since each byte has been substituted for a
single bit, there are 32 bytes in the bitmap. However, starting
from the end of the bitmap, if the last byte in the bitmap does not
contain more than one bit showing the occurrance of a redundant
byte, then it is wasteful to have the bitmap the full 32 bytes in
length and the last byte of the bitmap is removed and the bitmap
reduced in size accordingly. A check of the last byte or bytes of
the bitmap is made to determine whether the length of the bitmap is
optimal and, the bitmap is shortened until is has proved to be
optimal in length. The bitmap is then, in fact, a map which shows
where redundant bytes occur in the string of information. Once this
bitmap is made, it is then only necessary to remove the redundant
bytes from the string, substantially compressing the string, and
then placing the bitmap at the head of the string to act as a
pattern to indicate, (a)the redundant pattern; (b) the length of
the bitmap; and (c) the locations of the redundant pattern in the
succeeding string which is, of course, the bitmap.
It is now useful to discuss the flow of information through SANPAKC
for the purposes of Type 2 compression. Starting at stage 60 (in
Figure 1B) counters and registers are set up for Type 2
compression. After this set up step at stage 60, control then flows
to stage 52 wherein the machine is next asked to obtain the next
Type 2 code in LEXICON to search for in the string. Type 2 codes
are the codes which previously were stored at the end of LEXICON.
Alternatively, the PCORDS Table can be checked to determine whether
there are any Type 2 PCORDS to be searched for. This alternative
will be discussed with respect to the Fast-Mode operation. In most
instances the length of the string to be searched for a single byte
pattern will be 256 bytes in length, even though, the actual length
of the string may be greater than that. If the string itself is
less than 256 bytes in length then, the length of the string to be
searched will equal the actual length of the string. This value of
the number of bytes of the string to search, is stored in the
machine location CS3. After stage 52 operations have been
completed, the control of the program continues at stage 96 where
the instructions are executed to scan the first CS3 bytes of the
string for the code which has been picked up from the LEXICON Table
as noted at stage 52. The program then continues to stage 98
wherein the question is asked "was there a savings utilizing this
particular code over the first CS3 bytes of the string?". If there
are less than 34 occurrances of the one byte pattern (the selected
code) then there is no saving and the answer is "no" at stage 98
and the program continues at stage 90 in the manner discussed
previously with respect to stage 90. If the answer at stage 90 is
"no", the program continues to stage 100 where again the question
is asked "was there a Type 2 saving?". Since the answer must again
be "no" at stage 100, the program then moves to stage 64 where it
is prepared to end the operation of SANPAKC.
If the answer at stage 90 had been "yes", that there were more Type
2 codes available, then the program returns to stage 52 and the
operation continues with a new Type 2 code retrieved from LEXICON
or from the PCORDS Table. The program will then continue through
stages 96 and 98 as was discussed previously. At stage 98 if the
answer has been "yes", then the program would have continued to
stage 102 wherein the computer proceeds to build the bitmap (BBM)
by scanning the first CS3 bytes of the string and determine where
in those CS3 bytes the code appears and recording that information
in the bitmap, stored temporarily in location CORD1. Thus, at stage
102 the information of concern is the length of the string being
compressed, CS3, the code of the redundant information which will
be pulled out of the string to compress the string,and the bitmap
or CORD1 which provides a road map of the places in the string
where the code is present so that, after the redundant information
is removed from the string, CORD1 provides a record of where that
information was removed from the string, so that it can be later
replaced during decompression. After stage 102, operations have
been completed, the program continues to stage 28 (in Figure 1A)
wherein the counters are updated. There is no need to select a code
from the LEXICON array as this is Type 2 compression and such codes
are unnecessary. The process of removing the Type 2 codes from the
string is then continued through stages 28, 30, 32, 34 and 36 in
the same manner as discussed with respect to Type 1 compression
except for the special case,which is not found in Type 1
compression, wherein all of the located redundant bytes in the
first CS3 bytes of the string may not be removed due to the fact
that in optimizing the length of the bitmap as was discussed
earlier, the occurrances of some of the patterns may not be removed
even though they are in the first CS3 bytes of the string. Thus, it
may be the case in Type 2 compression that the removal of a single
byte pattern from the string may effect a savings, but all
occurrances of the redundant byte may not be removed from the
string which is contrary to the case of Type 1 compressions where
all occurrances of the redundant pattern are removed from the
string.
After completing the instructions at stage 152 and having recorded
the exact position of every occurrance of the code in the string,
the program continues execution at stage 154, At stage 154,
register IR2 is decremented by 4 and this new value is loaded into
register IR4. IR4 points to a location in the TL array where the
address of the first spot in string, where the original byte
pattern which was substituted for during the execution of SANPAKC
is to be replaced, is stored. After completing stage 36, the
program continues to stage 38 wherein the question is again asked
"is this Type 2 compression?". The answer is "yes" and the program
continues on to stage 104 (in Figure 1B) wherein the counters are
set up to assemble the LEXICON at the head of the string. The
counters are set by setting up R and RM. The meaning of R at this
stage is indicative of the number of bytes in the bitmap and RM is
again equal to R minus 1. It should be noted that, in Type 2
compression, this action can be performed by simply adding the
total number of bytes in the bitmap to the current value contained
in R and Rm for purposes which will be discussed hereinafter.
After, this has been done, the program then continues to stage 40
(in Figure 1A) wherein the LEX instructions are completed which
move the string to the right an amount equal to R+1 bytes, leaving
a space at the front of the string for inserting the control
information for this Type 2 compression. JII is changed by adding
to the original JII and amount equal to R+1. It should be
understood that this is necessary as R+1 bytes of control
information will be placed at the head of the string. The control
information relating to the Type 2 compression (a) one byte for the
code of the byte being compressed (removed from the string); (b)
one byte indicating the length of the bitmap; and (c) the actual
bitmap. Thus JII will have been increased by R+1. The program then
continues through stages 40, 42, 44, 86,and 88 in the same manner
as was discussed previously with respect to Type 1 compression. At
stage 88, when the answer to the question of "is this Type 2?" is
answered "yes", the program continues to stage 90 (in Figure 1B)
wherein the question is asked "are there any more Type 2 codes?".
If the answer at stage 90 is "yes", and the program returns to
stage 52 for another cycle of Type 2 compression. If the answer at
stage 90 is "no", the program continues to stage 100 wherein the
question is asked "was there a Type 2 saving?". If the answer at
this stage 100 is "no", then the program continues on to stage 64
which will be discussed below. If the answer at stage 100 is "yes"
that there was a Type 2 saving, the program control is passed to
stage 46, where additional information is placed at the head of the
string, namely a composite byte which in its first four bits gives
the number of times Type 2 compression has been effected; and in
its next four bits is the number of bytes in the pattern being
compressed. In this case the length value would be "1". This will
key the machine for recognizing the occurrance Type 2 compression
during the decompression cycle. The program continues from stage 46
to stage 48 (in Figure 16) and to stage 50 in a manner discussed
previously. If the answer at stage 50 was that this was not a
Fast-Mode type compression, then the answer is "no". Then, the
program continues again at stage 90.
If the answer at stage 50 was "yes", then the program continues at
stage 89 (in FIG. 1B) wherein the question is asked "are there any
more PCORDS for this length?". If the answer is "yes", then the
program continues again at stage 54. If the answer is "no", the
program continues to stage 108, where the question is asked
"whether there was a savings?. If the answer at stage 108 is "no",
then the program continues to stage 110 where the question is asked
"are there any more PCORDS?". If the answer is "yes", the program
returns to stage 54 where the next PCORD is retrieved for scanning
the string. If the answer is "no", at stage 110, then the system
continues at stage 64.
If SANPAKC is operating in the Fast-Mode, then the response at
stage 16 (in FIG. 1A) would transfer program control directly to
stage 112 (in FIG. 1B) wherein the counters and registers for the
Fast-Mode will be set up. At this stage, and after setting up the
counters and registers for the Fast-Mode, the program continues to
stage 54 to get the next PCORD to search for in the string (from
the PCORDS Table). If the program is in Type 2 compression, it
would be searching for a Type 2 PCORD or, if in a Type 1
compression, it would of course, be checking the next Type 1 PCORD.
This determination is made at the succeeding stage 114 wherein the
question is "whether the PCORD to be scanned for is a Type 2
PCORD?". If the answer is "yes", then the program starts again at
stage 96 and continues in the manner discussed previously with
respect to Type 2 compression. If the answer is "no" at location
114, then the program continues to stage 20 (in FIG. 1A) where the
string is searched for the Type 1 PCORD.
All of the Type 1 and Type 2 Fast and Slow-Mode branches discussed
previously have eventually terminated at stage 64. At stage 64,
steps are taken to end the compression operation on the string of
information. Only "housekeeping" functions are completed from stage
64 to the end of SANPAKC. That is, at stage 64 the PCORDS Table is
searched to find the PCORD with the lowest savings ratio. It should
be noted that if the PCORDS Table is not filled, the lowest savings
ratio is zero and the address of the PCORD with the lowest savings
ratio is the last location in the PCORDS Table. As was discussed
previously, the PCORDS Table in its first 20 byte segment maintains
the information as to the lowest savings ratio of a PCORD which is
up for review at location 64. It should be noted that, specifically
in the Slow-Mode, every PCORD in the PCORDS Table is updated as to
its savings ratio. In the Fast-Mode, this would have been
accomplished at stage 42 as discussed previously. However, if a
particular PCORD had not been looked for during the Slow-Mode, this
means that it was not in the string of information reviewed and,
accordingly, the savings ratio associated with the particular PCORD
is halved. It should be noted that the savings ratio is not
averaged as might be expected, but is halved meaning that the last
two strings of information have the greatest effect upon which
PCORD remains in the Table with the highest savings ratio. This
allows the system to rapidly change over from one type of
information input to another. For example, if one were compressing
information in English and there immediately followed information
in German, where there might be different grouping of letters, the
PCORDS Table would be very responsive to this change and after only
a few strings of information had been fed through the compressor,
the entries in the PCORDS Table would reflect this change to the
new language being supplied to SANPAKC.
After operations are completed at stage 64, the program continues
to stage 116. At stage 116, four additional bytes of information
are placed at the head of the string. First, the length of the
compressed string, JII, is recorded in the first three bytes of the
four byte addition. JII, of course have been updated to include the
last value of JII plus the four bytes of information to be added at
this stage. As was stated, three bytes are used to record this
value of the new JII with the fourth byte being utilized to provide
a count of the number of different compressions which took place in
the string to follow and which in fact has been made on the string
prior to reaching stage 116. If the input information has not been
compressed, then, of course, the fourth byte of the above mentioned
four byte segment at the head of the string will be zero. After
completion of the steps at stage 116, the remaining steps outlined
in FIG. 1B are "housekeeping" machine functions which are completed
merely to provide information as to the economics of the
compression techniques completed in SANPAKC and to determine where
the control of the program is to be continued.
For example, at stage 120, after completion of the instructions at
stage 116 there is provided a set of instructions for increasing
the amount storage within the machine which is addressable by the
program. This is necessary because of a limitation on the number of
instructions which can be addressed in one section of machine
storage. At stage 122 the determination of the value of the
variable MODE, an input command, indicates whether the string has
come through SANPAKC or, whether at the input stages 11, the string
and been directed not to be compressed such as would occur in a
retrieval mode of operation. Accordingly, those strings of
information which are not to be compressed are transferred directly
from stage 11 to stage 120 and then stage 122 without ever passing
through any portion of SANPAKC. At stage 122, if the determination
is made that the information had not been compressed, it goes
directly to the termination point of SANPAKC. If there had been a
compression, and, therefore, the answer at stage 122 is that MODE
is not equal to zero, then a stage 124 there is computed a savings
ratio of the amount of savings achieved by the compression of the
input string by SANPAKC. Thus, the savings ratio is the number of
bytes saved divided by the original number of bytes in the input
string, and accordingly, the actual savings achieved by SANPAKC can
be determined for each string of information supplied.
SANPAKD
INTRODUCTION
SANPAKD is the alphanumeric decompressor which is used for
decompressing strings of information which have been compressed by
SANPAKC. It is the purpose of SANPAKD to take such compressed
information and return it to the form of the original input
information.
DETAILED DESCRIPTION
Briefly, in the case of a string that has undergone both Type 1 and
Type 2 compressions in SANPAKC, the first three bytes of the
compressed string indicate its length in bytes. The fourth byte
specifies the number of compressions carried out on the string in
SANPAKC. These four bytes are removed and set to appropriate
registers to be used for control purposes through the decompression
process. The next two bytes in the compressed string relate to a
Type 2 compression: one gives the Type 2 byte which was deleted
from the string and must now be inserted in the proper location
thereon, the other byte gives the length of the bit-map which
follows next and will be used for finding the right locations in
the string to carry out the insertions. The insertion process is
carried out and then any other deleted Type 2 bytes are reinserted
in the compressed string in the same manner. Next, decompression
information relating to Type 1 compression is examined. As noted
earlier, for each Type 1 compression, the string has at its head a
Type 1 code byte, a byte designating in four bits the length of the
replaced pattern and designating in the other four bits the number
of patterns replaced. These two bytes are set to appropriate
registers for control purposes, and the R bytes of the replaced
pattern which follow at the head of the string are inserted in
place of every occurrence in the string of the Type 1 code just
mentioned. The process is repeated until all deleted patterns are
replaced in the string.
Various housekeeping, control and error checking functions are also
carried out. A detailed description of each step of the process,
with particular references to the drawings, is given below.
The information flowchart for SANPAKD is shown in FIG. 2. In FIG. 2
at stage 130, the instruction SANPAKD is given which will initiate
all the steps which follow as set forth in FIG. 2. The first step
in the decompression of the string such as the output of SANPAKC
discussed above, is to complete the instructions at stage 132. The
instruction OPCORDS at stage 132 is to optimize the PCORDS Table if
the input string has not already been compressed. But if, the input
string is intended to be compressed, at stage 132 the PCORDS Table
in SANPAKC will be optimized. This optimization is carried out by
removing all patterns in the PCORDS Table except for a specified
number of PCORDS with the highest savings ratio values. The actual
number of PCORDS to be retained in the PCORDS Table is an option of
the user. For example, if it is only desired to scan the best five
PCORDS, then, in fact, only the top five PCORDS in terms of savings
ratio will be utilized during the SANPAKC compression with all
other PCORDS being deleted from the table. Once this optimization
instruction is completed prior to entering SANPAKC, the program
control then continue with all of the instructions in SANPAKC as
was discussed previously. If the string at stage 132 is intended
for decompression, then program execution is continued at stage 134
where all of the registers and counters for SANPAKD are set to
receive a new string of information. After setting the registers
and counters, the program continues at stage 136 wherein
instructions are given to remove the first four bytes from the head
of the input string. These four bytes, were, as discussed
previously with respect to SANPAKC, comprised of three bytes which
designated the length of the string and one byte which iniciated
the number of compressions which had been completed on the string
while passing through the SANPAKC. The first three bytes of these
four bytes are removed from the input string and are then stored in
counter JII (length of the string). The fourth byte is stored in
counter CS8. This last counter CS8 indicates the number of
compression which had been completed on the input string. After
completing the instructions at stage 136, the program then moves to
stage 138 wherein a determination is made as to whether counter CS8
has a value greater than zero. If CS8 is equal to zero, then the
input string has not been compressed and there is therefore no need
for sending the string through any further stages of SANPAKD.
Accordingly, the program control is then immediately passed to
stage 140 which will be discussed at the end of the operation of
SANPAKD .
If CS8 is greater than zero, the string was compressed and
therefore requires decompression. Thus, control passes to stage
142, where registers BRYY and BRY, which are registers in the
computer, are loaded with information as to where the string of
input information can be found in the computer. Once this is
determined, the next step is taken at stage 144 where the first
byte of the input string is examined. The first byte of the string
thus processed will have, as was discussed with respect to SANPAKC,
a composite byte comprising first (A) four bits which indicate the
number of repetitions of a particular pattern length of follow; and
(B) the next four bits indicate the length of the first group of
repetitive patterns which have to be decompressed. For example, if
two patterns of length 12 are at the head of the string, then (A)
would be 2 and (B) would be 12. However, it should be noted that it
is most likely the length of the pattern at the head of the string
would be small as in SANPAKC compression occurs first with the
longest patterns and works downward to the shortest patterns, with
the last patterns to be the compressed being the Type 2 patterns,
one byte in length. If there was any compression of the string,
this composite byte would be at the head of the string. The first
half of the composite byte, indicating the number of repetitions,
is stored in counter CS4 and the length the patter is stored in
location RM, remembering that R equals RM+1.
After the preliminary steps at stage 144, the instructions proceed
to stage 146 wherein a determination is made as to whether the
string should be decompressed for Type 2 or Type 1 information.
Thus, if RM equals "0", then the information is in order for Type 2
decompression and the instructions would proceed to stage 148. If
RM is greater than "0", then Type 1 decompression is in order and
the instructions would proceed to stage 150. At stage 150, the
input string must have, as its first byte the code which has been
substituted for a particular pattern and, the succeeding R bytes
comprising of the redundant pattern to which substitution has been
effected. It will be understood that at stage 150 JII is reduced by
R plus 1 bytes and the substitution code and the redundant pattern
are stored in locations CODE and CORD1 respectively. The string is
then moved, to the left, R+1 bytes to close up the space created by
the removal of the above information from the head of the
string.
After completing the steps at stage 150, the program continues at
stage 152 where the instruction F1ND is initiated. These
instructions scan the string for the single byte in CODE stored
during the operation at stage 150. This byte is the substitution
code which replaced the occurrances of the redundant pattern during
the execution of SANPAKC. The addresses of the locations in the
string where this code is found are stored in the array TL.
Register IR2 contains the number of bytes past the beginning of the
TL array where the address of the last found occurrance of CODE is
stored.
After completing the instructions at stage 152 and having recorded
the exact position of every occurrance of the code in the string,
the program continues execution at stage 154. At stage 154,
register IR2 is decremented by 4 and and its new value is loaded
into register IR4. IR4 points to a location in the TL array where
the address of the first spot in the string, where the original
byte pattern which was substituted for during the execution of
SANPAKC is to be replaced, is stored. After completing this step,
the instructions continue at stage 156, where the number recorded
in register IR4 is loaded into register IR5. Register IR5 indicates
the last string address at which a code was found during the scan
defined in the operations at stage 152. IR5 is now set equal to the
value in IR5 minus the address of beginning point of the string of
information. Thus, at this point IR5 is equal to the number of
bytes from the beginning of the string where the last code found is
actually located.
The next step is to determine at stage 158 whether the code We are
dealing with is a Type 1 or Type 2 code. If it is a Type 1 code
then the program control does directly to stage 160. If it is a
type 2 code, the program control goes to stage 162. This
determination is made by merely checking, as at stage 146, as to
whether RM is equal to zero or greater than zero. Considering the
case with Type 1 compression, the program continues to stage 160
wherein the string is operated on by moving the remainder of the
string one byte past the location pointed to by IR5 (the location
of the CODE in the string) to the right an amount equal to RM (R
minus 1). This leaves an opening of R bytes in the string
subsequent to the location pointed to by IR5 one byte of which is
the substitution code placed in the string by SANPAKC. Then, the
pattern in CORD1 is inserted into the string in the R-byte space
between the location pointed to by IR5 and the remainder of the
string. The insertion of the pattern in CORD1 into the string
erases the code which had replaced the pattern during the
compression cycle and the new string will be returned toward its
decompressed form. In the process, the actual length of the string
has been increased by RM bytes and this amount is added to JII.
After this stage is completed, IR4 is decremented by 4 so that it
now points to the next lower address in the TL array where the next
address in the string to be operated on is stored. This is
accomplished at program stage 162. That is, the new IR4 is equal to
IR4 minus 4 bytes which is the position of the next CODE address
stored in TL. The program then continues to stage 164 where a
determination is made whether the new IR4 is equal to or greater
than zero. If the value is equal to or greater than zero, then the
loop is executed again by returning to the instructions at stage
156 to a gain insert the pattern in CORD1 at particular CODE
locations. This looping will continue until IR4 is less than zero.
This means that all the codes for this particular pattern have been
replaced by the original string pattern and the program will
continue to the instructions at stage 166. At stage 166 are the
instructions relating to checking the operation of the string
decompression and insuring correctness. Thus, at stage 166 counter
CS8 is decremented by 1 indicating that the first decompression
step has been completed and that there are now left a new CS8 minus
1 decompression steps to be completed before total decompression of
the string is achieved. Further, counter CS4 is also decremented by
1 meaning that for this particular length of pattern there are CS4
minus 1 decompression steps to be completed before a new composite
byte is located or total decompression of the string has been
achieved.
The next step is to determine, at stage 168, if counter CS4 is
equal to or less than zero. If counter CS4 is greater than zero,
then the program returns to stage 146 and the loop will continue
until CS4 counts down to zero indicating that all patterns of this
R length have been decompressed. When this occurs, the program
continues to stage 170 where a determination is made as to whether
counter CS8 is equal to zero. If counter CS8 is not equal to zero,
then the program returns to stage 144 to remove a new composite
byte which, at this stage, should be the first byte of the string
and to continue through the decompression stages. If, in fact, all
of the decompression steps have been completed, the counter CS8
will be equal to zero and the program will continue to stage 140.
At stage 140, the now decompressed information is then set up for
use in the numeric decompression portion of SNUPAK (if numeric
decompression is required), the operation of which will be
discussed below. This treatment involves breaking down the string
of information into substrings in accordance with whether the
information is textual, floating point, or integer information. All
of this will be more fully discussed with respect to SNUPAK. After
the steps at stage 140, the string would leave SANPAKD fully
decompressed and ready for use wherever needed.
If the determination is made at stage 146 that RM is equal to zero
and, thus, we are dealing with Type 2 compressed information, the
next steps are taken at stage 148. At stage 148, the Type 2 control
information is decompressed by removing three items of information
from the head of the string. The first item of information is the 1
byte code which is to be replaced at selected locations of the
string as is designated upon decompressing the bitmap which is to
follow. The second item of information which is removed from the
string is the byte following the one byte code in the string of
information. This byte of information designates the number of
bytes in the bitmap which follow this byte in the string. It should
be noted that by the removal of this information from the head of
the string the length of the string, JII, must be reduced by an
amount equal to the length of the bitmap plus 2 bytes. The bitmap
is also decompressed at this stage and the addresses of each
location where a substitution must be made are stored seriatum
starting at location TL in four byte increments. The number of
bytes past the beginning of TL, where the last address is stored,
is stored in counter CS10. The program then moves to stage 149
where the contents of counter CS10 is loaded into register IR4. The
program then continues on to stage 156. Register IR4 is transformed
at location 156 in the same manner that was discussed previously.
Control is then transmitted to stage 158 wherein, again,
determination is made whether RM is equal to zero. Since RM is
equal to zero, the program continues to stage 162. At stage 162,
there is the utilization of a counter CS11 which is the original
number of one byte patterns from the string during the compression
cycle as determined by the program when disassembling the bitmap at
stage 148. This number minus 1 is loaded into register IR6 and the
new number in IR6 is then stored back in counter CS11. A new value
stored in register IR5 must be determined. IR5 contains the number
of bytes from the head of the string to the location where the code
to be inserted into the string and from this number must be
subtracted the value recorded in register IR6 in order to determine
the actual position in the compressed string where the coded
information will be placed. This is required as the address stored
originally in the bitmap has been changed by reason of the other
compressions which had occured during the Type 2 compression but
have not been restored into the string yet. After completing the
steps at stage 162, a correct value in IR5 has been computed which
can be utilized at stage 160 to insert the Type 2 code into its
correct position in the string. Once this has occurred, then the
loop will continue through stage 162 in the same manner as was
discussed with respect to Type 1 compression until the bitmap has
been completely utilized to insert the Type 2 codes in their
correct position in the string. The result of the SANPAKD operation
is to produce, at the end, an absolute reproduction of the original
input string into SANPAKC. It should be noted that the original
string length JII originally recorded can be checked against the
now new length JII at stage 140 and determine whether the length of
the decompressed string corresponds to the length of the original
input string. Further, there is a check as to whether the number of
compressions equal the number of decompressions which were effected
by SANPAKD. These cross checks insure that there is no error in
compressing and decompressing the input strings. This completes the
operation of the alphanumeric compressor and decompressor in
COPAK.
NUPAKC
INTRODUCTION
NUPAKC is the numeric compressor. That is, NUPAKC is designed
specifically for compressing numeric information. The machine is
normally instructed that certain strings of information are
basically in numeric form, and, such information will be
transmitted to NUPAKC for compression. In FIG. 3, there is shown a
flow diagram of the steps that take place in NUPAKC to compress the
numeric information. In FIG. 3, there is a start up procedure which
instructs the machine to proceed to stage 182, where the input
strings of numeric information are converted into integer number
organized in four-byte words in a manner which will be more fully
described in FIG. 4. This conversion into integers is the first
compression step in that it removes the floating point exponent and
allows the numerical information to be treated as an integer so as
to conserve storage facilities and effect more efficient
utilization in the remaining compression steps.
After conversion into integers, the program continues to a
differencing stage. At this stage, successive integer words in a
substring are sifferenced seriatum so as to substantially reduce
the magnitudes of all the integers following the first integer in
an optimal manner. The procedures at stage 184 are described in
FIG. 6 and will only be accomplished if, in fact, such a
differencing procedure will effect a saving and the number of
differencing cycles will be limited to that number which reaches
the optimal condencing of the input substring.
After completing the procedural steps at stage 184, the program
continues to stage 186 described in FIG. 5 in which identical
sequences are removed and condensed information replaces the
sequential information and a map of the position of such
information is placed at the head of the substring so as to
indicate, for decompression purposes, where said condensed
information can be found in the substring. After completion of the
steps at program stage 186, the program continues to stage 188
where all of the substring integers are packed into eight byte
double words in a optimal fashion, i.e., the maximum number of
integers are placed in each double word so as to again condense the
information. It has been found with NUPAKC, that it is possible,
especially in dealing with highly repetitive information such as is
found in graphical data, etc., that compression up to 99.99 percent
is possible. However, more normally, compression of numerical data
is in the range of 80 to 95 percent.
As indicated by the Table of FIG. 9, the string of these double
words may then be directed to SANPAKC for further compression.
It should be noted that the longer the substring, the more likely
are repetitive sequences to occur and more efficiently are the
integers packed into double words. It has been found that when one
substring, which could be compressed to save 88 percent, is
included in a substring ten times as long it would give a savings
of 95 percent. Thus, long substrings should in fact give rise to
higher savings.
DETAILED DESCRIPTION
FIG. 4 is a structural flow diagram of the operation at stage 182
discussed with respect to FIG. 3. It should be understood that all
substrings, which might be utilized in the COPAK system, have
certain identifying words associated with them. (the substring
command). One of the first words has been called SOS. If SOS is a
number less than zero then the substring is intended to be
compressed by SANPAKC only. If SOS is equal to zero then the
substring is to be compressed by NUPAKC. It will further be
understood that when substring information is read into the
computer, this type of identifying material is controlled by the
user through the input commands because he knows what the
information type is (either numeric or alphanumeric) and,
therefore, capable of numeric compression or alphanumeric
compression. However, it should be noted that NUPAKC is not purely
limited to numeric information and, in fact, alphanumeric
information could be compressed by NUPAKC which could, assuming
that all the information being entered into the machine is in fact
in numeric form. However, for practical purposes, NUPAKC is
intended strictly for numeric information. In addition to the SOS
substring command, there is a second command called LSX. LSX is a
substring command which determines the type of numeric compression
which will be used. NUPAKC may use no truncation, truncation by the
bin procedure using the value of LSX, or truncation by the logical
right shift method. These operations are discussed below.
In the procedure at stage 182 (in FIG. 4) the first step is a
determination at stage 190 whether LSX is equal to, less than, or
greater than zero. At stage 190, register IR1 contains a value of
the number of bytes past location LSX in core memory of the machine
where the desired LSX value for the particular substring is stored.
If LSX is equal to zero, then the program would continue with no
truncation at stage 192. If LSX is less than zero, then the program
continues at stage 194 to begin execution of the logical right
shift method. If LSX is greater than zero, then the program
continues at stage 196 wherein truncation by the bin procedure
using the value of LSX is started. LSX is an indication of the
degree of reliability which the user desires the information to be
passed through NUPAKC. Thus, if one knows that the input data is
correct to within 1 percent, then LSX would equal, i.e., 0.01. If
the user states that LSX is less than zero (usually set to -1) this
means that the logical right shift method will be used. In the
logical right shift method there will only be a small variation in
the seventh significant figure in the input data upon
decompression. Thus, normally, one who wishes to use the logical
right shift method would be interested in extremely accurate data
with little or no loss of significant information during
compression.
Register IR1 will be used throughout this description and it will
have the following meaning. IR1 is associated with the address of
information in various arrays which are used by NUPAKC for each
particular substring. The first location in each array (such as
SOS, LSX, BWX, YM, etc.) contains the compression commands for the
first substring. The second location in each array is the
information associated with the second substring, and so on for
each substring in the string to be compressed. IR1 contains a count
of the number of bytes past the beginning location of the array
where the substring information is stored. Thus, for substring 1,
IR1 will have a value of zero, indicating that the information is
stored at the beginning of each array. For substring 2, IR 1
contains the value of 4, meaning that the information is four-bytes
past the beginning of the array. For notation purposes, the symbols
such as SOS(IR1), LSX(IR1), etc., will indicate the above mentioned
meaning. That is, SOS(IR1) means to use the beginning address of
the SOS array plus the number of bytes past the beginning of the
array (the value of IR1) to address the proper location of the
substring information.
When LSX equals zero, as was previously stated the program
continues to stage 192 wherein the floating point number 0.0 is
stored in location BWX(IR1). The number 0.0 for LSX indicates that
the string is not be be truncated. This information will be added
to the head of the string (composed of all the substrings) after
completion of the passage of all the substrings through NUPAKC.
Thus, after completion of the compression, this information will be
placed at the head of the string to indicate that, no truncation
was completed on this particular substring so the proper
decompression procedure can be affected.
After storing 0.0 in BWX, the program continues to stage 198
wherein instructions are provided to have the substring searched to
find the minimum and maximum values in the substring. The minimum
value is contained in register IR4 and the maximum value of the
substring is contained in register IR5. After this is completed,
the program continues to stage 200 wherein the median value of the
substring is determined. This is YM(IR1) (the median value for the
substring) is computed as the sum of the minimum and maximum values
stored in IR4 and IR5 divided by 2. Then IR5 is then reset to equal
the absolute value of the median value for the substring (i.e.
IR5will always be a positive value at this point).
After completing this stage, the program continues to stage 202
wherein a determination is made as to whether IR5 is greater than
67,108,864. This would occur where the input information was not in
fact, numeric information or there had been some mistake in
entering the input string. The number 67,108,864 is equal to 2 to
the 27th power. If this were to occur, there obviously was an error
in the kind of input information entered and, in fact, the input
information should not have been supplied to NUPAKC. Since the GR5
is indicative of the medium value and would indicate that there are
some numbers above and some numbers below that value, any numbers
that exceed the 27 power are too large for the numeric compressor
to operate on and, accordingly, they should be bypassed through the
numeric compressor. Thus, if IR5 minus 227 is greater than zero
then the program continues to stage 204 wherein the number of times
the differencing procedure has been executed (in this case 0) is
stored in SOS(IR1). Then, SOS is loaded negatively so as to make
SOS less than zero. As was discussed previously, when SOS is less
than zero, the program is to use only SANPAKC for compression.
After completing this storage, the program continues to stage 206
wherein a printout is made to tell the user that the string could
not be compressed by NUPAKC.
At stage 208, immediately succeeding stage 206, the operations of
NUPAKC on the input string have been completed and program control
is transferred to the end of NUPAKC at location 210 as shown in
FIG. 3.
If the answer at stage 202 is a negative number, then program
continues at stage 212. At stage 212, the number of bytes in the
substring is loaded into register IR3 from storage location CS2 and
the address of the first byte of the substring is loaded into
register BRYY from counter CS15. After this step, the program
continues at stage 214 to substract the median value YM(IR1),
determined at stage 200, from each word in the substring taking the
substring word from storage to complete said subtraction step. Each
word in the substring is stored in four byte intervals in a storage
location addressed by register BRYY and, at stage 214, the first
word in the substring has YM subtracted therefrom. Register IR3
loaded with the number of bytes in the substring, has four bytes
subtracted therefrom. Thus, IR3 equals IR3 minus 4. Register BRYY
is incremented by 4 to address the next word in the substring.
Next, the program continues to stage 216 where determination is
made as to whether IR3 minus 4 or the new IR3 is still greater than
zero. If it is greater than zero, then the program returns to stage
214 and processes the next word in the substring recorded at BRYY
plus 4 and, further decrements IR3 by another four bytes. This
value of the new IR3 is then checked at stage 216 until such time
as the new IR3 is equal to zero. When this happens, the program
continues to the terminal stage 208.
If LSX had been greater than zero, the program would have continued
at stage 196 to execute the bin procedure truncation. At this
stage, a value LSX(IR1) divided by two is computed. The
LSX(IR1)/2.0 value is stored at location BWX(IR1).
After completing the storage step, the program continues at stage
218 where the substring is scanned to find a minimum and a maximum
value of the floating point numbers in the substring. This value is
contained in registers IR4 and IR5 respectively. At stage 220, the
value of the minimum number of the substring, in IR4, is subtracted
from IR5 and divided by the number stored in BWX(IR1), namely,
(LSX/2) to obtain a value which is stored at location DUM1. Thus,
it will be seen that as LSX approaches zero, DUM1 becomes larger
and larger. If DUM 1 becomes greater than 227, then the program
returns to stage 194 to execute the logical right shift method. If,
at stage 222, it is determined that DUM1 is less than 227, then the
address of the first byte in the string and the number of bytes in
the substring, recorded in counters CS15 and CS2 respectively, are
loaded into registers BRYY and IR3 so as to be ready for use. After
loading the registers with the values from CS2 and CS15, the
program continues to stage 226.
The value of the first word in the substring is replaced in its
storage location by a value computed by subtracting from the
original value at that location, the minimum number in the
substring and dividing by the value (LSX/2) stored in BWX. After
completion of this step, this word is then further operated on at
stage 228 by truncation. The truncation removes all digits to the
right of the decimal place in the word and leaving only the digits
to the left of the decimal place. This is known as truncation
without rounding as there is no significance placed on the size of
the number to the right of the decimal place and it does not affect
the integer which remains. The number or integer thus formed is now
stored in the same location from which it was taken and, next, the
program continues to stage 230 wherein register BRYY is incremented
to address the next number in the substring and IR3 is decreased by
four bytes, the amount one has moved to find the next word in
memory. IR3 is equal to four times the number of words remaining to
be processed in the substring and by decreasing IR3 by four bytes
for each word processed in memory in the substring, it will be
understood that when the entire string has been completed, IR3 will
be equal to zero.
At the next stage, a determination is made as to whether in fact
IR3 has reached zero. At this stage 232 if IR3 is still greater
than zero the program continues back to stage 226 and a new
truncation is performed by utilizing the value of the second word
in the substring and subtracting therefrom the minimum value and
dividing by (LSX/2). This new number is then truncated at stage 228
by dropping all digits to the right of the decimal point and
storing the new integer in the place for the second word in the
substring and continuing to stage 230 to look for the third word in
the string while decreasing the new IR3 by another four bytes. If,
at stage 232, IR3 is still greater than zero, the loop continues
again until IR3 reaches zero. If IR3 is less than or equal to zero,
the program continues to the terminal stage 208.
If LSX had been less than zero, at stage 190, the program would
continue to stage 194, where the floating point number "-1.0" would
be stored at location BWX(IR1). After completion of this storage
stage, the program continues at stage 234. At stage 234, the
program takes the first number from the substring and logically
adds the last four bits of this number to itself which is affective
to round the number before the next step is taken of shifting the
resultant sum logically to the right five bits, removing the last
five bits from the number. This is a truncation with rounding and
although one has lost the last five bits, the bits removed are the
least significant bits in the floating point number.
The program then continues at stage 236 wherein the truncated,
rounded number is returned to its storage location and BRYY the
address in storage is incremented by four bytes to address the next
word in the string and IR3 is decreased by four bytes. If IR3 is
still greater than zero, at the next stage of the program 238, the
loop continues by returning the program back to stage 234 for
truncation with rounding of the next word in the string. This
continues until TR3 is equal to or less than zero. When this
occurs, the program moves on to stage 240.
It should be noted that the resultant words in the substring are
now all integers and not floating point numbers as, by shifting to
the right five bits, each number in the substring is less than 227
and the number including its exponent at the front thereof can be
considered, for all practical purposes as an integer. After this is
completed, when the program continues at stage 240 wherein the last
word in the substring is removed from the substring and later
stored in location CHECK(IR1) for purposes of later utilizing said
word as a check on the accuracy of the compression and
decompression of the substring. The number so retrieved at stage
240 is first shifted logically left five bits before storing it in
CHECK (IR1) at stage 242. This is so the number will be in the
exact form in which it should be found after decompression of the
string at the end of NUPAKD. These operations occur at stage 242.
After completion of the program steps at stage 242, there is stored
in memory the address of the beginning of the first word in the
string, at location CS15, and the number of bytes in the string, at
location CS2. This latter number is loaded into register IR3, which
is the number of bytes past the beginning of the substring where
the last word in the substring is stored. The address of the
beginning of the substring is loaded into register BRYY. After
completion of this step, the program continues to stage 198 and the
information is treated as though there had been no truncation, in
the manner discussed previously with respect to integer numbers,
including finding a median and subtracting the median from all
numbers in the substring. After completion of stages 198, 200, 202,
212, 214 and 216 the converted substring reaches the terminal stage
208 for Step a of NUPAKC.
Step b of NUPAKC consists of identifying sequences and counting of
significant bits so as to achieve condensation of information.
Step b of NUPAKC is best shown in FIG. 5. For purposes of
definition the following are true:
Ir4 is equal to the number of consecutive equal integers;
Ir1 is the number of times a particular consecutive number is
repeated in the substring;
Ir3 is the number of bytes in the particular substring which is
being compressed;
Bryy is the address in storage where the particular substring is
stored.
In view of the detailed description heretofore given, it is now
possible to describe the steps of the program in accordance with
groups of steps and the functions which are achieved by the program
steps without necessity of describing each individual step within a
particular sequence of steps. The actual program listing at the end
of the written description will also aid in the understanding.
In program Step b , in FIG. 5, it is first desirable to examine the
string and search the string to find consecutive numbers which are
repeated along the length of the string. After completing the scan,
as accomplished at the stages identified collectively by the
numeral 244, (i.e., to find repeated numbers, identifying the
number of repeats in a particular sequence, and providing the
address of each of these repeated groups in the string), all of
this information is then stored in the computer.
After having completed the scan to determine the number of repeats
of particular numbers, the address of the repeats and the
particular number being repeated, the program continues to stage
246 where the determination is made as to whether the number of
bytes which can be saved, based on the results of the scan during
stage 244, is greater than the number of bytes needed for control
information for replacing the sequences of consecutive numbers. If
the answer to this question is "yes," i.e., that the number of
bytes saved is greater than zero, then the program continues to
stage 248 wherein the consecutive identical numbers in the string
have substituted therefore two four byte words (a double word) in
which the first four bytes contain a number corresponding to the
number of repeats of the particular information and the second four
byte word indicates the number which is being replaced. The address
in the substring where this sequence begin is stored in location
DUM1 (IR4). After completing the operation at stage 248, the
program continues to the steps shown at stage 250 wherein the
substring is closed up to erase the sequence information now
represented in the substring by the double word described with
respect to the steps at stage 248, providing a new substring where
the consecutive identical numbers have been replaced by a double
word indicating the number of times a particular number is repeated
at a given address in the substring. This operation as described
with respect to stages 244, 246, 248 and 250 is continued to find
further consecutive repeated numbers (if they exist) in the
substring. When more than ten such sequences are found, the program
stops with respect to this particular means of compressing the
string. The number ten is merely an arbitrary number selected to
indicate that a multitude of consecutive number are found in the
string. It is most probable that the overall saving is not going to
be substantially increased by removing additional repeated numbers
found in the substring and, therefore, there is no need to waste
additional machine time searching the substring. The number of
sequences found in the substring is recorded in register IR6. The
actual length of the string (in bytes) is recorded in register IR7.
When the string has been completely scanned, and there is no more
saving to be affected by executing the stages 244, 246, 248 and
250, the program continues to stage 252 wherein the stored address
(in array DUM1) of the replaced consecutive groupings are placed at
the head of the condensed substring.
If at stage 246, it was determined that for a particular repeated
sequence found during the program steps at stage 244, that no
saving would be affected by substitution for the repeated numbers,
then the program continues to stage 254 wherein a determination is
made as to whether the program should continue back to stage 244 to
look for additional repeated numbers or whether to end this
searching procedure and continue on to stage 252. In effect, stage
254 is substantially similar to the operation at stage 250
discussed previously.
After condensing the substring and placing the address information
at the head thereof, the program continues to a stage 256 wherein
the string is now prepared for the packing step which follows as
described in FIG. 6. It should be remembered that no number in the
string is greater than 227. Thus, in any four byte word, there
must, necessarily, be five bits which are not used. Thus, the five
leftmost bits in each four byte word are, of necessity, not used by
any number in the substring. For purposes of packing, it will be
necessary to determine, for each four byte word, the number of bits
required to represent the number. In this regard, the string is
prepared by moving the number in each four byte word to the left
five bits leaving the rightmost five bits in each four byte word
empty, Then, each number is scanned to determine the number of bits
required to represent the number and this information is placed in
the rightmost five bits. It will be understood that since the
number of bits necessary to represent the number cannot be more
than 27 (since the number cannot exceed 227 ) the number of the
significant bits will be less than 27 or a number which can be
designated within five bits of digital information. Accordingly,
the substring, as it reaches stage 256, will be in a form wherein
the number in the four byte word is recorded in the first 27 bits
and the next five bits will provide information relating to the
number of significant bits in those 27 bits.
FIG. 6 is a complete showing of the flow diagram for the program
set forth schematically in FIG. 3. FIG. 6 indicates that there are,
in fact, four steps required in NUPAKC. The steps are as
follows:
a. Truncation;
b. Differencing;
c. Sequencing;
d. Packing.
In view of the detailed description given in respect to SANPAKC, it
will be obvious to one skilled in the art after a discussion of the
function of NUPAKC as to the manner in which NUPAKC operates from
this functional description. Accordingly, although the flow diagram
is a complete step by step analysis of the operation of NUPAKC, the
description of the programming steps will be only in functional
form without reference to specific counters, storage, and
processing elements which will be accomplished in the computer by
reason of the programming steps.
The step of truncation is effectively the step described in FIG. 4
with respect to stage 182. After completion of truncation which has
been generally designated by the numeral 258, the substring
proceeds to be operated on by the program at stage 184 where the
differencing operation is completed. The differencing technique is
basically an attempt to reduce the number of significant bits in
the numbers being compressed so as to better enable the packing
operation to be more efficient, Thus, the lower the number of
significant figures in a given number, the better packing and more
efficient packing is possible. Differencing is a means of achieving
lower numbers without losing any information in the string.
The differencing operation is affected as follows:
a. First, the substring is added in an absolute manner to determine
the absolute sum of all of the numbers in the substring, regardless
of the sign of any individual number.
b. Each number in the substring is thus subtracted therefrom its
next preceding number, seriatum, in a manner whereby, for example,
if the first five numbers of the substring are the numbers (a, b,
c, d, and e), after differencing, the new substring will have the
numbers (a, (b-a), (c-a), (d-c), and (e-d).
c. Then the new differenced substring is added in the same manner
as in step a, taking the absolute value of the resultant numbers in
the differenced substring to produce a second sum. If the second
sum is greater than the first sum, (that is the absolute sum of the
numbers achieved through the differencing operation is greater than
the actual sum of the original numbers) then no improvement can be
achieved by differencing and, accordingly, the original substring
will be passed out of the differencing stage of the program without
any differencing operation being completed thereon. If the actual
sum of the differenced substring in accordance with step' b is less
than the absolute sum determined in step a then there has been some
betterment by the differencing technique and a determination will
then be made as to whether the substring can be even further
improved by a second differencing step.
d. A second differencing step similar to step b is then effected on
the resultant substring of step (b) to achieve a new string which
will be (a,[(b-a)-a] [(c-b)-(b-a)], [(d-c)-(c-b)], [(e-d)-(d-c)].
The sum of the numbers in this new substring is then determined and
if this absolute sum is greater than the absolute sum of the
substring in step b, then the substring in step b is continued into
the next step in the program. If the sum in step d is less than the
sum in step b then the differencing technique is continued until in
the last differencing step, the absolute sum of the new substring
is greater than the previous differencing step. The step which
produces a substring having the lowest absolute sum of the numbers
therein is the substring which will be processed through the
remainder of the program in NUPAKC. A record is kept of the number
of differencing steps achieved in the program stage 184 and this
number is recorded in the variable that maintains the status of the
substring (SOS), which will later be placed at the head of the
substring as information regarding the manner in which the
substring can be decompressed.
After completion of the differencing technique at stage 184, the
program continues the sequencing operation described in FIG. 5 with
respect to step b at stage 186. After completing the sequencing
operation, one has a string of information in the form of numbers,
each in four byte words, with the last five bits of each four byte
word giving the number of significant bis in the preceeding 27
bits. Additionally, there is an address at the head of the string
giving a the number of sequencing operations which have been
performed on the string and b the addresses of the information
which has been sequenced along the string. The purpose of the
packing stage 188 is to take the string of information and compress
it into its optimal form by the use of a packing technique which
will be described as follows:
a. The information is basically placed into sequential double
words.
b. In each double word, the first eight bits set forth the numbers
which will follow in the next fifty-six bits of information. For
example, if a string of information includes numbers whose largest
number requires only five bits of significant information, then it
is possible to place eleven numbers in the 56 bits following the
eight bits control information at the head of the double word. That
is, after the control information indicates that there are eleven
numbers to follow, each one of the numbers in the succeeding string
will be placed in five bit groupings within the double word,
leaving, at the end, one bit of useless information at the end of
the double word. It will thus be understood that considerable
compression would have been achieved as 11 numbers would normally
have taken up 44 bytes, whereas, by this technique, it has been
possible to compress this into eight bytes. From this limited point
of view there had been a saving of 36 bytes.
c. The first double word in the substring is different from all of
the other double words in that it has, in its first eight bits, the
number of words which are packed into the last 48 bits in the first
double word. The second eight bits in the first double word
provides the number of sequences which were found in the substring
at stage 186. This leaves only 48 bits in the first double word. It
will be understood that if there have been sequencing operations on
the substring, after the information relating to the number of
sequencing operations, there is at the head of the substring, the
addresses where each one of these sequencing operations took place.
It is thus possible to determine where the addresses for the
sequencing operation begin (after the second eight bits in the
first double word) and where they end (after the number of sequence
address set forth in the second eight bits in the first double
word). After all the addresses have been completed, then the
remainder of the substring begins.
It should be understood that within each double word only that
group of numbers which can be fit into the 56 bits following the
control byte will be included within the 56 bits. For example, if
the largest number of a group of successive numbers requires seven
bits of significant information, then there will be eight numbers
within the 56 bits, each in seven bit segments and the control
number will be eight. In this manner, maximum packing will be
achieved for a particular substring which is being operated on by
NUPAKC. All of the information relating to the substrings which are
being operated on by NUPAKC have completed their passage through
NUPAKC. When this is completed, the information relating to each
substring is placed at the head of the string sequentially and
additional information is placed at the head of the string relating
to the number of bytes in the now condensed string, and the number
of bytes in the string prior to entering COPAK which operates as a
check for the operation after decompression of this string.
NUPAKD
The input information to the decompressor NUPAKD, best shown in
FIG. 8, is of the type wherein the head of the string has certain
control information which has been placed in front of the
compressed string immediately subsequent to the completion of the
numeric compression in NUPAKC. The input control information has at
the head thereof four bytes which are designated as JIR, the number
of bytes in the original segment. After JIR, the next four bytes
are designated PARM. The first three bytes of PARM are the number
of bytes in the compressed segment, with the last byte indicating
the number of substrings in the string. It should be noted that no
string contains more than twenty substring and, therefore, this
information can be placed in one byte.
After PARM, comes the first status-of-substring information (SOS).
The SOS four bytes of information contains, in the first three
bytes, the number of bytes in the compressed substring. The next
four bits contain the format code, which indicates the original
input format type of X, A, I, E or F for the information. The
format code and its meaning with respect to the type of compression
in the string is shown in FIG. 9. The last four bits in the SOS
four byte word contain the number of differencing procedures which
were accomplished on the substring when passing through NUPAKC
stage 184. After the SOS four byte word, there comes a four byte
word indicated by the term CHECK. CHECK is the last four bytes in
the substring which should be reproduced upon decompression. Thus,
after decompression, it will be possible to compare CHECK with the
last four bytes of the decompressed substring to determine whether
there has been an accurate decompression of the substring.
After the CHECK four byte word, the next substring has its four
byte words of SOS and CHECK as indicated. If the second substring
had passed through the NUPAKC compressor utilizing truncation from
floating point to integer form, it would be necessary to add two
additional four byte words relating to said truncation. These two
four bytes words of information are BWX, the explanation of which
has been discussed with respect to FIG. 4 and stage 196 and YM
which is discussed with respect to FIG. 4 and stage 220. BWX and YM
four byte words are only added if SOS indicates that there is an E
or F type format code indicating that truncation by the bin
procedure or truncation by the logical right shift method were
affected on the input information. Where the format type in SOS is
neither E nor F, then there will be no words for BWX or YM. It will
be understood that as many SOS and CHECK four byte words are added
to the head of the processed string as there are substrings in the
string.
All of the above mentioned information placed at the head of the
string is removed from the substring and stored for use during the
NUPAKD procedure. The first double word which enters NUPAKD
contains in its first eight bits the number of words compressed
within the last 48 bits of the first double word. The second eight
bits of the first double word contains the number of sequencing
operations utilized in compressing the substring. In NUPAKD the
input first double word is picked up at program stage 262 of FIG.
7A wherein the first double word is loaded into registers IR2 and
IR3. The IR2 contains the first four bytes and IR3 contains the
second four bytes of the first double word. At program stage 264
immediately following, the position in storage of the substring is
incremented by eight bytes to indicate that the first double word
is now being decompressed. Additionally, IR7, the register which
contains the number of bytes in the condensed substring, is
decreased by eight bytes. After this step, the program moves to
stage 266 wherein the first eight bits in the double word are
extracted from IR2 to provide the number of words in the substring.
After extracting the first eight bits, the program continues at
stage 268 to compute the number of bits in each word in the
remaining portion of the double word. Since the first double word
has 48 bits of information, if the number of words in the substring
were nine, at location 268, a determination will be made that there
are five bits in each segment of the double word which are to be
expanded into full words. The program continues next to stage 270
wherein the information in the next double word is shifted to the
left eight bits so that the next 56 bits in the double word can be
considered. If this is the first double word in the input
substring, then, at stage 272, the next eight bits (NOS) are taken
from IR2 and that number is stored in RSX. If this is the first
double word, then the remaining 48 bits are shifted another eight
bits to the left to bring the last 48 bits to the head of the
double word for operation thereon. If the number NOS is zero, then
the program would skip to program stage 274 and operate in the
manner which will be discussed below. However, if NOS is greater
than zero indicating that there are some condensed sequences, then
the LSX array is used to store the locations in the substring, of
the sequences. In each four byte segment is placed the number from
IR2. First, however, the number from IR2 is placed into register
IR5 at stage 276 at the right hand end of the register so that,
when placed into LSX in four byte segments, the number will appear
at the correct position, mainly, the right hand end of the four
byte segments. All of the above shift and storage into LSX of the
information IR2 and IR3 is accomplished at program stage 278. If,
by reason of a review of NOS, it appears that all of the sequence
address have not been included in the first double word, provision
is made through the use of the program stages 282 and 284 to
indicate that the second double word must be similarly
decompressed. It should be understood that with the second double
word, as it enters stage 262, and continues in a manner discussed
with respect to the first double word, that only the first eight
bits would be looked at for current information, namely the number
(NOS) and that at stage 268 the number NOS would be divided into 56
the number of remaining bits in the second double word (and each
succeeding double word). Thus, after all of the sequencing
addresses have been stored in LSX, the program continues at stage
274 to extract, bit group by bit group, each word compressed in the
remaining 56 bits in each double word. However, in order to save
time in forming the string, it is necessary to first store each 56
words (four byte group) in core storage position DUN 1. As each 56
words are stored in DUN1, they are transferred, as a group, to a
second core storage location YY where they form the partially
decompressed string. This is all accomplished in a series of steps
herein noted as program step 286 (shown in FIG. 7B). These
transfers eliminate the need for continuously shifting all of the
partially decompressed substrings to the right as each additional
word is added to the substring. By grouping 56 words in DUN1, it is
possible to shift this entire amount in one operation to storage
location YY in the correct position at the right hand end of the
partially decompressed substring. This operation continues until
all of the double words have been expanded back into their original
form and the complete partially decompressed substring is presented
in which all packing has been removed. When this has been
completed, the program is at stage 288. At stage 288, the operation
for decompressing the condensed sequences in initiated. This
programatic operation is done generally within the steps indicated
as stage 290 (in FIG. 7C). If there was no sequencing operation,
then, of course, the entire stage of 290 is bypassed and the
program would continue at the stage succeeding stage 290 which will
be discussed below.
If sequencing has been accomplished, then the register in the LSX
is increased by four bytes and at stage 292 and the first address
in LSX is computed at stage 294 to determine where in the substring
the sequence begins. At that location in the substring, at the
first word containing the number of times the sequence was repeated
and the second word contains the actual number which was repeated.
This is determined at stage 296. Thereafter, the program computes
the size of the hole which has to be made in the substring in order
to insert the repeated sequences.
This "hole" creation is accomplished at program stage 298. This
program also creates this hole in the substring so as to allow the
data to be inserted into the substring at the proper location. In
the next sequence of steps in the program, the numbers to be
inserted are regenerated and inserted into the substring at the
proper address, and the new length for the string is computed for
storage in counter JII. Finally, at substage 302, the number NOS is
reduced by one and, if it is greater than zero this indicates that
there are additional sequencing addresses in LSX and the operation
continues again at stage 288. If NOS is now zero, that means that
all of the sequencing operations have been completed and all the
sequencing numbers have been decompressed and the operation is
ready to continue at stage 304. At stage 304, the differencing
operation is reversed for the purposes of further expanding the
substring. At stage 304, the differencing operation is reversed by
continuing to stage 306 wherein the first word is added to the
second word. The newly formed second word is added to the third
word, the newly formed third word is added to the fourth word, etc.
down the substring until the end of the substring thus reversing
the differencing operation in NUPAKC and decompressing one complete
differencing operation. If more than one differencing operation is
required, at stage 308, a determination will be made that an
additional differencing operation was accomplished on the
compressed information and, accordingly, the steps at programs
stage 306 will be repeated until all of the differencing steps have
been reversed, returning the compressed information to its original
form prior to differencing. When this has been completed, NDR will
be equal to zero and the program will continue at stage 301. At
stage 310, the truncation process is reversed. If there was no
truncation, then nothing happens at state 310. If the right shift
truncation was used during compression, the words will be shifted
to the left five bits thus reversing the right truncation. If the
bin method was used, then this too is reversed by multiplying BWX
times each of the numbers in the substring and adding to each of
the numbers in the substring the minimum YM which has been placed
in storage. BWX is, of course, also placed in storage from
information which was at the head of the substring prior to its
application to NUPAKD. After completion of stage 310, the program
continues, finally, to stage 312 wherein SOS is reconstructed in
accordance with the new string, specifically adding the new JII
and, further checking the new string against CHECK(1) and CHECK(2)
and any other information which has been stored indicating the
original information such as the original length of the string
prior to compression and decompression. The information has thus
been compressed in NUPAKC and decompressed in NUPAKD and is ready
for whatever uses are desired by the user.
CONTROL PROGRAMS
The CONTROL routine can be viewed as a supervisory program that
serves as a buffer between the O/S system of the IBM 360 and the
SOLID System. It is coded in the "higher language" (ALLOCATE) that
has evolved from the open-ended two-part design. The CONTROL
routine performs the following functions:
a. During assembly, CONTROL positions those components of the SOLID
system that are compiled in the main system with the SUBMP BMP
service macro.
b. During execution, CONTROL calls the components when they are
needed.
c. Special termination procedures, which are designed to protect
the AUXILIARY FILE, are executed in the CONTROL routine before the
O/S system terminates the job in the normal way.
By changing the CONTROL routine and the SUBMP service macro, a user
can easily alter the SOLID System to perform a specific task like
data compression. The SOLID System can be tailored to fit a
particular 360 configuration by altering the planned overlays. A
step-by-step description of the implementation procedure
follows:
Step 1
Code the control routine in ALLOCATE. If some components are not
going to be used the SUBMP service-macro must be altered to include
the dummy entry points for the omitted components. Also, the name
of the component must be deleted from the overlay structure. For
example, if the component SSEARCH is not being used SUMMP will
contain the two statements:
DUMADD PMARRAYR SEARCH
and SEARCH must be deleted from the planned overlay.
Step 2
Determine the amount of core storage that is available.
Step 3
Figure the amount of storage that is needed for the components and
the CONTROL routine.
Step 4
Select values for the fourteen variable parameters, then compile
the components and store them in the load module library,
SOLID.LOAD. If the retrieval package is being used, the size of the
memory block (defined by &NTRKS, &TRKL and <HAYY)
should be as large as possible. A minimum of 22,000 bytes is needed
for the CONTROL routine plus the largest component. Because
frequest accesses to the load module library (SOLID.LOAD) are
costly, it is suggested that the planned overlay structure should
also be considered at this time.
Step 5
Construct the planned overlay structure so that the storage
allocated for the programs will be fully and efficiently utilized.
Separately compute the 31 components and store them in the load
module library (SOLID, LOAD). Assemble the CONTROL routine.
The overlay structures are discussed next, then the CONTROL
routines for data-compression, data-transmission and the SOLID
System are given.
A. overlay Structure
Here the planned overlays for the IBM 360/40 (128K) and IBM 360/67
(768K) are given as an example. For details of the overlay
technique the reader is directed to the IBM 360 Link Editor
Manual.
i. IBM 360/40
The storage (in bytes) needed for each component is given in
parentheses. Double buffers were assigned for the four tape DCB's.
The single buffer for the disk DCB occupied 3600 bytes. A memory
block was defined as ten (=INTRKS) logical records
(length=&TRKL=3600). The two principal data arrays, &ARRAY
and YY, had lengths of 1500 and 38000 bytes respectively. The
storage figures given in parentheses below are approximate. This
overlay arrangement requires about 27000 bytes of case-storage,
exclusive of the storage needed for data-assay. ##SPC10##
ii. IBM 360/67
On the 768K IBM 360/67 the 30 components after OPENSHUT were on a
single branch of the overlay. Two buffers were used for each tape
DCB. About 84,000 bytes of core are needed for this arrangement.
All 31 components can also be compiled with the CONTROL routine and
positioned by the "SUBMP" type macro instruction.
B. CONTROL for the SOLID System
The program given for the CONTROL System (SOLIDO) requires both the
SOLID.MACLIB and SOLID.LOAD libraries. Normally the CONTROL routine
would be compiled separately and stored in the load module library,
SOLID.LOAD.
Before the CONTROL routine is compiled the variable parameters
associated with the RESERVE and SUBMP macro instructions must be
selected. Normally this would be done before the components are
separately compiled and stored in the load module library,
SOLID.sub.. LOAD. All variable parameters have been defined.
In the particular example described in this specification, the
CONTROL routine accomplishes the following:
Storage:
An information path is traced in the AUXILIARY FILE with a JOBLIST
item stored on a tape with the DCB named TAPEJB and a record number
is assigned to the RFILE. The bulk referenced information (stored
on the tape with DCB TAPEIND) is compressed and written on a tape
with the DCB named TAPEOTC.
Retrieval:
The JOBLIST item stored on the tape with the DCB named TAPEJB is
used to trace an information path in the AUXILIARY FILE to the bulk
storage address in RFILE. This address, which is the number of a
logical record on a tape read with the DCB named TAPEINC, is used
to retrieve the compressed referenced information. This referenced
information is decompressed and then appears on the device
designated by OUTPXT. An unsuccessful search is terminated with an
appropriate message.
The translated JOBLIST items can be arranged sequentially on the
tape with the DCB named TAPEJB. The bulk referenced information,
which is to be stored in compressed form, is located on the tape
with DCB named TAPEIND. With OUTPXT=0 the compressed referenced
information can be stored on the tape with DCB named TAPEOTC.
When this CONTROL routine is used on a production basis, the
following steps are taken:
The compressed referenced information is stored on a bulk storage
device like data-cells, disks or tapes. This might involve setting
the tape-read (REIDT) and tape-write (WRITE) macro instructions and
the first BULK address in the initialization macro MJARRAY, which
is executed in SSTATECL.
The variable parameter <HAYY is set to equal to the size of a
memory block plus 2000 (for the M and J arrays) plus 21/2 times the
maximum string length (LIT = SLENGTH + LLENGTH).
For example, if strings are to be less than 2,000 bytes and a
memory block is to have 100 logical records then: <HAYY=100
(memory block is 100 .times. 7294 = 729,400 bytes); LIT = 5000
(i.e., greater than twice times the maximum string length);
<HAYY = 729,400 + 5,000 + 2,000 = 740,000. The 2,000 is the
number of bytes needed to store the first arrays associated with
the prime index M and screen J.
The flow diagram for the CONTROL package SOLIDE is shown in FIG.
25. SOLIDE is the extended form of the CONTROL package where no
overlays are required. If overlays were required, the stage 1020
shown in FIG. 25 calling for the macro-instruction SUBME would have
su stituted therefore the macro-instruction SUBMO with the numbers
100,000; 500; 500; 500; and 1500. However, for the purposes of
simplicity of the description, the CONTROL package in its extended
form will be fully discussed with only the program at the end of
the specification being utilized to show the operation with
overlays.
In the control package SOLIDE, first, the RESERVE macro instruction
is executed at stage 1022. The RESERVE macro-instruction defines
the storage areas, the registers, tapes, and intializes the SOLID
System. The principal array YY, the override arrays, and the two
arrays (JBLIST and JB1) are defined in the macro instruction SUBME
at stage 1020. The system parameter &ADDL which is the
composite address length is set, for example, at six bytes; the
principal array length parameter <HAYY is set at 100.000
bytes, the number of PCORDS in the PCORD TABLE that are to be used
in the fast mode of SANPAKC are set to 5 (&TPCORD); and the
length of the two JOB-LIST arrays JBLIST and JBWORK is set equal to
1500 bytes (&LJBLIST). After these system parameters are set,
control goes to the next stage, 1024, wherein the macro instruction
DEVICE is executed. The DEVICE macro sets up seven device commands
which tell the system where to find information that is to be
compressed and what to do with information after it has been
compressed. These seven device commands are as follows: INPXT
(tells what type of device information is coming in on); OUTPXT
(tells the system what device the information should be put on
after compression or decompression): RSKIPS (a tape command which
tells how far to skip out on the tape before beginning); SLENGTH
(the minimum number of bytes per segment of information, used as a
compressor command); LLENGTH (the number of bytes in the LABEL that
is not to be compressed at the beginning of each segment); RNOS
(the number of strings or segments that have to be processed by the
compressor before a new set of device commands is read); and TPCORD
(the number of PCORDS in the PCORD TABLE that are to be used by
SANPAKC in the fast mode). It should be noted that if TPCORD as set
in the DEVICE command at stage 1024 is not entered, the value of
TPCORD falls back to the value &TPCORD set at stage 1022.
After completion of the macro device at stage 1024, control goes to
stage 1026 where the macro-instruction STRING is executed. The
STRING macro reads the five string commands that define the status
of the strings or segments of information. These five commands are
MODE (tells whether the string or segment is to be compressed or
decompressed or to be used to update the system); POSTOP (tells the
system what to do after it has completed the current job, i.e., to
get out of the system; to read a new set of device commands; or to
read a new set of string commands); LEXCON (indicates whether the
PACORD TABLE must be read or not read); LEXMODE (indicates whether
the system is operating in the fast mode or the slow mode or simply
extending the PCORDS TABLE); LEXPACH (indicates whether the PACORDS
TABLE should be punched or not punched after the current string or
segment has been compressed or decompressed).
After completion of stage 1026, this program continues to stage
1020 where the macro instruction GETJLIST is accomplished. The
GETLIST macro performs all of the instructions relating to the
fetching and translation of the descriptor sets to the JOBLIST
form. There are nine instructions associated with the GETJLIST
macro. These instructions are as follows: JLINPXT (designates the
imprint input/output device); JLRSKIP (a tape command which
indicates the number of records that are to be skipped before the
first record is read); JLTRAN (designates the translator that is to
be used); JLNORM (designates whether normalization is to be
executed or not); KLENGTH (designates the number of bytes per
kernel in the JOBLIST item); NJOBS (designates the number of bulk
items to be stored for each information path); NTASKS (designates
the number of items in the JOBLIST); and the last four items
NVALUE, JVALUE, NUMDIAG and GENERATE are special instructions
associated with the Monte Carlo generator for generating random
JOBLIST items. They are read only when JLINPXT equals 16 which
indicate that the Monte Carlo generator is to be used. When random
generation of JOB-LIST items is being effected the NVALUE is the
value of M, JVALUE is the maximum value of any J, NUMDIAG is the
maximum number of diagonols or screens to be generated; and
GENERATE is the location of the random number that is to be used by
the Monte Carlo generator to generate the JOBLIST item. Monte Carlo
or Random generators are used to debug the system or to determine
the limits of the system and to determine the economics of its
operation.
Control then goes to stage 1030 wherein the macro instruction CALL2
is executed. The CALL2 macro first executes the SSEARCH component,
as discussed previously, then the component SRESULT is executed.
SRESULT prints intermediate results of the search. In a continuing
production system, it may be undesirable to even utilize the
SRESULT component and, accordingly, the CALL2 macro instruction may
have substituted therefore a macro instruction CALL1 which would
call only the SEARCH procedure. After the Call1 or CALL2
instruction control goes to the location ANSWER. Thereafter, at
stage 1032, the instruction DISPENSE would be executed. In DISPENSE
the determination is made as to whether control should pass through
the compressor, back to stage 1030, back to stage 1028, back to
stage 1026, or back to stage 1024. The other option is, of course,
to leave the machine because the day's operations have been
completed.
At some point, after completion of stage 1032, the program would
continue to stage 1034 wherein another CALL1 instruction would be
effected to pass control to the SREADC component which reads the
substring command, and the bulk information, if it is on cards.
Then control goes to stage 1036 which is a decision box. At stage
1036, determination is made as to whether the INPXT command is zero
or not zero. If the INPXT command is zero, then control goes to
stage 1038 wherein a CALL1 instruction is used to pass control to
SREADT component which reads the sub-strings of information from
magnetic tape. When INPXT is zero, this means that the bulk
information is on tape. If INPXT is not zero, then the control
passes directly to stage 1040 wherein RSKIPS is set to zero.
RSKIPS is normally used for the tape read-out and, since INPXT is
not zero, this means that the bulk information is not on tape and,
therefore, there is no need to have any value of RSKIPS. If the
information had been on tape, and had been read out at stage 1038,
RSKIPS would have been reset to zero so that, at a later stage, it
would be reset to a new value in the macro DISPENSE at 1032.
After completion of stage 1040, control passes stage 1042 which is
the COPAK macro discussed previously. After completion of the COPAK
macro at stage 1042, control passes to stage 1044 wherein the macro
instruction CALL1 is used to call the SOUTPUT component. SOUTPUT
macro disposes of the information after compression or
decompression in COPAK in accordance with the OUTPXT command set at
stage 1024. After completion of the SOUTPUT command, control can
pass either back to stages 1032, 1030, 1028, 1026 or 1024 or,
alternatively, can pass out of the system. After completion of
stage 1044, the last stage of the program SUBME at stage 1020
positions all the components correctly at compilation time. The
fourteen system parameters discussed previously are defined at
stage 1020.
CONTROL PROGRAM COPAKCD
It the COPAK compressor program were to stand alone without
relation to the SOLID System, then a separate control program would
be required for COPAK. This has been defined as COPAKCO. The
control program flow diagram for COPAKCO is shown in FIG. 26.
The control program for COPAKCO is substantially similar to the
control program for SOLIDE except that unnecessary
macro-instructions relating strictly to the SOLID System have been
eliminated and new or changed macro instructions have been
substituted therefore. Macro instructions which are similar to the
macro instructions shown in FIG. 25 have been shown in FIG. 26 with
prime numerals. Basically the control program in FIG. 26 is
substantially similar and operates in substantially the same manner
as the control program of FIG. 25.
In FIG. 26, the first stage of the flow diagram for COPAKCO is
stage 1046 wherein the macro instructions RESERCO is executed. In
this instruction, only three system parameters are set, namely
<HAYY, (the length of the principal data array); &TPCORD
(the number of PCORDS in the PCORD TABLE used in the fast mode);
and &LJBLIST (the length of the job list array). For purposes
of example, in FIG. 26 &Lthayy has been set at 20,0000 bytes,
&TPCORD has been set at 5 and &LJBLIST has been set at
1,500 bytes. After setting these system parameters, the program
continues through stages 1024' and 1026' to stage 1048 wherein the
macro instruction DISPOSE is executed. DISPOSE performs the same
functions as were performed by DISPENSE at stage 1032 except those
procedures relating to the search operation have been omitted.
After completion of stage 1048, the program continues to stages
1034', 1036', 1038', 1040', 1042' and 1044' in the same manner as
was discussed with respect to FIG. 25. Then, at stage 1050, the
macro command SUBCE is performed which positions all Those
components needed for compression or decompression at compilation
time. SUBCE is used if OOPAKCO is to be used in the extended form.
If the CONTROL routine is to be executed in the overlay form, then
instruction SUBCO should be used in place of SUBCE.
CONTROL PROGRAM COPAKAN
If the alphanumeric compresser and decompresser is to be used as a
stand alone program, then a separate CONTROL program should be
used. This CONTROL program is shown in FIG. 27 and it is named
COPAKAN. Similar programatic steps shown in FIG. 25 and 26 have
been shown by either ' or " numerals in FIG. 27 to indicate that
there is no difference between these programatic steps and the
steps used in FIG. 27.
In FIG. 27, the program continues as in FIG. 26 through stages
1046', 1024", 1026", 1048', 1034", 1036", 1038", 1040", to stage
1052. At stage 1052, the macro-instruction COPAJ is effected. COPAJ
is a special macro-instruction which, in effect, is COPAK without
the numeric compressor, decompressor comonent SNUPAK. After
completion of stage 1052, the program continues to stage 1044".
Stage 1054 includes the instruction SUBCJ which positions all of
the components necessary for COPAKAN. It should be noted that the
instruction SUBCJ is for use in the extended form. If operation is
in the overlay form, then there is substituted for the instruction
SUBCJ, the macro instruction SUBCJO. Please note that for both the
COPAKCO and COPAKAN and, additionally, for the COPAKNU instructions
to be discussed hereinafter, there is only needed three system
parameters, namely, <HAYY, &TPCORD, and &LJBLIST.
CONTROL PROGRAM COPAKNU
If the numeric compressor and decompressor SNUPAK is operated as a
stand alone program without the alphanumeric compressor SANPAK then
a special control program for the macro SNUPAK must be used. This
is shown in FIG. 28. This is defined as COPAKNU. Similar
programatic steps shown in FIGS. 25, 26, and 27 have been indicated
with prime numerals to indicate similar instructions. In COPAKNU
shown in FIG. 28, again the program starts at stage 1046" continues
through stage 1024'" to stage 1056. At stage 1056, the macro
instruction STRING is effected, but only the first three string
commands mode, POSTOP and LEXICON are read as the remaining
instructions discussed with respect to FIGS. 27 and 26 relate to
alphanumeric compression and, therefore, are not necessary.
After completion of stage 1056, the program continues through
stages 1048", 1034'", 1036'", 1038'", 1040'", to stage 1058 wherein
the macro instruction COPAB is performed. COPAB is a variation of
COPAK without alphanumeric compression or decompression. After
completion of stage 1058, the program continues to stage 1044'".
Stage 1060 provides the macro-instruction SUBCB which provides all
of the components of COPAKNU at the time of compilation. Again,
SUBCB is the macro instruction in the extended form, if the system
is operating in the overlay form, then a special instruction SUBCBO
must be substituted for the instruction SUBCB.
It will be appreciated that all of the functions shown in block
diagram in the drawings are implemented by digital program. The
digital program listing in accordance with this invention will now
be given sufficient details to enable those skilled in the art to
carry it out. This routine is written in IBM BALL language and the
program can be carried out by a number of suitable digital
processing systems. As one exaMple of a digital system on which
this program has been performed, reference is made to the IBM
Computer 360/40. The program is as follows:
* * * * *