U.S. patent application number 09/846443 was filed with the patent office on 2003-11-06 for method and apparatus for generating encryption stream ciphers.
Invention is credited to Rose, Gregory G..
Application Number | 20030206634 09/846443 |
Document ID | / |
Family ID | 25499787 |
Filed Date | 2003-11-06 |
United States Patent
Application |
20030206634 |
Kind Code |
A1 |
Rose, Gregory G. |
November 6, 2003 |
Method and apparatus for generating encryption stream ciphers
Abstract
A method and apparatus for generating encryption stream ciphers.
The recurrence relation is designed to operate over finite fields
larger than GF(2) and is maximal length. An output equation
generates the output based on a plurality of elements in the shift
register used to implement the recurrence relation. The recurrence
relation and the output equation are selected to have distinct pair
distances such that, as the shift register shifts, no particular
pair of elements of the shift register are used twice in either the
recurrence relation or the output equation.
Inventors: |
Rose, Gregory G.; (New South
Wales, AU) |
Correspondence
Address: |
Qualcomm Incorporated
Patents Department
5775 Morehouse Drive
San Diego
CA
92121-1714
US
|
Family ID: |
25499787 |
Appl. No.: |
09/846443 |
Filed: |
April 30, 2001 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
09846443 |
Apr 30, 2001 |
|
|
|
08957571 |
Oct 24, 1997 |
|
|
|
6252958 |
|
|
|
|
Current U.S.
Class: |
380/265 ;
380/46 |
Current CPC
Class: |
H04L 9/0668 20130101;
H04L 2209/12 20130101; H04L 2209/20 20130101; Y04S 40/20 20130101;
G09C 1/00 20130101 |
Class at
Publication: |
380/265 ;
380/46 |
International
Class: |
H04L 009/26 |
Claims
I claim:
1. A method for generating a non-linear output stream from a linear
feedback shift register (LFSR), comprising: shifting a plurality of
bits through the LFSR, wherein the LFSR is structured in accordance
with a recurrence relation; performing modular multiplications upon
the plurality of bits, wherein the modular multiplications are
implemented through pre-computed look-up tables, wherein the
pre-computed look-up tables are computed using an irreducible
polynomial; and performing a non-linear operation on a selected
portion of the shifted plurality of bits, wherein the selected
portion is selected so that the pairwise distances between elements
in the selected portion are distinct values.
2. The method of claim 1, wherein the non-linear operation is
defined as
V.sub.n=(S.sub.n+S.sub.n+5).times.(S.sub.n+2+S.sub.n+12), where the
non-linear operation is defined over GF(2.sup.8).
3. The method of claim 1, wherein the non-linear operation is a
stuttering operation.
4. The method of claim 1, further comprising the step of
initializing the LFSR before shifting the plurality of bits,
wherein initializing the LFSR comprises: adding a byte of a secret
key to an element in the LFSR; and adding a byte of a secondary key
to the LFSR for each frame of data that passes through the LFSR.
Description
CROSS REFERENCE
[0001] This application is a continuation application of U.S.
application Ser. No. 08/957,571, filed Oct. 24, 1997, entitled
"Method and Apparatus for Generating Encryption Stream Ciphers",
now allowed.
BACKGROUND OF THE INVENTION
[0002] I. Field of the Invention
[0003] The present invention relates to encryption. More
particularly, the present invention relates to a method and
apparatus for generating encryption stream ciphers.
[0004] II. Description of the Related Art
[0005] Encryption is a process whereby data is manipulated by a
random process such that the data is made unintelligible by all but
the targeted recipient. One method of encryption for digitized data
is through the use of stream ciphers. Stream ciphers work by taking
the data to be encrypted and a stream of pseudo-random bits (or
encryption bit stream) generated by an encryption algorithm and
combining them, usually with the exclusive-or (XOR) operation.
Decryption is simply the process of generating the same encryption
bit stream and removing the encryption bit stream with the
corresponding operation from the encrypted data. If the XOR
operation was performed at the encryption side, the same XOR
operation is also performed at the decryption side. For a secure
encryption, the encryption bit stream must be computationally
difficult to predict.
[0006] Many of the techniques used for generating the stream of
pseudo-random numbers are based on a linear feedback shift register
(LFSR) over the Galois finite field of order 2. This is a special
case of the Galois finite field of order 2.sup.n where n is a
positive integer. For n=1, the elements of the Galois field
comprise bit values zero and one. The register is updated by
shifting the bits over by one bit position and calculating a new
output bit. The new bit is shifted into the register. For a
Fibonacci register, the output bit is a linear function of the bits
in the register. For a Galois register, many bits are updated in
accordance with the output bit just shifted out from the register.
Mathematically, the Fibonacci and Galois register architectures are
equivalent.
[0007] The operations involved in generating the stream of
pseudo-random numbers, namely the shifting and bit extraction, are
efficient in hardware but inefficient in software or other
implementations employing a general purpose processor or
microprocessor. The inefficiency increases as the length of the
shift register exceeds the length of the registers in the processor
used to generate the stream. In addition, for n=0, only one output
bit is generated for each set of operations which, again, results
in a very inefficient use of the processor.
[0008] An exemplary application which utilizes stream ciphers is
wireless telephony. An exemplary wireless telephony communication
system is a code division multiple access (CDMA) system. The
operation of CDMA system is disclosed in U.S. Pat. No. 4,901,307,
entitled "SPREAD SPECTRUM MULTIPLE ACCESS COMMUNICATION SYSTEM
USING SATELLITE OR TERRESTRIAL REPEATERS," assigned to the assignee
of the present invention, and incorporated by reference herein. The
CDMA system is further disclosed in U.S. Pat. No. 5,103,459,
entitled SYSTEM AND METHOD FOR GENERATING SIGNAL WAVEFORMS IN A
CDMA CELLULAR TELEPHONE SYSTEM, assigned to the assignee of the
present invention, and incorporated by reference herein. Another
CDMA system includes the GLOBALSTAR communication system for world
wide communication utilizing low earth orbiting satellites. Other
wireless telephony systems include time division multiple access
(TDMA) systems and frequency division multiple access (FDMA)
systems. The CDMA systems can be designed to conform to the
"TIA/EIA/IS-95 Mobile Station-Base Station Compatibility Standard
for Dual-Mode Wideband Spread Spectrum Cellular System",
hereinafter referred to as the IS95 standard. Similarly, the TDMA
systems can be designed to conform to the TIA/EIA/IS-54 (TDMA)
standard or to the European Global System for Mobile Communication
(GSM) standard.
[0009] Encryption of digitized voice data in wireless telephony has
been hampered by the lack of computational power in the remote
station. This has led to weak encryption processes such as the
Voice Privacy Mask used in the TDMA standard or to hardware
generated stream ciphers such as the A5 cipher used in the GSM
standard. The disadvantages of hardware based stream ciphers are
the additional manufacturing cost of the hardware and the longer
time and larger cost involved in the event the encryption process
needs to be changed. Since many remote stations in wireless
telephony systems and digital telephones comprise a microprocessor
and memory, a stream cipher which is fast and uses little memory is
well suited for these applications.
SUMMARY OF THE INVENTION
[0010] The present invention is a novel and improved method and
apparatus for generating encryption stream ciphers. In accordance
with the present invention, the recurrence relation is designed to
operate over finite fields larger than GF(2). The linear feedback
shift register used to implement the recurrence relation can be
implemented using a circular buffer or a sliding window. In the
exemplary embodiment, multiplications of the elements of the finite
field are implemented using lookup tables. A crytographically
secured output can be obtained by using one or a combination of
non-linear processes applied to the state of the linear feedback
shift register. The stream ciphers can be designed to support
multi-tier keying to suit the requirements of the applications for
which the stream ciphers are used.
[0011] It is an object of the present invention to utilize a
recurrence relation and output equation having distinct pair
distances. Distinct pair distances ensure that, as the shift
register used to implement the recurrence relation shifts, no
particular pair of elements of the shift register are used twice in
either the recurrence relation or the non-linear output equation.
This property removes linearity in the output from the output
equation.
[0012] It is another object of the present invention to utilize a
recurrence relation having maximum length. An exemplary maximum
length recurrence relation of order 17 is: S.sub.n+17=141{circle
over (.times.)}S.sub.n+15{circle over (+)}S.sub.n+4{circle over
(+)}175{circle over (.times.)}S.sub.n, where the operations are
defined over GF(2.sup.8), {circle over (+)} is the exclusive-OR
operation on two bytes, and {circle over (.times.)} is a polynomial
modular multiplication. A recurrence relation of order 17 is well
suited to accommodate 128-bit key material which is required for
many applications.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] The features, objects, and advantages of the present
invention will become more apparent from the detailed description
set forth below when taken in conjunction with the drawings in
which like reference characters identify correspondingly throughout
and wherein:
[0014] FIG. 1 is a block diagram of an exemplary implementation of
a recurrence relation;
[0015] FIG. 2 is an exemplary block diagram of a stream cipher
generator utilizing a processor;
[0016] FIGS. 3A and 3B are diagrams showing the contents of a
circular buffer at time n and time n+1, respectively;
[0017] FIG. 3C is a diagram showing the content of a sliding
window;
[0018] FIG. 4 is a block diagram of an exemplary stream cipher
generator of the present invention;
[0019] FIG. 5 is a flow diagram of an exemplary secret key
initialization process of the present invention;
[0020] FIG. 6 is a flow diagram of an exemplary per frame
initialization process of the present invention;
[0021] FIG. 7 is a block diagram of an alternative exemplary stream
cipher generator of the present invention; and
[0022] FIG. 8 is a flow diagram of an alternative exemplary per
frame initialization process of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0023] Linear feedback shift register (LFSR) is based on a
recurrence relation over the Galois field, where the output
sequence is defined by the following recurrence relation:
S.sub.n+k=C.sub.k-1S.sub.n+k-1+C.sub.k-2S.sub.n+k-2+. . .
+C.sub.1S.sub.n+1+C.sub.0S.sub.n, (1)
[0024] where S.sub.n+k is the output element, C.sub.j are constant
coefficients, k is the order of the recurrence relation, and n is
an index in time. The state variables S and coefficients C are
elements of the underlying finite field. Equation (1) is sometimes
expressed with a constant term which is ignored in this
specification.
[0025] A block diagram of an exemplary implementation of the
recurrence relation in equation (1) is illustrated in FIG. 1. For a
recurrence relation of order k, register 12 comprises k elements
S.sub.n to S.sub.n+k-1. The elements are provided to Galois field
multipliers 14 which multiply the elements with the constants
C.sub.j. The resultant products from multipliers 14 are provided
Galois field adders 16 which sum the products to provide the output
element.
[0026] For n=1, the elements of GF(2) comprise a single bit (having
a value of 0 or 1) and implementation of equation (1) requires many
bit-wise operations. In this case, the implementation of the
recurrence relation using a general purpose processor is
inefficient because a processor which is designed to manipulate
byte or word sized objects is utilized to perform many operations
on single bits.
[0027] In the present invention, the linear feedback shift register
is designed to operate over finite fields larger than GF(2). In
particular, more efficient implementations can be achieved by
selecting a finite field which is more suited for a processor. In
the exemplary embodiment, the finite field selected is the Galois
field with 256 elements (GF(2.sup.8)) or other Galois fields with
2.sup.n elements, where n is the word size of the processor.
[0028] In the preferred embodiment, a Galois field with 256
elements (GF(2.sup.8)) is utilized. This results in each element
and coefficient of the recurrence relation occupying one byte of
memory. Byte manipulations can be performed efficiently by the
processor. In addition, the order k of the recurrence relation
which encodes the same amount of states is reduced by a factor of
n, or 8 for GF(2.sup.8).
[0029] In the present invention, a maximal length recurrence
relation is utilized for optimal results. Maximal length refers to
the length of the output sequence (or the number of states of the
register) before repeating. For a recurrence relation of order k,
the maximal length is N.sup.k-1, where N is the number of elements
in the underlying finite field, and N=256 in the preferred
embodiment. The state of all zeros is not allowed.
[0030] An exemplary block diagram of a stream cipher generator
utilizing a processor is shown in FIG. 2. Controller 20 connects to
processor 22 and comprises the set of instructions which directs
the operation of processor 22. Thus, controller 20 can comprise a
software program or a set of microcodes. Processor 22 is the
hardware which performs the manipulation required by the generator.
Processor 22 can be implemented as a microcontroller, a
microprocessor, or a digital signal processor designed to performed
the functions described herein. Memory element 24 connects to
processor 22 and is used to implement the linear feedback shift
register and to store pre-computed tables and instructions which
are described below. Memory element 24 can be implemented with
random-access-memory or other memory devices designed to perform
the functions described herein. The instructions and tables can be
stored in read-only memory, only the memory for the register itself
needs to be modified during the execution of the algorithm.
[0031] I. Generating Non-Linear Output Stream
[0032] The use of linear feedback shift register for stream ciphers
can be difficult to implement properly. This is because any
linearity remaining in the output stream can be exploited to derive
the state of the register at a point in time. The register can then
be driven forward or backward as desired to recover the output
stream. A number of techniques can be used to generate non-linear
stream ciphers using linear feedback shift register. In the
exemplary embodiment, these non-linear techniques comprise
stuttering (or unpredictable decimation) of the register, the use
of a non-linear function on the state of the register, the use of
multiple registers and non-linear combination of the outputs of the
registers, the use of variable feedback polynomials on one
register, and other non-linear processes. These techniques are each
described below. Some of the techniques are illustrated by the
example below. Other techniques to generate non-linear stream
ciphers can be utilized and are within the scope of the present
invention.
[0033] Stuttering is the process whereby the register is clocked at
a variable and unpredictable manner. Stuttering is simple to
implement and provides good results. With stuttering, the output
associated with some states of the register are not provided at the
stream cipher, thus making is more difficult to reconstruct the
state of the register from the stream cipher.
[0034] Using a non-linear function on the state of the shift
register can also provide good results. For a recurrence relation,
the output element is generated from a linear function of the state
of the register and the coefficients, as defined by equation (1).
To provide non-linearity, the output element can be generated from
a non-linear function of the state of the register. In particular,
non-linear functions which operate on byte or word sized data on
general purpose processors can be utilized.
[0035] Using multiple shift registers and combining the outputs
from the registers in a non-linear fashion can provide good
results. Multiple shift registers can be easily implemented in
hardware where additional cost is minimal and operating the shift
registers in parallel to maintain the same operating speed is
possible. For implementations on a general purpose processor, a
single larger shift register which implements the multiple shift
registers can be utilized since the larger shift register can be
updated in a constant time (without reducing the overall
speed).
[0036] Using a variable feedback polynomial which changes in an
unpredictable manner on one register can also provide good results.
Different polynomials can be interchanged in a random order or the
polynomial can be altered in a random manner. The implementation of
this technique can be simple if properly designed.
[0037] II. Operations on Elements of Larger Order Finite Fields
[0038] The Galois field GF(2.sup.8) comprises 256 elements. The
elements of Galois field GF(2.sup.8) can be represented in one of
several different ways. A common and standard representation is to
form the field from the coefficients modulo 2 of all polynomials
with degree less than 8. That is, the element .alpha. of the field
can be represented by a byte with bits (a.sub.7, a.sub.6, . . . ,
a.sub.0) which represent the polynomial:
a.sub.7x.sup.3+a.sub.6x.sup.6+. . . .div.a.sub.1x+a.sub.0. (2)
[0039] The bits are also referred to as the coefficients of the
polynomial. The addition operation on two polynomials represented
by equation (2) can be performed by addition modulo two for each of
the corresponding coefficients (a.sub.7, a.sub.6, . . . , a.sub.0).
Stated differently, the addition operation on two bytes can be
achieved by performing the exclusive-OR on the two bytes. The
additive identity is the polynomial with all zero coefficients (0,
0, . . . , 0).
[0040] Multiplication in the field can be performed by normal
polynomial multiplication with modulo two coefficients. However,
multiplication of two polynomials of order n produces a resultant
polynomial of order (2n-1) which needs to be reduced to a
polynomial of order n. In the exemplary embodiment, the reduction
is achieved by dividing the resultant polynomial by an irreducible
polynomial, discarding the quotient, and retaining the remainder as
the reduced polynomial. The selection of the irreducible polynomial
alters the mapping of the elements of the group into encoded bytes
in memory, but does not otherwise affect the actual group
operation. In the exemplary embodiment, the irreducible polynomial
of degree 8 is selected to be:
x.sup.8+x.sup.6+x.sup.3+x.sup.2+1. (3)
[0041] Other irreducible monic polynomial of degree 8 can also be
used and are within the scope of the present invention. The
multiplicative identity element is (a.sub.7, a.sub.6, . . . ,
a.sub.0)=(0, 0, . . . , 1).
[0042] Polynomial multiplication and the subsequent reduction are
complicated operations on a general purpose processor. However, for
Galois fields having a moderate number of elements, these
operations can be performed by lookup tables and more simple
operations. In the exemplary embodiment, a multiplication (of
non-zero elements) in the field can be performed by taking the
logarithm of each of the two operands, adding the logarithmic
values modulo 255, and exponentiating the combined logarithmic
value. The reduction can be incorporated within the lookup
tables.
[0043] The exponential and logarithm tables can be generated as
follows. First, a generator g of the multiplicative subgroup
GF(2.sup.8) is determined. In this case, the byte value g=2
(representing the polynomial x) is a generator. The exponential
table, shown in Table 1, is a 256-byte table of the values g.sup.i,
for i=0, 1, . . . 2.sup.8-1. For g.sup.i (considered as an integer)
of less than 256, the value of the exponential is as expected as
evidenced by the first eight entries in the first row of Table 1.
Since g=2, each entry in the table is twice the value of the entry
to the immediate left (taking into account the fact that Table 1
wraps to the next row). However, for each g.sup.i greater than 255,
the exponential is reduced by the irreducible polynomial shown in
equation (3). For example, the exponential x.sup.8 (first row,
ninth column) is reduced by the irreducible polynomial
x.sup.8+x.sup.6+x.sup.3+x.sup.2+1 to produce the remainder
-x.sup.6-x.sup.3-x.sup.2-1. This remainder is equivalent to
x.sup.6+x.sup.3+x.sup.2+1 for modulo two operations and is
represented as 77 (2.sup.6+2.sup.3+2.sup.2+1) in Table 1. The
process is repeated until g.sup.i for all index i=0 to 255 are
computed.
[0044] Having defined the exponential table, the logarithm table
can be computed as the inverse of the exponential table. In Table
1, there is a unique one to one mapping of the exponential value
g.sup.i for each index i which results from using an irreducible
polynomial. For Table 1, the mapping is i2.sup.i, or the value
stored in the i-th location is 2.sup.i. Taking log.sub.2 of both
sides results in the following: log.sub.2(i)i. These two mappings
indicate that if the content of the i-th location in the
exponential table is used as the index of the logarithm table, the
log of this index is the index of the exponential table. For
example, for i=254, the exponential value 2.sup.i=2.sup.254=166 as
shown in the last row, fifth column in Table 1. Taking log.sub.2 of
both sides yields 254=log.sub.2(166). Thus, the entry for the index
i=166 in the logarithmic table is set to 254. The process is
repeated until all entries in the logarithmic table have been
mapped. The log of 0 is an undefined number. In the exemplary
embodiment, a zero is used as a place holder.
[0045] Having defined the exponential and logarithmic tables, a
multiplication (of non-zero elements) in the field can be performed
by looking up the logarithmic of each of the two operands in the
logarithmic table, adding the logarithmic values using modulo 255
arithmetic, and exponentiating the combined logarithmic value by
looking up the exponential table. Thus, the multiplication
operation in the field can be performed with three lookup
operations and a truncated addition. In the exemplary Galois field
GF(2.sup.8), each table is 255 bytes long and can be pre-computed
and stored in memory. In the exemplary embodiment, the logarithm
table has an unused entry in position 0 to avoid the need to
subtract 1 from the indexes. Note that when either operand is a
zero, the corresponding entry in the logarithmic table does not
represent a real value. To provide the correct result, each operand
needs to be tested to see if it is zero, in which case the result
is 0, before performing the multiplication operation as
described.
[0046] For the generation of the output element from a linear
feedback shift register using a recurrence relation, the situation
is simpler since the coefficients C.sub.j are constant as shown in
equation (1). For efficient implementation, these coefficients are
selected to be 0 or 1 whenever possible. Where C.sub.j have values
other than 0 or 1, a table can be pre-computed for the
multiplication t.sub.i=C.sub.j.multidot.i, where i=0, 1, 2, . . . ,
2.sup.8-1. In this case, the multiplication operation can be
performed with a single table lookup and no tests. Such a table is
fixed and can be stored in read-only memory.
1TABLE 1 Exponential Table i xx0 xx1 xx2 xx3 xx4 xx5 xx6 xx7 xx8
xx9 00x 1 2 4 8 16 32 64 128 77 154 01x 121 242 169 31 62 124 248
189 55 110 02x 220 245 167 3 6 12 24 48 96 192 03x 205 215 227 139
91 182 33 66 132 69 04x 138 89 178 41 82 164 5 10 20 40 05x 80 160
13 26 52 104 208 237 151 99 06x 198 193 207 211 235 155 123 246 161
15 07x 30 60 120 240 173 23 46 92 184 61 08x 122 244 165 7 14 28 56
112 224 141 09x 87 174 17 34 68 136 93 186 57 114 10x 228 133 71
142 81 162 9 18 36 72 11x 144 109 218 249 191 51 102 204 213 231
12x 131 75 150 97 194 201 223 243 171 27 13x 54 108 216 253 183 35
70 140 85 170 14x 25 50 100 200 221 247 163 11 22 44 15x 88 176 45
90 180 37 74 148 101 202 16x 217 255 179 43 86 172 21 42 84 168 17x
29 58 116 232 157 119 238 145 111 222 18x 241 175 19 38 76 152 125
250 185 63 19x 126 252 181 39 78 156 117 234 153 127 20x 254 177 47
94 188 53 106 212 229 135 21x 67 134 65 130 73 146 105 210 233 159
22x 115 230 129 79 158 113 226 137 95 190 23x 49 98 196 197 199 195
203 219 251 187 24x 59 118 236 149 103 206 209 239 147 107 25x 214
225 143 83 166
[0047]
2Table 2 Logarithmic Table i xx0 xx1 xx2 xx3 xx4 xx5 xx6 xx7 xx8
xx9 00x 0 0 1 23 2 46 24 83 3 106 01x 47 147 25 52 84 69 4 92 107
182 02x 48 166 148 75 26 140 53 129 85 170 03x 70 13 5 36 93 135
108 155 183 193 04x 49 43 167 163 149 152 76 202 27 230 05x 141 115
54 205 130 18 86 98 171 240 06x 71 79 14 189 6 212 37 210 94 39 07x
136 102 109 214 156 121 184 8 194 223 08x 50 104 44 253 168 138 164
90 150 41 09x 153 34 77 96 203 228 28 123 231 59 10x 142 158 116
244 55 216 206 249 131 111 11x 19 178 87 225 99 220 172 196 241 175
12x 72 10 80 66 15 186 190 199 7 222 13x 213 120 38 101 211 209 95
227 40 33 14x 137 89 103 252 110 177 215 248 157 243 15x 122 58 185
198 9 65 195 174 224 219 16x 51 68 105 146 45 82 254 22 169 12 17x
139 128 165 74 91 181 151 201 42 162 18x 154 192 35 134 78 188 97
239 204 17 19x 229 114 29 61 124 235 232 233 60 234 20x 143 125 159
236 117 30 245 62 56 246 21x 217 63 207 118 250 31 132 160 112 237
22x 20 144 179 126 88 251 226 32 100 208 23x 221 119 173 218 197 64
242 57 176 247 24x 73 180 11 127 81 21 67 145 16 113 25x 187 238
191 133 200 161
[0048] III. Memory Implementation
[0049] When implemented in hardware, shifting bits is a simple and
efficient operation. Using a processor and for a shift register
larger than the registers of the processor, shifting bits is an
iterative procedure which is very inefficient. When the units to be
shifted are bytes or words, shifting becomes simpler because there
is no carry between bytes. However, the shifting process is still
iterative and inefficient.
[0050] In the exemplary embodiment, the linear feedback shift
register is implemented with a circular buffer or a sliding window.
The diagrams showing the contents of circular buffer 24a at time n
and at time n+1 are shown in FIGS. 3A and 3B, respectively. For
circular buffer 24a, each element of the shift register is stored
in a corresponding location in memory. A single index, or pointer
30, maintains the memory location of the most recent element stored
in memory, which is S.sub.k-1 in FIG. 3A. At time n+1, the new
element S.sub.k is computed and stored over the oldest element
S.sub.0 in memory, as shown in FIG. 3B. Thus, instead of shifting
all elements in memory, pointer 30 is moved to the memory location
of the new element S.sub.k. When pointer 30 reaches the end of
circular buffer 24a, it is reset to the beginning (as shown in
FIGS. 3A and 3B). Thus, circular buffer 24a acts as if it is a
circle and not a straight line.
[0051] Circular buffer 24a can be shifted from left-to-right, or
right-to-left as shown in FIGS. 3A and 3B. Correspondingly, pointer
30 can move left-to-right, or right-to-left as shown in FIGS. 3A
and 3B. The choice in the direction of the shift is a matter of
implementation style and does not affect the output result.
[0052] To generate an output element in accordance with a
recurrence relation, more than one element is typically required
from memory. The memory location associated with each required
element can be indicated by a separate pointer which is updated
when the register is shifted. Alternatively, the memory location
associated with each required element can be computed from pointer
30 as necessary. Since there is a one-to-one mapping of each
element to a memory location, a particular element can be obtained
by determining the offset of that element from the newest element
(in accordance with the recurrence relation), adding that offset to
pointer 30, and addressing the memory location indicated by the
updated pointer. Because of the circular nature of the memory, the
calculation of the updated pointer is determined by an addition
modulo k of the offset to pointer 30. Addition modulo k is simple
when k is a power of two but is otherwise an inefficient operation
on a processor.
[0053] In the preferred embodiment, the shift register is
implemented with sliding window 24b as shown in FIG. 3C. Sliding
window 24b is at least twice as long as circular buffer 24a and
comprises two circular buffers 32a and 32b arranged adjacent to
each other. Each of circular buffers 32a and 32b behaves like
circular buffer 24a described above. Circular buffer 32b is an
exact replica of circular buffer 32a. Thus, each element of the
shift register is stored in two corresponding locations in memory,
one each for circular buffers 32a and 32b. Pointer 34 maintains the
memory location of the most recent element stored in circular
buffer 32a, which is S.sub.k-1, in FIG. 3C. In the exemplary
embodiment, pointer 34 starts at the middle of sliding window 24b,
moves right-to-left, and resets to the middle again when it reaches
the end on the left side.
[0054] From FIG. 3C, it can be observed that no matter where in
circular buffer 32a pointer 34 appears, the previous k-1 elements
can be addressed to the right of pointer 34. Thus, to address an
element in the shift register in accordance with the recurrence
relation, an offset of k-1 or less is added to pointer 34. Addition
modulo k is not required since the updated pointer is always to the
right of pointer 34 and computational efficiency is obtained. For
this implementation, sliding window 24b can be of any length at
least twice as long as circular buffer 24a, with any excess bytes
being ignored. Furthermore, the update time is constant and
short.
[0055] IV. Exemplary Stream Cipher Based on LFSR Over
GF(2.sup.8)
[0056] The present invention can be best illustrated by an
exemplary generator for a stream cipher based on a linear feedback
shift register over GF(2.sup.8). The stream cipher described below
uses the byte operations described above over the Galois field of
order 8 with the representation of H and .quadrature. for
operations of addition and multiplication, respectively, over the
Galois field. In the exemplary embodiment, table lookup is utilized
for the required multiplication with constants C.sub.j. In the
exemplary embodiment, a sliding window is used to allow fast
updating of the shift register.
[0057] A block diagram of the exemplary generator is shown in FIG.
4. In the exemplary embodiment, linear feedback shift register 52
is 17 octets (or 136 bits) long which allows shift register 52 to
be in 2.sup.136-1 (or approximately 8.7.times.10.sup.40) states.
The state where the entire register is 0 is not a valid state and
does not occur from any other state. The time to update register 52
with a particular number of non-zero elements in the recurrence
relation is constant irrespective of the length of register 52.
Thus, additional length for register 52 (for higher order
recurrence relation) can be implemented at a nominal cost of extra
bytes in memory.
[0058] In the exemplary embodiment, linear feedback shift register
52 is updated in accordance with the following recurrence
relation:
S.sub.n+17=(100S.sub.n+9).sym.(141S.sub.n), (4)
[0059] where the operations are defined over GF(2.sup.8), {circle
over (+)} is the exclusive-OR operation on two bytes represented by
Galois adders 58, and {circle over (.times.)} is a polynomial
modular multiplication represented by Galois multipliers 54 (see
FIG. 4). In the exemplary embodiment, the modular multiplications
on coefficients 56 are implemented using byte table lookups on
pre-computed tables as described above. In the exemplary
embodiment, the polynomial modular multiplication table is computed
using the irreducible polynomial defined by equation (3). The
recurrence relation in equation (4) was chosen to be maximal
length, to have few non-zero coefficients, and so that the shift
register elements used were distinct from ones used for the
non-linear functions below.
[0060] In the exemplary embodiment, to disguise the linearity of
shift register 52, two of the techniques described above are used,
namely stuttering and using a non-linear function. Additional
non-linearity techniques are utilized and are described below.
[0061] In the exemplary embodiment, non-linearity is introduced by
performing a non-linear operation on multiple elements of shift
register 52. In the exemplary embodiment, four of the elements of
shift register 52 are combined using a function which is
non-linear. An exemplary non-linear function is the following:
V.sub.n=(S.sub.n+S.sub.n+5).times.(S.sub.n+2+S.sub.n+12), (5)
[0062] where V.sub.n is the non-linear output (or the generator
output), + is the addition truncated modulo 256 represented by
arithmetic adders 60, and .times. is the multiplication modulo 257
represented by modular multiplier 62 and described below. In the
exemplary embodiment, the four bytes used are S.sub.n, S.sub.n+2,
S.sub.n+5 and S.sub.n+12, where S.sub.n is the oldest calculated
element in the sequence according to the recurrence relation in
equation (4). These elements are selected such that, as the
register shifts, no two elements are used in the computation of two
of the generator outputs. The pairwise distances between these
elements are distinct values. For example, S.sub.n+12 is not
combined with S.sub.n+5, S.sub.n+2, nor S.sub.n again as it is
shifted through register 52.
[0063] Simple byte addition, with the result truncated modulo 256,
is made non-linear in GF(2.sup.8) by the carry between bits. In the
exemplary embodiment, two pairs of elements in the register
{(S.sub.n and S.sub.n+5) and (S.sub.n+2 and S.sub.n+12)} are
combined using addition modulo 256 to yield two intermediate
results. However, addition modulo 256 is not ideal since the least
significant bits have no carry input and are still combined
linearly.
[0064] Another non-linear function which can be computed
conveniently on a processor is multiplication. However, truncation
of a normal multiplication into a single byte may not yield good
result because multiplication modulo 256 does not form a group
since the results are not well distributed within the field. A
multiplicative group of the field of integers modulo the prime
number 257 can be used. This group consists of integers in the
range of 1 to 256 with the group operation being integer
multiplication reduced modulo 257. Note that the value 0 does not
appear in the group but the value 256 does. In the exemplary
embodiment, the value of 256 can be represented by a byte value of
0.
[0065] Typically, processors can perform multiplication
instructions efficiently but many have no capability to perform,
nor to perform efficiently, divide or modulus instructions. Thus,
the modulo reduction by 257 can represent a performance bottleneck.
However, reduction modulo 257 can be computed using other
computational modulo 2.sup.n, which in the case of n=8 are
efficient on common processors. It can be shown that for a value X
in the range of 1 to 2.sup.16-1 (where X is the result of a
multiplication of two 8th order operands), reduction modulo 257 can
be computed as: 1 X 257 = { X 256 - X 256 } 257 , ( 6 )
[0066] where X.sub.257 is the reduction modulo 257 of X and
X.sub.256 is the reduction modulo 256 of X. Equation (6) indicates
that reduction modulo 257 of a 16-bit number can be obtained by
subtracting the 8 most significant bits (X/256) from the 8 least
significant bits (X.sub.256). The result of the subtraction is in
the range of -255 and 255 and may be negative. If the result is
negative, it can be adjusted to the correct range by adding 257. In
the alternative embodiment, reduction modulo 257 can be performed
with a lookup table comprising 65,536 elements, each 8 bits
wide.
[0067] Multiplication of the two intermediate results is one of
many non-linear functions which can be utilized. Other non-linear
functions, such as bent functions or permuting byte values before
combining them, can also be implemented using lookup tables. The
present invention is directed at the use of these various
non-linear functions for producing non-linear output.
[0068] In the exemplary embodiment, stuttering is also utilized to
inject additional non-linearity. The non-linear output derived from
the state of the linear feedback shift register as described above
may be used to reconstruct the state of the shift register. This
reconstruction can be made more difficult by not representing some
of the states at the output of the generator, and choosing which in
an unpredictable manner. In the exemplary embodiment, the
non-linear output is used to determine what subsequent bytes of
non-linear output appear in the output stream. When the generator
is started, the first output byte is used as the stutter control
byte. In the exemplary embodiment, each stutter control byte is
divided into four pairs of bits, with the least significant pair
being used first. When all four pairs have been used, the next
non-linear output byte from the generator is used as the next
stutter control byte, and so on.
[0069] Each pair of stutter control bits can take on one of four
values. In the exemplary embodiment, the action performed for each
pair value is tabulated in Table 3.
3TABLE 3 Pair Value Action of Generator (0, 0) Register is cycled
but no output is produced (0, 1) Register is cycled and the
non-linear output XOR with the constant (0 1 1 0 1 0 0 1).sub.2
becomes the output of the generator. Register is cycled again. (1,
0) Register is cycled twice and the non-linear output becomes the
output of the generator. (1, 1) Register is cycled and the
non-linear output XOR with the constant (1 1 0 0 0 1 0 1).sub.2
becomes the output of the generator.
[0070] As shown in Table 3, in the exemplary embodiment, when the
pair value is (0, 0), the register is cycled once but no output is
produced. Cycling of the register denotes the calculation of the
next sequence output in accordance with equation (4) and the
shifting this new element into the register. The next stutter
control pair is then used to determine the action to be taken
next.
[0071] In the exemplary embodiment, when the pair value is (0, 1)
the register is cycled and the non-linear output generated in
accordance with equation (5). The non-linear output is XORed with
the constant (0 1 1 0 1 0 0 1).sub.2 and the result is provided as
the generator output. The register is then cycled again. In FIG. 4,
the XORed function is performed by XOR gate 66 and the constant is
selected by multiplexer (MUX) 64 using the stutter control pair
from buffer 70. The output from XOR gate 66 is provided to switch
68 which provides the generator output and the output byte for
stutter control in accordance with the value of the stutter control
pair. The output byte for stutter control is provided to buffer
70.
[0072] In the exemplary embodiment, when the pair value is (1, 0)
the register is cycled twice and the non-linear output generated in
accordance with equation (5) is provided as the generator
output.
[0073] In the exemplary embodiment, when the pair value is (1, 1)
the register is cycled and the non-linear output generated in
accordance with equation (5). The non-linear output is then XORed
with the constant (1 1 0 0 0 1 0 1).sub.2 and the result is
provided as the generator output.
[0074] In the exemplary embodiment, the constants which are used in
the above steps are selected such that when a generator output is
produced, half of the bits in the output are inverted with respect
to the outputs produced by the other stutter control pairs. For
stutter control pair (1, 0), the non-linear output can be viewed as
being XORed with the constant (0 0 0 0 0 0 0 0).sub.2. Thus, the
Hamming distance between any of the three constants is four. The
bit inversion further masks the linearity of the generator and
frustrates any attempt to reconstruct the state based on the
generator output.
[0075] The present invention supports a multi-tier keying
structure. A stream cipher which supports a multi-tier keying
structure is especially useful for wireless communication system
wherein data are transmitted in frames which may be received in
error or out-of-sequence. An exemplary two-tier keying structure is
described below.
[0076] In the exemplary embodiment, one secret key is used to
initialize the generator. The secret key is used to cause the
generator to take an unpredictable leap in the sequence. In the
exemplary embodiment, the secret key has a length of four to k-1
bytes (or 32 to 128 bits for the exemplary recurrence relation of
order 17). Secret keys of less than 4 bytes are not preferred
because the initial randomization may not be adequate. Secret keys
of greater than k-1 bytes can also be utilized but are redundant,
and care should be taken so that a value for the key does not cause
the register state to be set to all 0, a state which cannot happen
with the current limitation.
[0077] A flow diagram of an exemplary secret key initialization
process is shown in FIG. 5. The process starts at block 110. In the
exemplary embodiment, at block 112, the state of the shift register
is first initialized with the Fibonacci numbers modulo 256. Thus,
elements S.sub.0, S.sub.1, S.sub.2, S.sub.3, S.sub.4, S.sub.5, and
so on, are initialized with 1, 1, 2, 3, 5, 8, and so on,
respectively. Although Fibonacci numbers are used, any set of
non-zero numbers which are not linearly related in the Galois field
can be used to initialize the register. These numbers should not
have exploitable linear relationship which can be used to
reconstruct the state of the register.
[0078] Next, the loop index n is set to zero, at block 114. The
secret key initialization process then enters a loop. In the first
step within the loop, at block 116, the first unused byte of the
key material is added to S.sub.n. Addition of the key material
causes the generator to take an unpredictable leap in the sequence.
The key is then shifted by one byte, at block 118, such that byte
used in block 116 is deleted. The register is then cycled, at block
120. The combination of blocks 116 and 120 effectively performs the
following calculation:
S.sub.n+17=(100S.sub.n+9).sym.S.sub.n+4.sym.(141(S.sub.n.sym.K)),
(7)
[0079] where K is the first unused byte of the key material. The
loop index n is incremented, at block 122. A determination is then
made whether all key material have been used, at block 124. If the
answer is no, the process returns to block 116. Otherwise, the
process continues to block 126.
[0080] In the exemplary embodiment, the length of the key is added
to S.sub.n, at block 126. Addition of the length of the key causes
the generator to take an additional leap in the sequence. The
process then enters a second loop. In the first step within the
second loop, at block 128, the register is cycled. The loop index n
is incremented, at block 130, and compared against the order k of
the generator, at block 132. If n is not equal to k, the process
returns to block 128. Otherwise, if n is equal to k, the process
continues to block 134 where the state of the generator is saved.
The process then terminates at block 136.
[0081] In addition to the secret key, a secondary key can also be
used in the present invention. The secondary key is not considered
secret but is used in an exemplary wireless telephony system to
generate a unique cipher for each frame of data. This ensures that
erased or out-of-sequence frames do not disrupt the flow of
information. In the exemplary embodiment, the stream cipher accepts
a per-frame key, called a frame key, in the form of a 4-octet
unsigned integer. The per-frame initialization is similar to the
secret key initialization above but is performed for each frame of
data. If the use of the stream cipher is such that it is
unnecessary to utilize per-frame key information, for example for
file transfer over a reliable link, the per-frame initialization
process can be omitted.
[0082] A flow diagram of an exemplary per-frame initialization
process with the frame key is shown in FIG. 6. The process starts
at block 210. In the exemplary embodiment, at block 212, the state
of the generator is initialized with the state saved from the
secret key initialization process as described above. Next, the
loop index n is set to zero, at block 214. The per-frame
initialization process then enters a loop. In the first step within
the loop, at block 216, the least significant byte of the frame key
is added modulo 256 to S.sub.n. The frame key is then shifted by
three bits, at block 218, such that the three least significant
bits used in block 216 are deleted. The register is then cycled, at
block 220. In the exemplary embodiment, the loop index n is
incremented at block 222 and compared against 11 at block 224. The
value of 11, as used in block 224, corresponds to the 32 bits used
as the frame key and the fact that the frame key is shifted three
bits at a time. Different selections of the frame key and different
numbers of bits shifted at a time can result in different
comparison values used in block 224. If n is not equal to 11, the
process returns to block 216. Otherwise, if n is equal to 11, the
process continues to block 226 and the register is cycled again.
The loop index n is incremented, at block 228, and compared against
2k, at block 230. If n is not equal to 2k, the process returns to
block 226. Otherwise, if n is equal to 2k, the process terminates
at block 232.
[0083] The present invention has been described for the exemplary
Galois finite field having 256 elements. Different finite fields
can also be utilized such that the size of the elements matches the
byte or word size of the processor used to manipulate the elements
and/or the memory used to implement the shift register, or having
other advantages. Thus, various finite fields having more than two
elements can by utilized and are within the scope of the present
invention.
[0084] The example shown above utilizes a variety of non-linear
processes to mask the linearity of the recurrence relation. Other
generators can be designed utilizing different non-linear
processes, or different combinations of the above described
non-linear processes and other non-linear processes. Thus, the use
of various non-linear processes to generate non-linear outputs can
be contemplated and is within the scope of the present
invention.
[0085] The example shown above utilizes a recurrence relation
having an order of 17 and defined by equation (4). Recurrence
relations having other orders can also be generated and are within
the scope of the present invention. In the present invention, a
maximal length recurrence relation is preferred for optimal
results.
[0086] V. A Second Exemplary Stream Cipher Based on LFSR Over
GF(2.sup.8)
[0087] A block diagram of a second exemplary generator is shown in
FIG. 7. In the exemplary embodiment, linear feedback shift register
82 is 17 octets long although other lengths for register 82 (for
different order recurrence relation) can be implemented and are
within the scope of the present invention. A recurrence relation of
order 17 is well suited for applications requiring 128-bit key
material. In the exemplary embodiment, linear feedback shift
register 82 is updated in accordance with the following recurrence
relation:
S.sub.n+17=(141S.sub.n+15).sym.S.sub.n+4.sym.(175S.sub.n), (8)
[0088] where the operations are defined over GF(2.sup.8), {circle
over (+)} is the exclusive-OR operation on two bytes represented by
Galois adders 88, and {circle over (.times.)} is a polynomial
modular multiplication represented by Galois multipliers 84 (see
FIG. 7). In the exemplary embodiment, the modular multiplications
on coefficients 86 are implemented using byte table lookups on
pre-computed tables as described above. The recurrence relation in
equation (8) was chosen to be maximal length.
[0089] In the exemplary embodiment, to disguise the linearity of
shift register 82, two of the techniques described above are used,
namely stuttering and using a non-linear function. Additional
non-linearity techniques are utilized and are described below.
[0090] In the exemplary embodiment, non-linearity is introduced by
combining four of the elements of shift register 82 using a
function (or output equation) which is non-linear with respect to
the linear operation over GF(2.sup.8). In the exemplary embodiment,
the four bytes used are S.sub.n, S.sub.n+2, S.sub.n+5 and
S.sub.n+12, where S.sub.n is the oldest calculated element in the
sequence according to the recurrence relation in equation (8). In
the exemplary embodiment, the four bytes are combined in accordance
with the following output equation:
V.sub.n=S.sub.n+S.sub.n+2+S.sub.n+5+S.sub.n+12, (9)
[0091] where V.sub.n is the non-linear output and + is the addition
truncated modulo 256 (with the overflow discarded) represented by
arithmetic adders 90.
[0092] As stated above, simple byte addition, with the result
truncated modulo 256, is made non-linear in GF(2.sup.8) by the
carry between bits. In the exemplary embodiment, the four bytes are
combined using addition modulo 256 to yield the output. However,
addition modulo 256 is not ideal since the least significant bits
have no carry input and are still combined linearly. In the
exemplary embodiment, the subsequent stuttering step provides
sufficient disguise of the remaining linearity in equation (9). The
use of modulo addition in equation (9) simplifies the computation
required to generate an output.
[0093] In the exemplary embodiment, the bytes used for recurrence
relation (8) comprise S.sub.n, S.sub.n+4, and S.sub.n+15 and the
bytes used for output equation (9) comprise S.sub.n, S.sub.n+2,
S.sub.n+5 and S.sub.n+12. In the exemplary embodiment, these bytes
are selected to have distinct pair distances. For recurrence
relation equation (8), the three bytes used have pair distances of
4 (the distance between S.sub.n and S.sub.n+4), 11 (the distance
between S.sub.n+4 and S.sub.n+15), and 15 (the distance between
S.sub.n and S.sub.n+15). Similarly, for output equation (9), the
four bytes used have pair distances of 2 (the distance between
S.sub.n and S.sub.n+2), 3 (the distance between S.sub.n+2 and
S.sub.n+5), 5 (the distance between S.sub.n and S.sub.n+5), 7 (the
distance between S.sub.n+5 and S.sub.n+12), 10 (the distance
between S.sub.n+2 and S.sub.n+12), and 12 (the distance between
S.sub.n and S.sub.n+12). It can be noted that the pair distances in
recurrence relation (8) (e.g., 4, 11, and 15) are unique (or
distinct) within that first respective group and that the pair
distances in output equation (9) (e.g., 2, 3, 5, 7, 10, and 12) are
also distinct within that second respective group. Furthermore, it
can be noted that the pair distances in recurrence relation (8) are
distinct from the pair distances in output equation (9). Distinct
pair distances ensure that, as shift register 82 shifts, no
particular pair of elements of shift register 82 are used twice in
either recurrence relation (8) or the non-linear output equation
(9). This property removes linearity in the subsequent output
equation (9).
[0094] In the exemplary embodiment, multiplexer (MUX) 92, XOR gate
94, switch 96, and buffer 98 in FIG. 7 operate in the manner
described above for MUX 64, XOR gate 66, switch 68, and buffer 70
in FIG. 4.
[0095] In the exemplary embodiment, the secret key initialization
process as shown in FIG. 5 is performed once and the state of the
generator is saved for later use by the subsequent per-frame
initialization process. In the alternative embodiment, instead of
saving the state of the generator, the secret key initialization
process can be performed whenever the state of the generator is
needed. The alternative embodiment work particularly well when the
secret key is shorter than 17 bytes, or the length of the shift
registers.
[0096] A flow diagram of an alternative exemplary per-frame
initialization process with the frame key is shown in FIG. 8. The
alternative exemplary per-frame initialization process in FIG. 8 is
identical to the per-frame initialization process in FIG. 6, with
the exception of block 213. For frame key which is used somewhat
like a counter (e.g., the least significant bits change most
frequently), the least significant byte of the frame key can be
XORed with the most significant byte such that the most significant
byte can have more impact in the initialization process. This is
represented by block 213 in FIG. 8 which is interposed between
blocks 212 and 214 in the flow diagram of FIG. 6.
[0097] The previous description of the preferred embodiments is
provided to enable any person skilled in the art to make or use the
present invention. The various modifications to these embodiments
will be readily apparent to those skilled in the art, and the
generic principles defined herein may be applied to other
embodiments without the use of the inventive faculty. Thus, the
present invention is not intended to be limited to the embodiments
shown herein but is to be accorded the widest scope consistent with
the principles and novel features disclosed herein.
* * * * *