Method for processing new sequences being recorded into an interlocking trees datastore Mazzagatti; Jane Campbell ; et al. [Claar; Jane Van Keuren]

Method for processing new sequences being recorded into an interlocking trees datastore

Mazzagatti; Jane Campbell ; et al.

Patent Application Summary

U.S. patent application number 11/185620 was filed with the patent office on 2006-05-11 for method for processing new sequences being recorded into an interlocking trees datastore. Invention is credited to Jane Van Keuren Claar, Jane Campbell Mazzagatti, Steven L. Rajcan.

Application Number	20060101018 11/185620
Document ID	/
Family ID	39096068
Filed Date	2006-05-11

United States Patent Application	20060101018
Kind Code	A1
Mazzagatti; Jane Campbell ; et al.	May 11, 2006

Method for processing new sequences being recorded into an interlocking trees datastore

Abstract

A method of recording information in an interlocking trees datastore having a plurality of K paths includes receiving a data stream input sequence and traversing the interlocking trees datastore in accordance with the received input sequence for recording the received input sequence within the interlocking trees datastore. First determining whether a K path of the plurality of K paths matches the input sequence is performed and second determining that a new sequence has been encountered in accordance with the first determining is performed. New structure is built in accordance with the second determining. A path of the plurality of paths has a plurality of K nodes including a current K node, an adjacent K node that is adjacent to the current K node, the adjacent K node having a non-adjacent K node that is not in the asCase list of the current K node.

Inventors:	Mazzagatti; Jane Campbell; (Blue Bell, PA) ; Claar; Jane Van Keuren; (Bethlehem, PA) ; Rajcan; Steven L.; (Glenmoore, PA)
Correspondence Address:	Unisys Corporation;Attn: Mark T. Starr Unisys Way, MS/E8-114 Blue Bell PA 19424-0001 US
Family ID:	39096068
Appl. No.:	11/185620
Filed:	July 20, 2005

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
60625922	Nov 8, 2004

Current U.S. Class:	1/1 ; 707/999.008; 707/E17.007; 707/E17.012
Current CPC Class:	G06F 16/2343 20190101; Y10S 707/99942 20130101; Y10S 707/99938 20130101; G06F 16/90344 20190101; Y10S 707/99943 20130101; G06F 16/2246 20190101
Class at Publication:	707/008
International Class:	G06F 7/00 20060101 G06F007/00; G06F 17/30 20060101 G06F017/30

Claims

1. A method of recording information in an interlocking trees datastore having a plurality of K paths, comprising: receiving a data stream input sequence; traversing said interlocking trees datastore in accordance with said received data stream input sequence for recording said received data stream input sequence within said interlocking trees datastore; first determining whether a K path of said plurality of K paths matches said data stream input sequence; and second determining that a new sequence has been encountered in accordance with said first determining.

2. The method of recording information in an interlocking trees datastore of claim 1, comprising building new structure in accordance with said second determining.

3. The method of recording information in an interlocking trees datastore of claim 1, wherein a path of said plurality of paths has a plurality of K nodes including a current K node, an adjacent K node that is adjacent to said current K node, said adjacent K node having a non-adjacent K node that is not in the asCase list of said current K node further comprising traversing said interlocking trees datastore by determining a match between said non-adjacent K node an adjacent K node.

4. The method of recording information in an interlocking trees datastore of claim 3, wherein the non-adjacent K node is the K node matched to an input particle

5. The method of recording information in an interlocking trees datastore of claim 3, wherein said interlocking trees datastore includes a plurality of levels and the non-adjacent K node is some other K node of said plurality of K nodes.

6. The method of recording information in an interlocking trees datastore of claim 3, further comprising traversing said interlocking trees datastore in accordance with an asCase bi-directional link of a K node of said plurality of K nodes.

7. The method of recording information in an interlocking trees datastore of claim 6, further comprising determining said match in accordance with a Result pointer of a K node of said plurality of K nodes.

8. The method of recording information in an interlocking trees datastore of claim 3, further comprising traversing from a current K node location to an adjacent K node in accordance with said non-adjacent K node.

9. The method of recording information in an interlocking trees datastore of claim 8, further comprising determining said adjacent K node in accordance with an asCase list of said current K node.

10. The method of recording information in an interlocking trees datastore of claim 8, further comprising determining a match between said adjacent K node and a further K node.

11. The method of recording information in an interlocking trees datastore of claim 10, further comprising determining said match between said adjacent K node and an end node or a K node representing a further particle.

12. The method of recording information in an interlocking trees datastore of claim 8, wherein said current K node is a beginning of thought node.

13. The method of recording information in an interlocking trees datastore of claim 8, wherein said current K node is a subcomponent K node of said path of said plurality of paths.

14. The method of recording information in an interlocking trees datastore of claim 1, further comprising processing a new sequence event in accordance with said second determining.

15. The method for recording information in an interlocking trees datastore of claim 14, wherein said interlocking trees datastore is constructed by a KEngine further comprising processing a new sequence event by way of said KEngine.

16. The method for recording information in an interlocking trees datastore of claim 14, wherein said KEngine has a procedure further comprising processing a new sequence event in accordance with established parameter settings.

17. The method of recording information in an interlocking trees datastore of claim 16, further comprising reporting said new sequence in accordance with said established parameter settings.

18. The method for recording information in an interlocking trees datastore of claim 16, further comprising creating new KStore structure in accordance with said established parameter setting.

19. The method for recording information in an interlocking trees datastore of claim 16, further comprising instantiating a process in accordance with said established parameter setting.

20. The method for recording information in an interlocking trees datastore of claim 16, wherein said established parameter setting is obtained from a graphical user interface.

21. The method for recording information in an interlocking trees datastore of claim 16, wherein said established parameter setting is obtained from a calling procedure.

22. The method for recording information in an interlocking trees datastore of claim 1, further comprising determining a possible next K node in accordance with said first determining.

23. The method for recording information in an interlocking trees datastore of claim 22, further comprising: determining a plurality of possible next K nodes wherein at least one possible next K node of said plurality of possible next K nodes has a node count; and selecting said possible next K node in accordance with said node count.

24. The method for recording information in an interlocking trees datastore of claim 22, further comprising determining said possible next K in accordance with a context.

25. The method for recording information in an interlocking trees datastore of claim 22, further comprising determining said possible next K in accordance with how recently a node on the asCase list of a current node has been created.

26. The method for recording information in an interlocking trees datastore of claim 22, further comprising determining said possible next K in accordance with how recently a node on the asCase list of a current node has been accessed.

27. The method for recording information in an interlocking trees datastore of claim 2, further comprising locking said matched K node when said new K node is built.

28. The method for recording information in an interlocking trees datastore of claim 1, further comprising reporting said new sequence in accordance with said first determining.

29. The method for recording information in an interlocking trees datastore of claim 28, wherein said interlocking trees datastore is provided with a KEngine further comprising reporting said new sequence by way of said KEngine.

30. The method for recording information in an interlocking trees datastore of claim 28, wherein said interlocking trees datastore is accessed by way of a calling procedure further comprising reporting said new sequence by way of returning information to said calling procedure.

31. A method for recording information in an interlocking trees datastore having at least one K node, comprising: receiving a sequence having at least one input into said interlocking trees datastore; first determining a mismatch between said at least one K node and said at least one input; and second determining that said sequence is a new sequence in accordance with said determined mismatch.

32. The method for recording information in an interlocking trees datastore of claim 31, wherein said interlocking trees datastore includes a plurality of K paths having an end node and a plurality of K nodes and said sequence has a plurality of inputs associated with a plurality of sensors, further comprising: receiving said sequence having said plurality of inputs into said interlocking trees datastore; traversing at least one K path of said plurality of K paths in accordance with said plurality of inputs to provide a traversed K path; determining a mismatch between a K node of said plurality of K nodes and an end node or a sensor to provide a mismatched K node and a mismatched end node or sensor; determining that said sequence is a new sequence in accordance with said determined mismatch.

33. The method for recording information in an interlocking trees datastore of claim 32, further comprising traversing said at least one K path by determining a match between a current K node of said at least one K path and a matched sensor K node with an associated matched input.

34. The method for recording information in an interlocking trees datastore of claim 33, further comprising determining an adjacent K node of said current K node.

35. The method for recording information in an interlocking trees datastore of claim 34, further comprising determining said adjacent K node in accordance with an asCase list pointer of said current K node.

36. The method for recording information in an interlocking trees datastore of claim 34, further comprising first determining in accordance with a Result pointer of said adjacent K node.

37. The method for recording information in an interlocking trees datastore of claim 31, further comprising reporting said new sequence in accordance with said first determining.

38. The method for recording information in an interlocking trees datastore of claim 37, wherein said interlocking trees datastore is provided with a KEngine further comprising reporting said new sequence by way of said KEngine.

39. The method for recording information in an interlocking trees datastore of claim 37, wherein said KEngine has a calling procedure further comprising reporting said new sequence by having said KEngine return information to said calling procedure.

40. The method for recording information in an interlocking trees datastore of claim 39, wherein said interlocking trees datastore has a learn engine and said calling procedure comprises said learn engine.

41. The method for recording information in an interlocking trees datastore of claim 39, wherein said calling procedure comprises a query.

42. The method for recording information in an interlocking trees datastore of claim 31, further comprising reporting an event in accordance with said first determining.

43. The method for recording information in an interlocking trees datastore of claim 42, wherein said event comprises recording a best guess in said interlocking trees datastore.

44. The method for recording information in an interlocking trees datastore of claim 31, further comprising logging a message in accordance with said first determining.

45. The method for recording information in an interlocking trees datastore of claim 31, further comprising setting a lock flag in accordance with said first determining.

46. The method for recording information in an interlocking trees datastore of claim 31, further comprising determining a possible next K node in accordance with said first determining.

47. The method for recording information in an interlocking trees datastore of claim 46, further comprising: determining a plurality of possible next K nodes, at least one possible next K node of said plurality of possible next K nodes having a node count; and selecting said possible next K node in accordance with said node count.

48. The method for recording information in an interlocking trees datastore of claim 31, further comprising building a new K node in accordance with said first determining.

49. The method for recording information in an interlocking trees datastore of claim 48, further comprising locking said current K node when said new K node is built.

50. The method for recording information in an interlocking trees datastore of claim 49, further comprising establishing Case bi-directional links between said current K node and said new K node when said current K node is locked.

51. The method for recording information in an interlocking trees datastore of claim 49, further comprising locking said the Result node of said new K node when said new K node is built.

52. The method for recording information in an interlocking trees datastore of claim 49, further comprising updating the asResult list of said Result node of said new K node when said Result node of said new K node is locked.

53. The method for recording information in an interlocking trees datastore of claim 48, wherein said new K node has a new node count further comprising initializing said new node count.

54. A system for recording information in an interlocking trees datastore having a plurality of K paths, comprising: a received data stream input sequence; a traversal of said interlocking trees datastore in accordance with said received data stream input sequence for recording said received data stream input sequence within said interlocking trees datastore; a first determination whether a K path of said plurality of K paths matches said data stream input sequence; and a second determination that a new sequence has been encountered in accordance with said first determination.

55. The system for recording information in an interlocking trees datastore of claim 54, comprising a new structure built in accordance with said second determination.

56. The system for recording information in an interlocking trees datastore of claim 54, wherein a path of said plurality of paths has a plurality of K nodes including a current K node, an adjacent K node that is adjacent to said current K node, said adjacent K node having a non-adjacent K node that is not in the asCase list of said current K node further comprising a traversal of said interlocking trees datastore determined by a match between said non-adjacent K node an adjacent K node.

57. The system for recording information in an interlocking trees datastore of claim 56, wherein the non-adjacent K node is the K node matched to an input particle

58. The system for recording information in an interlocking trees datastore of claim 56, wherein said interlocking trees datastore includes a plurality of levels and the non-adjacent K node is some other K node of said plurality of K nodes.

59. The system for recording information in an interlocking trees datastore of claim 56, further comprising a traversal of said interlocking trees datastore in accordance with an asCase bi-directional link of a K node of said plurality of K nodes.

60. The system for recording information in an interlocking trees datastore of claim 59, further comprising a determination of said match in accordance with a Result pointer of a K node of said plurality of K nodes.

61. The system for recording information in an interlocking trees datastore of claim 56, further comprising a traversal from a current K node location to an adjacent K node in accordance with said non-adjacent K node.

62. The system for recording information in an interlocking trees datastore of claim 61, further comprising a determination of said adjacent K node in accordance with an asCase list of said current K node.

63. The system for recording information in an interlocking trees datastore of claim 61, further comprising a determination of a match between said adjacent K node and a further K node.

Description

BACKGROUND OF THE INVENTION

[0001] 1. Field of Invention

[0002] This invention relates to the field of building new structure in an interlocking trees datastore.

[0003] 2. Description of Related Art

[0004] In many applications it is useful to identify when data or input sequences that have not been previously encountered, i.e. new variables or records, are received into a datastore. In known systems identifying a new sequence has required the very computationally intensive procedure of comparing the new sequence with all of the previously received to search for a match. In another known procedure for identifying new sequences a table of the distinct values was constructed for comparison. Therefore, more efficient methods for detecting a new sequence are required.

[0005] Additionally, when a new sequence is identified the rules for the construction of a particular interlocking trees datastore may require building a new node or nodes to record the new sequence. When a new node is being built care should be taken to prevent access to nodes being changed, for example, by another thread executing in the datastore. Therefore, nodes that are being changed should be locked until the changes are complete in order to prevent such an access. In known systems the entire interlocking trees datastore was locked from threads adding new sequences to prevent other threads from accessing changing nodes. This was a severe restriction because it slowed the system down and essentially limited the construction of interlocking trees datastore to a single thread.

[0006] All references cited herein are incorporated herein by reference in their entireties.

BRIEF SUMMARY OF THE INVENTION

[0007] A method of recording information in an interlocking trees datastore having a plurality of K paths includes receiving a data stream input sequence and traversing the interlocking trees datastore in accordance with the received data stream input sequence for recording the received data stream input sequence within the interlocking trees datastore. First determining whether a K path of the plurality of K paths matches the data stream input sequence is performed and second determining that a new sequence has been encountered in accordance with the first determining is performed. New structure is built in accordance with the second determining. A path of the plurality of paths has a plurality of K nodes including a current K node, an adjacent K node that is adjacent to the current K node, the adjacent K node having a non-adjacent K node that is not in the asCase list of the current K node. The interlocking trees datastore is traversed by determining a match between the non-adjacent K node and an adjacent K node. The interlocking trees datastore can include a plurality of levels and the non-adjacent K node is some other K node of the plurality of K nodes. The interlocking trees datastore is traversed in accordance with an asCase bi-directional link of a K node of the plurality of K nodes. The match is determined in accordance with a Result pointer of a K node of the plurality of K nodes. Traversal occurs from a current K node location to an adjacent K node in accordance with the non-adjacent K node. The adjacent K node is determined in accordance with an asCase list of the current K node. A match is determined between the adjacent K node and a further K node.

[0008] As a part of its regular processing of streams of input particles the KStore engine recognizes that a sequence being processing is `new` when there is no existing structure to record the sequence being processed. This condition may occur at any level in the K structure. At this point it is possible to implement many different processes to deal with the new sequence at substantially low incremental cost.

[0009] For example, new K structure may be created to record the event, an error statement may be issued, the event may be logged, processes may be initiated to evaluate the new sequence in accordance with some established criteria, or any combination of processes may be initiated in response to encountering new sequence.

BRIEF DESCRIPTION OF SEVERAL VIEWS OF THE DRAWINGS

[0010] The invention will be described in conjunction with the following drawings in which like reference numerals designate like elements and wherein:

[0011] FIG. 1 shows a block diagram representation of a KStore environment in which the system and method of the present invention can be implemented.

[0012] FIG. 2A shows a minimal KStore structure.

[0013] FIGS. 2B-E shows a series of datastore elements that may be formed in the process of building the minimal KStore structure of FIG. 2A.

[0014] FIGS. 2F,G show interlocking trees datastores that may be used to represent data according to the system and method of the present invention.

[0015] FIG. 3 shows a preferred embodiment of the next node process for processing particles in an interlocking trees datastore such as the interlocking trees datastores of FIGS. 2E, F.

[0016] FIGS. 4A, B show exemplary nodes within an interlocking trees datastore such as the interlocking trees datastores of FIGS. 2F, G.

[0017] FIG. 5 shows a preferred embodiment of a process for detecting a new sequence in an interlocking trees datastore such as the interlocking trees datastores of FIGS. 2F, G.

[0018] FIG. 6 shows a preferred embodiment of a process for creating a new node in an interlocking trees datastore such as the interlocking trees datastores of FIGS. 2F, G.

[0019] FIG. 7 shows a preferred embodiment of a process for creating a new node in an interlocking trees datastore such as the interlocking trees datastores of FIGS. 2F, G

DETAILED DESCRIPTION OF THE INVENTION

[0020] The invention will be illustrated in more detail with reference to the following examples, but it should be understood that the present invention is not deemed to be limited thereto.

[0021] Referring now to FIG. 1, there is shown a block diagram representation of a KStore environment in which the system and method of the present invention may be implemented. Within the KStore environment information, may flow bi-directionally between the KStore 12 and both a data source 30 and an application 34 by way of a K Engine 14, as understood by those skilled in the art. The transmission of information between the data source 30 and the K Engine 14 may be by way of a learn engine 26, and the transmission of information between the application 34 and the K Engine 14 may be by way of an API utility 23, as also understood by those skilled in the art. The data source 30 and the application 34 may be provided with graphical user interfaces 36, 38 to permit a user to communicate with the system.

[0022] Additionally, according to the system and method of the invention, the K Engine 14 can be provided with a new sequence process 18 to permit the processing of new sequences as described in detail herein below. For example, one of the processes that are related to the new sequence process 18 is the lock process for locking previously existing nodes when a new node is constructed. The performance of the new sequence process 18 may be facilitated by providing specialized utilities 16. In the preferred embodiment of the invention the learn engine 26 and the API utility 23 can communicate with the K Engine 14 both directly and by way of the utilities 16.

[0023] Referring now to FIGS. 2A-E, there are shown the minimal KStore structure 100 and a plurality of FIGS. 130-160 which demonstrate a means of creating the minimal KStore structure. The minimal KStore structure 100 is the smallest KStore possible. It includes a single K path, beginning with K node 102 and ending with K node 112, wherein a portion of the K path ending with K node 112 is a portion of triad 114 having a subcomponent node 106. The minimal KStore structure 100 also includes an end product node 112 with its end of thought (EOT) root node 126. The nodes 106, 112 and 126 form a KStore triad.

[0024] In order to build the minimal KStore structure 100 the subcomponent K node 106 can be created by establishing the Case and Result bi-directional links. The Case bi-directional link 104 between the primary root node 102 and the K node 106, as best seen in the datastore element 130 of FIG. 2B. A pointer to node 102 becomes the Case entry of node 106 and the asCase list of the primary root node 102 is updated to include a pointer to the subcomponent node 106.

[0025] The Result bi-directional link 120 is established between the elemental root K node 122 and the K node 106. A pointer to node 122 becomes the Result entry of node 106 and the asResult list of the elemental root node 122 is updated to include the subcomponent node 106. It will be understood that the foregoing operations for building a subcomponent node such as the subcomponent node 106 can be described in any order for illustrative purpose and can be performed in any order when practicing the present invention.

[0026] The Case bi-directional link 110 can then be established between the end product node 112 and the subcomponent node 106, as best seen in the datastore elements 140, 150 of FIGS. 2C, D. By establishing the bi-directional link 124 between the end product node 112 and the EOT node 126, the datastore element 150 is completed, as best seen in the datastore elements 160 of FIGS. 2E. It will be understood that the datastore element 160 has substantially the same structure as the minimal KStore structure 100. Thus, the datastore elements 130-160 illustrate possible stages in the process of building the minimal KStore structure 100. The building process shown by the datastore elements 130-160 can be continued indefinitely to create a triadic continuum of any size. The building of interlocking trees datastores such as the minimal KStore structure 100 is taught in more detail in U.S. patent application Ser. Nos. 10/385,421,10/666,382 and 10/759,466.

[0027] Referring now to FIGS. 2F, G, there are shown the interlocking trees datastore 250 and the interlocking trees datastore 290. The interlocking trees datastore 250 provides a representation of the sequence C-A-T. The sequence of letter data particles required for representing the sequence C-A-T can be streamed into the interlocking trees datastore 250 using a data simulator or by using any other method of entering input into a KStore. When the letter particle C of the sequence C-A-T is received the BOT-C node 256 can be built by establishing the Case bi-directional link to the beginning of thought (BOT) node 252 and establishing the Result bi-directional link to the C root node 274. Thus, the BOT-C node 256 can be built in substantially the manner previously described with respect to the nodes 106 of the triad 114 within the minimal KStore structure 100. The BOT-C node 256 is then designated as the current K location within the structure for this particular thread.

[0028] When the letter particle A of the sequence C-A-T is received the BOT-C-A node 258 can be built into the interlocking trees datastore 250 by establishing the Case bi-directional link to the BOT-C node 256. The Result bi-directional link to the A root node 270 can then be established. Thus, the BOT-C-A node 258 is built in response to receiving the letter particle A. The BOT-C-A node 258 is then designated as the current K location.

[0029] In order to form the BOT-C-A-T node 262 when the T letter particle is received the Case bi-directional link can be established to the BOT-C-A node 258 and the Result bi-directional link can be established to the T root node 280. The end product node 266 can then be created by forming the Case bi-directional link to the BOT-C-A-T node 262 and the Result bi-directional link to the EOT node 282. In this manner the K path 265 of the interlocking trees datastore 250 is built for representing the sequence of letter particles within the string C-A-T. With the processing of an EOT node the current K location is set to BOT.

[0030] In one preferred embodiment a count can be kept within each node of the interlocking trees datastore 250 in order to keep track of the number of times the node is traversed. The counts of the nodes within a K path of an interlocking trees datastore may be incremented each time they are encountered during later traversals of the K path. Thus, it will be understood that in alternate embodiments of the system and method of the invention the counts of the individual nodes of an interlocking trees datastore such as the interlocking trees datastore 250 can be incremented either as they are built, encountered or when the building and traversal is complete.

[0031] Referring now to FIG. 5, there is shown the new sequence determining procedure 500 for determining when a new sequence is received by an interlocking trees datastore such as the interlocking trees datastores 250, 290. In the new sequence determining procedure 500 an input particle is compared to the list of the sensors of the interlocking trees datastore 250, 290, as shown in block 504. If no match is found in block 504 the particle is ignored and execution returns from the new sequence determining procedure 500. If a match is found the root node corresponding to the particle is thereby located.

[0032] If a match is found in block 504 the particle is considered valid. The asCase list of the current K location node is obtained. A comparison is then performed between the Result node of each node in the asCase list and the root node of the input particle. The comparison between the nodes on the asCase list of the current node and the particle root node is shown in block 508. If there is a match in the comparison of block 508 the matched node becomes the current K location node as shown block 512. The count of the new current K location node and the input particle root node may be incremented at this time or may be incremented along with the other nodes in its K path when an end product node is encountered. If there is no match in the comparison of block 508 a new node may be created as shown in block 516 of new sequence determining procedure 500.

[0033] Referring now to FIG. 3, there is shown a more detailed diagram of boxes 508, 512, and 516 from FIG. 5. The next node processing procedure 300 is a procedure for locating the path indicated by the next K node to be processed. For example, the next node processing in a Kstore procedure 300 can be used to process nodes representing particles or nodes representing complete thoughts within the interlocking trees datastore 250.

[0034] Although the particles set forth herein are input letter particles, it will be understood that the system and method of the invention can apply to any type of particle processing within an interlocking trees datastore. For example, they can apply to input words, sentences, pixels, molecules, amino acids, or any other data that can be received and stored in a datastore.

[0035] It will be understood by those skilled in the art that during its next node processing operations the next node particle processing procedure 300 will inherently detect the occurrence of a new sequence by the absence of structure to record an event within the interlocking trees datastore 250. This feature of the next node processing procedure 300 eliminates the need for any additional operations, beyond the normal input processing procedures, to determine when a new sequence is received. When a new sequence is determined within a KStore by the next node processing procedure 300 new KStore structure may be built to represent the new sequence. It will be understood that, the next node processing procedure 300 and any other process set forth herein, may be applied to a single level KStore such as the interlocking trees datastore 25 as well as any level of a multi-level KStore and to any other KStore.

[0036] Furthermore, the next node particle processing procedure 300 may be used in building new structure when it determines that a new sequence has arrived. For example, the particle next node procedure 300 may be applied to processing the sequence C-A-T-S within the interlocking trees datastore 250 to create the interlocking trees datastore 290.

[0037] When the sequence of letter data particles required for representing the sequence C-A-T-S is streamed into the KEngine for interlocking trees datastore 250 there is already structure within the interlocking trees datastore for the sequence. This structure was previously created for the C-A-T input sequence. The next node processing procedure 300 may thus begin by traversing the existing structure starting at the BOT node and possibly incrementing the count fields in the nodes encountered during the traversal. When the letter particle C of the sequence C-A-T-S is received, the current K location node is BOT and the asCase list of the BOT root node 252 may be followed to the BOT-C node 256. Since the Result node of the BOT-C node 256 (the C root node 274) matches the input letter particle C, traversal can continue. The C root node 274 is defined as a non-adjacent node of the BOT-C node 256 since it is not on the asCase list of the BOT-C node 256. At this point the count of the BOT node 252, the BOT-C node 256, and the C root node 274 can be incremented.

[0038] When the letter particle A of the sequence C-A-T-S is received execution of the next node processing procedure 300 can proceed to block 302. The next node processing procedure 300 may then make a determination whether the Result pointer of any subcomponent node in the asCase list of the BOT-C node 256 points to the A root node 270.

[0039] The only node in the asCase list of the BOT-C node 256 is the BOT-C-A node 258. Therefore, the current K location node is set to point to the BOT-C-A node 258 in block 302. Since Node is thus not null as determined in decision 304, a determination is made in decision 310 whether the Result pointer of the BOT-C-A node 258 points to the A root node 270.

[0040] Since a match is found in this determination the count of the BOT-C-A node 258 and the Root Node 270 may be incremented. The BOT-C-A node 258 is then made the current K location node in block 312. In this manner the next node processing procedure 300 can process the input stream. No new structure has been built within the interlocking trees datastore 250 thus far since the BOT-C-A node 258 of the sequence C-A-T-S was already formed when the sequence C-A-T was previously received.

[0041] When the letter particle T of the sequence C-A-T-S is received, the asCase pointer of the BOT-C-A node 258 can be followed to the BOT-C-A-T node 262 and the Result list of the BOT-C-A-T node 262 can be followed to the T node 280 and another match is found. Again, no new structure is built within the interlocking trees datastore 250. Thus, the next node processing procedure 300 has traversed the interlocking trees datastore 250 incrementing the counts of the nodes encountered as the input particles C-A-T of the sequence C-A-T-S were received.

[0042] However, when the letter particle S is received it will be determined by the next node procedure 300 that the structure for representing the sequence C-A-T-S does not exist within the interlocking trees datastore 250. Accordingly, when block 302 is executed the BOT-C-A-T-EOT node 266 is located using the asCase list of the BOT-C-A-T node 262 and the BOT-C-A-T-EOT node 266 is assigned to the variable Node in block 302. Execution of the next node processing procedure 300 proceeds to decision 310 by way of decision 304 since the BOT-C-A-T-EOT node 266 is not null. When the determination of decision 310 is made execution returns to block 302, since there is no match between the received letter particle S and the EOT node 282 indicated by the Result pointer of the BOT-C-A-T-EOT node 266.

[0043] Since there are no more nodes in the asCase list of the BOT-C-A-T node 262 the determination whether Node is null is affirmative the next time decision 304 is encountered. It will be understood that the affirmative determination in decision 304 indicates that the interlocking trees datastore 250 does not include a K path representing the sequence C-A-T-S. Therefore, the sequence C-A-T-S has not been previously entered into the interlocking trees datastore 250, and it may be determined that the sequence C-A-T-S is a new sequence at this point.

[0044] It will thus be appreciated by those skilled in the art that using a procedure such as the next node processing procedure 300 the system and method of the present invention may input sequences, and inherently and simultaneously determine the occurrence of a new sequence as part of the normal process of receiving the sequences. There is no need to perform any separate comparisons or any other operations in addition to the processing of the input stream in order to detect the need for new structure since the detection of need for new structure is inherent in decision 304 as part of the process.

[0045] In response to the determination in decision 304 that Node is null and that new sequence has been encountered or captured, a determination can be made in block 306 whether a new node should be built in the interlocking trees datastore 250. Additionally, a determination can be made whether the occurrence of the new sequence should be reported to the user, administrator, log, etc. of the system and method of the present invention in block 308.

[0046] The determinations of blocks 306, 308 may be made according to predetermined rules or administrative guidelines or any other set of protocols. The parameters of the guidelines or protocols can be communicated to the next node processing procedure 300 by a calling procedure, a GUI or any other source. The rules or administrative guidelines can be set up by a user or administrator or any other party. For example, the user or administrator or other party can adapt the system and method of the invention to ignore the detection of new structure made in decision 304 and do nothing. Under these guidelines the occurrence of the new sequence can be treated substantially as noise.

[0047] Another possibility is to adapt the next node processing procedure 300 to build no new structure when a new sequence is detected or captured, but to report the occurrence to the user or administrator. The reporting can be done through the KEngine 14 or it can be performed by passing information to a calling procedure such as the learn engine 26 or the API utilities 23. Additionally, reporting can be done by way of an email. The reports can be used, for example, to trigger events such as the starting of another process, a new field event or a new record event, to set flags, to log messages, to send information or reports to a graphical user interface (GUI) 36, to send emails to a list of recipients or to perform any other operations. Additionally, the GUI 36 can report the occurrence to the user or administrator and permit the user to determine how to handle it. If a new node is built in view of the determination of block 306 the guidelines may or may not require providing a report.

[0048] When a report is made according to block 308 it may contain any information that may prove useful. For example, the value represented by the current node at the time that the new sequence was determined in decision 304 can be reported. The identity of the newly received particle can be reported. A list of possible next nodes obtained in block 302 of the particle processing procedure 300 according to the asCase list of the current node can be reported. It is also possible to anticipate the next node by making an estimate of which of the possible next nodes obtained in block 302 is the most likely next node. For example, the most likely next node may be determined by comparing the counts of the possible next nodes. The determination of the mostly likely next node can also be made according to how recently the possible next nodes have been accessed or according to the context in which the determination is made. In another embodiment the determination can be determined based upon processing a further node/particle or nodes/particles.

[0049] Referring again to FIG. 2G, there is shown the interlocking trees datastore 290. The interlocking trees datastore 290 may be built as a result of the next node processing procedure 300 operating on the letter particle S of the sequence C-A-T-S in the interlocking trees datastore 250. This may occur when: i) there is an affirmative determination in decision 304, thereby indicating that a new sequence has been determined, and ii) the protocol of block 306 requires the building of a new node when such a new sequence is determined.

[0050] In order to form the BOT-C-A-T-S node 264 the Case bi-directional link may be established to the BOT-C-A-T node 262 by the routine for creating new nodes called in block 306. A Result bi-directional link may be established to the S root node 278. Thus, a branch is created in the path 265 of the interlocking trees datastore 290 at the BOT-C-A-T node 262 to provide the new K path 263.

[0051] Referring now to FIG. 4A, there is shown the exemplary node 400 of an interlocking trees datastore such as the interlocking trees datastores 250, 290. The exemplary node 400 can be an exemplary elemental root node 402 or an exemplary subcomponent node 404 or end product node 404. When a new node is built in an interlocking datastore memory is allocated for the new node in the manner shown in the exemplary node 400. A plurality of pointers can then be stored in the allocated memory. The new node is defined by setting the Case pointer to point to the previous node and setting the Result pointer to point to the root node. Thus, for example, if the exemplary subcomponent node 404 represents the subcomponent node 264 of the interlocking trees datastores 290, the Case pointer 406 would point to the subcomponent node 262.

[0052] The exemplary node 400 can also include a Result pointer 408. Thus, when the exemplary node 400 represents the subcomponent node 264 of the interlocking trees datastore 290, the Result pointer 408 would point to the S root node 278.

[0053] A pointer to asCase list 410 can also be included in the exemplary node 400. The pointer to asCase list 410 is a pointer to a list of the subcomponent nodes or end product nodes for which the node represented by the exemplary node 400 is the Case node. The pointer to asResult list 412 is a pointer to a list of the subcomponents nodes or end product nodes for which the node represented by the exemplary node 400 is the Result node. The nodes of the interlocking trees datastores 250, 290 can also include one or more additional fields 414. The additional fields 414 may be used for an intensity or count associated with the node or for any number of different items associated with the structure. Another example of a parameter that can be stored in an additional field 416 is the particle value for an elemental root node.

[0054] Referring now to FIG. 4B, there is shown the exemplary node 450. The exemplary node 450 is an alternate embodiment of the exemplary node 400. The exemplary node 450 includes the Case pointer 406, the Result pointer 408, the pointer to asCase list 410, the pointer to asResult list 412, the additional fields 414 and the value field 416 as previously described with respect to the exemplary node 400. Additionally, the exemplary node 450 includes the lock definition field 413. A lock definition within the lock definition field 413 can be set and reset in order to indicate whether the node represented by the exemplary node 450 is locked or unlocked to prevent the node from being accessed while it is being changed. Thus, the lock definition field 413 of a node should be checked prior to accessing any nodes in an interlocking trees datastore in order to prevent any attempts to access a locked node. However, in one preferred embodiment the writing of information to nodes is not permitted without checking the lock definition field 413 while the reading of nodes without checking can be permitted. In another alternate embodiment of the invention the lock definition may be stored at another location rather than within the exemplary node 400.

[0055] Referring now to FIG. 6, there is shown the node locking procedure 600 used when building a new node within the interlocking trees datastores 250, 290. As previously described, care should be taken to prevent access to a node which is being changed by, for example, another thread executing in a datastore. Therefore, nodes that are linked to new nodes may be locked while their asCase list, asResult list, count field or some other field in the node is updated in order to prevent such an access.

[0056] In the node locking procedure 600 a new node is created as shown in block 604. The new node can be created as previously described with respect to FIGS. 2A-E. Memory for the new node is allocated as shown in the exemplary node 400 and the fields of the exemplary node 450 are populated. Block 604 also shows that the current K location node is the Case node of the newly created node. Therefore, the location of the current K location node can be stored in the Case pointer field 406 of the new node. A pointer to the location of the root node is stored in the Result pointer field 408. No locking is required while this new node is being constructed.

[0057] Additionally, the asCase list of the current K location node may be updated to include the newly created node as shown in block 608. Therefore, the current K location node may be locked while its asCase list is updated and unlocked after the asCase list is updated. In a preferred embodiment all list updating operations are performed with the node being locked and immediately updated and unlocked in this manner in order to minimize the lock time.

[0058] Furthermore, no other nodes are locked by this thread during the period that the current node is locked. Thus, it is an important feature of the present invention that only one node at a time is locked by a particular thread, as described in more detail below. In a preferred embodiment, the current K location node may be locked and unlocked by setting and resetting the lock definition in the lock definition field 413 of the exemplary node 400 of the current K location node at the beginning and at the end of the operation of block 608.

[0059] Additionally, the asResult list of the root node may be updated by the node locking procedure 600 to include the newly created node. Therefore, as shown in block 612 the root node is locked and its asResult list is updated. The root node to be added may then be unlocked. In a preferred embodiment of the invention the root node is locked by the node locking procedure 600 only during the time it takes to update its asResult list.

[0060] During the time that the pointers of the exemplary node 400 for the new node are being established there are no pointers pointing to the new node. Therefore, no nodes can access the new node and there is no need to lock the new node. Furthermore, there is no need to lock either the Case node or the Result node while the new node is being established since their data is not changing during this period.

[0061] When the pointers of the new node are established the asCase list of the Case node and the asResult list of the root node may be updated one at a time so that only one or the other of the two are locked. Therefore, a very important feature of the system and method of the present invention is that only one node is locked at a time for this thread. All of the remaining nodes in the interlocking trees datastores 250, 290 may remain unlocked. This reduces the scope of the node locking to the minimum amount possible and permits faster operation of the interlocking trees datastore 250, 290.

[0062] Referring now to FIG. 7, there is shown the node locking procedure 700 for preventing access to nodes which are involved in the construction of a new node in an interlocking trees datastore 250, 290. Within the node locking procedure 700 the current node is locked by a thread as shown in block 702 by setting the lock flag in its lock definition field 413.

[0063] A determination is then made in decision 706 whether the new node has been added prior to the locking operations of block 702. This determination should be made in the node locking procedure 700 since it is possible for a thread other than the instant thread to begin building the new node between the time that the instant thread determines that the new node must be built and the time that the current node is actually locked according to block 702. The determination of decision 706 can be made by checking the asCase list of the current node.

[0064] If the new node has not been added during the foregoing time period as determined in decision 706 of the node locking procedure 700, the new node may be created as shown in block 710. Since the current node was locked in block 702 a pointer to the new node can be added to the asCase list of the current node at this time as shown in block 714. Regardless of whether a new node is created in block 710 the current node is unlocked as shown in block 716.

[0065] If a new node was created in block 710 as determined in decision 718, the root node for the new node is locked as shown in block 720. A pointer to the new node is added to the asResult list of the root node as shown in block 722. The root node can then be unlocked as shown in block 724. Thus, within the node locking procedure 700 only one node is locked at a time.

[0066] While the invention has been described in detail and with reference to specific examples thereof, it will be apparent to one skilled in the art that various changes and modifications can be made therein without departing from the spirit and scope thereof.

* * * * *