U.S. patent application number 11/420451 was filed with the patent office on 2006-11-30 for database caching and invalidation using database provided facilities for query dependency analysis.
This patent application is currently assigned to TERRACOTTA, INC.. Invention is credited to Andrew R. Geweke, Nathaniel D. Harward, Alexander Voskoboynik.
Application Number | 20060271510 11/420451 |
Document ID | / |
Family ID | 37452936 |
Filed Date | 2006-11-30 |
United States Patent
Application |
20060271510 |
Kind Code |
A1 |
Harward; Nathaniel D. ; et
al. |
November 30, 2006 |
Database Caching and Invalidation using Database Provided
Facilities for Query Dependency Analysis
Abstract
Database data is maintained reliably and invalidated based on
actual changes to data in the database. Updates or changes to data
are detected without parsing queries submitted to the database. The
dependencies of a query can be determined by submitting a version
of the received query to the database through a native facility
provided by the database to analyze how query structures are
processed. The caching system can access the results of the
facility to determine the tables, rows, or other partitions of data
a received query is dependent upon or modifies. An abstracted form
of the query can be cached with an indication of the tables, rows,
etc. that queries of that structure access or modify. The tables a
write or update query modifies can be cached with a time of last
modification. When a query is received for which the results are
cached, the system can readily determine dependency information for
the query, the last time the dependencies were modified, and
compare this time with the time indicated for when the cached
results were retrieved. By passing versions of write queries to the
database, updates to the database can be detected.
Inventors: |
Harward; Nathaniel D.; (San
Francisco, CA) ; Geweke; Andrew R.; (Berkeley,
CA) ; Voskoboynik; Alexander; (Foster City,
CA) |
Correspondence
Address: |
VIERRA MAGEN MARCUS & DENIRO LLP
575 MARKET STREET SUITE 2500
SAN FRANCISCO
CA
94105
US
|
Assignee: |
TERRACOTTA, INC.
650 Townsend Street Suite 325
San Francisco
CA
|
Family ID: |
37452936 |
Appl. No.: |
11/420451 |
Filed: |
May 25, 2006 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60684610 |
May 25, 2005 |
|
|
|
Current U.S.
Class: |
1/1 ;
707/999.002 |
Current CPC
Class: |
G06F 9/526 20130101;
G06F 9/445 20130101; G06F 9/544 20130101; G06F 11/1479 20130101;
G06F 16/217 20190101; G06F 9/45504 20130101; G06F 9/52 20130101;
G06F 16/24552 20190101; G06F 2209/522 20130101; G06F 11/203
20130101 |
Class at
Publication: |
707/002 |
International
Class: |
G06F 17/30 20060101
G06F017/30 |
Claims
1. A method of caching database query results, comprising:
receiving a request for information from a database; determining if
previously received results for said request are maintained
locally; if previously received results are not maintained locally
for said request: passing a form of said request to said database
by calling a native analysis facility at said database to assess
how said database processes requests of said form; accessing
results of said assessment by said native analysis facility;
determining from said results one or more dependencies of said
database for requests of said form; maintaining data indicating
that requests of said form for said database include said one or
more dependencies; passing said request to said database and
receiving results; and maintaining information including said
request and said results of said request.
2. The method of claim 1, wherein: maintaining said information
includes maintaining an indication of a time said results were
determined.
3. The method of claim 2, further comprising: if previously
received results are being maintained for said request: accessing
said data to determine said one or more dependencies; accessing
information indicating a time of last update to said one or more
dependencies; determining whether said previously received results
are valid by comparing said indication of said time said results
were determined with said time of last update to said one or more
dependencies; returning said results in response to said request if
said previously received results are valid.
4. The method of claim 3, further comprising: if said previously
received results are determined to be invalid: passing said request
to said database and receiving updated results; updating said
information maintained for said request to indicate said updated
results.
5. The method of claim 1, wherein: said native analysis facility is
"EXPLAIN PLAN."
6. The method of claim 1, wherein: said native analysis facility is
a debugging facility provided by said database.
7. The method of claim 1, wherein: said native analysis facility is
a performance analysis facility provided by said database.
8. The method of claim 1, wherein: said one or more dependencies of
said database for requests of said form are one or more tables of
said database upon which requests of said form depend for
results.
9. The method of claim 1, wherein: said one or more dependencies of
said database for requests of said form are one or more rows of one
or more tables of said database upon which requests of said form
depend for results.
10. The method of claim 1, further comprising, if previously
received results are not being maintained for said request:
abstracting said request by removing any literal expressions
included therein to generate said form of said request; wherein
said request is a database query and said form of said request is a
skeleton of said database query.
11. The method of claim 1, wherein: receiving said request includes
receiving said request at a first caching system; passing said form
of said request includes passing said form of said request from a
second caching system to said database after receiving at least one
of said request and said request at said second caching system from
said first caching system; and maintaining said data and
maintaining said information include maintaining said data and said
information at said second caching system.
12. The method of claim 11, wherein: maintaining said data and
maintaining said information include maintaining said data and
information at said first caching system and said second caching
system.
13. A database caching system, comprising: a database including a
native analysis facility to analyze query structures; a caching
intermediary in communication with said database and an application
accessing said database, said caching intermediary maintains one or
more queries and results received from said database in response to
said one or more queries, said caching intermediary receives a
first query from said application and determines if said first
query and previously received results for said first query are
maintained by said caching intermediary, said caching intermediary
passes a form of said first query to said native analysis facility
at said database if said caching intermediary is not maintaining
results for said first query and accesses results of said analysis
facility to determine one or more tables of said database upon
which queries of said form depend, said caching intermediary
further maintains said form of said first query and an indication
of said one of more tables of said database upon which queries of
said form depend.
14. The database caching system of claim 13, wherein: said caching
intermediary maintains an indication of dependencies from said
database and a time of last modification of each of said
dependencies; said caching intermediary receives a second query
from said application and after determining that said second query
is not maintained at said caching intermediary, passes a form of
said second query to said native analysis facility and accesses
results of said analysis facility to determine if said second query
reads or modifies data at said database, said caching intermediary
determines from said results one or more tables of said database
modified by queries of said form of said second query if said
second query updates data at said database, said caching
intermediary passes said second query to said database and receives
results in return, said caching intermediary maintains said second
query with an indication of said results of said second query and
also maintains an indication of said one or more tables of said
database modified by queries of said form of said second query with
an indication of a time of most recent modification to said one or
more tables.
15. The database caching system of claim 14, wherein: said caching
intermediary maintains said form of said second query with an
indication of said one or more tables modified by queries of said
form of said second query.
16. The database caching system of claim 13, wherein: said form of
said first query is a structure of said first query generated by
removing any literal expressions from said first query.
17. The database caching system of claim 13, wherein: said caching
intermediary includes a first caching intermediary at a first
processing system and a second caching intermediary at a second
processing system.
18. The database caching system of claim 17, wherein: said caching
intermediary further includes one or more additional caching
intermediaries; said second caching intermediary receives queries
received at said first caching intermediary and said one or more
additional caching intermediaries.
19. One or more processor readable storage devices having processor
readable code embodied on said one or more processor readable
storage devices, said processor readable code for programming one
or more processors to perform a method comprising: receiving a
request for information from a database; determining if previously
received results for said request are maintained locally; and if
previously received results are not maintained locally for said
request: passing a form of said request to said database by calling
a native analysis facility at said database to assess how said
database processes requests of said form; accessing results of said
assessment by said native analysis facility; determining from said
results one or more dependencies of said database for requests of
said form; maintaining data indicating that requests of said form
for said database include said one or more dependencies; passing
said request to said database and receiving results; and
maintaining information including said request and said results of
said request.
Description
PRIORITY CLAIM
[0001] This application claims the benefit of U.S. Provisional
Patent Application No. 60/684,610, filed May 25, 2005, entitled
"Terracotta Virtualization Server", and incorporated by reference
herein in its entirety.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0002] The following applications are cross-referenced and
incorporated by reference herein in their entirety:
[0003] U.S. patent application Ser. No. ______, filed concurrently,
entitled "Database Caching and Invalidation Based on Detected
Database Updates," by Harward et al., filed concurrently (Attorney
Docket No. TERA-01008US0); and
[0004] U.S. patent application Ser. No. ______, filed concurrently,
entitled "Database Caching and Invalidation for Stored Procedures,"
by Harward et al., filed concurrently (Attorney Docket No.
TERA-01009US0).
BACKGROUND OF THE INVENTION
[0005] 1. Field of the Invention
[0006] The present invention relates generally to databases and
caching systems for databases.
[0007] 2. Description of the Related Art
[0008] Databases are an integral part of many operations,
businesses and organizations where access to structured data is
needed. Databases offer manageability for the vast amounts of data
that many of today's large scale enterprises rely upon in day to
day operations. Through the use of database storage, seemingly
incomprehensible amounts of data can be maintained in a structured
environment so that requesting users can easily locate and retrieve
needed information. Many of today's large-scale applications such
as e-commerce, banking, etc. are able to present end-users with
seemingly instant access to personalized information by using
databases. Without database storage in many of these environments,
the task of methodically sorting through the amount of information
necessary to access that which is needed would likely frustrate the
ultimate purpose of the application to the point of ending many
operations.
[0009] Databases can be implemented in many languages and may
support many protocols such as Structured Query Language (SQL)
which provides users and applications the ability to interface with
the database through SQL statements to select, insert, update, and
delete information in the database. Of course, the complexity of
operations performed by and submitted to databases continues to
expand. SQL statements can be combined to provide a high level of
logic and interaction with data in a database.
[0010] As the amount of data maintained by databases continues to
expand, the efficiency and performance of database access and
response becomes ever more crucial. For instance, a database
maintaining banking records for users across the country or world
must be able to efficiently respond to requests from numerous users
accessing the database at the same time from various remote
geographic locations.
[0011] Database caching techniques have been developed to decrease
the amount of aggregate traffic transmitted between the database
and requesting client devices. By caching database information in
smaller faster memories, performance can be increased. Certain
access requests will be satisfied from local cached versions of the
database. This will not only increase the performance as seen by
requesting applications or entities, but also decrease the amount
of load placed on the database.
[0012] An inherent difficulty with database caching is the
structured nature of both the data storage and retrieval. In
traditional data caching scenarios, such as caching recently
accessed data from a large memory volume that is indexed according
to a physical or logical address, discrete data segments can be
cached in a smaller memory and referenced according to their
location on the large memory volume. The cache simply looks for an
update to the location on the memory volume corresponding to the
cached data and when the data at that location changes, the cache
can simply invalidate or discard its cached results. In databases,
however, data is retrieved by passing queries such as SQL
statements. These statements can access, combine, and manipulate
data from different locations to satisfy the query request.
[0013] Because of the structured nature of databases, database
caches relying on these traditional caching techniques are required
to store a subset of an entire database's data structures. Data is
cached based on memory locations and entries invalidated based on
updates to those memory locations. These systems do not cache
results based on an actual query. Because queries may combine or
access data from various locations in the database, and request
data based on logic that is dependent upon the data stored therein,
it is difficult to accurately determine when the previously
retrieved results for a query are no longer valid because of
changes to data in the database. Complex analysis to parse a
received query is thus required to determine what the query does at
the database. While in theory such a technique is useful, in
practice it presents obstacles. Databases are typically written in
a proprietary format and the underlying source code is not made
available to those wishing to implement a caching system. Because
an intimate knowledge of the database's internal operations and
code can not be had, parsing the queries is in many instances
deficient or even wrong. Moreover, individual database providers
frequently alter their databases, requiring these parsing
techniques to continually be updated as the database is
updated.
[0014] In other implementations, schedule or time-based caching has
been used for databases to overcome the difficulty with parsing
queries. Entries in a cache are made for particular queries and the
results of those queries. However, the entries are automatically
invalidated after a specified amount of time without any regard to
actual changes at the database. If changes to the database are made
between invalidation periods and a request is received, the cache
may return invalid results. Additionally, this technique can
degrade performance and efficiency by needlessly invalidating
accurate query results.
[0015] Accordingly, an improved system and technique for database
caching is needed to provide efficient, reliable, and accurate
results.
SUMMARY OF THE INVENTION
[0016] A database caching system and related techniques are
provided that can reliably maintain and invalidate database data
based on actual changes to data in the database. Updates or changes
to data at the database are detected without parsing queries
submitted to the database. The dependencies of a received query can
be determined by submitting a version of the received query to the
database through a native facility provided by the database itself
to analyze how query structures are processed at the database. The
caching system can access the results of the facility to determine
the tables or rows (or other data partition) a received query is
dependent upon or modifies. In addition to the results of a query
that can be cached with an indication of the query itself, an
abstracted form of the query can be cached with an indication of
the tables, rows, etc. that queries of that structure access or
modify. The tables a write or update query modifies can be cached
with a time of their last modification. When a query is received
for which the results are cached, the system can readily determine
dependency information for the query, the last time the
dependencies were modified, and compare this time with the time
indicated for when the cached results were retrieved. By passing
versions of write queries to the database, updates to the database
can be detected. In one embodiment, a component is implemented at
or on the system of the database to directly detect changes to the
database data. This component can monitor transactional information
maintained by the database itself to determine when changes to the
database occur. This component can communicate with the cache to
provide notification of changes to the database.
[0017] A method of caching database query results is provided in
one embodiment that includes receiving a request for information
from a database and determining if previously received results for
the request are maintained locally. If previously received results
are not maintained locally for the request, the method can include
passing a form of the request to the database by calling a native
analysis facility at the database to assess how the database
processes requests of the form, accessing results of the assessment
by the native analysis facility, determining from the results one
or more dependencies of the database for requests of the form,
maintaining data indicating that requests of the form for the
database include the one or more dependencies, passing the request
to the database and receiving results, and maintaining information
including the request and the results of the request.
[0018] In one embodiment, a database caching system is provided
that includes a database including a native analysis facility to
analyze query structures and a caching intermediary in
communication with the database and an application accessing the
database. The caching intermediary maintains one or more queries
and results received from the database in response to the one or
more queries. The caching intermediary receives a first query from
the application and determines if the first query and previously
received results for the first query are maintained by the caching
intermediary. The caching intermediary passes a form of the first
query to the native analysis facility at the database if the
caching intermediary is not maintaining results for the first query
and accesses results of the analysis facility to determine one or
more tables of the database upon which queries of the form depend.
The caching intermediary further maintains the form of the first
query and an indication of the one of more tables of the database
upon which queries of the form depend.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] FIG. 1 is a block diagram of a database caching system in
accordance with one embodiment.
[0020] FIG. 2 depicts a database cache in accordance with one
embodiment.
[0021] FIG. 3 is a block diagram of a database caching system in
accordance with one embodiment, wherein processing and caching
database queries is depicted.
[0022] FIG. 4 is a flowchart for processing and caching database
queries in accordance with one embodiment.
[0023] FIG. 5 is a block diagram of a database caching system in
accordance with one embodiment, wherein the detection of a backend
update to the database, and subsequent update to the cache is
depicted.
[0024] FIG. 6 is a flowchart for detecting updates to a database
and updating a database cache according to the update in accordance
with one embodiment.
[0025] FIG. 7 is a flowchart for monitoring updates to a database
using a listener component implemented as a debugger for at least a
portion of the database.
[0026] FIG. 8 is a block diagram of a database caching system in
accordance with one embodiment, wherein processing and caching of
database stored procedures is depicted.
[0027] FIG. 9 is a flowchart for processing and caching database
stored procedures in accordance with one embodiment.
[0028] FIG. 10 is a flowchart for caching database queries and
invalidating the results of the database queries based on row level
updates to the database.
[0029] FIG. 11 is a block diagram of a multi-level database caching
system in accordance with one embodiment.
DETAILED DESCRIPTION
Database Caching Intermediary
[0030] FIG. 1 is a block diagram depicting one embodiment of the
technology described herein executing in an application
server--database server environment. FIG. 1 depicts an application
server 102 and database server 122. A variety of applications 104
can be hosted and executed upon application server 102.
Applications 104 can be implemented in any suitable language,
including but not limited to the Java.TM. programming language,
C++, etc. The application servers on which applications 104 are
executed will vary by embodiment, but can support such platforms as
the Java.TM. 2 Platform, Enterprise Edition (J2EE) Platform.
[0031] In one embodiment, servers, processing systems, and other
processor based devices as described can include, by way of
non-limiting example, one or more processors capable of executing
instructions provided on processor-readable media to perform the
methods and tasks described herein. The processing system may
include a volatile memory, a mass storage device or other
non-volatile memory, a portable storage device, one or more
communications or network interfaces and various I/O devices. A
processing system hardware architecture as described is just one
suitable example of a processing system suitable for implementing
the present technology. While embodiments are presented with
respect to exemplary server-based implementations, variations for
stand-alone, client, and other suitable processing environments can
be made without departing from the scope of the present
disclosure.
[0032] The database server 122 hosts a database application 120.
Database server 122 includes an operating system 124 and a
non-volatile storage device 126 (such as a disk drive or RAID
array). Numerous types of databases 120 can be used in accordance
with embodiments of the disclosed technology. For example, database
120 can include a database provided by Oracle.TM. Corporation of
Sunnyvale, Calif., a database provided by Sybase.TM. Corporation of
Dublin, Calif., or a database provided by Microsoft.TM. Corporation
of Redmond, Wash. The technology described herein is not limited to
any particular type of database application. References to
particular types and brands of databases and the particular
functions of a type of database are presented for exemplary
purposes and are not intended to limit the scope of the present
disclosure or claimed subject matter. The extension of an embodied
concept from an exemplary described database to a different type of
database will be readily apparent to those of ordinary skill in the
art. It is further noted that the disclosed server based
implementations are exemplary and other implementations can be made
on client devices hosting the database and/or caching system, for
example. Moreover, multiple levels of caching in accordance with
embodiments can be provided at one or more servers in addition to
those indicated in FIG. 1. For example, a caching system
implemented at a central server could interface between the
components of database server 122 and those of multiple application
servers. Cached results can be maintained and retrieved from the
central server in addition to or in place of maintaining and
receiving them at an intermediary cache at the application server
level.
[0033] Interface 106 is provided at application server 102 to
enable applications 104 access to database 120. Interface 106 will
vary by implementation according to the type of programming
platform and/or database provided. Interface 106 is provided in
accordance with an Application Programming Interface (API) that
provides connectivity between the programming language(s) and
platforms associated with applications 104 and database 120. In
many modern database implementations, interface 106 is implemented
in accordance with the Java.TM. Database Connectivity (JDBC) API,
an industry standard interface that provides database independent
connectivity between the Java.TM. programming language and a wide
range of databases. Interface 106 is a JDBC driver in one
embodiment, implementing the JDBC API to provide a call level API
for SQL-based database access. The JDBC driver can provide
transparent access for applications 104 to database 120.
[0034] A database caching intermediary 108 is provided at
application server 102 to increase the performance, reliability and
availability of the information maintained by database 120 for
requesting applications 104. Cache 108 can provide previously
received results for queries or other requests/calls submitted to
database 120 that are maintained locally at application server 102
to increase the performance and availability of database 120 while
decreasing the response time for requests. Applications 104 request
data from database 120 by submitting queries or other calls. For
example, applications can submit commands formatted in accordance
with the specifications of interface 106 (e.g., as JDBC commands)
to provide SQL commands to database 120. Intermediary 108 can more
quickly provide the results of a query if valid results are
currently stored in the cache. Moreover, intermediary 108 can
increase database availability by providing cached results when
database 120 is inaccessible.
[0035] The structure of caching intermediary 108 may vary by
implementation according to the requirements of a particular
application. In the embodiment depicted in FIG. 1, caching
intermediary 108 includes an intermediary interface or driver 112
and a cache 110. Interface 112 can provide and execute many of the
disclosed operations while cache 110 maintains portions of the data
from database 120 and internal data used by the caching
intermediary 108 that can maintain portions of database 120 in
local memory. Interface 112 conceptually wraps itself around native
interface 106 to provide transparent intermediation between
applications 104 and database 120. Applications 104 can communicate
with interface 112 in the same manner they would with the native
database interface. To requesting applications 104, interface 112
will appear the same as the native interface 106. By wrapping
around the native interface, the intermediary interface forces
applications 104 to communicate with the intermediary, instead of
directly accessing interface 106.
[0036] In one embodiment, interface 112 is a HA-JDBC High
Availability Java.TM. Database Connectivity (HA-JDBC) driver.
Interface 112 will appear as a standard JDBC driver and thus
provide a transparent solution to applications 104 for caching
query results. Intermediary 108 can be installed and integrated
into an existing system without modifying any application source or
object code and without modifying any database source or object
code. Applications 104 continue to access database 120 in the
typical manner, appearing to interface directly with the native
database driver 106, while in fact, interfacing with the
intermediary driver 112 provided in accordance with one
embodiment.
[0037] Caching intermediary 108 provides cached results that are
maintained current and invalidated where appropriate based on
updates to database 120 or events processed by database 120.
Changed data within the database is the impetus for determining
that cached database results are no longer valid. This contrasts
with traditional database caches that operate as scheduled or
time-based caches. For example, traditional schedule-based caches
invalidate cached results based on a set schedule or time interval.
Entries stored in the cache expire or are invalidated periodically
based solely on an elapsed period of time. Invalidations occur in
response to the mere passage of time without regard to any actual
changes at the database. By using such a technique, these systems
stand not only to return invalid data to requesting applications,
but also to needlessly invalidate valid cached data and decrease
performance.
Depending Analysis Using Native Database Facilities
[0038] Intermediary 108 can detect actual changes to database 120
and invalidate and/or update entries in cache 110 based on these
updates rather than an elapsed period of time according to a
predetermined schedule. In one embodiment, changes to database 120
are detected through queries submitted by applications 104 and
received at the caching intermediary. In one embodiment, caching
intermediary 108 is able to detect changes to database 120 made in
response to queries without parsing the submitted queries. In one
embodiment, the caching intermediary 108 calls a native analysis
facility provided by and inherent to the database to analyze
queries received from requesting applications. The analysis
facility provides information regarding how a query submitted for
analysis is processed by the database. Caching intermediary 108 can
determine details of how the database processes the query,
including the tables or other data partitions the database accesses
to fulfill the requests.
[0039] It is important to note that the analysis facility called by
caching intermediary 108 is a native facility or process provided
by the database. This enables the intermediary to determine which
tables of a database are accessed in responding to a query.
Importantly, the intermediary is able to determine this information
without parsing the actual query. Databases typically provide
debugging and performance analysis facilities as native database
procedures available to end-users. These facilities are
traditionally used by database administrators to debug queries or
to increase the performance associated with a particular query. An
administrator may view how a query is processed and modify the
query structure based on this review to maximize performance.
Caching intermediary 108 can exploit one or more of these native
facilities to determine the dependencies of queries it
receives.
[0040] The use of native database facilities to determine this
information is a strong departure from traditional caching
techniques that rely on parsing queries. For example, intermediary
108 may call an "EXPLAIN PLAN" facility to determine the
dependencies of a query submitted to an Oracle.TM. database. An
abstracted or skeleton version of the query can be issued in a call
to this facility to determine how such a query is processed and
this, the tables accessed. Numerous types of native database
facilities may be used in various embodiments depending on the type
of database platform, programming language, or server platform.
[0041] The use of native database facilities to determine
dependency and other information for a query is a significant and
strong departure from traditional caching techniques that rely on
query parsing, for example. To provide cached results based on
changes to data in a database, previous techniques have relied on
parsing the actual query received from an application to determine
query dependencies, etc. These systems include an inherent
limitation that it is difficult if not impossible to precisely
parse a query without knowledge of the relevant database source
code. Any cache provided by an entity without intimate knowledge of
the database code will not be able to fully understand and parse
queries designed for that database. The technology described herein
avoids these inherent limitations by relying on the database itself
to assess and provide the necessary information for caching to the
intermediary.
[0042] FIG. 2 depicts an exemplary implementation of intermediary
cache 110 in accordance with one embodiment. In FIG. 2, cache 110
includes three individual tables to facilitate caching and cache
invalidation. A results table 204 maintains a list 210 of queries
and the results 212 of those queries as retrieved from the
database. Table 204 further includes an indication 214 of the time
at which the query was processed and the results obtained from the
database. In one embodiment, the results are time-stamped and the
third column is not used.
[0043] Dependency table 206 maintains a list of query structures,
also referred to as skeletons or footprints, along with a list of
the tables from database 120 upon which queries of that structure
are dependent. A query skeleton is a form of an actual query
without any variable data or literal expressions. Literal
expressions refer to the individual variables within a query for
which input data is passed to the database to retrieve unique
records. For example, a simple query (shown as entry 216 in table
204) that selects a record from an employee table where the user_id
column is 12 will have a skeleton as shown by entry 222 in table
206. The skeleton shown is a form of the original query with the
user_id literal expression "12" replaced with ":v." All queries
having the same skeleton will be processed in a similar manner and
retrieve results from the same tables in the database. Any query
having the structure of entry 222 is dependent upon the employee
table. Replacing the literal expressions with ":v" is but one
example of abstracting a query. Any suitable place holder can be
used and in one embodiment, the literal sections are simply "marked
off" using some external method (e.g., recording the indices of the
start and end characters of each literal section). Modification
table 208 tracks modifications or updates to the tables in database
120. Table 208 lists table from database 102 and the time at which
the table was last updated. Using the tables of cache 110, the
caching intermediary can determine if valid results are cached for
a particular query. If results are listed in table 204 for a
received query, table 206 is accessed to determine what tables the
query is dependent upon. Table 208 is then accessed to determine
when those tables were last modified. If the time of the last
modification is before the time the results were obtained (table
204), the results are determined to be valid.
[0044] A listener component 128 is also provided to detect changes
to data maintained by database 120. This component provides the
additional benefit of being able to detect so-called backend
updates to database 120 that do not pass through intermediary 108.
For example, database 120 can be modified by a database
administrator or other entity not in communication with
intermediary 108. Listener component 128 can interface directly
with the database at the database server to detect any backdoor
updates that originate from beyond the reach or control of the
caching intermediary.
[0045] In accordance with one embodiment, invalidations of cached
database results are based on actual changes to database data.
Updates to the database trigger the invalidation of cache entries
within a caching intermediary. A more efficient cache is provided
by avoiding unnecessary invalidations of cached results, as
inherent in time-based caches. Furthermore, invalidations based on
changed data as determined from the database provides consistency
between data in the cache and data in the cache and database.
Moreover, the reliance on the database to analyze queries and
determine dependency information avoids the limitations associated
with parsing queries submitted to a database for which the
underlying code and operation is not well understood.
[0046] FIG. 3 is a simplified block diagram depicting the
communication flow of a database caching system in accordance with
one embodiment. Application 104 (e.g., Application A) at
application server 102 communicates with caching intermediary 108
to access database 120, implemented at database server 122.
Application 104 issues query 140 for execution by database 120.
Query 140 is received at caching intermediary 108. When query 140
is received at the intermediary, the interface 112 accesses
intermediary cache 110 to determine if the intermediary is
maintaining valid results for query 140, retrieved from the
database in response to a previous request including query 140.
[0047] Interface 112 accesses cache 110 as indicated at 141 and
determines whether results for query 140 are currently maintained
locally at the cache. If the results are currently stored--in
results table 204 for example--the intermediary accesses a
dependency table 206 at cache 110 to determine which tables at the
database queries of the received query's structure are dependent
upon. Table 208 is then accessed to determine the last time each
table query 140 is dependent upon was last updated. The time the
results stored in results table 204 were obtained is compared with
the last time each dependent table was updated to determine whether
the cached results were retrieved from the database after the last
update to any dependent table of the query. If the results were
retrieved after the last update the cached results are returned as
indicated at 138.
[0048] If the results were retrieved before the last update or if
no results are maintained, the database is accessed to process
query 140. Interface 112 abstracts the query by removing the
literal expressions from the query string to create a query
skeleton 142 having the structure of query 140. After abstracting
the query, intermediary interface 112 passes the skeleton 142 to
the native database interface which forwards it to database 120. In
one embodiment, interface 112 checks cache 110 first to determine
if the dependency analysis information for the abstracted query is
already stored. If so, the abstracted query is not passed.
[0049] If the query structure and dependencies of query 140 are not
already stored, interface 112 passes query structure 142 by calling
a native analysis or tuning facility 162 provided by database 120.
The term native indicates that the facility or process called at
the database is a process provided by and originating from the
database itself, not caching intermediary 108. Databases typically
provide various functionality for performance analysis, tuning, and
debugging, etc. Debugging facilities can provide information to
database administrators (DBAs) and developers about how a query
will be processed by the database. The facilities may analyze a
query structure and specify which tables are accessed to process
queries of that structure, how or in what order the tables are
accessed, how the data from the tables are combined, and other
types of information about how the database processes a query
having the structure that was passed when calling the analysis
facility. These facilities provide information to DBAs, etc. so
that queries can be restructured or modified to optimize processing
at the database. In one embodiment for example, an Oracle.TM.
database provides a debugging facility commonly referred to as
"EXPLAIN PLAN." In response to a call to the "EXPLAIN PLAN"
facility passing a selected query structure for analysis, the
database will analyze how the database processes that particular
query structure.
[0050] When database 120 receives the abstracted query skeleton
142, it will execute the facility called by the intermediary when
passing the skeleton. After analyzing skeleton 142 to determine how
queries having the particular structure are processed, intermediary
interface 112 accesses the results in order to determine the tables
upon which queries of structure 142 are dependent. Different types
of databases make these results available in different formats. For
example, database 120 may create a table, file, or other data
structure 150 with the results of its analysis and store it locally
on disk 126, for example. Interface 112 can access the results from
disk to determine the dependencies for query structure 142. An
exemplary communication 144 from intermediary 112 to listener 128
instructing the listener to read results 150 from disk 126 is
depicted in FIG. 3. After reading the data from disk 126, listener
128 provides the requested information to interface 112 as depicted
by reply 145. In other embodiments, a database may provide the data
from an analysis by such a facility directly to the requesting
application such as interface 112.
[0051] Numerous types of analysis facilities are provided by
databases that can be accessed in accordance with embodiments to
determine query structures' dependency information. Embodiments of
the technology described herein are not limited to any particular
analysis facility provided by any particular database. It is
important to note that a cache in accordance with one embodiment
can call a native database analysis facility without having any
predetermined knowledge of the query or the database protocols to
be able to parse the query itself. This enables the cache to be
implemented transparently to both requesting applications and
databases. Importantly, the cache does not parse the query received
from the requesting application. This enables the cache to
interface with databases having different protocols, platforms,
programming languages, and interfaces. Because the system calls a
native facility at the database, it does not need to fully
comprehend or be able to parse the queries it receives. The native
database facility is relied upon for the required dependency
analysis.
[0052] It is also important to note that these native analysis
facilities such as "EXPLAIN PLAN" are not intended or designed for
use in caching operations as presently described. These facilities
are typically provided for database administrators to manually
issue queries to the database to analyze the performance of the
database when executing a particular query. These tools are
valuable to database administrators because they can design and
restructure queries to maximize efficiency and performance by the
database. By taking advantage of such facilities for caching query
results, the complicated task of parsing queries is avoided while
improving performance be reducing or eliminating parsing errors.
Because of their intended application, these analysis facilities
typically provide large amounts of information in addition to
indicating the tables the database accesses in response to a
particular query structure. In one embodiment, listener 128 filters
the information received from the database in response to the query
analysis to determine the tables accessed when executing the
query.
[0053] After receiving dependency information, intermediary 108
passes the actual query 140 to database 120. Database 120 executes
the query and returns results 148. If intermediary 108 determines
that query 140 is a read query (as can be determined from the query
structure analysis), a new entry is created in a dependency table
listing skeleton 142 (if an entry does not already exist) and the
dependencies determined by the database analysis facility.
Intermediary 108 also creates an entry in results table 204 for
query 140, specifying the query 140, the results of the query
provided by database 120, and the time at which the results were
obtained. FIG. 3 conceptually depicts an update 143 issued from
interface 112 to cache 110 specifying the data to be stored in the
results and dependency tables. Processing of the query is complete
when results 148 are provided to application 104. It should be
noted that the query and skeleton can be passed to the database in
any order and the cache may determine the dependency information
after receiving the actual query results.
[0054] If intermediary 108 determines that query 140 is a write
query, it updates or creates an entry in a modification table
listing the tables effected by such a write query and the time at
which the write query modified the tables. For example, if the
write query modifies an employee table, and an entry already exists
in modification table 208 for this table, the time at which that
table is indicated to have last been modified is updated to reflect
the new modification time. After updating or creating entries in
the modification table, an indication is returned to application
104 that the query has been successfully processed.
[0055] In one embodiment, an entry is included in dependency table
206 for write query structures and the tables modified by those
write query structures. A database analyzing a write query
structure will provide information regarding the tables updated or
modified by that query structure. When a query is subsequently
received, the dependency table can be accessed to determine if
dependency information for that query's structure is already
stored. If so, the actual query can be submitted to the database
without passing the query structure to the database to determine
the dependent tables. When the results are received, the
modification table can be directly updated. In one embodiment, the
dependency table indicates whether a query is a write or read query
and thus, whether the listed tables are modified by the query or
whether the query is dependent upon those tables.
[0056] In an alternate embodiment, entries in table 204 affected by
a write query are directly invalidated instead of updating times of
last modification in table 208 that will be used later to determine
if cached results are still valid. For example, each entry from
table 204 can be read and the results of any queries that are
dependent upon an updated table immediately invalidated. In one
embodiment, the tables each query listed in the results table 204
is dependent upon is determined from dependency table 206.
Intermediary 108 invalidates any results from the results table
that depend upon an updated table. While such embodiments are
possible, efficiency may decrease if numerous entries are listed in
table 204. By maintaining a list of the last update times for each
table, better efficiency can be achieved by invalidating results
only when a subsequent query corresponding to those results is
received to avoid reading every entry from table 204 in response to
any update to the database.
[0057] FIG. 4 is a flowchart of a method for processing the
database queries and caching results thereof in accordance with one
embodiment. A caching system, such as intermediary 108 executing at
application server 102, receives a query from an application at
step 302. A results table is accessed at step 304 to determine
whether an entry exists for the received query. If results are not
being maintained for the received query, the query is abstracted at
step 306 by removing any literal expressions or variable data.
[0058] After abstracting the query, a dependency table maintained
by the caching system is accessed at step 308 to determine if an
entry exists for the structure of the abstracted query. If an entry
exists, the tables the query is dependent upon (read query) or the
tables the query modifies (write query) are determined at step 310.
After determining the tables at step 310, the unabstracted query is
passed to the database at step 312 for processing. After processing
the query, the database returns the results to the cache.
[0059] If the abstracted query structure is not stored in the
dependency table, the abstracted query skeleton is passed to the
database at step 314 by calling a native analysis facility provided
by the database. The query is passed onto the database and the
results received at step 316. The results of the abstracted query
analysis by the database are accessed at step 318. From the results
accessed at step 318 or step 310, the cache determines whether the
query was a write or read query at step 320. If the query is a read
query, an entry is created in the dependency table at step 322
setting forth the query structure and the table(s) the structure is
dependent upon. If an entry already exists as determined at step
308, step 322 is skipped. An entry is created in results table 204
at step 324 listing the actual query and the results received from
the database upon passing the query at step 312 or 316. These
results are stored with an indication of the time they were
received to facilitate subsequent data validity determinations. The
results of the query are returned to the requesting application at
step 326.
[0060] Returning to step 320, if the received query is determined
to be a write query, an entry is created in the dependency table at
step 328 setting forth the query structure and the table(s) queries
of that structure modify. Any entries in modification table 208
that correspond to tables affected by the received query are
updated at step 330 to reflect the time of the latest update by the
query. If no entry or entries exists in table 208 for the table(s)
updated by the received query, new entries are created. After
updating and/or creating the entries at step 330, a response is
returned to the requesting application at step 332 indicating that
the query has been successfully processed.
[0061] If it is determined at step 304 that an entry in results
table 204 for the received query already exists, table 206 is
accessed at step 334 to determine what tables the query is
dependent upon after abstracting the query. After determining the
dependent tables, modification table 208 is accessed at step 336 to
determine the last time each dependent table was last updated. If
any dependent table was updated after the time of retrieval for the
query results as reflected in the time stamp at the results table,
the results are determined to be invalid at step 338. The actual
query is then passed to the database at step 340 and the results of
the query received. Results table 204 is updated at step 342 to
reflect the new results for the query that were received from the
database. The cache also adjusts the time for the query in the
results table to reflect the time the new results were
retrieved.
[0062] The results of the query are returned to the requesting
application at step 344 from the cache after determining that the
query results are valid at step 336. Otherwise, the results
received at step 338 from the database are returned at step
344.
Database Listener
[0063] FIG. 2 illustrates the detection of updates to a database
directly by a caching intermediary in response to a received query
and structural analysis provided by the database for the query. It
is also possible for updates to a database to be made without a
caching intermediary detecting them. For example, an application
accessing the database from a location outside of application
server 102 may send queries that do not pass through the caching
intermediary. Other applications implemented on different
application servers with their own communication links to database
server 122 may send updates that do not pass through the
intermediary. Database administrators or other users may directly
modify the database through an administration interface provided by
the database. Such updates may be referred to generally as backend
updates.
[0064] FIG. 5 illustrates a backend update 166 to database 120
originating from an entity beyond application server 102. Update
166 is received directly at database server 120 and does not pass
through the caching intermediary 108 at application server 102.
Although a backend update is described for exemplary purposes, it
is noted that updates that pass through the caching intermediary
may also be detected through the hereinafter explained techniques,
in addition to being directly detected by the intermediary.
[0065] As understood by those of ordinary skill in the art,
databases typically maintain portions of user data in local memory
for performance considerations. Databases may aggregate updates to
the database over a period of time and write them all to disk at
once in the background when it has ample bandwidth and resources to
do so. Databases typically maintain a log or other identification
reflecting all updates to the database in order to track the
updates that have propagated to disk and those that are currently
only maintained in local memory. A query that updates one or more
tables in database 120 can be processed, and the updated results
maintained in a memory local to the database such as a software
memory or a fast volatile memory such as RAM. Subsequently received
queries that reference this data before it is written to the
database tables in non-volatile memory are processed using the
updated data in local memory. Database 120 may accumulate numerous
updates to the database from various queries before writing updated
data to the tables or other data structures stored on disk 126 or
other non-volatile memory.
[0066] The log maintained by a database, often referred to as a
transaction log, is continuously updated and written to a
non-volatile memory such as disk 126. If power loss occurs, causing
data maintained in local memory at the database to be lost, the
transaction log enables the database to "replay" any lost
transactions to recover data not written to disk before the power
loss. Upon reboot, the database can read the transaction log,
compare it with the data stored on disk, and if any updates
reflected in the log have not yet propagated to disk, update the
appropriate tables on disk in accordance with the information in
the transaction log.
[0067] In FIG. 5, a transaction log 152 is maintained by database
120 and stored on disk 126. Listener 128 can utilize this
transaction log to detect all changes to database 120 regardless of
origin. When database 120 receives a query, such as query 166,
specifying an update to one or more tables, the database calls the
operating system to write information relating to the query to the
transaction log 152 at disk 126. Database 120 causes the
transaction log to be updated, in addition to making the changes
specified by the query in local memory. Listener 128 can access the
updated transaction log 152 on disk 126 to determine if changes
were made to database data. When listener 128 detects an update to
the database, it notifies caching intermediary 108 that an update
has been made as shown by communication 188. The intermediary can
update a modification table to reflect the time at which the tables
corresponding to any previously listed entries were updated, and to
add new entries and modification times for any updated tables not
already listed. In one embodiment where cached results are
invalidated immediately upon detecting an update and a table such
as table 208 is not used, intermediary 108 can update a results
table 204 by invalidating or deleting those entries dependent upon
an updated table.
[0068] FIG. 6 is a flowchart depicting one embodiment for updating
a cache based on detecting changes to data at the database. In FIG.
6, a database server-side component monitors changes to a database
transaction log to detect and report changes to database data to a
caching intermediary remote from the database server.
[0069] A transaction log or other indication of transactional
updates to the database is accessed at step 352. A server-side
listener component 128 reads the transaction log and at step 354
compares the most recent data from the transaction log with data
read from a previous version of the transaction log. If the
listener detects changes to the transaction log, it makes a
determination at step 356 that the database data has been modified.
If no changes to the transaction log are detected, the listener
determines that the database data has not been modified. In one
embodiment, selected tables of the database are monitored while
others are ignored. Listener 128 can be configured to determine
that a change to monitored database data has occurred when a change
to the transaction by reflecting an update to one of the monitored
tables if detected. In one embodiment, listener 128 detects all
changes to the log and database but only reports changes to
specified tables.
[0070] When an update to database data, or an update to selectively
monitored data is detected, the listener formulates an update to be
sent to the caching intermediary at the application server. An
update is sent to the caching intermediary at the application
server at step 358. The intermediary will update its local cache
results and tables at step 360 to reflect the changes to the
database. For example, an indication of the time a table was last
modified as maintained in a modification table 208 can be updated
to reflect the time of this latest update. After the local cache is
updated, the caching intermediary can optionally send a reply at
step 362 to the listener indicating that the necessary updates have
been completed. After receiving the success response at step 362 or
determining that no monitored tables have been updated at step 356,
the listener will read the transaction log again at step 352 to
determine if more updates have been made. In one embodiment, a
delay is implemented before accessing the transaction log again at
step 352. Additionally, the listener may not wait for a success
response from the cache before accessing the transaction log again
after sending an update at step 358.
[0071] In one exemplary embodiment, listener 128 is configured as a
debugger for a native facility or process associated with database
120. As a debugger for a process executed by the database, listener
128, via an operating system 124, can gain access to and control
operations performed by database 120.
[0072] Databases typically provide a facility or process that is
responsible for writing data to transaction log 152 in accordance
with updates to the database. This type of facility is commonly
referred to as a log writer process for the database. By disguising
itself as a debugger for a database's log writer process, listener
128 can be notified when transaction log 152 is updated, access the
transaction log or information stored therein, and notify the cache
of any relevant updates.
[0073] Referring again to FIG. 5, the exemplary embodiment
presented therein includes a listener 128 configured as a debugger
for a native log writer process 154 of database 120. Listener 128
attaches itself to the database through log writer 154 in such a
manner to be notified of successful write operations to disk 126
that update or create transaction log 152.
[0074] Database 120 receives query 166, specifying an update to one
or more tables of the database. The database initially updates any
data it is storing in local memory to reflect the changes made by
query 166. Database 120 also updates any copy of transaction log
152 it is maintaining in local memory to reflect the updates.
[0075] Log writer process 154 is responsible for updating and
writing transaction log 152 to disk 126 when updates to the
database are made. Database 120, via logwriter process 154, sends a
write command 182 to update the file for transaction log 152 stored
on disk 126 in order to reflect the updates made by query 166.
Write command 182 is received by operating system 124, which
forwards the write command 182 to disk 126, typically via a disk
driver (file system) for the disk. When disk 126 has successfully
written the data for transaction log 152, it sends a response 184
to operating system 124 indicating that the data was successfully
written.
[0076] As a debugger for log writer process 154, listener 128
instructs operating system 124 to notify it when updates are made
to transaction log 152 by log writer process 154. In FIG. 5,
listener 128 is configured to be notified when operating system 124
receives a write request for transaction log 152 from log writer
process 154. When operating system 124 receives write request 182,
it notifies listener 128 at 186 that a write request for the
transaction log has been received. Operating system 124 passes the
new or updated data written to transaction log 152 in response to
update 166 to listener 128, as indicated at 186. In another
embodiment, the operating system passes the full transaction log to
listener 128. Listener 128 parses the information received from
operating system 124 to determine the nature of the database
changes. If the listener determines that the updates made by query
166 effect one or more tables of database 120 that it is configured
to monitor, a notification 188 is provided to caching intermediary
108 so that it can invalidate any query results that are dependent
on the updated table(s). In one embodiment, the intermediary
updates information maintained locally to indicate the time at
which the tables were modified, by modifying or creating new
entries in a modification table such as table 208. In one
embodiment, listener 128 is configured to be notified when a write
operation for transaction log 152 has been successfully completed.
When operating system 124 receives the success response from disk
126, it can notify listener 128 at 186 that a successful write
operation to the transaction log has been completed.
[0077] When operating system 124 receives success response 184 from
disk 126, it will also reply to database 120 with success response
190 informing it that the transaction log was successfully updated.
Log writer process 154 receives response 190 and can issue a
response to the requesting application that its update was
successfully processed.
[0078] It is possible in a configuration like that of FIG. 5 that
an update from listener 128 to intermediary 108 may not be
processed in sufficient time to avoid returning old or no longer
valid data to requesting applications. Typically, databases will
temporarily stop processing updates to database data while the
transaction log is writing the log to disk. Databases my further
refrain from responding to read queries with updated data until
that data has been successfully written to the log file maintained
in non-volatile memory. This avoids data inconsistencies should the
data in local memory be lost after it has been provided in response
to a read request but before it is written to non-volatile memory.
Log writer process 154 can pause database 120 from further write
and/or read query processing while it is writing the transaction
log. When success response 190 is received, log writer process 154
knows the updated data for the database is now maintained in
non-volatile memory. Accordingly, the log writer process will free
the database 120 to continue processing queries.
[0079] Consider a situation where listener 128 and database 120
receive responses 186 and 190, respectively, at the same time. Log
writer process 154 will process success response 190 and then
release the database to use and respond with the newly updated data
as well as process new queries. When intermediary 108 receives
update 188 from listener 128, it updates its cached data as
required to invalidate any effected result entries in results table
204. If the intermediary updates its records after database 120 is
released to begin responding and using the new data from query 166,
intermediary 108 may return old data in response to a query.
Imagine that a read query is received by intermediary 108 after
database 120 is released to use and respond with the new data from
query 166, but before the intermediary updates its records to
reflect that the relevant old entries are no longer valid. The
intermediary may return the cached results from its local memory to
the requesting application since it is unaware of the changes made
to the database by query 150.
[0080] FIG. 7 is a flowchart of a method in accordance with one
embodiment for detecting changes to a database using a listener
configured as a debugger for a native database facility responsible
for updating transactional information to document changes to the
database. In the embodiment of FIG. 7, a technique is implemented
to ensure that a caching intermediary is updated in response to
changes to the database before the database is released to begin
using and responding to requests with the new data. The listener
presents itself as a debugger to the operating system at the
database server. As understood by those of ordinary skill,
debuggers can gain access and control over select portions of code,
applications, or entire systems. By implementing the listener as a
debugger for a portion of the database, the listener can take
control of that portion to receive information regarding updates at
the database from the database itself. Because the listener is
disguised as a debugger, the operating system is instructed to
notify the listener of select actions and allow the listener to
take control of select actions to receive this updated information.
The listener can thus take control of a portion of the database to
ensure that the caching intermediary is updated before the database
begins using updated data. While such an implementation is broadly
and generally described as a debugger, it will be understood that
the listener can gain access and control of portions of the
database using other suitable techniques, such as through various
instrumentations of the database source or object code.
[0081] The database receives an update to data stored in the
database (e.g., write query) at step 370. The database updates its
local memory at step 372 to reflect the changes made by the write
query. The database updates its transaction log in local memory at
step 374 to reflect the updates made by the query. The database
sends a command to the operating system at step 376 to write the
new information to the transaction log stored on a local
non-volatile storage medium disk. In one embodiment, a log writer
process is responsible for writing the transaction log to disk and
issues a write command to the operating system at step 376.
[0082] At step 378, the operating system issues a write command to
update the transaction log at the disk responsible for storing the
transaction log. The write command, which includes the new
information from the query, is received at the listener component
at step 378. The listener component is configured as a debugger for
the log writer process so that the operating system forwards the
write request to it for processing. The listener presents itself as
a debugger in order to gain access and control of the log writer
process to receive the transaction log updates. The listener
forwards the write request to the disk at step 382 and receives a
response that the data was successfully written at step 380.
[0083] The listener is further configured to pause the log writer
process at step 834 when a success response is received from the
disk indicating an update to the transaction log. In one
embodiment, listener 128 pauses the log writer process by
instructing the operating system not to return the success response
to the database. The listener temporarily pauses the log writer
process to insure that the caching intermediary at the application
server has updated its local cache results updated in accordance
with the write or update query. Database 120 will not use the new
data provided by the query until it receives a notification that
the transaction log has successfully been written to disk.
[0084] The listener parses the information received from the
operating system at step 386 after the log writer process has been
paused. The listener determines the tables modified by the query.
If tables the listener was configured to monitor were updated, an
update is sent at step 388 to the caching intermediary at the
application server. At step 390, the caching intermediary updates
its locally stored information (e.g., modification table 208) to
reflect the time at which the tables were modified by the
query.
[0085] After the caching intermediary successfully updates its
data, it provides a success response to the listener at step 392.
Upon receipt of the response, the listener instructs the operating
system at step 394 to release the log writer process. At step 396,
a success response indicating that the transaction log was
successfully written is issued to the log writer process. The log
writer process releases database 120 to use the new data from the
query at step 398. In this manner, the database will only use new
data after that data has propagated to the caching intermediary.
Thus, each cache can remain in a consistent and accurate data state
with respect to the actual database.
[0086] In one embodiment, the listener is configured to be notified
of updates to the transaction log before they are written to disk.
For example, the listener component can be notified of updates by
the operating system when a write command is issued to the disk to
update the transaction log. If updates are made to tables which the
listener is monitoring, the listener begins updating the caches at
the application level prior to the disk handling the write command.
It is possible that the transaction log will not be successfully
written to disk and thus, the database not updated with the new
data. The caching intermediary may invalidate cached results that
are still valid as a result. While invalidations may occur more
often than necessary in this scenario, no data consistency issues
will exist. The intermediary will simply provide queries to the
database for which it was maintaining valid results in its
cache.
[0087] The listener is not limited to the particular structure of
any disclosed arrangement and can be tailored to the requirements
of individual implementations. In one embodiment, the listener is
configured as a file system driver for the disk on which the
database transaction log is to be stored. After the database
receives an update to the database data, it issues a command to the
operating system to write the new information to the transaction
log on disk. The operating system issues this write command to an
interface for the disk adapted to translate internal addresses used
by the operating system to addresses understood by the disk. The
listener can disguise itself as the disk interface or driver for
the disk storing the transaction log in order to intercept commands
issued to the disk via the disk interface.
[0088] The listener will receive the command and forward it to the
actual file system driver for the disk. The file system will pass
the command to the disk and receive a success response in return
when the data is successfully written. The listener can be
configured to access the transaction log or the data passed in the
write command to be written to the transaction log to defect
updates to database data. In one embodiment, the listener accesses
the data written to the transaction log when the write command is
issued to the file system for the disk. In another embodiment, the
listener accesses this information when a success response is
received from the disk after the data has been written. In either
case, the listener parses the information it receives and provides
an update to the caching intermediary if necessary.
[0089] Similarly to the debugger configuration, the listener can
pass the success response received from the disk to the operating
system immediately upon receipt. The operating system will send a
success response to the database, freeing the database to use the
upload information. This scenario can lead to data inconsistencies
between the cache and database as previously described.
[0090] In another embodiment, the listener pauses the log writer
process after receiving the success response from disk. The
listener will wait until it receives a success response from the
caching intermediary before issuing the success response to the
operating system. By pausing the log writer process until the cache
is updated, the listener ensures data consistency as already
described.
Caching Stored Procedures and Invalidating Results
[0091] Stored procedures represent an increased level of complexity
for caching systems associated with databases. Stored procedures
define a set of one or more operations and are stored at a database
in a compiled format to be called by requesting applications to
receive the results of the operations. Stored procedures typically
include a set of SQL statements. Applications or developers wishing
to access a stored procedure simply need to know the stored
procedure's name, rather than schema, indexing, and column
information required to develop a query. In addition to SQL
statements, stored procedures can include operational code. Stored
procedures can execute a set of code, call other databases to
access or modify records, directly access other files, and/or
access and transmit other information, such as by calling other
applications. Stored procedures are valuable, if not essential,
components of many database installations. They provide centralized
query management for institutions, grouping operations and call
commands for increased security and to avoid errors, etc. when
writing individual queries and application code. Stored procedures
improve database performance by reducing the total traffic between
applications and the database. An application can call a stored
procedure once to execute numerous queries and receive a single set
of results, rather than issue each individual query and receive a
result for each issued query. The pre-compiled nature of stored
procedures provides significant improvements in performance when
compared to traditional query statements. Stored procedures are
compiled just once by the database which reuses the compiled
process whenever the procedure is called. Queries on the other
hand, are compiled each time the query is executed.
[0092] Caching techniques that rely on parsing queries received
from requesting applications are unable to handle stored
procedures. A stored procedure is only definable and understood by
the database on which it is stored. For example, a typical stored
procedure as seen by an application or application server will
simply be a textual invocation of the procedure such as foo( ). Any
application or caching system interfacing with the database does
not know the internal operations defined by foo( ), and thus, can
only see this textual call or issue similar calls to access the
procedure. Parsing such a call to a stored procedure will not
reveal anything about the individual queries the stored procedure
references or the commands it executes at the database.
[0093] In order to run stored procedures, databases typically
perform an analysis before run-time to assess how the stored
procedure is processed. For example, the database may compile the
stored procedure, examining the structure of the procedure, the
individual queries the database tables accessed, in what manner and
order the tables are accessed, as well as the various other
operations that the database may perform as part of the stored
procedure. The database may maintain this information or compiled
process locally to increase performance when a request for this
stored procedure is received. In this manner, the database does not
have to determine this information each time the stored procedure
is called. Conceptually, the preparatory work by the database for a
stored procedure amounts to compiling the stored procedure for
run-time execution.
[0094] In accordance with one embodiment of the technology
described herein, stored procedures are successfully cached and the
results invalidated based on changed data in the database. FIG. 8
depicts one embodiment for caching stored procedures. Application
104 issues a request 402 for a stored procedure 403 maintained at
database 120.
[0095] Application 104 (e.g., Application A) can issue a call 402
to a stored procedure 403 and pass one or more variable values.
Intermediary 108 can intercept the request as previously described
for simple queries by wrapping itself around the native database
driver 106. Intermediary 108 opens a trace session at the database
when it determines the request is for a stored procedure. As
understood by those of ordinary skill in the art, trace sessions
are typically used and designed for debugging. A tracing process
(e.g., 161) can collect various types of information associated
with an application, system, or section of code during a period of
time during which the trace is open. For example, a tracing
facility provided by an application can allow a user of the
application to observe files accessed, files modified, new files
created, applications called, methods called, and various other
types of operations associated with the application. Various
facilities provided by databases to track and document operations
the database performs during a period of time may commonly be
referred to as tracing facilities. An Oracle.TM. database
application provides a facility called "trace." Other database
vendors and types have similar but differently named facilities to
perform these processes.
[0096] After opening the trace session with database 120, the
intermediary issues the stored procedure call 402 to the database.
Database 120 receives call 402 and executes stored procedure 403 by
performing the various functions and queries specified by the
stored procedure. Database 120 returns the results 406 of the
stored procedure to intermediary 108 when it is complete. The
results 406 of stored procedure 403 are cached in a results table
such as table 204 at the intermediary in the same manner query
results are cached. An identification of stored procedure 403,
including the literal expressions passed with stored procedure call
402, are maintained along with the results 406 received from the
database 120.
[0097] When intermediary 108 receives the results, indicating that
the stored procedure has been executed, it also closes the trace
session with database 120. When the trace session is closed,
database 120 generates a results file 409, which in the embodiment
of FIG. 8 is stored at local disk 206 in response to write command
408. The database may create and update the trace file while the
trace session is opened in other embodiments. Different databases
record the results of trace sessions in different ways. Databases
may create a trace file containing information about the actions
taken while the trace session was opened. Other databases may
return information directly to the application opening the trace
session. The technology described herein is not limited to any
particular type of database or native facility, and it is
anticipated that the particular facility used may vary according to
the requirements of a particular implementation.
[0098] Intermediary 108 accesses the trace information from file
409 to determine the queries executed at the database when
performing stored procedure 403. In the embodiment of FIG. 8,
intermediary 108 issues a request 410 to listener 128 to read trace
file 409 from disk 126. Listener 128 responds by issuing a read
request 412 to operating system 124. The request is forwarded to
the disk which returns the file to the operating system as
indicated at 414. The operating system forwards the file to
listener 128. Listener 128 parses the information in the trace file
to determine the individual queries that users executed to perform
stored procedure 403. Listener 128 can select the queries executed
in response to the stored procedure by identifying those queries
from the trace file that include an identification associated with
the caching intermediary. Trace files may contain information
identifying all operations, queries, etc. a database performs while
a trace session is open, regardless of the originating application
to which the database was responding. Databases typically store an
identification of the application or process initiating each query
in the trace file which the listener can use to select only those
queries executed for procedure 403.
[0099] A list 416 of queries executed in response to the stored
procedure is sent from listener 128 to intermediary 108 after the
trace file has been parsed. From this list, intermediary 108
determines every query executed at the database in response to
stored procedure 403. The intermediary determines each query
executed by a stored procedure without parsing the stored procedure
or the call issued by application 104.
[0100] Having determined the queries executed in response to stored
procedure 403, intermediary 108 determines which, if any, tables at
the database were modified by the stored procedure. This
determination is made as previously described for a standard query
received at the cache. Intermediary 108 abstracts each individual
query and passes it to the database by calling a native database
analysis facility to analyze how it processes a query of that
structure. The abstracted query 418 is passed to database 120 which
analyzes it and returns results 420 to interface 112 identifying
any modified tables.
[0101] Intermediary 108 creates or modifies any entries in tables
206 and 208 associated with portions of the database affected by
stored procedure 403. An entry is created (if not already existing)
in table 206 for the structure of stored procedure 403, along with
the tables the stored procedure is dependent upon. The intermediary
can abstract a stored procedure in the same manner as a query by
removing any literal expressions or variable data from the stored
procedure call. Interface 112 can list the abstracted stored
procedure call in cache 110 with an indication of the tables the
stored procedure is dependent on. The dependencies are determined
from individual query abstraction analyzed by the database. For
stored procedures, table 206 can further include an indication of
the tables the stored procedure modifies. Table 206 can include an
indication for each listed table whether the procedure depends on
that tables or modifies that table. In table 208, intermediary 108
lists any tables the stored procedure modifies with an indication
of the last time that table was modified. For any pre-existing
entries, the intermediary updates the indication of the last time
of modification.
[0102] FIG. 9 is a flowchart depicting a method in accordance with
one embodiment for caching database stored procedures. A request or
call for a stored procedure maintained at a database is received at
step 702. At step 704, a results table containing previously
received results for various stored procedures and/or queries is
accessed to determine whether an entry is present for the stored
procedure referenced by the call. If an entry does not exist for
the particular stored procedure, the stored procedure is abstracted
at step 706 and a dependency table such as table 206 is accessed to
determine if an entry is present for stored procedures having the
structure of the received query at step 708. If an entry for the
query's structure is not present in the dependency table, a trace
session is opened at the database at step 710. A request can be
issued to the database to track the operations, transaction,
queries, etc. processed while the trace session is open. After
opening the trace session, the stored procedure request is
forwarded to the database at step 712. The database executes the
stored procedure at step 714 by performing the actions specified in
the stored procedure using the data passed with the request. The
database traces its actions performed while executing the stored
procedure at step 716 and creates a file or other indication of the
trace analysis. After executing the stored procedure, the database
returns the results at step 718. The caching intermediary can
receive the results and store them in a results table at step 720.
An indication of the stored procedure along with the results of its
execution can be stored in a results table such as 204. The results
are returned to the requesting application at step 722. After
caching the results of the stored procedure, the trace session at
the database is closed at step 724. In one embodiment, the trace
session is closed when the results from the database are
received.
[0103] The caching intermediary can issue a request to access the
trace information at step 725 if it is not automatically returned
as previously described. For example, a request can be issued to a
listener installed at the database machine to read a file such as
trace file 409 storing the trace data. The listener can issue a
read command and parse the information received from the requested
trace file. The listener can return all or a portion of this data
to the caching system at step 726. In one embodiment, the listener
parses the data and returns a list of the queries executed by the
database for the stored procedure. In other embodiments, the
listener can pass the entire trace file to the intermediary which
can perform the proper analysis to determine the queries. IN one
embodiment, the trace information is automatically returned to the
requesting application such that steps 725-726 can be skipped. In
any event, the queries that are executed as part of the stored
procedure are determined at step 727.
[0104] Each query executed as part of the stored procedure is
abstracted at step 728 by removing any literal expressions from the
query. Each abstracted query or skeleton is passed to the database
at step 730 by calling a native facility of the database to analyze
the structure of each query. The results for each query skeleton
are accessed at step 732. The intermediary creates an entry in a
dependency table such as table 206 at step 734. Each entry will
include a stored procedure skeleton and the tables it is dependent
upon. In one embodiment, any tables modified by the stored
procedure are also listed in the corresponding entry. The table can
include an indication of whether a listed table is a dependency of
the query or a table modified by the query. At step 736, entries in
a dependency modification table such as table 208 are created or
updated for any tables modified by the stored procedure. For each
table that was updated in response to the stored procedure command,
an existing entry in the modification table can be updated with the
time the stored procedure modified the corresponding database table
if an entry for the table already exists. If an entry for the table
does not already exist, a new entry is created by listing the table
and time of modification. The results table, dependency table, and
modification table can be used for future stored procedure requests
to determine whether results at the cache are validated and can be
returned instead of retrieving them from the database itself.
[0105] If it is determined at step 708 that the structure of the
stored procedure is already cached in a table such as 206, for
example, the dependencies of the stored procedure structure are
determined at step 756. A call to the stored procedure is issued at
step 758 and the results received. At step 760, the results are
cached by the intermediary in the results table. The results are
returned to the requesting application at step 762. The
modification table 208 is updated at step 764 if the stored
procedure includes write queries.
[0106] If it is determined at step 704 that an entry in the results
table already exists for the stored procedure, the dependency table
is accessed at step 736 to determine the tables the stored
procedure is dependent upon. The stored procedure can be abstracted
to determine if an entry in the dependency table exists for queries
of the received query's structure. The time of last modification of
each dependent table is determined at step 742. At step 744, it is
determined whether the cached results are still valid. In one
embodiment, a time stamp for the results determined from the
results table is compared with the times listed in the modification
table for each table the stored procedure is dependent upon. If the
results in the results table were retrieved before an update to one
or more of the tables modified by the stored procedure as indicated
in the modification table, the cached results are not valid and are
not returned to the requesting application. A call to the stored
procedure at the database is then issued at step 746 and the
results received. The results of the stored procedure are stored in
the cache at step 748 with an indication of the time they were
obtained. The results are returned to the requesting application at
step 750. The modification table is updated at step 752 to reflect
the time of the latest updates if the stored procedure modifies any
tables.
[0107] If the cached results are determined at step 744 to be
valid, the results are returned to the requesting application at
step 754. As mentioned, stored procedures can include write queries
as well as read queries. In one embodiment, a call to a stored
procedure is issued to the database even when the results of the
procedure maintained by the cache are valid and can be returned to
the requesting application. The stored procedure is issued so that
the updates to the database by the write portions of the procedure
can be made. The cached results can still be returned to increase
performance in responding to applications. When the results of
passing the stored procedure to the database are returned, the
caching system can update the modification table to reflect the
time of the new updates. Additionally, the results can be used to
update the results table with new results for the stored procedure
and the time these results were retrieved.
[0108] In one embodiment, the results table and/or dependency table
can include an indication for each stored procedure entry whether
that stored procedure modifies any data at the database. An
embodiment that specifies whether each table in the dependency
table is updated by the corresponding stored procedure or whether
the stored procedure modifies the listed table has already been
described. When the intermediary determines that a requested
procedure contains a valid entry in the results table and the
procedure does not modify data at the database, the cached results
are returned and the stored procedure is not passed to the
database. If the entry indicates that the stored procedure does
modify data, it can be passed to the database to modify database
data after returning the cached results to the requesting
application. In yet another embodiment, a caching system is
implemented to only cache stored procedure results and other
information for procedures that do not modify data at the
database.
Row-Level Caching and Invalidation
[0109] The embodiments described thus far have described table
based caching systems that invalidate entries when a dependent
table of a query is modified. The disclosed techniques are
extendable to other levels of invalidation based on more refined
analysis of modified data. For example, the disclosed technology
can be used to invalidate cached query results when a row the query
is dependent upon is modified. Likewise, results can be invalidated
only when an actual entry in the database the query is dependent
upon is modified. These techniques can increase efficiency and
performance by further reducing the number of times valid query
results in the cache are invalidated.
[0110] FIG. 10 is a flowchart depicting a row-based caching and
invalidation technique in accordance with one embodiment. In FIG.
10, row based caching is implemented for queries whose dependencies
are limited to the retrieval of records (row of data) based on a
unique identifier for the row. For example, if a query specifies
"select * from employee where user_id=1111", and user_id is a
unique user_id assigned to each individual in the organization,
such a query can be cached and invalidated in accordance with the
disclosed techniques. An update to the individual record (row) of
the employee table specified in the query can be the basis for
invalidating the cached query results. If an update is made to
another record (row) in the employee table, the cached results are
not invalidated since the update does not affect the cached
results.
[0111] For queries that depend on something more than a unique
record identifier, the table-based caching techniques previously
described are employed. Consider an exemplary query that specifies,
"select * from employee where salary >100000." The results of
this query depend upon a column in the employee table other than
the unique record identifier for that row (e.g., user_id or social
security number). The query is not merely dependent upon a part of
the entry that is unique for each individual record, but rather the
salary of an employee which is a non-unique portion of a record. If
the results for this query are cached and a subsequent update
increases the salary of an employee that was previously below
$100,000 to above $100,000, the cached results will no longer be
valid. The employee record with the salary column updated to a
value above $100,000 should now be included in the query results.
However, the update does not affect any rows that were returned in
response to the original query. If invalidation is based only on
changes to those records or rows included in the original results,
row-based caching in this example would fail to properly invalidate
the cached results and invalid results returned for a subsequent
received version of this query.
[0112] In FIG. 10, selective row-based caching is implemented to
guarantee accurate cache results while also improving performance
by reducing the number of invalidations inherent in a table-based
invalidation implementation. A query is received from a requesting
application at step 802 and at 804, it is determined whether an
entry corresponding to the received query exists in a results
table. If there is no corresponding entry in the results table, the
query is abstracted at step 806 and a determination made at step
808 as to whether an entry for the abstracted query skeleton exists
in a dependency table.
[0113] If no entry exists for the query skeleton, the resulting
skeleton is passed to the database at step 810 by calling a native
analysis facility of the database to assess how the query structure
is handled. The results of the analysis are accessed at step 812.
At step 814, an entry is created in the dependency table for the
query skeleton. Dependency tables in the row-based technique
depicted in FIG. 10 are implemented as previously described with
respect to table-based caching. The entry lists the abstracted
query skeleton and any tables queries having that structure are
dependent upon.
[0114] At step 816, it is determined whether the query requests
records based solely on a unique identifier for individual records
of the corresponding table. If the query depends solely on a unique
identifier for a row of the relevant table, the process continues
at step 818 where it is determined if the query is a write or read
query. If the query is a read query, the query is modified at step
820 to include a request for the row ID number of the record
requested by the query. Each record in a table includes an internal
identifier used by the database to uniquely identify every record.
This identifier or row ID number is different than the unique user
data identifier referred to previously.
[0115] The modified query is passed to the database at step 822.
The results of the query are received from the database at step
824. The query results are used to create an entry in results table
204 at step 826. The entry created in table 204 will not only
include the query, the results, and the time the results were
obtained, but also the row ID number associated with that query
that was retrieved via the modified query. The row ID number is
used to determine whether cached query results are valid when
subsequent requests including the cached query are received. The
query results are returned to the requesting application at step
828. If the cached system determines at step 818 that the query is
a write query, a modification table is updated at step 830 to
indicate the modifications to the tables effected by the write
query. To facilitate the row-based caching technique of FIG. 10,
the modification table stores an indication of the times rows
within a table were last updated. At step 830, the modification
table is updated to include the table dependencies determined at
step 812. For each entry, the modification will specify the row_id
for the specific record(s) modified by the query as well as the
table that was modified and the time of modification. If it is
determined at step 816 that the query does not requests records
based solely on a unique identifier for individual records of the
corresponding table, the method continues at step 832 where table
based caching is instituted according to steps 320-332 of FIG. 4.
Steps 322 and 328 are skipped if the query's structure is already
cached.
[0116] If it is determined at step 804 that an entry exists in
results table 204 for the query received at step 802, the
dependencies for the query are determined at step 834. The tables
the query is dependent upon are determined from dependency table
206. The row_id number of the query is determined from results
table 204 at step 836. Modification table 208 is then searched at
step 838 for an entry matching the dependent table of the query and
the row_id number of the query. If an entry is found and the time
of last modification to the specified row was before the results
stored in table 204 were obtained, the cached results are
determined to be valid at step 840. The cached results are returned
to the requesting application at step 842. If no entry is found, it
is determined that the cached results are valid and they are
returned at step 842. If an entry is found and it indicates an
update to the row after the results were obtained, the query is
passed to the database at step 844 and the results received. The
results are provided to the requesting application at step 846.
[0117] It should be noted that row-based caching and table-based
caching can be contemporaneously implemented in accordance with one
embodiment. For example, table 208 may only list table
identifications for certain entries along with the time the table
was modified when the entry is for a query that is dependent on
something other than a unique record identifier. Table 208 may also
include entries including row ID numbers and table IDs for other
entries. Likewise, entries in table 204 for the query results may
list the row_id numbers for queries meeting the unique row
identifier dependency limitation. For more complicated queries with
additional dependencies, only the results and time will be
maintained. In this manner, when a query is received that has an
entry in table 204, an intermediary can determine the row ID
dependency from table 206 in order to compare that dependency with
the time of last update from table 208 to determine whether the
results in table 204 are valid.
[0118] In one embodiment, a caching system can be distributed
across one or more servers or other systems at various levels to
provide further performance and availability. FIG. 11 depicts an
exemplary embodiment including caching intermediaries 108
distributed at application servers 102 as well as a central server
902. Central server 902 is implemented between application servers
102 and database server 122 to provide an additional level of
caching.
[0119] Each application server includes a caching intermediary 108
as previously described. Central server 902 also includes a caching
intermediary 108, providing a centralized cache for results and
invalidation information that can be used by the individual
intermediaries at each application server. In FIG. 11, the caching
intermediary 108 at the central server includes similar cache 110
and interface 112 components as the local caching intermediaries.
These components may include additional operations and perform
slightly different operations in such a multi-level environment.
Again, different implementations of intermediary 108 can be made in
accordance with embodiments. In FIG. 11, the listener component 128
interfaces with the caching intermediary 108 at the central server
902 to provide information regarding updates to the database. The
intermediary at the central server can maintain this information
for invalidating cache entries. In one embodiment, the central
server intermediary can pass this information to the local
intermediaries at the application servers so it can be used there
for invalidations. In yet another embodiment, the listener
component 128 can interface directly with each caching intermediary
108 at the application server level and central server level.
[0120] In an exemplary operation of the system of FIG. 11, an
application 104 at any of the application servers can issue a query
or stored procedure call which is intercepted at the local caching
intermediary 108 associated with that application. The caching
intermediary can determine if it is storing dependency information
for the structure of the received query. If the structure is stored
locally, the local intermediary can determine if it is storing
valid results for the received query as previously described. If
the local intermediary is storing valid results, it can further
access the central server's intermediary 108 to assess the validity
of the results.
[0121] An application at one of the other application servers in
communication with the central server may update database 120 in a
transaction not seen at the local intermediary processing the
current request. However, a query received from any of the local
application servers will pass through central server 902.
Accordingly, the local intermediary can access the central server
to determine the last time any table, row, etc. upon which the
received query is dependent was last modified by any of the
applications at any of the application servers. If the local
intermediary determines that its local results are valid, those
results can be returned to the requesting application. If the local
results are not valid, the local intermediary passes the query to
the caching intermediary at the central server 902. The caching
intermediary at central server 902 will return its locally cached
results if they are valid.
[0122] If neither the local intermediary nor the central
intermediary are maintaining valid results for the query, the
intermediary at the central server will forward the query to the
database and receive results. The intermediary at the central
server will update its cache with the new results, time of
retrieval, and/or any portions of the database modified by the
query if necessary. The results will be returned to the local
intermediary which will also update its cache in accordance with
the new results, time, etc.
[0123] In one embodiment, the central server's caching intermediary
can also send the results and any update information to the local
intermediaries at the application servers in response to updates to
the database from any of the application servers and/or in response
to a direct update detected by listener component 128. In this
manner, the local intermediaries need only check their local caches
to determine if their cached results are still valid.
[0124] If the local intermediary receiving the query initially
determines that it is not storing dependency information for the
structure of the received query, it can abstract the query and pass
it to the intermediary at the central server to determine if the
intermediary at the central server is maintaining the dependency
information. For example, dependency analysis may have already been
performed by the database for the received structure in response to
a query from another application server. If the intermediary at the
central server is maintaining the information, it can pass it to
the local intermediary which can cache the information for later
use. Upon receiving the dependency information in one embodiment,
the local intermediary determines that it is not storing valid
results for the query and forwards the query to the central
server's intermediary. In other embodiments, the local intermediary
can pass the query to the central server's intermediary as soon as
it determines that it is not storing the dependency information,
rather than wait to receive this information.
[0125] If the database is not storing the dependency information
for a query of this structure, it will send a response to the local
intermediary informing it that the query structure has not yet been
analyzed or that queries of that structure are not cacheable. In
one embodiment, the central server intermediary stores information
to indicate query structures that were previously determined to not
be cacheable to avoid the performance of dependency analysis more
than once for the same structure. Additionally, the local
intermediaries can maintain information about query structures that
were previously determined to not be cacheable so that a request to
the central server can be avoided when it is known that queries of
the received structure are not cacheable.
[0126] After receiving a notification from the central server
intermediary that the dependency information is not being stored or
that queries of the received query's structure are not cacheable,
the local intermediary sends the actual query to the intermediary
at the central server. The local intermediary also sends the actual
query to the intermediary after receiving dependency information
and determining that it is not maintaining valid results for the
actual query as mentioned above. The central server intermediary
can again check whether it is storing dependency information for
the received query. In other embodiments, the central server
intermediary only performs this check once. The central server
intermediary will pass the abstracted query to the database by
calling a facility such as "EXPLAIN PLAN" as previously described.
The intermediary at the central server will also pass the actual
query to the database. When the actual query results are received,
the intermediary at the central server can cache the results and
forward them to the local caching intermediary which will also
cache the results. Likewise, when the results of the dependency
analysis are received, the central server intermediary will cache
the dependency information, etc. and forward the results to the
local intermediary which will cache them as well. In one
embodiment, the central server intermediary and local
intermediaries wait to receive the dependency analysis information
before caching query results or dependency information.
[0127] In FIG. 11, a dual-level caching system is provided with
caches at the application server level and another server level
implemented between the application servers and database server. In
one embodiment, additional levels of caching can be used. For
example, application server caching systems may interface with a
regional server running a caching system in accordance with an
embodiment. Multiple regional servers may interface with a central
server running a caching system that interfaces with the database
system. Other variations with additional levels of caching or
further distribution of components across these servers can be
implemented in accordance with various embodiments.
[0128] While exemplary embodiments have been described in which
caching systems or intermediaries execute on application servers
and other servers, any type of processing or computing device may
be used, including personal computers, minicomputers, mainframes,
handheld computing devices, mobile computing devices, and so forth.
Typically, these computing devices will include one or more
processors in communication with one or more processor readable
storage devices, communication interfaces, peripheral devices, and
so forth. Examples of storage devices include RAM, ROM, hard disk
drives, floppy disk drives, CD ROMS, DVDs, flash memory, and so
forth. Examples of peripherals include printers, monitors,
keyboards, pointing devices, and so forth. Examples of
communication interfaces include network cards, modems, wireless
transmitters/receivers, and so forth. In some embodiments, all or
part of the functionality is implemented in software, including
firmware and/or micro code, that is stored on one or more processor
readable storage devices and is used to program one or more
processors to achieve the functionality described herein.
[0129] The foregoing detailed description has been presented for
purposes of illustration and description. It is not intended to be
exhaustive or to limit the disclosure and any claimed subject
matter to the precise form disclosed. Many modifications and
variations are possible in light of the above teachings. The
described embodiments were chosen in order to best explain the
principles of the technology and its practical application to
thereby enable others skilled in the art to best utilize the
technology in various embodiments and with various modifications as
are suited to the particular use contemplated. It is intended that
the scope of the present disclosure and its claimed subject matter
be defined by the claims appended hereto.
* * * * *