U.S. patent application number 10/856748 was filed with the patent office on 2005-01-13 for system and method of query transformation.
This patent application is currently assigned to Cognos Incorporated. Invention is credited to Styles, Michael E..
Application Number | 20050010570 10/856748 |
Document ID | / |
Family ID | 33102849 |
Filed Date | 2005-01-13 |
United States Patent
Application |
20050010570 |
Kind Code |
A1 |
Styles, Michael E. |
January 13, 2005 |
System and method of query transformation
Abstract
A query transformation system for transforming nested aggregates
in a query is provided. The query transformation system comprises a
nested aggregate analysis module for analysing a query that is not
supported by a target database system, and a nested aggregate
transformation module for transforming the query into a
semantically equivalent query that is supported by the target
database system.
Inventors: |
Styles, Michael E.;
(Osgoode, CA) |
Correspondence
Address: |
KING & SPALDING LLP
191 PEACHTREE STREET, N.E.
ATLANTA
GA
30303-1763
US
|
Assignee: |
Cognos Incorporated
Ottawa
CA
|
Family ID: |
33102849 |
Appl. No.: |
10/856748 |
Filed: |
May 27, 2004 |
Current U.S.
Class: |
1/1 ;
707/999.004 |
Current CPC
Class: |
Y10S 707/99931 20130101;
G06F 16/244 20190101; Y10S 707/99933 20130101; Y10S 707/99935
20130101; Y10S 707/954 20130101; Y10S 707/99934 20130101; G06F
16/24547 20190101; G06F 16/283 20190101 |
Class at
Publication: |
707/004 |
International
Class: |
G06F 017/30 |
Foreign Application Data
Date |
Code |
Application Number |
May 27, 2003 |
CA |
2,429,910 |
Claims
What is claimed is:
1. A query transformation system for transforming nested aggregates
in a query, the query transformation system comprising: a nested
aggregate analysis module for analysing a query that is not
supported by a target database system; and a nested aggregate
transformation module for transforming the query into a
semantically equivalent query that is supported by the target
database system.
2. The query transformation system as claimed in claim 1, wherein
the nested aggregate analysis module includes: a nested aggregate
detection unit for traversing a query and detecting a nested
aggregate in the query; and a nested aggregate tagging unit for
traversing a query and tagging the nested aggregate.
3. The query transformation system as claimed in claim 1, wherein
the nested aggregate transformation module includes: a derived
table for storing a copy of a tagged nested aggregate; and a query
editor for inserting the tagged nested aggregate into a query that
is supported by a target database system.
4. A method of nested aggregate transformation, the method
comprising the steps of: analysing a query that is not supported by
a target database system; and transforming the query into a
semantically equivalent query that is supported by the target
database system.
5. The method as claimed in claim 4, wherein the step of analysing
includes the steps of: detecting a nested aggregate in a select
list of a query; and adding a tag to the nested aggregate.
6. The method as claimed in claim 4, wherein the step of analysing
includes the steps of: creating a derived table; adding a column
reference to a select list in the derived table; and adding the
tagged nested aggregate to the select list of the derived
table.
7. A method of nested aggregate transformation, the method
comprising the steps of: generating a derived table to compute
aggregates; extracting nested aggregates and moving nested
aggregates into a select list of the derived table; and determining
whether the transformation is to be performed, including the steps
of: detecting nested aggregates; and detecting framed aggregates
and report aggregates; if nested aggregates are detected,
performing the transformation, including the steps of: marking a
nested aggregate for insertion into an inner select list; marking a
report aggregate for insertion into the inner select list if it
does not contain a nested aggregate or there are framed aggregates
present; and detecting framed aggregates; and adding expressions to
the inner select list of the derived table.
8. A computer data signal embodied in a carrier wave and
representing sequences of instructions which, when executed by a
processor, cause the processor to perform a method of nested
aggregate transformation, the method comprising the steps of:
analysing a query that is not supported by a target database
system; and transforming the query into a semantically equivalent
query that is supported by the target database system.
9. A computer-readable medium having computer readable code
embodied therein for use in the execution in a computer of a method
of nested aggregate transformation, the method comprising the steps
of: analysing a query that is not supported by a target database
system; and transforming the query into a semantically equivalent
query that is supported by the target database system.
10. A computer program product for use in the execution in a
computer of a query transformation system for transforming nested
aggregates in a query, the computer program product comprising: a
nested aggregate analysis module for analysing a query that is not
supported by a target database system; and a nested aggregate
transformation module for transforming the query into a
semantically equivalent query that is supported by the target
database system.
Description
FIELD OF THE INVENTION
[0001] The invention relates generally to data access middleware,
and in particular to a system and method of query
transformation.
BACKGROUND OF THE INVENTION
[0002] A typical data access environment has a multi-tier
architecture. For description purposes, it can be separated into
three distinct tiers:
[0003] a Web server
[0004] Applications
[0005] Data
[0006] The tiers are based on business function, and are typically
separated by fireballs. Client software, such as a browser or a
report-authoring tool, sits above the tiers.
[0007] The web server contains a firewall and one or more gateways.
All web communication is performed through a gateway. A gateway is
responsible for passing on requests to the application server, in
tier 2. for execution.
[0008] The applications Tier contains one or more application
servers. The application server runs requests, such as reports and
queries that are forwarded by a gateway running on the web server.
Typically, one of the components of the applications tier is a
query engine, which is data access middleware that provides
universal data access to a variety of heterogeneous database
systems. The query engine formulates queries (typically SQL) and
passes them on to the data tier, through a native database API
(such as ODBC) for execution.
[0009] The data tier contains database management systems (DBMS),
which manage raw data stored in a database. Examples of such
systems include Oracle, DB2, and Microsoft SQL Server.
[0010] Although a multi-tier architecture can be configured in
several different ways, a typical configuration places each tier on
a separate computer (server). A database server is typically a
"high end" server, and thus can process queries at a relatively
fast speed. An application server cannot generally process queries
as quickly as a database server.
[0011] In order to solve many business questions, a query engine
may generate SQL queries that utilize the SQL/OLAP technology
introduced in the SQL: 1999 standard. However, many database
systems do not support this technology. Thus, the SQL queries would
have to be performed by the query engine on the application server
that is generally slower than the database server. It is desirable
to have as much processing performed on the database server.
[0012] There is a need to prevent or reduce the amount of local
(application server) processing required to process a query. In the
past, the application would be responsible for generating SQL that
contained no nested OLAP functions. Quite often, generating this
type of SQL is more difficult since it is more complex.
[0013] One way of overcoming this problem is for the query engine
to generate a simple base query for collecting the data, and then
process the remainder of the query (including OLAP functions) to
produce the desired result. Unfortunately, this solution requires
extensive processing time on the application server. It is
desirable to have a way of transforming the query into a form that
minimizes the processing Time on the application server. Ideally,
the transformed query would be processed by the DBMS in its
entirety.
SUMMARY OF THE INVENTION
[0014] It is an object of the present invention to provide a method
of nested aggregate transformation in a database system that
supports SQL/OLAP, but does not support nested OLAP functions.
[0015] In accordance with an embodiment of the present invention,
there is provided a query transformation system for transforming
nested aggregates in a query. The query transformation system
comprises a nested aggregate analysis module for analysing a query
that is not supported by a target database system, and a nested
aggregate transformation module for transforming the query into a
semantically equivalent query that is supported by the target
database system.
[0016] In accordance with another embodiment of the present
invention, there is provided a method of nested aggregate
transformation. The method comprises the steps of analysing a query
that is not supported by a target database system, and transforming
the query into a semantically equivalent query that is supported by
the target database system.
[0017] In accordance with another embodiment of the present
invention, there is provided a method of nested aggregate
transformation. The method comprises the steps of generating a
derived table to compute aggregates, extracting nested aggregates
and moving nested aggregates into a select list of the derived
table, and determining whether the transformation is to be
performed. The step of determining includes the steps of detecting
nested aggregates, detecting framed aggregates and report
aggregates, performing the transformation if nested aggregates are
detected, and adding expressions to the inner select list of the
derived table. The step of performing includes the steps of marking
a nested aggregate for insertion into an inner select list, marking
a report aggregate for insertion into the inner select list if it
does not contain a nested aggregate or there are framed aggregates
present, and detecting framed aggregates.
[0018] In accordance with another embodiment of the present
invention, there is provided a computer data signal embodied in a
carrier wave and representing sequences of instructions which, when
executed by a processor, cause the processor to perform a method of
nested aggregate transformation. The method comprises the steps of
analysing a query that is not supported by a target database
system, and transforming the query into a semantically equivalent
query that is supported by the target database system.
[0019] In accordance with another embodiment of the present
invention, there is provided a computer-readable medium having
computer readable code embodied therein for use in the execution in
a computer of a method of nested aggregate transformation. The
method comprises the steps of analysing a query that is not
supported by a target database system, and transforming the query
into a semantically equivalent query that is supported by the
target database system.
[0020] In accordance with another embodiment of the present
invention, there is provided a computer program product for use in
the execution in a computer of a query transformation system for
transforming nested aggregates in a query. The computer program
product comprises a nested aggregate analysis module for analysing
a query that is not supported by a target database system, and a
nested aggregate transformation module for transforming the query
into a semantically equivalent query that is supported by the
target database system.
BRIEF DESCRIPTION OF THE DRAWINGS
[0021] FIG. 1 shows a typical data access environment.
[0022] FIG. 2 shows a nested aggregate transformation system, in
accordance with an embodiment of the present invention.
[0023] FIG. 3 shows in a flowchart an example of a method of a
nested aggregate transformation, in accordance with an embodiment
of the nested aggregate transformation system.
[0024] FIG. 4 shows in a flowchart another example of a method of a
nested aggregate transformation, in accordance with an embodiment
of the nested aggregate transformation system.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0025] FIG. 1 shows a typical data access environment 10 for
processing data. Typically, data is stored in a database 11. A
database management system (DBMS) running on a database server 12
accesses the raw data stored in the database 11. A query engine 15,
running on a report server (or application server) 13 is used to
generate reports on the raw data and instruct the DBMS on the
database server 12 to obtain information pertaining to the raw data
in the database 11. The query engine 15 provides universal data
access to a variety of heterogeneous database systems. An end user
uses a client application 14, running on a client workstation, to
facilitate application server 13 operations.
[0026] In order To solve many business questions, a query engine 15
generates SQL queries that utilize the SQL/online analytical
programming (OLAP) technology introduced in the SQL: 1999 standard.
These SQL queries include SQL/OLAP functions (windowed aggregates).
However, many database systems 12 do not support this technology.
In order to prevent or reduce the amount of local (application
server) processing required to process these types of queries, the
query engine 15 attempts to generate semantically equivalent
queries that can be processed on the database server 12 by the
target database system. These semantically equivalent queries
included standard aggregate functions and a GROUP BY operator.
[0027] In order to solve many business questions, SQL (Structured
Query Language) queries are generated that utilize the SQL/OLAP
(Online Analytical Programming) technology introduced in the SQL:
1999 standard. Quite often, these queries contain OLAP functions
whose arguments and/or window specifications contain other (nested)
OLAP functions. The SQL: 1999 standard (and most databases with a
SQL/OLAP capability) does not permit this.
[0028] FIG. 2 shows a nested aggregate transformation system 20 for
generating a query with no nested OLAP functions that is
semantically equivalent to a query having nested OLAP functions.
The nested aggregate transformation system 20 comprises a nested
aggregate analysis module 21 for analyzing SQL/OLAP queries
containing nested OLAP functions that are not supported by a target
database system, and a nested aggregate transformation module 22
for transforming these SQL/OLAP queries into semantically
equivalent queries that are supported by the target database
system. The nested aggregate analysis module 21 may include a
nested aggregate detection unit for traversing a query and
detecting a nested aggregate in the query, and a nested aggregate
tagging unit for traversing a query and tagging the nested
aggregate. The nested aggregate transformation module may include a
derived table for storing a copy of the tagged nested aggregate,
and a query editor for inserting the tagged nested aggregate into a
query that is supported by a target database system.
[0029] The nested aggregate transformation system 20 may be
implemented as a sub-system of the query engine 15 in the data
access environment 10. This transformation 20 may generate queries
that can be processed in their entirety on the database server 12,
or queries that require processing on both the application server
13 and the database server 12.
[0030] FIG. 3 shows in a flowchart an example of a method of nested
aggregate transformation (30), in accordance with an embodiment of
the nested aggregate transformation system 20. The method (30)
begins with analyzing a query containing a nested aggregate
functions that is not supported by a target database system (31).
Next, the query is transformed into a semantically equivalent query
that is supported by the target database system (32). The method
(30) is done (33).
[0031] Advantageously, the nested aggregate transformation system
20 eases the task of generating SQL by eliminating the need to
analyze all aggregation required by the report, and by eliminating
the need to generate nested derived tables. Furthermore, the nested
aggregate transformation system 20 allows the query engine 15 to
only perform the transformation if necessary.
[0032] There are two types of OLAP functions: framed frictions and
report functions. Framed OLAP fruitions contain a window frame
specification (ROWS or RANGE) and an ORDER BY clause. Through
window frames, capabilities such as cumulative (running) sums and
moving averages can be supported. Report functions do not contain a
window frame specification, and produce the same value for each row
in a partition.
[0033] A nested aggregate/OLAP function is any aggregate/OLAP
function appearing inside the specification of another
aggregate/OLAP function. These aggregates /OLAP functions may
appear in the operand, the PARTITION BY clause, the AT (compute
break) clause, or the ORDER BY clause. Nested aggregates are
computed prior to the parent aggregate being computed. Framed OLAP
functions are given special attention when performing the nested
aggregate transformation 20 due to the presence of the ORDER BY
clause.
[0034] FIG. 4 shows a flowchart of an example of a method of nested
aggregate transformation (40), in accordance with an embodiment of
the nested aggregate transformation system 20. The method (40)
begins with searching for nested aggregates and framed aggregates
in a select list (41). If nested aggregates are not found (42),
then the method is done (46). If nested aggregates are found (42),
then the select list is traversed again with tags added to
appropriate aggregates (43). A derived table is created (44). Once
again, the select list is traversed and simple column references
and tagged aggregates are added to the select list of the derived
table (45). The method (40) is done (46).
[0035] The nested aggregate transformation system 20 generates a
derived table to compute these aggregates. Nested aggregates are
extracted and moved into the select list of the derived Table. This
process may be repeated several times, depending on the level of
nesting.
[0036] To determine whether the transformation must be performed,
each expression in the select list is analyzed to detect the
presence (if any) of nested aggregates (31). This analysis
comprises the following:
[0037] Detection of nested aggregates.
[0038] Detection of framed aggregates (nRAggregates) and report
aggregates (nXAggregates).
[0039] If nested aggregates are detected (32), the transformation
is performed.
[0040] The transformation performs a second analysis of the select
list to determine how the transformation should be performed. This
analysis comprises:
[0041] Marking a nested aggregate for insertion into the inner
select list.
[0042] Marking a report aggregate for insertion into the inner
select list if it does not contain a nested aggregate or there are
framed aggregates present (nRAggregates >0).
[0043] Detection of framed aggregates
[0044] Once this is complete, the select list is traversed again,
with the following types of expressions being added To the inner
select list:
[0045] 1. Simple column references.
[0046] 2. Aggregates marked for insertion. For all other
aggregates, the AT, PARTITION BY, and ORDER BY clauses are
processed according to the rules outlined here.
[0047] 3. Expressions that do not contain nested aggregates or
framed aggregates.
[0048] All other expressions are traversed, and processed according
to the rules outlined here.
[0049] Once the nested aggregate transformation is performed, other
query transformations can be performed as required.
EXAMPLE 1
[0050] In this example, a nested OLAP function appears in the
operand of another OLAP function, so the query must be
transformed.
[0051] Original Query
1 SELECT MAX( SUM( QTY ) OVER ( PARTITION BY SNO )) OVER ( ) FROM
SUPPLY
[0052] Transformed Query
2 SELECT MAX(C0) OVER ( ) FROM ( SELECT SUM( QTY ) OVER ( PARTITION
BY SNO ) C0 FROM SUPPLY ) T1
[0053] Explanation
[0054] Against Oracle, the original query will result in an error
since the OLAP function MAX contains a nested SUM function in its
operand. To eliminate the nesting, a derived table T1 is created,
and the SUM function is pushed into the select list.
EXAMPLE 2
[0055] In this example, a nested OLAP function appears in the
operand of an OLAP function. As well, the same nested OLAP function
appears on its own in the select list. This query must be
transformed.
[0056] Original Query
3 SELECT SNO, PNO, SUM( QTY ) OVER ( PARTITION BY SNO ), MAX( SUM(
QTY ) OVER ( PARTITION BY SNO )) OVER ( ) FROM SUPPLY
[0057] Transformed Query
4 SELECT C0, C1, C2, MAX( C2 ) OVER ( ) FROM ( SELECT SNO C0, PNO
C1, SUM( QTY ) OVER ( PARTITION BY SNO ) C2 FROM SUPPLY ) T1
[0058] Explanation
[0059] Against Oracle, the original query will result in an error
since the OLAP function MAX contains a nested SUM function in its
operand. To eliminate the nesting, a derived table T1 is created,
and the SUM function is pushed into the select list. Note also that
a small optimization is performed with respect to the. first OLAP
function (SUM) in the select list of the original query, since it
is identical to the operand of the MAX function. The SUM function
only needs to be computed once.
EXAMPLE 3
[0060] In this example, a nested OLAP function appears in the
operand of an OLAP function. As well, the nested OLAP function
itself contains a nested OLAP function in its PARTITION BY clause.
This query must be transformed.
[0061] Original Query
5 SELECT SNO, PNO, SUM( QTY ) OVER ( PARTITION BY AVG( QTY ) OVER (
PARTITION BY JNO )), MAX( SUM( QTY ) OVER ( PARTITION BY AVG( QTY )
OVER ( PARTITION BY JNO ))) OVER ( ) FROM SUPPLY
[0062] Transformed Query
6 Pass 1 SELECT C0, C1, C2, MAX( C2 ) OVER ( ) FROM ( SELECT C0,
C1, SUM( C2 ) OVER ( PARTITION BY AVG( QTY ) OVER ( PARTITION BY
JNO ) ) C2 FROM SUPPLY ) T0 Pass 2 SELECT C0, C1, C2, MAX( C2 )
OVER ( ) FROM ( SELECT C0, C1, SUM( C2 ) OVER ( PARTITION BY C3 )
C2 FROM ( SELECT SNO C0, PNO C1, QTY C2, AVG( QTY ) OVER (
PARTITION BY JNO ) C3 FROM SUPPLY ) T0 ) T1
[0063] Explanation
[0064] Against Oracle9i, the original query will result in an error
since the OLAP functions SUM and MAX contain nested OLAP functions.
This particular example requires 2 passes. In the first pass, the
SUM function is pushed into a derived table to eliminate all nested
aggregation in the top-level select list. The second pass, required
since the SUM function contains a nested AVG function in its
PARTITION BY clause, results in the AVG function being moved into
another derived table.
EXAMPLE 4
[0065] In this example, a nested OLAP function appears in the
operand of an OLAP function. As well, the select list contains a
framed aggregate That computes a cumulative SUM. This query must be
transformed.
[0066] Original Query
7 SELECT SNO, PNO, SUM( QTY ) OVER( ORDER BY SNO, PNO ROWS
UNBOUNDED PRECEDING ), SUM( QTY ) OVER ( PARTITION BY SNO ), MAX(
SUM( QTY ) OVER ( PARTITION BY SNO ) OVER ( ) FROM SUPPLY
[0067] Transformed Query
8 Pass 1 SELECT C0, C1, SUM( C2 ) OVER( ORDER BY C1, C2 ROWS
UNBOUNDED PRECEDING ) C2, C3, C4 FROM ( SELECT SNO C0, PNO C1, QTY
C2 SUM( QTY ) OVER ( PARTITION BY SNO ) C3, MAX( SUM( QTY ) OVER (
PARTITION BY SNO ) OVER ( ) C4 FROM SUPPLY ) T0 Pass 2 SELECT C0,
C1, SUM( C2 ) OVER( ORDER BY C1, C2 ROWS UNBOUNDED PRECEDING ) C2,
C3, C4 FROM ( SELECT C0, C1, C2, C3, MAX( C3 ) OVER( ) C4 FROM (
SELECT SNO C0, PNO C1, QTY C2, SUM( QTY) OVER ( PARTITION BY SNO )
C3 FROM SUPPLY ) T0 ) T1
[0068] Explanation
[0069] Against Oracle9i, the original query will result in an error
since the OLAP function MAX contains a nested SUM function in its
operand. This particular example requires 2 passes. In the first
pass, the SUM and MAX functions are pushed into a derived table
since the main select list contains a framed aggregate. The framed
aggregate is left in the main select list since it requires the
data to be ordered a specific way to compute the value. If this
aggregate was pushed into the inner select list, any sorting
required to compute aggregates in the main select may disrupt the
ordering of values produced by the framed aggregate. The second
pass, required since the MAX function contains a nested SUM
function in its operand, results in the SUM function being moved
into another derived table.
[0070] The systems and methods according to the present invention
may be implemented by any hardware, software or a combination of
hardware and software having the functions described above. The
software code, either in its entirety or a part thereof, may be
stored in a computer readable memory. Further, a computer data
signal representing the software code that may be embedded in a
carrier wave may be transmitted via a communication networks Such a
computer readable memory and a computer data signal are also within
the scope of the present invention, as well as the hardware,
software and the combination thereof.
[0071] While particular embodiments of the present invention have
been shown and described, changes and modifications may be made to
such embodiments without departing from the true scope of the
invention.
* * * * *