Oracle Notes

ADVANCED QUERIES

Joining Tables in SQL

A join allows data to be selected from two or more tables by one query. A join returns all possible combinations of rows from the specified tables that satisfy the WHERE clause.

Indicate the tables to be joined in the FROM clause:

    table-name join-type JOIN table-name ON join-condition

where join-type is one of:

    INNER, LEFT OUTER, RIGHT OUTER or FULL OUTER

The "join-condition" is a condition that specifies a relationship between the tables (usually, a column in one table equal to a column in another table). 

Examples:

In an INNER join, only matched rows are returned.  The following SELECT command uses an inner join, with the ON clause, to display manager number, name, portfolio number and cash from the MANAGER and PORTFOLIO tables:

select manager.managerno,managername,portfoliono,cash
    from manager inner join portfolio
    on manager.managerno = portfolio.managerno;

The result of this SELECT is:

MA MANAGERNAME     PORT       CASH
-- --------------- ---- ----------
M2 Brown           P001      25000
M3 Jones           P004      10000
M2 Brown           P005      50000
M3 Jones           P008      37500


The SELECT statement can include a WHERE clause to restrict the rows returned.  To display the stock symbol and current value (numbershares x current price) in portfolio P008:

select holding.stocksymbol,numbershares * currentprice
    from holding join stock
    on holding.stocksymbol = stock.stocksymbol
    where portfoliono = 'P008';

Note that the keyword INNER is optional;  an inner join is the default.

select managername, p.portfoliono, numbershares, currentprice
    from holding h join stock s on h.stocksymbol = s.stocksymbol

                        join portfolio p on p.portfoliono=h.portfoliono

                        join manager m on m.managerno=p.managerno

order by managername;

The first join can reference columns in holding and stock.  The second join can reference columns in all 3 tables and the third, in all 4 tables.

Natural Join

The joining column(s) don’t need to be named.  The columns with the same names, in both tables, are used for the join.  If more than one column matches, in the joined tables, the using clause could be used to specify the column which should be used for the join.


Outer Joins

The join performed in the examples above is an inner join. In an inner join, rows are only included in the result if there is a match. In the first example, manager M1 would not appear in the result as there are no portfolios for manager M1 in the Portfolio table. To include all rows from the Manager table even if they have no match in the Portfolio table, an outer join must be used.

The following SELECT statement uses a left outer join to include all rows from the Manager table (table on the left in the from clause) in the display; rows not containing any match in the Portfolio table will be assigned a null value in the join.

select manager.managerno,managername,portfoliono,cash
    from manager left outer join portfolio
    on manager.managerno = portfolio.managerno;

The result of this SELECT would be:

MA MANAGERNAME     PORT       CASH
-- --------------- ---- ----------
M2 Brown           P001      25000
M3 Jones           P004      10000
M2 Brown           P005      50000
M3 Jones           P008      37500
M1 Smith

A right outer join will include all rows from the table on the right even if there is no match.

A full outer join includes all rows from both tables even if there is no match.


Cross Join

A cross join produces the "product" of two tables - all rows from one table are joined to all rows of the second table.  For example:

    select manager.managerno,managername,portfoliono,cash
        from manager cross join portfolio;

would produce the following output:

MA MANAGERNAME     PORT       CASH
-- --------------- ---- ----------
M1 Smith           P001      25000
M2 Brown           P001      25000
M3 Jones           P001      25000
M1 Smith           P004      10000
M2 Brown           P004      10000
M3 Jones           P004      10000
M1 Smith           P005      50000
M2 Brown           P005      50000
M3 Jones           P005      50000
M1 Smith           P008      37500
M2 Brown           P008      37500
M3 Jones           P008      37500
M1 Smith           P011       8000
M2 Brown           P011       8000
M3 Jones           P011       8000


Subqueries

The value or set of values in the search condition of a SELECT, UPDATE or DELETE command can be obtained from another query called a "subquery"; a subquery may also appear in a HAVING clause. The subquery must be enclosed in parentheses.

The following rules apply:

- subqueries must be placed on the right of the search-condition
- subqueries can't be used with the BETWEEN or LIKE functions
- a subquery can have only a single item in the SELECT clause
- a subquery can not contain an ORDER BY clause


Examples:

To list the stock number and number of numbershares for holdings managed by manager M2:

select stocksymbol,numbershares
    from holding
    where portfoliono in
        (select portfoliono
            from portfolio
            where managerno = 'M2');


To display the portfolio number of the holdings with the largest number of COS numbershares:

select portfoliono
    from holding
    where numbershares =
        (select max(numbershares)
            from holding
            where stocksymbol = 'COS');

Subqueries can be nested: To display the manager name of managers who manage holdings containing stock symbol NTL:

select managername
    from manager
    where managerno in
        (select managerno
            from portfolio
            where portfoliono in
        (select portfoliono
            from holding
            where stocksymbol = 'NTL'));

To display the portfolio number of portfolios that contain JDS stock but not NTL stock:

select portfoliono
    from portfolio
    where portfoliono in
        (select portfoliono
            from holding
            where stocksymbol = 'JDS')
    and portfoliono not in
        (select portfoliono
            from holding
            where stocksymbol = 'NTL');

Multiple-Row Subquery Operators

IN        – equal to any member in the list
ANY    -           < ANY – less than maximum
                        > ANY – greater than minimum
ALL     -           < ALL – less than minimum
                        > ALL – greater than maximum


Set Operators:  Union, Minus and Intersect

These three operators correspond to the relational algebra operators - union, difference and intersection.

They can be used to combine the results of two queries.

Format:

    select-statement {UNION| UNION ALL | MINUS| INTERSECT} select-statement

UNION: the result table combines all rows returned by the two select statements. If ALL is specified, duplicate rows are returned.
MINUS: the result table contains all rows from the first select that do not have a corresponding row returned by the second select.
INTERSECT: the result table contains rows that are returned by both select statements

Example:

Display the stock symbol of all stock that are in a holding with more than 200 shares and have a current price of more than $50.

select stocksymbol from holding where numbershares >200
intersect
select stocksymbol from stock where currentprice > 50;

select portfoliono, purchaseDate, stocksymbol, numbershares 
from holding
union
select portfoliono, to_date(null), stocksymbol, 0 
from holding_history 
order by portfoliono;


Correlated Queries

When a subquery is coded as part of a query, the subquery is evaluated only once and the resulting values returned to the outer query. Correlation allows a subquery to be repeated more than once.

For example, suppose the following is required:

Display the portfolio number of all portfolios that contain stock with the number of numbershares greater than the average number of numbershares for that stock.

To evaluate this query, it will be necessary to fix a row in the holding table and then to loop through the rows in the holding table with the same portfolio number and calculate the average numbershares; this value must then be compared to the numbershares in the current row. The subquery to evaluate the average must be repeated for each row in the holding table.

A correlated query is the method used by SQL to code this type of loop. To code a correlated query:

- place a correlation variable (any name up to 18 characters beginning with a letter) following the table name in the FROM clause
- in the subquery, qualify a variable using the correlation variable when you wish to reference a variable in the outer query

Logically, the subquery is repeated for each row in the outer query. The following correlated query would solve the problem above.

select portfoliono
    from holding x
    where numbershares >
        (select avg(numbershares)
            from holding
            where stocksymbol=x.stocksymbol)

SQL performs a correlated subquery when the subquery references a column from a table in the "outer" query. The presence of the alias or correlation variable, X, indicates that this is a correlated query. In evaluating the query, instead of evaluating the subquery once, SQL will fix a row in the HOLDING table (outer query), and proceed to evaluate the subquery. The variable, X.STOCKSYMBOL, represents the value of STOCKSYMBOL in the current row in the outer query. Thus the average is calculated for all holdings with that value for STOCKSYMBOL.

The query will be evaluated as follows:

- SQL examines the first row of the HOLDING table; the value for STOCKSYMBOL is ST2
- SQL now evaluates the subquery with that value of STOCKSYMBOL (X.STOCKSYMBOL); thus the subquery will calculate the average numbershares for all rows that have an STOCKSYMBOL value of ST2
- the current row will be returned if the share value is greater than the average calculated by the subquery
- SQL then proceeds to the second row of the HOLDING table etc. until the end of the HOLDING table is reached

 

The Exists Operator

 

The Exists Operator tests for the existence of rows in the results from a subquery.  For example:

 

Select stockname

From stock s

Where not exists

(Select 'x'

From holding

Where s.stocksymbol = stocksymbol);

 

 

Top-N Analysis

 

Top-N Analysis provides the n largest or smallest values of a column.

 

Select [column list], Rownum

From (Select [column list]

            From tablename

Order by Top_N_Column)

Where rownum <= N;

 

Rownum  is a pseudocolumn which assigns a sequential value starting with 1 to each of the rows returned from a subquery.

 

Example:  Display the two top earning managers from the manager table:

 

Select rownum as Rank, managername, salary

From (Select managername, salary

From manager

Order by salary DESC)

Where Rownum <=2;

 

 

Extensions of Group By

 

Rollup – extension of Group By to produce cumulative aggregates.

 

Select [column,] group-function (column)

From table

[Where condition]

[Group by [Rollup] group-by-expression]

[Having having-expression];

 

Example:

Select portfoliono, stocksymbol, sum(numbershares)

From holding

Group by Rollup(portfoliono, stocksymbol);

 

Cube – extension of Group By to produce cross-tabulation values with a single select statement.

 

Select [column,] group-function (column)

From table

[Where condition]

[Group by [Cube] group-by-expression]

[Having having-expression];

 

Example:

Select portfoliono, stocksymbol, sum(numbershares)

From holding       

Group by Cube(portfoliono, stocksymbol);

 

Grouping Function – differentiate stored Null values from null values created by Rollup or Cube.  The Grouping function returns 0 (stored null value) or 1 (null result of rollup/cube).

 

Select [column,] group-function (column), Grouping (expr)

From table

[Where condition]

[Group by [Rollup/Cube] group-by-expression]

[Having having-expression];

 

Example:

Select portfoliono, stocksymbol, sum(numbershares), grouping(portfoliono) GRP_Port, grouping(stocksymbol) GRP_Stock

From holding       

Group by Rollup(portfoliono, stocksymbol);

 

Grouping Sets – define multiple queries in same query

 

Select portfoliono, stocksymbol, purchaseDate, sum(numbershares)

From holding       

Group by Grouping Sets ((portfoliono, stocksymbol), (stocksymbol, purchaseDate));