Chapter 6. Basic SQL

Chapter 6. Basic SQL

- SQL (Structured Query Language)

: SQL is now the standard language for commercial DBMSs.

· 1974 - SEQUEL (Structured English QUery Language)

1986 - A standard version of SQL (: SQL86 or SQL1)

1992 - SQL2 (SQL92)

· The SQL language has several parts:

(1) The Data Manipulation Language (DML)

: This subset of SQL allows users to pose queries and to insert, delete and modify

rows. (ex) SELECT, INSERT, etc.

(2) The Data Definition Language (DDL)

: This subset of SQL supports the creation, deletion and modification of

definitions for tables and views.

(ex) CREATE TABLE, CREATE VIEW, ALTER, etc.

(3) Embedded and dynamic SQL

: Embedded SQL features allow SQL code to be called from a host language such

as C, COBOL, C++ or Fortran.

Dynamic SQL features allow a query to be constructed (and executed) at

run- time.

(4) Security

: SQL provides mechanism to control user’s access to data objects such as tables

and views. (ex) GRANT, REVOKE, etc.

(5) Transaction management

: Various commands allow a user to explicitly control aspects of how a

transaction is to be executed. (ex) COMMIT, ROLLBACK, etc.

(6) Client/Server execution and remote database access

: These commands control how a client application program can connect to an

SQL database server, or access database from a database over a network.

(ex) Web programming tech, Java Script, Java Servlet, XML, etc.

- SQL terms

type ≡ domain

table ≡ relation

table definition ≡ schema

row ≡ tuple

column ≡ attribute

- Create table structure

(See sections 6.1 and 6.2 (pp180-187) of the textbook for details)

· (Form) CREATE TABLE T-NAME (attribute and type list);

(ex) CREATE TABLE Customer (cno CHAR(3), balance NUMBER(5) );

CREATE TABLE Employee (

SSN CHAR(9) NOT NULL, added for

NAME VARCHAR(30) NOT NULL, total participation

AGE INT, constraint

PRIMARY KEY(SSN)

);

CREATE TABLE Works_On (

ESSN CHAR(9) NOT NULL,

PNO INT NOT NULL,

HOURS DECIMAL(3,1) NOT NULL,

PRIMARY KEY(ESSN, PNO)

FOREIGN KEY(ESSN) REFERENCES EMPLOYEE(SSN),

FOREIGN KEY(PNO) REFERENCES PROJECT(PNUMBER)

);

- Basic queries in SQL

(See section 6.3 (pp187-198) of the textbook for details)

· (Form) π → SELECT (DISTINCT) <attribute list>

FROM <table list>

σ → WHERE <condition>;

· DISTINCT: SQL usually treats a table as a multiset; duplicate tuples can appear

more than once in a table and in the result of a query. Only distinct

tuples should remain in the result.

Sailors / Sid / sname / rating / age
21 / Bob / 7 / 40
22 / Bob / 8 / 50
23 / Joe / 9 / 31
24 / Bill / 10 / 20

(ex)

sname
Bob
Bob
Joe

SELECT sname

FROM Sailors Þ

WHERE age > 30;

sname
Bob
Joe

SELECT DISTINCT sname

FROM Sailors Þ

WHERE age > 30;

· Enumeration strategy

1. Compute the cross product of the tables in the <table-list>.

2. Delete those rows in the cross product that do not meet the <condition>.

3. Delete all columns that do not appear in the <attribute-list>.

4. If DISTINCT is specified, eliminate duplicate rows

· The condition in the WHERE clause is a Boolean combination of conditions of the

form expression op expression, where op is one of the comparison operators

{=, ≠, <, >, ≤, ≥}.

(ex) SELECT sname

FROM Sailors, Reserves

WHERE Sailors.sid = Reserves.sid AND Reserves.bid = 103;

SELECT sid

FROM Reserves, Boats

WHERE Reserves.bid = Boats.bid AND Boats.color = ‘red’;

SELECT sname

FROM Reserves, Boats, Sailors

WHERE Reserves.bid = Boats.bid AND Boats.color = ‘red’

AND Sailors.sid = Reserves.sid;

· Renaming (Aliasing)

- In SQL the same name can be used for two (or more) attributes as long as the

attributes are in the different relation.

Þ Prefix the relation name to the attribute name separating the two by period

(ex) SELECT FNAME, EMPLOYEE.NAME, ADDRESS

FROM EMPLOYEE, DEPARTMENT

WHERE DEPARTMENT.NAME = ‘Research’ AND

DEPARTMENT.DNUMBER = EMPLOYEE.DNUMBER;

- Ambiguity also arises in the case of queries that refer to the same relation twice.

(ex) For each employee, retrieve the employee’s first and last name and the first and

last name of his or her immediate supervisor.

SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME

FROM EMPLOYEE (AS) E, EMPLOYEE (AS) S

WHERE E.SUPERSSN = S.SSN;

We are allowed to declare alternative relation names E and S, called aliases for

the EMPLOYEE relation. We can think of E and S as two different copies of

the EMPLOYEE relation.

· Special operator

LIKE : for pattern matching

% : 0 or more arbitrary characters

_ : exactly 1 arbitrary character

(ex) SELECT sname

FROM Sailors Þ BoB, B...B, BB, Bob

WHERE Sailors.sname LIKE ‘B_%B’ O O X X

(Note) Although SQL is not case sensitive for commands,

SQL is case sensitive for data.

· Set operations

UNION

INTERSECT

EXCEPT (: set difference)

(NOT) IN (: is/isn’t a member of set)

ANY/ALL (: compare against any/all element)

(NOT) EXISTS

(ex) Name of sailors who have reserved red or (and) green boats

SELECT S.sname

FROM Sailors S, Boats B, Reserves R

WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’

UNION (or INTERSECT)

SELECT S2.sname

FROM Sailors S2, Boats B2, Reserves R2

WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = ‘green’;

(ex) The sids of all sailors who have reserved red boats but not green boats

SELECT S.sid

FROM Sailors S, Boats B, Reserves R

WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’

EXCEPT

SELECT S2.sid

FROM Sailors S2, Boats B2, Reserves R2

WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B.color = ‘green’;

- Nested queries

(See section 7.1 (pp 207-215) of the textbook for details)

(ex) Names of sailors reserved (or not reserved) boat 103

subquery

SELECT S.name

FROM Sailors S

WHERE S.sid IN (or NOT IN) ( SELECT R.sid

FROM RESERVES R

WHERE R.bid = 103);

SELECT S.name

FROM Sailors S correlated query: occurrence of S in the subquery

(in the form of the literal S.sid)

WHERE EXISTS ( SELECT *

FROM RESERVES R

WHERE R.bid = 103 AND R.sid=S.sid);

(Notes) 1. Correlated subqueries are subqueries in which there is a subquery (and hence

a main, outer query), and the information in the subquery is referenced by the

other, main query. Whenever a condition in the WHERE clause of a nested

query references some attribute of a relation declared in the outer query, the

queries are said to be correlated.

2. Using the EXISTS predicate, the subquery does not form a result set, but

rather returns TRUE or FALSE.

(ex) Names of sailors who have reserved all boats

SELECT S.sname

FROM Sailors S

WHERE NOT EXISTS ( ( SELECT B.bid all boats

FROM Boats B)

EXCEPT

( SELECT R.bid

FROM Reserves R reserved boats

WHERE R.sid = S.sid) );

SELECT S.sname

FROM Sailors S

WHERE NOT EXISTS ( SELECT B.bid

FROM Boats B

WHERE NOT EXISTS ( SELECT R.bid

FROM Reserves R

WHERE R.bid = B.bid

AND R.sid = S.sid )

);

- Aggregate functions

(See section 7.1.7 (pp 216-222) of the textbook for details)

SQL supports five aggregate operations, which can be applied on any column, say A, of

a relation:

(1) COUNT ( (DISTINCT) A ): The number of (unique) values in the A column.

(2) SUM ( (DISTINCT) A ): The sum of all (unique) values in the A column.

(3) AVG ( (DISTINCT) A ): The average of all (unique) values in A column.

(4) MAX (A): The maximum value in the A column.

(5) MIN (A): The minimum value in the A column.

(ex) Find the average age of sailors

MIN (or MAX)

Þ SELECT AVG (age)

FROM Sailors;

Find the average age of sailors with a rating of 10.

MIN (or MAX)

Þ SELECT AVG (age)

FROM Sailors

WHERE rating = 10;

Find the number of sailors.

* refers to the rows (tuples), so COUNT (*)

returns the number of rows in the result of the query

Þ SELECT COUNT (*)

FROM Sailors;

Find the number of distinct sailor names.

Þ SELECT COUNT (DISTINCT sname)

FROM Sailors;

Find the names of sailors who are older than the oldest sailor with a rating of 10.

Þ SELECT S.sname

FROM Sailors S

WHERE S.age > ( SELECT MAX (S2.age)

FROM Sailors S2

WHERE S2.rating = 10 );

(Remark) Find name and age of oldest sailors

SELECT S.sname, MAX (S.age) Þ illegal ( cannot combine aggregate and

FROM Sailors S; nonaggregate in the same query)

SELECT S.sname, S.age

FROM Sailors S

WHERE S.age = ( SELECT MAX (S2.age)

FROM Sailors S2);

- Grouping (See section 7.1.8 (pp 218-222) of the textbook for details)

· In many cases, we want to apply the aggregate functions to subgroups of tuples in a

relation, based on some attributes. In these cases, we need to group the tuples that

have the same value of some attribute(s) and we need to apply the function to each

such group independently.

(ex) Find the average salary of employees in each department.

Find the average age for sailors of each rating.

· The GROUP BY-clause specifies the grouping attributes, which should also appear

in the SELECT-clause, so that the value resulting from applying each function to a

group of tuples appears along with the value of the grouping attribute(s).

· (Form) SELECT (DISTINCT) <attribute-list>

FROM <table-list>

WHERE <condition>

GROUP BY <grouping attribute(s)>

HAVING <group condition> ;

(ex)

Sailors / sid / Sname / rating / Age
22 / Dustine / 7 / 45.0
31 / Lubber / 8 / 55.5
32 / Andy / 8 / 25.5
85 / Art / 3 / 25.5
95 / Bob / 3 / 63.5
58 / Rusty / 10 / 35.0
29 / Brutus / 1 / 33.0
64 / Horatio / 7 / 35.0
71 / Zorba / 10 / 16.0

Find the age of the youngest sailors who is eligible to vote (i.e. is at least 18

years old) for each rating level with at least two such sailors.

SELECT rating, MIN(age)

FROM Sailors Notice that the SELECT-clause includes

WHERE age>18 Þ only the grouping attribute and the functions

GROUP BY rating to be applied on each group of tuples.

HAVING COUNT(*)>1;

(1) Construct cross product of relations in FROM <table-list> if |table-list|>1.

Otherwise, use the original table.

Þ Use the original table.

(2) Apply condition in WHERE-clause to eliminate rows.

Þ Eliminate < 71, Zorba, 10, 16.0 >.

(3) Eliminate unwanted columns

: need columns mentioned in SELECT, GROUP BY or HAVING.

Þ Eliminate sid and sname.

(4) Sort table by GROUP BY-clause to identify groups.

(5) Apply the group condition in the HAVING-cluse

: COUNT(*)>1 Þ Eliminate group with rating=1 and rating=10.

(6) Generate one answer row per remaining group.

<DISTINCT> would now be applied.

after (3) after (4) after (5)

rating / age
1 / 33.0
3 / 25.5
3 / 63.5
7 / 45.0
7 / 35.0
8 / 55.5
8 / 25.5
10 / 35.0
rating / age
7 / 45.0
8 / 55.5
8 / 25.5
3 / 25.5
3 / 63.5
10 / 35.0
1 / 33.0
7 / 35.0
rating / age
1 / 33.0
3 / 25.5
3 / 63.5
7 / 45.0
7 / 35.0
8 / 55.5
8 / 25.5
10 / 35.0

after (6)

rating / age
3 / 25.5
7 / 35.0
8 / 25.5

(ex) For each red boat, number of reservations for that boat.

SELECT B.bid, COUNT(*) AS reservecount

FROM Boats B, Reserves R

WHERE R.bid=B.bid AND B.color=‘red’

GROUP BY B.bid;

HAVING ← No group condition is necessary

(ex) Average age of sailors for each rating level that has at least two sailors.

SELECT S.rating, AVG(S.age) AS average

FROM Sailors S

WHERE ← No tuple condition is necessary

GROUP BY S.rating

HAVING COUNT(*)>1;

- Insert, Delete, and Update statements

(See section 6.4 (pp198-201) of the textbook for details)

(1) INSERT is used to add a single tuple to a relation.

(ex) INSERT INTO Sailors

VALUES (‘10’, ‘Joe’, 7, 30.0);

(2) DELETE removes tuples from a relation.

(ex) DELETE FROM Sailors

WHERE sid = 22; (or sname = ‘rusty’)

(3) UPDATE is used to modify attribute values of one or more selected tuples.

(ex) UPDATE Sailors

SET sname = ‘michael’, rating = 10

WHERE sid = 22;

(Note) NULL Value: SQL provides a special column value, called NULL, to use when

a column is unknown or unapplicable.