80
Chapter 6. Basic SQL
- SQL (Structured Query Language)
: SQL is now the standard language for commercial DBMSs.
· 1974 - SEQUEL (Structured English QUery Language)
1986 - A standard version of SQL (: SQL86 or SQL1)
1992 - SQL2 (SQL92)
· The SQL language has several parts:
(1) The Data Manipulation Language (DML)
: This subset of SQL allows users to pose queries and to insert, delete and modify
rows. (ex) SELECT, INSERT, etc.
(2) The Data Definition Language (DDL)
: This subset of SQL supports the creation, deletion and modification of
definitions for tables and views.
(ex) CREATE TABLE, CREATE VIEW, ALTER, etc.
(3) Embedded and dynamic SQL
: Embedded SQL features allow SQL code to be called from a host language such
as C, COBOL, C++ or Fortran.
Dynamic SQL features allow a query to be constructed (and executed) at
run- time.
(4) Security
: SQL provides mechanism to control user’s access to data objects such as tables
and views. (ex) GRANT, REVOKE, etc.
(5) Transaction management
: Various commands allow a user to explicitly control aspects of how a
transaction is to be executed. (ex) COMMIT, ROLLBACK, etc.
(6) Client/Server execution and remote database access
: These commands control how a client application program can connect to an
SQL database server, or access database from a database over a network.
(ex) Web programming tech, Java Script, Java Servlet, XML, etc.
- SQL terms
type ≡ domain
table ≡ relation
table definition ≡ schema
row ≡ tuple
column ≡ attribute
- Create table structure
(See sections 6.1 and 6.2 (pp180-187) of the textbook for details)
· (Form) CREATE TABLE T-NAME (attribute and type list);
(ex) CREATE TABLE Customer (cno CHAR(3), balance NUMBER(5) );
CREATE TABLE Employee (
SSN CHAR(9) NOT NULL, added for
NAME VARCHAR(30) NOT NULL, total participation
AGE INT, constraint
PRIMARY KEY(SSN)
);
CREATE TABLE Works_On (
ESSN CHAR(9) NOT NULL,
PNO INT NOT NULL,
HOURS DECIMAL(3,1) NOT NULL,
PRIMARY KEY(ESSN, PNO)
FOREIGN KEY(ESSN) REFERENCES EMPLOYEE(SSN),
FOREIGN KEY(PNO) REFERENCES PROJECT(PNUMBER)
);
- Basic queries in SQL
(See section 6.3 (pp187-198) of the textbook for details)
· (Form) π → SELECT (DISTINCT) <attribute list>
FROM <table list>
σ → WHERE <condition>;
· DISTINCT: SQL usually treats a table as a multiset; duplicate tuples can appear
more than once in a table and in the result of a query. Only distinct
tuples should remain in the result.
Sailors / Sid / sname / rating / age21 / Bob / 7 / 40
22 / Bob / 8 / 50
23 / Joe / 9 / 31
24 / Bill / 10 / 20
(ex)
snameBob
Bob
Joe
SELECT sname
FROM Sailors Þ
WHERE age > 30;
snameBob
Joe
SELECT DISTINCT sname
FROM Sailors Þ
WHERE age > 30;
· Enumeration strategy
1. Compute the cross product of the tables in the <table-list>.
2. Delete those rows in the cross product that do not meet the <condition>.
3. Delete all columns that do not appear in the <attribute-list>.
4. If DISTINCT is specified, eliminate duplicate rows
· The condition in the WHERE clause is a Boolean combination of conditions of the
form expression op expression, where op is one of the comparison operators
{=, ≠, <, >, ≤, ≥}.
(ex) SELECT sname
FROM Sailors, Reserves
WHERE Sailors.sid = Reserves.sid AND Reserves.bid = 103;
SELECT sid
FROM Reserves, Boats
WHERE Reserves.bid = Boats.bid AND Boats.color = ‘red’;
SELECT sname
FROM Reserves, Boats, Sailors
WHERE Reserves.bid = Boats.bid AND Boats.color = ‘red’
AND Sailors.sid = Reserves.sid;
· Renaming (Aliasing)
- In SQL the same name can be used for two (or more) attributes as long as the
attributes are in the different relation.
Þ Prefix the relation name to the attribute name separating the two by period
(ex) SELECT FNAME, EMPLOYEE.NAME, ADDRESS
FROM EMPLOYEE, DEPARTMENT
WHERE DEPARTMENT.NAME = ‘Research’ AND
DEPARTMENT.DNUMBER = EMPLOYEE.DNUMBER;
- Ambiguity also arises in the case of queries that refer to the same relation twice.
(ex) For each employee, retrieve the employee’s first and last name and the first and
last name of his or her immediate supervisor.
SELECT E.FNAME, E.LNAME, S.FNAME, S.LNAME
FROM EMPLOYEE (AS) E, EMPLOYEE (AS) S
WHERE E.SUPERSSN = S.SSN;
We are allowed to declare alternative relation names E and S, called aliases for
the EMPLOYEE relation. We can think of E and S as two different copies of
the EMPLOYEE relation.
· Special operator
LIKE : for pattern matching
% : 0 or more arbitrary characters
_ : exactly 1 arbitrary character
(ex) SELECT sname
FROM Sailors Þ BoB, B...B, BB, Bob
WHERE Sailors.sname LIKE ‘B_%B’ O O X X
(Note) Although SQL is not case sensitive for commands,
SQL is case sensitive for data.
· Set operations
UNION
INTERSECT
EXCEPT (: set difference)
(NOT) IN (: is/isn’t a member of set)
ANY/ALL (: compare against any/all element)
(NOT) EXISTS
(ex) Name of sailors who have reserved red or (and) green boats
SELECT S.sname
FROM Sailors S, Boats B, Reserves R
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
UNION (or INTERSECT)
SELECT S2.sname
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B2.color = ‘green’;
(ex) The sids of all sailors who have reserved red boats but not green boats
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid = R.sid AND R.bid = B.bid AND B.color = ‘red’
EXCEPT
SELECT S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid = R2.sid AND R2.bid = B2.bid AND B.color = ‘green’;
- Nested queries
(See section 7.1 (pp 207-215) of the textbook for details)
(ex) Names of sailors reserved (or not reserved) boat 103
subquery
SELECT S.name
FROM Sailors S
WHERE S.sid IN (or NOT IN) ( SELECT R.sid
FROM RESERVES R
WHERE R.bid = 103);
or
SELECT S.name
FROM Sailors S correlated query: occurrence of S in the subquery
(in the form of the literal S.sid)
WHERE EXISTS ( SELECT *
FROM RESERVES R
WHERE R.bid = 103 AND R.sid=S.sid);
(Notes) 1. Correlated subqueries are subqueries in which there is a subquery (and hence
a main, outer query), and the information in the subquery is referenced by the
other, main query. Whenever a condition in the WHERE clause of a nested
query references some attribute of a relation declared in the outer query, the
queries are said to be correlated.
2. Using the EXISTS predicate, the subquery does not form a result set, but
rather returns TRUE or FALSE.
(ex) Names of sailors who have reserved all boats
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS ( ( SELECT B.bid all boats
FROM Boats B)
EXCEPT
( SELECT R.bid
FROM Reserves R reserved boats
WHERE R.sid = S.sid) );
or
SELECT S.sname
FROM Sailors S
WHERE NOT EXISTS ( SELECT B.bid
FROM Boats B
WHERE NOT EXISTS ( SELECT R.bid
FROM Reserves R
WHERE R.bid = B.bid
AND R.sid = S.sid )
);
- Aggregate functions
(See section 7.1.7 (pp 216-222) of the textbook for details)
SQL supports five aggregate operations, which can be applied on any column, say A, of
a relation:
(1) COUNT ( (DISTINCT) A ): The number of (unique) values in the A column.
(2) SUM ( (DISTINCT) A ): The sum of all (unique) values in the A column.
(3) AVG ( (DISTINCT) A ): The average of all (unique) values in A column.
(4) MAX (A): The maximum value in the A column.
(5) MIN (A): The minimum value in the A column.
(ex) Find the average age of sailors
MIN (or MAX)
Þ SELECT AVG (age)
FROM Sailors;
Find the average age of sailors with a rating of 10.
MIN (or MAX)
Þ SELECT AVG (age)
FROM Sailors
WHERE rating = 10;
Find the number of sailors.
* refers to the rows (tuples), so COUNT (*)
returns the number of rows in the result of the query
Þ SELECT COUNT (*)
FROM Sailors;
Find the number of distinct sailor names.
Þ SELECT COUNT (DISTINCT sname)
FROM Sailors;
Find the names of sailors who are older than the oldest sailor with a rating of 10.
Þ SELECT S.sname
FROM Sailors S
WHERE S.age > ( SELECT MAX (S2.age)
FROM Sailors S2
WHERE S2.rating = 10 );
(Remark) Find name and age of oldest sailors
SELECT S.sname, MAX (S.age) Þ illegal ( cannot combine aggregate and
FROM Sailors S; nonaggregate in the same query)
SELECT S.sname, S.age
FROM Sailors S
WHERE S.age = ( SELECT MAX (S2.age)
FROM Sailors S2);
- Grouping (See section 7.1.8 (pp 218-222) of the textbook for details)
· In many cases, we want to apply the aggregate functions to subgroups of tuples in a
relation, based on some attributes. In these cases, we need to group the tuples that
have the same value of some attribute(s) and we need to apply the function to each
such group independently.
(ex) Find the average salary of employees in each department.
Find the average age for sailors of each rating.
· The GROUP BY-clause specifies the grouping attributes, which should also appear
in the SELECT-clause, so that the value resulting from applying each function to a
group of tuples appears along with the value of the grouping attribute(s).
· (Form) SELECT (DISTINCT) <attribute-list>
FROM <table-list>
WHERE <condition>
GROUP BY <grouping attribute(s)>
HAVING <group condition> ;
(ex)
Sailors / sid / Sname / rating / Age22 / Dustine / 7 / 45.0
31 / Lubber / 8 / 55.5
32 / Andy / 8 / 25.5
85 / Art / 3 / 25.5
95 / Bob / 3 / 63.5
58 / Rusty / 10 / 35.0
29 / Brutus / 1 / 33.0
64 / Horatio / 7 / 35.0
71 / Zorba / 10 / 16.0
Find the age of the youngest sailors who is eligible to vote (i.e. is at least 18
years old) for each rating level with at least two such sailors.
SELECT rating, MIN(age)
FROM Sailors Notice that the SELECT-clause includes
WHERE age>18 Þ only the grouping attribute and the functions
GROUP BY rating to be applied on each group of tuples.
HAVING COUNT(*)>1;
(1) Construct cross product of relations in FROM <table-list> if |table-list|>1.
Otherwise, use the original table.
Þ Use the original table.
(2) Apply condition in WHERE-clause to eliminate rows.
Þ Eliminate < 71, Zorba, 10, 16.0 >.
(3) Eliminate unwanted columns
: need columns mentioned in SELECT, GROUP BY or HAVING.
Þ Eliminate sid and sname.
(4) Sort table by GROUP BY-clause to identify groups.
(5) Apply the group condition in the HAVING-cluse
: COUNT(*)>1 Þ Eliminate group with rating=1 and rating=10.
(6) Generate one answer row per remaining group.
<DISTINCT> would now be applied.
after (3) after (4) after (5)
rating / age1 / 33.0
3 / 25.5
3 / 63.5
7 / 45.0
7 / 35.0
8 / 55.5
8 / 25.5
10 / 35.0
rating / age
7 / 45.0
8 / 55.5
8 / 25.5
3 / 25.5
3 / 63.5
10 / 35.0
1 / 33.0
7 / 35.0
rating / age
1 / 33.0
3 / 25.5
3 / 63.5
7 / 45.0
7 / 35.0
8 / 55.5
8 / 25.5
10 / 35.0
after (6)
rating / age3 / 25.5
7 / 35.0
8 / 25.5
(ex) For each red boat, number of reservations for that boat.
SELECT B.bid, COUNT(*) AS reservecount
FROM Boats B, Reserves R
WHERE R.bid=B.bid AND B.color=‘red’
GROUP BY B.bid;
HAVING ← No group condition is necessary
(ex) Average age of sailors for each rating level that has at least two sailors.
SELECT S.rating, AVG(S.age) AS average
FROM Sailors S
WHERE ← No tuple condition is necessary
GROUP BY S.rating
HAVING COUNT(*)>1;
- Insert, Delete, and Update statements
(See section 6.4 (pp198-201) of the textbook for details)
(1) INSERT is used to add a single tuple to a relation.
(ex) INSERT INTO Sailors
VALUES (‘10’, ‘Joe’, 7, 30.0);
(2) DELETE removes tuples from a relation.
(ex) DELETE FROM Sailors
WHERE sid = 22; (or sname = ‘rusty’)
(3) UPDATE is used to modify attribute values of one or more selected tuples.
(ex) UPDATE Sailors
SET sname = ‘michael’, rating = 10
WHERE sid = 22;
(Note) NULL Value: SQL provides a special column value, called NULL, to use when
a column is unknown or unapplicable.