Relational Databases BORROWED WITH MINOR ADAPTATION FROM PROF. CHRISTOS FALOUTSOS, CMU 15-415/615
Roadmap 3
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
Why Study the Relational, a.k.a. “SQL” Model? 4
Most widely used model. Vendors: IBM/Informix, Microsoft, Oracle, Sybase, etc. “Legacy systems” in older models e.g., IBM’s IMS Object-oriented concepts have merged in object-relational model Informix->IBM DB2, Oracle
Essentially for up to millions of records (Beyond that, think NoSQL)
Relational Database: Definitions relations
(relation
database: a set of
5
Relational
= table)
specifically
sid 53666 53688 53650
name login Jones
[email protected] Smith
[email protected] Smith
[email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Relational Database: Definitions made up of 2 parts: Schema : specifies name of relation, plus name and type of each column. Instance : a table, with rows and columns. = cardinality #fields = degree / arity
6
Relation:
#rows
sid 53666 53688 53650
name login Jones
[email protected] Smith
[email protected] Smith
[email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Relational Database: Definitions 7
relation:
a set of rows or tuples.
all rows are distinct
no order among rows (why?)
sid 53666 53688 53650
name login Jones
[email protected] Smith
[email protected] Smith
[email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Ex: Instance of Students Relation sid 53666 53688 53650
name login Jones
[email protected] Smith
[email protected] Smith
[email protected]
age 18 18 19
8
gpa 3.4 3.2 3.8
• Cardinality = 3, arity = 5 , • all rows distinct • Q: do values in a column need to be distinct?
SQL* (a.k.a. “Sequel”), standard language
Data Definition Language (DDL) create,
modify, delete relations
specify
constraints
administer E.g.:
users, security, etc.
create table student (ssn fixed, name char(20));
* Structured Query Language
9
SQL - A language for Relational DBs
Data Manipulation Language (DML) Specify
criteria
add,
queries to find tuples that satisfy
modify, remove tuples select * from student ; update takes set grade=4 where name=‘smith’ and cid = ‘db’;
10
SQL - A language for Relational DBs
SQL Overview TABLE
( , … ) INSERT INTO () VALUES () DELETE FROM WHERE
11
CREATE
SQL Overview SET = WHERE SELECT FROM WHERE
12
UPDATE
Creating Relations in SQL the Students relation.
CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)
13
Creates
Creating Relations in SQL the Students relation. Note: the type (domain) of each field is specified, and enforced by the DBMS whenever tuples are added or modified. CREATE TABLE Students (sid CHAR(20), name CHAR(20), login CHAR(10), age INTEGER, gpa FLOAT)
14
Creates
Table Creation (continued) 15
Another
example:
CREATE TABLE Enrolled (sid CHAR(20), cid CHAR(20), grade CHAR(2))
Adding and Deleting Tuples insert a single tuple using:
INSERT INTO Students (sid, name, login, age, gpa) VALUES (‘53688’, ‘Smith’, ‘[email protected]’, 18, 3.2)
16
Can
Adding and Deleting Tuples ‘mass’-delete (all Smiths!) : DELETE FROM Students S WHERE S.name = ‘Smith’
17
•
Roadmap 18
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
Keys help associate tuples in different relations
Keys
(IC)
are one form of integrity constraint
Enrolled sid 53666 53666 53650 53666
19
Keys
cid 15-101 18-203 15-112 15-105
Students grade C B A B
sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
(Motivation: ) 20
In flat files, how would you check for duplicate ssn, in a student file?
(horror stories, if ssn is duplicate?)
sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Keys help associate tuples in different relations
Keys
(IC)
are one form of integrity constraint
Enrolled sid 53666 53666 53650 53666
21
Keys
cid 15-101 18-203 15-112 15-105
Students grade C B A B
FOREIGN Key
sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
PRIMARY Key
age 18 18 19
gpa 3.4 3.2 3.8
Primary Keys set of fields is a superkey if:
No
two distinct tuples can have same values in all key fields
A
set of fields is a key for a relation if :
minimal
superkey
Student (ssn, name, address) {ssn,name}: superkey {ssn}: superkey, AND key {name}: not superkey
22
A
Primary Keys if >1 key for a relation?
23
what
Primary Keys
if >1 key for a relation?
one of the keys is chosen (by DBA) to be the primary key. Other keys are called candidate keys..
Q:
example of >1 superkeys?
24
what
Primary Keys
25
what if >1 key for a relation? one of the keys is chosen (by DBA) to be the primary key. Other keys are called candidate keys.. Q: example of >1 superkeys? A1: student: {ssn}, {student-id#}, {driving license#, state} A2: Employee: {ssn}, {phone#}, {room#} A3: computer: {mac-address}, {serial#}
Primary Keys sid
26
E.g.
is a key for Students.
What The
about name?
set {sid, gpa} is a superkey.
Syntax: 27
CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2))
Syntax: 28
CREATE TABLE Enrolled (sid CHAR(20) cid CHAR(20), grade CHAR(2), PRIMARY KEY (sid,cid))
PRIMARY KEY == UNIQUE, NOT NULL
Drill: 29
CREATE TABLE Enrolled CREATE TABLE Enrolled (sid CHAR(20) (sid CHAR(20) cid CHAR(20), cid CHAR(20), vs. grade CHAR(2), grade CHAR(2), PRIMARY KEY (sid), PRIMARY KEY (sid,cid)) UNIQUE (cid, grade))
Drill: 30
CREATE TABLE Enrolled CREATE TABLE Enrolled (sid CHAR(20) (sid CHAR(20) cid CHAR(20), cid CHAR(20), vs. grade CHAR(2), grade CHAR(2), PRIMARY KEY (sid), PRIMARY KEY (sid,cid)) UNIQUE (cid, grade)) Q: what does this mean?
Primary and Candidate Keys in SQL 31
CREATE TABLE Enrolled CREATE TABLE Enrolled (sid CHAR(20) (sid CHAR(20) cid CHAR(20), cid CHAR(20), vs. grade CHAR(2), grade CHAR(2), PRIMARY KEY (sid), PRIMARY KEY (sid,cid)) UNIQUE (cid, grade)) “Students can take only one course, and no two students in a course receive the same grade.”
Foreign Keys 32
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Foreign Keys, Referential Integrity Foreign key : Set of fields `refering’ to a
33
tuple in another relation. Must correspond to the primary key of the other relation. Like a `logical pointer’. foreign key constraints enforce referential integrity (i.e., no dangling references.)
Foreign Keys in SQL
sid is a foreign key referring to Students:
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
34
Example: Only existing students may enroll for courses.
Foreign Keys in SQL 35
CREATE TABLE Enrolled (sid CHAR(20),cid CHAR(20),grade CHAR(2), PRIMARY KEY (sid,cid), FOREIGN KEY (sid) REFERENCES Students )
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Roadmap 36
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
Enforcing Referential Integrity 37
Subtle issues:
What should be done if an Enrolled tuple with a non-existent student id is inserted?
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Enforcing Referential Integrity 38
Subtle issues:
What should be done if an Enrolled tuple with a non-existent student id is inserted? (Reject it!)
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Enforcing Referential Integrity Subtle issues, cont’d:
What should be done if a Student’s tuple is deleted?
Enrolled sid 53666 53666 53650 53666
cid 15-101 18-203 15-112 15-105
grade C B A B
39
Students sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
Enforcing Referential Integrity Subtle issues, cont’d:
What should be done if a Students tuple is deleted? Also
delete all Enrolled tuples that refer to it?
Disallow
deletion of a Students tuple that is referred to?
Set
sid in Enrolled tuples that refer to it to a default sid?
(In
SQL, also: Set sid in Enrolled tuples that refer to it to a special value null, denoting `unknown’ or `inapplicable’.)
40
Enforcing Referential Integrity Similar issues arise if primary key of Students tuple is updated.
41
Integrity Constraints (ICs) condition that must be true for any instance of the database; e.g., domain constraints. ICs are specified when schema is defined. ICs are checked when relations are modified.
42
IC:
Integrity Constraints (ICs) legal instance of a relation: satisfies all specified ICs. DBMS should not allow illegal instances. we prefer that ICs are enforced by DBMS (as opposed to ?) Blocks data entry errors, too!
43
A
Where do ICs Come From? 44
Where do ICs Come From? application!
45
the
Where do ICs Come From? point: We can check a database instance to see if an IC is violated, but we can NEVER infer that an IC is true by looking at an instance. An
IC is a statement about all possible instances!
Eg.,
name is not a key,
but
the assertion that sid is a key is given to us. sid 53666 53688 53650
name login Jones [email protected] Smith [email protected] Smith [email protected]
age 18 18 19
gpa 3.4 3.2 3.8
46
Subtle
Where do ICs Come From? and foreign key ICs are the most common; more general ICs supported too.
47
Key
Roadmap 48
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
ER to tables outline: 49
strong entities
weak entities
(binary) relationships
1-to-1, 1-to-many, etc
total/partial participation
ternary relationships
ISA-hierarchies
aggregation
Logical DB Design: ER to Relational
(strong) entity sets to tables.
ssn
name
Employees
lot
50
Logical DB Design: ER to Relational
(strong) entity sets to tables.
ssn
name
lot
51
Ssn
Name
Lot
123-22-6666
Attishoo
48
233-31-5363
Smiley
22
131-24-3650
Smethurst 35
Employees
CREATE TABLE Employees (ssn CHAR(11), name CHAR(20), lot INTEGER, PRIMARY KEY (ssn))
Relationship Sets to Tables 52
Many-to-many: since name ssn
dname lot
Employees
did
Works_In
budget
Departments
Relationship Sets to Tables 53
Many-to-many: since name ssn
dname lot
Employees
budget
did
Works_In
Departments
Ssn
Name
Lot
Ssn
did
since
123-22-6666
Attishoo
48
123-22-6666 51
1/1/91
233-31-5363
Smiley
22
123-22-6666 56
3/3/93
131-24-3650
Smethurst 35
233-31-5363 51
2/2/92
Relationship Sets to Tables 54
CREATE TABLE Works_In( key of many-to-many ssn CHAR(11), did INTEGER, relationships: since DATE, Keys from PRIMARY KEY (ssn, did), participating entity FOREIGN KEY (ssn) sets (as foreign keys). REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments) Ssn
did
since
123-22-6666 51
1/1/91
123-22-6666 56
3/3/93
233-31-5363 51
2/2/92
Review: Key Constraints in ER 1-to-many:
since name ssn
dname lot
Employees
did
Manages
budget
Departments
55
(Reminder: Key Constraints in ER) 56
1-to-1
1-to Many
Many-to-1
Many-to-Many
ER to tables - summary of basics 57
strong entities: key
-> primary key
(binary) relationships: get
keys from all participating entities - pr. key:
1-to-1
-> either key (other: ‘cand. key’)
1-to-N
-> the key of the ‘N’ part
M-to-N
-> both keys
A subtle point (1-to-many) 58
since name ssn
dname did
lot
Employees
Manages
budget
Departments
Translating ER with Key Constraints since
name
ssn
dname
did
lot Employees
Manages
CREATE TABLE Manages( ssn CHAR(11), did INTEGER, since DATE,
PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, FOREIGN KEY (did) REFERENCES Departments)
budget
Departments
CREATE TABLE Departments( did INTEGER), dname CHAR(20), budget REAL, PRIMARY KEY (did), )
Two-table-solution
59
Translating ER with Key Constraints since
name
ssn
dname
did
lot Employees
Manages
budget
Departments
CREATE TABLE Dept_Mgr( ssn CHAR(11), did INTEGER, since DATE, dname CHAR(20), budget REAL, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees)
Single-table-solution
60
Translating ER with Key Constraints since
name
ssn
dname
Employees
CREATE ssn did since
did
lot Manages
budget
Departments
TABLE Manages( CREATE TABLE Dept_Mgr( CHAR(11), ssn CHAR(11), INTEGER, did INTEGER, Vs. since DATE, DATE, dname CHAR(20), budget REAL, PRIMARY KEY (did), PRIMARY KEY (did), FOREIGN KEY (ssn) FOREIGN KEY (ssn) REFERENCES Employees, REFERENCES Employees) FOREIGN KEY (did) REFERENCES Departments)
61
Pros and cons? 62
Drill: 63
What if the toy department has no manager (yet) ?
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees)
Drill:
A: one-table solution can not handle that. (ie., helps enforce ‘thick arrow’ – see CREATE TABLE Dept_Mgr( did INTEGER, since next) dname CHAR(20), name dname ssn
did
lot Employees
Manages
budget
Departments
budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees)
64
What if the toy department has no manager (yet) ?
Rules: Thin
arrow -> one-table solution
arrow -> two-table solution
(More rules: next)
since
name ssn
dname
did
lot Employees
Manages
budget
Departments
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11), since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees)
65
Thick
ER to tables outline: 66
strong entities
weak entities
(binary) relationships
1-to-1, 1-to-many, etc
total/partial participation
ternary relationships
ISA-hierarchies
aggregation
Review: Participation Constraints Does every department have a manager? If so, this is a participation constraint: the participation of Departments in Manages is said to be total (vs. partial).
Every
did value in Departments table must appear in a row of the Manages table (with a non-null ssn value!) since
name ssn
dname did
lot Employees
Manages
Works_In
since
budget Departments
67
Participation Constraints in SQL We can capture participation constraints involving one entity set in a binary relationship, but little else (without resorting to CHECK constraints).
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE NO ACTION)
68
Participation Constraints in SQL Total participation (‘no action’ -> do NOT do the delete)
Ie, a department MUST have a nanager
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE NO ACTION)
69
Participation Constraints in SQL Partial partipation, ie, a department may be headless
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE SET NULL)
70
Participation Constraints in SQL Partial partipation, ie, a department may be headless
OR (better): use the two-table solution
CREATE TABLE Dept_Mgr( did INTEGER, dname CHAR(20), budget REAL, ssn CHAR(11) NOT NULL, since DATE, PRIMARY KEY (did), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE SET NULL)
71
ER to tables outline: 72
strong entities
weak entities
(binary) relationships
1-to-1, 1-to-many, etc
total/partial participation
ternary relationships
ISA-hierarchies
aggregation
Review: Weak Entities A weak entity can be identified uniquely only by considering the primary key of another (owner) entity.
Owner entity set and weak entity set must participate in a one-to-many relationship set (1 owner, many weak entities).
Weak entity set must have total participation in this identifying relationship set.
name ssn
lot
Employees
cost
Policy
dname
age
Dependents
73
Review: Weak Entities 74
How to turn ‘Dependents’ into a table?
name ssn
lot
Employees
cost
Policy
dname
age
Dependents
Translating Weak Entity Sets
CREATE TABLE Dep_Policy ( dname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (dname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)
75
Weak entity set and identifying relationship set are translated into a single table (== ‘total participation’)
Translating Weak Entity Sets
When the owner entity is deleted, all owned weak entities must also be deleted (-> ‘CASCADE’)
CREATE TABLE Dep_Policy ( dname CHAR(20), age INTEGER, cost REAL, ssn CHAR(11) NOT NULL, PRIMARY KEY (dname, ssn), FOREIGN KEY (ssn) REFERENCES Employees, ON DELETE CASCADE)
76
Weak entity set and identifying relationship set are translated into a single table.
ER to tables outline: 77
strong entities
weak entities
(binary) relationships
1-to-1, 1-to-many, etc
total/partial participation
ternary relationships
ISA-hierarchies
aggregation
name
Review: ISA Hierarchies hourly_wages
ssn
lot 78 Employees
hours_worked ISA contractid
Hourly_Emps
Contract_Emps
Overlap constraints: Can Joe be an Hourly_Emps as well as a Contract_Emps entity? (Allowed/disallowed)
Covering constraints: Does every Employees entity also have to be an Hourly_Emps or a Contract_Emps entity? (Yes/no)
Drill: 79
What would you do?
name ssn
lot
Employees hourly_wages
hours_worked ISA contractid
Hourly_Emps
Contract_Emps
Translating ISA Hierarchies to Relations General approach: 3 relations: Employees, Hourly_Emps and Contract_Emps. how many times do we record an employee? what to do on deletion? EMP (ssn, lot) an employee? how to retrieve allname, info about
H_EMP(ssn,
h_wg, h_wk)
CONTR(ssn, cid)
80
Translating ISA Hierarchies to Relations Alternative: Just Hourly_Emps and Contract_Emps. Hourly_Emps: ssn, name, lot, hourly_wages, hours_worked. Each employee must be in one of these two subclasses. EMP (ssn, name, lot)
H_EMP(ssn, h_wg, h_wk, name, lot) CONTR(ssn, cid, name, lot)
Notice: ‘black’ is gone!
81
ssn name
all
h_wg
hourly
cid
contractors
82
Not in book – why NOT 1 table + nulls?
TYPE ssn name
h_wg
cid
…
… all
hourly
contractors
83
Not in book – why NOT 1 table + nulls?
ER to tables outline: 84
strong entities
weak entities
(binary) relationships
1-to-1, 1-to-many, etc
total/partial participation
ternary relationships
ISA-hierarchies
aggregation
Ternary relationships; aggregation 85
rare
keep keys of all participating entity sets
(or: avoid such situations: break into 2-way relationships or add an auto-generated key
)
Roadmap 86
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
Views 87
Virtual tables CREATE VIEW YoungActiveStudents(name,grade) AS SELECT S.name, E.grade FROM Students S, Enrolled E WHERE S.sid=E.sid and S.age<21
DROP VIEW
Views and Security 88
DBA: grants authorization to a view for a user
user can only see the view - nothing else
Roadmap 89
Introduction
Integrity constraints (IC)
Enforcing IC
Querying Relational Data
ER to tables
Intro to Views
Destroying/altering tables
Table changes 90
DROP TABLE
ALTER TABLE, e.g. ALTER TABLE students
ADD COLUMN maiden-name CHAR(10)
Relational Model: Summary A tabular representation of data.
Simple and intuitive; widely used
Integrity constraints can be specified by the DBA, based on customer specs. DBMS checks for violations.
Two important ICs: primary and foreign keys
also: not null, unique
In addition, we always have domain constraints.
Mapping from ER to Relational is (fairly) straightforward:
91
ER to tables - summary of basics 92
strong entities:
key -> primary key
(binary) relationships:
get keys from all participating entities - pr. key:
1:1 -> either key
1:N -> the key of the ‘N’ part
M:N -> both keys
weak entities:
strong key + partial key -> primary key
..... ON DELETE CASCADE
total/partial participation:
93
ER to tables - summary of advanced NOT NULL; ON DELETE NO ACTION
ternary relationships:
get keys from all; decide which one(s) -> prim. key
aggregation: like relationships
ISA:
2 tables (‘total coverage’)
3 tables (most general) ISA