Database Notes
Key Concepts
Section titled “Key Concepts”Relation
Section titled “Relation”- In relational databases, a relation is basically a table.
- It consists of rows (tuples) and columns (attributes).
👉 Example: A STUDENT
relation
student_id | name | age | major |
---|---|---|---|
1 | Alice | 20 | Biology |
2 | Bob | 21 | Computer Science |
3 | Carol | 22 | Math |
Schema
Section titled “Schema”- The schema is the structure (definition) of a relation.
- It describes what attributes (columns) the relation has and their data types.
- Think of it as the blueprint of a table.
👉 Example:
STUDENT(student_id INT, name VARCHAR(50), age INT, major VARCHAR(50))
This schema says:
student_id
�?integername
�?string up to 50 charactersage
�?integermajor
�?string up to 50 characters
Instance
Section titled “Instance”- An instance is the actual content (data rows) in a table at a given moment.
- The schema is fixed, but the instance can change over time (when you insert/update/delete rows).
👉 Example:
The current instance of STUDENT
might be:
student_id | name | age | major |
---|---|---|---|
1 | Alice | 20 | Biology |
2 | Bob | 21 | Computer Science |
If tomorrow you insert Carol, the instance changes.
Attribute
Section titled “Attribute”- An attribute is a column of a relation.
- Each attribute has a name and a domain (data type).
👉 Example:
In STUDENT(student_id, name, age, major)
student_id
is an attribute (domain: integer)name
is an attribute (domain: string)- etc.
- A tuple is a row (record) in a relation.
- It represents one entry in the table.
👉 Example:
(1, 'Alice', 20, 'Biology')
is a tuple in STUDENT
.
DDL: definition
DML: manipulation
RDBMS
Basic SQL
Section titled “Basic SQL”primary key (sid)
do not have to be one column
foreign key => primary key (by default)
you can practice these on Leetcode, or simply ask AI
FROM
WHERE
SELECT DISTINCT / AVG
ORDER BY DESC / ASC
LIMIT
GROUP BY
HAVING COUNT (*) > 2
SELECT seller, SUM(amount) AS total_salesFROM salesGROUP BY sellerHAVING SUM(amount) > 200;
Sequence
Section titled “Sequence”FROM : identify table
WHRER : apply selections (eliminate rows)
SELECT : project away columns
GROUP BY : form groups & aggregate
HAVING : eliminate groups
DISTINCT : eliminate duplicates
Cross (Cartesian) Product
Section titled “Cross (Cartesian) Product”A list of tables => combination of tuples => WHERE selection
Aliases
Section titled “Aliases”self-join
SELECT x.name, x.age, y.sname AS sname2, y.age AS age2FROM Sailors AS X, Sailors AS yWHERE x.age > y.age
you can combine with arithmetic expressions
String Comparisons
Section titled “String Comparisons”SELECT S.snameFROM Sailors SWHERE S.name ~ 'B.*'
Combining Predicts
Section titled “Combining Predicts”AND / OR
UNION / INTERSECT / EXCEPT
Multiset Semantics
Section titled “Multiset Semantics”{UNION, INTERSECT, EXCEPT} ALL
ANY
SELECT *FROM Sailors SWHERE S.rating > ANY(SELECT S2.ratingFROM Sailors S2WHERE S2.sname='Popeye')
Nested Queries
Section titled “Nested Queries”IN => subquery NOT IN
EXISTS
correlated subquery
SELECT S.nameFROM Sailors SWHERE EXISTS (SELECT * FROM RESERVES R WHERE R.bid=102 AND S.sid=R.sid)
Division
Section titled “Division”Find sailors who’ve reserved all boats. (sailors with no counterexample missing boats)
SELECT S.snameFROM Sailors SWHERE NOT EXISTS (SELECT B.bid FROM Boats B WHERE NOT EXISTS (SELECT R.bid FROM Reserves R WHERE R.bid=B.bid AND R.sid=S.sid))
ARGMAX
Section titled “ARGMAX”commonly used in optimization problems
-- just the maxSELECT MAX(S.rating)FROM Sailors S;
-- illegal querySELECT S.*, MAX(S.rating)FROM Sailors S;
right implementations
SELECT *FROM S.rating >= ALL (SELECT s2.rating FROM Sailors S2)
-- single return, not entirely deterministicSELECT *FROM Sailors SORDER BY rating DESCLIMIT 1;
INNER JOIN
SELECT s.*, r.bidFROM Sailors s, Reserves rWHERE s.sid = r.sidAND ...
SELECT s.*, r.bidFROM Sailors s INNER JOIN Reserves rON s.sid = r.sidWHERE ...
others
-- INNER is defaultSELECT <column expression list>FROM table name[INNER | NATURAL| {LEFT | RIGHT | FULL} {OUTER}] JOINtable nameON <qualification_list>WHERE ...
generally speaking, natural join is not a good a idea to use
create a virtual table
CREATE VIEW Redcount...
SELECT * FROM Redcount;
view on the fly
SELECT bname, scountFROM Boats2 B,(SELECT B.bid, COUNT (*) FROM Boats2 B, Reserves2 R WHERE R.bid = B.bid AND B.color = 'red' GROUP BY B.bid) AS Reds(bid, scount)WHERE Reds.bid = B.bid AND scount < 10
another “view on the fly” syntax, you can have many queries in WITH
WITH Reds(bid, scount) AS(SELECT B.bid, COUNT (*)FROM Boats2 B, Reserves2 RWHERE R.bid = B. bid AND B.color = 'red'GROUP BY B.bid),
UnpopularReds ASSELECT bname, scountFROM Boats2 B, RedsWHERE Reds.bid = B.bidAND scount < 10
SELECT * FROM UnpopularReds;
ARGMAX GROUP BY
for each age, we’re going to group the max rating
WITH maxratings(age, maxrating) AS(SELECT age, MAX(rating)FROM SailorsGROUP BY age)
SELECT S.* FROM Sailors S, maxratings mWHERE S.age = m.age AND S.rating = m.maxrating;
Rule: (x op NULL) evaluates to … NULL!
With Boolean Logic
NULL AND TRUE => NULL;NULL AND FALSE => FALSE;
NULL OR TRUE => TRUE;NULL OR FALSE => FALSE;
Aggregation: NULL will be ignored
SELECT count(*) FROM sailors; -- should consider NULLSELECT count(rating) FROM sailors;SELECT SUM(rating) FROM sailors;SELECT AVG(rating) FROM sailors;