Summary

Introduction

Trends: data independence - immunity of applications to change in storage structure and access strategy
Relational data model: Proposed by E.F. Codd, store information in two ways:
- by the contents of records stored in the database
- by the ways in which these records are connected together
Two properties:
- all information is represented by data values
- the system supports a very high-level language

Relational access method: XRM.
XRM stores relations in the form of “tuples”, each of which has a unique 32-bit “tuple identifier”: TID.
The most challenging task: design of optimizer algorithms for efficient execution of SQL statements on top of XRM. => optimizer made extensive use of inversions
Demonstrated the usability of SQL.
Pros & Cons:
- Pros: fast w/ finding distinct jobs
- Cons:
  - Too many I/Os For each tuple, look up all its fields
  - Use “inversions” to find TIDs with a given value for a field
Lessons learned:
- The optimizer should take into account not just the cost of fetching tuples, but the costs of creating and manipulating TID lists.
- A better measure of cost would have been number of “I/Os”.
- Cost measure should be a weighted sum of CPU time and I/O count.
- Importance of “join”
- Simpler interaction, minimize the “path length” for simple SQL statements.

Assess method: Research Storage System (RSS)
- Store data in the individual records of the database.
- Pros: all the data value of a record could be fetched by a single I/O.
- B-tree indexes.
Optimizing SQL processor: Relational Data System (RDS)
Subsystems:
- View & Authorization subsystem
  - View definition in the form of an SQL parse tree.
  - Authorization is based on privileges, controlled by GRANT and REVOKE.
- Recovery subsystem
  - Media(disk) failure: image dump + change log.
  - System failure: change log + shadow pages.
    - Why do we need both shadow pages and change logs?
      - You may need log for transaction replay
      - You can only have log because shadow mechanism is quite expensive for large logs
  - Transaction failure: process the change log backwards.
- Locking subsystem
  - “predicate locks” (abandoned)
    - determine whether predicates are satisfiable are difficult and time-consuming
    - two predicates may appear to conflict
  - hierarchy of locks (used by MySQL)
  - “intention” locks