Lecture One –Database

1.1Introduction to Databases and Systems

Readings:
  • Required: Connolly and Begg, Sections 1.3.4 and 1.7.
  • Suggested: Connolly and Begg, Section 1.2.
  • Elective: Connolly and Begg, Section 1.5.

Q1: What are databases and database systems?

Q2: What advantages does a database system offer over a traditional file system?

Q3: What is difference among data, database, database system, and database management system(DBMS)?

File:

- Data: 'raw facts', little meaning and un-organised.

- Field: data grouped together to have some specific meaning.

- Record: logically connected set of fields.

- File: a collection of related records.

An example of a Staff File.

Name / Post / Salary / Date of entry
John / EDP Manager / $25000.00 / 12/08/1996
.. / .. / .. / ..

Related data

The data represents logically coherent facts about some aspects of the real world that are required by an application. For example, in a college database, data about students, faculty, courses and the likes.

Integrated data

Data for multiple applications is stored together and manipulated in a uniform way on a secondary storage. Integration is to support information sharing across multiple applications. For example, the registrar's office can share registration data with the accounting office for billing purposes.

Database

Integrated related data.

Database Management System (DBMS)

A DBMS is a collection of programs that controls a database. It provides an interface to create, maintain, and manipulate multiple databases.

Database System

A database system is to support the operations of a specific set of applications. It consists of:

Database, DBMS and Application software.

Database approach:

Three distinguishing characteristics data abstraction, reliability, and efficiency.

(Before that, let’s see the traditional file system: application programs directly access the stored data, and for this reason the names of files and the organization of data in these files, along with any other data definitions, are embedded in each program that accesses them. So what are the problems with that?)

Data Abstraction

Allowing data to be structured in ways that are more understandable and meaningful than the ways data are physically stored on disks.

High-level, conceptual representations of the data. Example Table column and date data.

The physical organization of data can be changed without affecting the application programs. Example a table stored on more than one physical disk.

Data Independence

The application programs can still continue to operate on the conceptual representation of the data. Similarly, as long as their calling interface stays the same, the implementation of abstract operations can be changed without affecting the code of the application programs. The former is referred to as program-data independence and the latter as program-operation independence.

DBMS stores the structure of the data as part of the description of the database in the system catalog, separately from the application programs.

Reliability

DBMSs provide high reliability by

1)Enforcing integrity constraints.

Integrity constraints reflect the meaning of the data and of the application. E.g. data type. such as an positive integer for price, or enforcing airline reservation system that no two passengers can be assigned the same seat in the same flight.

2)Ensuring data consistency.

When the system (H/W or S/W) fails, DBMSs still guarantee that interrupted update operations do not corrupt the database and no data is lost. After a failure, a DBMS automatically recovers, restoring the database to the consistent state in which it existed just prior to the interruption.

For example, consider the transference of $100 from a savings to a cheque account.

Savings balance = Savings balance – 100

Cheque balance = Cheque balance + 100

Efficiency
  • Minimize Data Redundancy which in turn saves both space, by storing each data item only once, and processing time, by eliminating the need of multiple updates to keep the replicas consistent and up-to-date.
  • Query optimizations simplify the queries so that they can execute faster, and access methods allow direct access to locations where relevant data are stored.
  • DBMSs also decrease response time of transactions by allowing multiple users to access the database concurrently.

Data Duplication

In many cases, same pieces of data may be stored in different locations/files, it will lead to data duplications. Such duplications will lead to a number of problems, and these problems could cause many problems. Let’s see an example:

Student Admission Record

Student Name / Address / Phone No. / Course
John / 1 st. H.K / 2823456 / DCO20107
May / B Tai Wai / 2552433 / DCO10123
Jack / 22 Ma St. Kln. / 2997343 / DCO20105

Student Society

Student / Address / Phone No. / Society / Post
May / B Tai Wai / 2552433 / Art / Member
May / B Tai Wai / 2552433 / Dancing / Chairman
May / B Tai Wai / 2552433 / Dancing / Trainner
Jack / 221 Ma St. Kln. / 2997343 / Tennis / Trainer
James / 34 Wong Yin St. / 2888777 / Baseball / Member

What a DBMS can do? Or Why we need a DBMS?

  1. Store and manage large quantities of related data on a secondary storage that can be efficiently retrieved, accessed and processed.
  2. Ensure the validity and consistency of the stored data by enforcing integrity constraints on their values.
  3. Ensure durability or permanence of data. This means that the effects of successfully completed transactions become permanent in the database, surviving any subsequent system failures. This makes a DBMS particularly suitable for handling the data in mission critical applications such as banking and electronic commerce.
  4. Support easy and efficient data sharing among many users by concurrency control. Allowing two or more users to access the database concurrently, interleaving their operations. Without concurrency control, two users accessing and updating the same data items could interfere with each other in such a way as to result in the loss of information or the violation of integrity.
  5. Provide various levels of representation. DBMS can present each user with different portions of the database based on the user's perspectives. Views can specify the portions of the database that are relevant to a particular user.
  6. Protects from unauthorized access by security and access control methods. Security techniques include encryption and authentication. Access control methods control the kinds of access available to authorized users. They deal with controlling who has access to whatdata, and with controlling the operations they can perform on that data.

(Prepared by Bryancheung-2001)1b-1/11