Introduction

Data VS Information: Information is data presented in context.

Metadata can include: structure, rules and constraints. Ensure consistency and meaning.

Database is a large, integrated, structured collection of data/

Entities and relations: Database systems manage data in a structured way.

Rows&Columns forming relations. Keys&Foreign keys to link relations.

Advantages of Databases vs File processing Systems

  • Data independence

    separation of data and program, application logic

    central data repository, central management

  • Minimal data redundancy

    redundancy can be controlled (normalization)

  • Improved data consistency

    single store: no disagreements, update problems, less storage space

  • Improved data sharing

    data is shared, a corporate resource, not a necessity for an application

    external users can be allowed access multiple views of data, arbitrary views of data

  • Reduced program maintenance

    data structure can change without application data changing

  • Novel ad hoc data access 'without programming'

    SQL

Database development process

Conceptual Design

Construction of a model of the data used in the database - independent of all physical considerations.

Data Models ER Diagram (Entity Relation Diagram)

Big boxes: entities

Blue diamond: mandatory

White diamond: optional

Logical Design

Construction of a (relational) model of the data based on the conceptual design

Independent of a specific database and other physical considerations

Data modeling involves two distinct stages: conceptual design and logical design. The former stage aims to identify the highest-level relationships between different entities, while the latter provides a detailed description of data attributes without considering their physical implementation in the database. Conceptual design is more abstract and focuses on main entities and their relationships, whereas logical design specifies all attributes and primary keys within each entity.

Example:

Consider a library system that comprises authors, books, and publishers. A conceptual design would illustrate the relationships between these entities as follows:

Author writes --> Book published by --> Publisher.

However, this design does not include their attributes or keys. To address this issue, we need to create a logical design that includes the following details:

  1. Author (author_id*, name, email)

  2. Book (book_id*, title, genre, year)

  3. Publisher (publisher_id*, name, address)

  4. Writes (author_id*, book_id*)

  5. PublishedBy (book_id*, publisher_id*)

This revised version provides a clear and concise explanation of the two designs while maintaining coherence.

Physical Design

For a specific DBMS

Describes:

  • Basic relations(data types)

  • File organization

  • Indexes

Types help the DBMS store and use information efficiently

  • Can make assumptions in computation

  • Consistency is guaranteed

Minimize storage space

Need to consider

  • Can you store all possible values

  • Can the type you choose support the data manipulation required

Selection of types may improve data integrity

Data Dictionary

Physical Design Decisions

(1)How to Store “Look Up”

Achieve data integerity in the left side. (Predefine the code in the look-up table)

There’s a trade-off between speed and space(and possibly integrity of data)

(2)To De-Normalise or Not

Last updated