Skip to content

Latest commit

 

History

History
2545 lines (1238 loc) · 55.3 KB

File metadata and controls

2545 lines (1238 loc) · 55.3 KB

🔥 UNIT–1 : MOST IMPORTANT TOPICS (EXAM-ORIENTED)

🟥 TOP PRIORITY (VERY HIGH CHANCE – ALMOST GUARANTEED)

Database & DBMS

1. Database – Definition

A database is an organized collection of related data stored electronically in a computer system.
It is designed to store, retrieve, and manage data efficiently.

Example:
Student database containing Roll No, Name, Course, Marks.


2. DBMS – Definition

A Database Management System (DBMS) is a software system that allows users to define, create, store, retrieve, update, and control access to a database.

Examples: MySQL, Oracle, SQL Server, PostgreSQL.


3. Database System

A database system consists of:

  • Database

  • DBMS software

  • Application programs

  • Users


4. Characteristics of Database Approach

The database approach has the following important characteristics:


1️⃣ Self-Describing Nature of Database System

A DBMS contains not only the data but also the description of the data, known as metadata.

  • Metadata includes:

    • Data type

    • Size

    • Structure

    • Constraints

This information is stored in a system catalog (data dictionary).

👉 Hence, the database is self-describing.


2️⃣ Program–Data Independence

In DBMS, application programs are independent of data storage structure.

  • Changes in data structure do not require changes in application programs.

  • Achieved through:

    • Logical data independence

    • Physical data independence

👉 This makes the system flexible and easy to maintain.


3️⃣ Data Abstraction

DBMS hides complex internal details from users and provides different levels of abstraction.

Levels of Data Abstraction:

  1. Physical Level – How data is stored internally

  2. Logical Level – What data is stored and relationships

  3. View Level – User-specific view of data

👉 Data abstraction simplifies database usage.


4️⃣ Multiple Views of Data

A DBMS allows different users to have different views of the same database.

  • Each user sees only the data relevant to them.

  • Improves security and usability.

Example:

  • Student sees marks

  • Admin sees all student records


5️⃣ Data Sharing and Multi-User Access

A DBMS allows multiple users to access the database simultaneously.

  • Supports:

    • Concurrency control

    • Transaction management

  • Prevents data inconsistency.

👉 Enables efficient data sharing in organizations.


File System vs Database System (DBMS)

Definition (Intro – 1 mark)

A file system stores data in separate files managed by the operating system, whereas a Database Management System (DBMS) stores data in an integrated database and provides controlled access to it.


Difference between File System and DBMS

Basis of Comparison File System Database System (DBMS)
Data Redundancy High data redundancy due to duplicate files Minimal redundancy due to centralized database
Data Consistency Data inconsistency may occur Data consistency is maintained
Data Sharing Data sharing is difficult Data sharing is easy among multiple users
Security Less security, handled by OS High security using authorization and access control
Data Integrity Integrity constraints are difficult to enforce Integrity constraints are easily enforced
Backup & Recovery Manual and difficult Automatic backup and recovery mechanisms
Data Independence No data independence Logical and physical data independence supported
Concurrency Control Not supported Supported using transaction management
Query Processing No query language support SQL support for efficient querying
Maintenance Cost High maintenance cost Lower maintenance cost in long term

Why DBMS is Better than File System? (Conclusion – 2 marks)

A DBMS is better than a file system because it reduces redundancy, maintains consistency, ensures data security and integrity, supports multi-user access, provides backup and recovery, and offers data independence, making data management more efficient and reliable.


Advantages and Disadvantages of DBMS

Introduction (1 mark)

A Database Management System (DBMS) provides a systematic way to store, manage, and retrieve data efficiently. It offers several advantages over traditional file systems, but also has some limitations.


Advantages of DBMS

1️⃣ Reduced Data Redundancy

  • Data is stored in a centralized database.

  • Duplicate data is minimized.

👉 This saves storage space.


2️⃣ Improved Data Consistency

  • Since data is not duplicated, updates are reflected everywhere.

  • Prevents conflicting data values.


3️⃣ Data Sharing

  • Multiple users can access the same database simultaneously.

  • Supports multi-user environment.


4️⃣ Data Security

  • DBMS provides authorization and authentication.

  • Only authorized users can access data.


5️⃣ Backup and Recovery

  • Automatic backup and recovery mechanisms are available.

  • Protects data from system failure or crashes.


6️⃣ Data Independence

  • Changes in data structure do not affect application programs.

  • Supports:

    • Logical data independence

    • Physical data independence


Disadvantages of DBMS

1️⃣ High Cost

  • DBMS software is expensive.

  • Requires costly hardware and maintenance.


2️⃣ Complexity

  • DBMS is complex to design and manage.

  • Requires careful configuration.


3️⃣ Performance Overhead

  • For small applications, DBMS may be slower than file systems.

  • Extra processing for security and concurrency.


4️⃣ Skilled Manpower Required

  • Requires trained database administrators (DBA).

  • Users need technical knowledge.


Data Abstraction and Data Independence

Introduction (1 mark)

A DBMS hides the complex details of data storage and provides users with a simplified view of data. This is achieved through data abstraction and data independence.


1. Data Abstraction

Definition

Data abstraction is the process of hiding internal implementation details of data and showing only the relevant information to the user.


Levels of Data Abstraction

A DBMS provides three levels of abstraction:


1️⃣ Physical Level

  • Lowest level of abstraction.

  • Describes how data is stored on disk.

  • Includes file structures, indexes, and storage details.

👉 Used by database administrators.


2️⃣ Logical Level

  • Middle level of abstraction.

  • Describes what data is stored and the relationships among data.

  • Does not include storage details.

👉 Used by database designers.


3️⃣ View Level

  • Highest level of abstraction.

  • Shows only a part of the database to the user.

  • Different users can have different views.

👉 Improves security and simplicity.


2. Data Independence

Definition

Data independence is the ability to change the database schema at one level without affecting the schema at the next higher level.


Types of Data Independence


1️⃣ Logical Data Independence

  • Ability to change the logical schema without changing the view level.

  • Examples:

    • Adding a new field

    • Adding a new table

👉 Harder to achieve.


2️⃣ Physical Data Independence

  • Ability to change the physical schema without changing the logical schema.

  • Examples:

    • Changing file organization

    • Adding indexes

👉 Easier to achieve.


Difference between Data Abstraction and Data Independence

Data Abstraction Data Independence
Hides complexity of data Allows schema changes without affecting programs
Concerned with levels Concerned with schema changes
Improves usability Improves flexibility

Conclusion (1–2 marks)

Data abstraction simplifies database usage by hiding internal details, while data independence allows changes in database structure without affecting applications, making DBMS flexible and scalable.


🟧 HIGH PRIORITY (VERY LIKELY)

DBMS Architecture

Introduction (1 mark)

DBMS architecture defines the logical design of a database system that determines how data is stored, accessed, and managed by users.


1️⃣ One-Tier (1-Tier) Architecture

Definition

In 1-tier architecture, the database, DBMS, and application all reside on the same system.

Diagram (Textual)

User
 |
Application + DBMS + Database

Explanation

  • Used for local applications

  • No network involved

Example

MS Access on a single computer


2️⃣ Two-Tier (2-Tier) Architecture

Definition

In 2-tier architecture, the client communicates directly with the database server.

Diagram

Client (Application)
        |
     DBMS Server
        |
     Database

Explanation

  • Client handles presentation and logic

  • Server handles database processing

Example

Client–server applications using MySQL


3️⃣ Three-Tier (3-Tier) Architecture

Definition

In 3-tier architecture, the application is divided into three layers: presentation, application, and database.

Diagram

Client (Presentation Layer)
           |
Application Server (Business Logic)
           |
Database Server (DBMS + Database)

Explanation

  • Client sends request to application server

  • Application server processes logic

  • Database server stores data


Advantages of Three-Tier Architecture

1️⃣ Improved Security

  • Database is not directly accessible to users

2️⃣ Scalability

  • Easy to add more users and servers

3️⃣ Better Performance

  • Load is distributed across layers

4️⃣ Easy Maintenance

  • Changes in one layer do not affect others

5️⃣ High Reliability

  • Failure in one layer does not crash entire system

Conclusion (1–2 marks)

Among all architectures, three-tier architecture is most widely used in modern DBMS applications due to its security, scalability, and maintainability.


Database Users and Database Administrator (DBA)

Introduction (1 mark)

A database system is accessed and managed by different types of users, and its overall control is handled by a Database Administrator (DBA).


1. Types of Database Users

1️⃣ Naïve Users

  • Use database through predefined applications.

  • Do not have knowledge of DBMS.

Example: Bank customers using ATM.


2️⃣ Application Programmers

  • Develop application programs to access database.

  • Use programming languages like Java, C++, and SQL.

Example: Software developers.


3️⃣ Sophisticated Users

  • Directly interact with the database using SQL queries.

  • Have good knowledge of DBMS.

Example: Data analysts.


2. Database Administrator (DBA)

Definition

A Database Administrator (DBA) is a person responsible for the overall management, control, and maintenance of the database system.


Roles and Responsibilities of DBA

1️⃣ Schema Definition

  • Defines database structure using DDL.

  • Creates tables, views, and constraints.


2️⃣ Security Management

  • Controls user access using authorization.

  • Prevents unauthorized access.


3️⃣ Backup and Recovery

  • Takes regular database backups.

  • Restores data after system failure.


4️⃣ Performance Tuning

  • Optimizes queries and indexing.

  • Improves system efficiency.


5️⃣ Integrity Maintenance

  • Ensures data accuracy and consistency.

  • Enforces integrity constraints.


Conclusion (1–2 marks)

Database users interact with the DBMS in different ways, while the DBA plays a critical role in ensuring security, performance, and reliability of the database system.


🟨 IMPORTANT (MEDIUM–HIGH CHANCE)

7️⃣ Entity–Relationship (E–R) Model – Basics

⭐⭐⭐⭐ (10 Marks)

Introduction (1 mark)

The Entity–Relationship (E-R) Model is a conceptual data model used to represent the structure of a database using entities, attributes, and relationships.


1. Entity

An entity is a real-world object that has an independent existence and can be uniquely identified.

Example: Student, Employee, Course


2. Attribute

An attribute describes a property of an entity.

Example:
Student → Roll_No, Name, Age


3. Relationship

A relationship represents an association between two or more entities.

Example:
Student enrolls in Course


4. Strong Entity

A strong entity:

  • Has a primary key

  • Exists independently

Example:
Student (Roll_No is primary key)


5. Weak Entity

A weak entity:

  • Does not have a primary key

  • Depends on a strong entity for identification


Example of Weak Entity

Dependent depends on Employee

  • Employee (Emp_ID) → Strong entity

  • Dependent (Name, Age) → Weak entity

Dependent is identified using Emp_ID + Name


Difference between Strong and Weak Entity

Strong Entity Weak Entity
Has primary key No primary key
Exists independently Depends on strong entity
Represented by single rectangle Represented by double rectangle

Conclusion (1–2 marks)

The E-R model provides a clear and simple way to design databases by representing real-world data using entities, attributes, and relationships.


8️⃣ Data Modeling & Phases of Database Modeling

⭐⭐⭐ (Short Answer / 5–7 Marks)

1. Data Modeling

Data modeling is the process of creating a conceptual representation of data and its relationships before designing a database.


2. Phases of Database Modeling

1️⃣ Conceptual Design

  • High-level design using E-R diagrams

  • Independent of DBMS


2️⃣ Logical Design

  • Converts conceptual model into relational schema

  • Defines tables, keys, and relationships


3️⃣ Physical Design

  • Defines storage structure

  • Indexes, file organization, and memory allocation


3. Benefits of Data Modeling

  • Improves data understanding

  • Reduces redundancy

  • Ensures data consistency

  • Saves development time


Conclusion

Data modeling helps in designing a well-structured, efficient, and scalable database system.


🟩 LOW PRIORITY (ONLY IF TIME LEFT)

9️⃣ Database Design (Basics)

⭐⭐ (Short Answer)

Meaning of Database Design

Database design is the process of organizing data structure by defining tables, attributes, relationships, and constraints to efficiently store and retrieve data.


Goals of Database Design

  1. Minimize data redundancy

  2. Ensure data consistency and integrity

  3. Improve data security

  4. Efficient data retrieval

  5. Easy maintenance and scalability



🔟 Strengths and Weaknesses of E–R Model

⭐⭐

Strengths of E–R Model

  1. Simple and easy to understand

  2. Graphical representation of data

  3. Helps in database planning

  4. Improves communication between users and designers

  5. Independent of DBMS


Weaknesses of E–R Model

  1. Not suitable for very complex systems

  2. No standard notation universally followed

  3. Cannot represent procedural logic

  4. Large diagrams become confusing

  5. Needs conversion to relational model




🔥 UNIT–2: RELATIONAL MODEL

🟥 TOP PRIORITY

1️⃣ Relational Model – Introduction & Basic Concepts

Introduction (1 mark)

The Relational Model is a widely used logical database model in which data is stored in tables (relations) consisting of rows and columns.
It was proposed by E.F. Codd in 1970.


1. Relation

A relation is a table with rows and columns.

  • Rows → Represent individual records (tuples)

  • Columns → Represent attributes of data

Example: Student Table

Roll_No Name Age Course
101 Rohan 20 BTech
102 Anjali 21 BSc

2. Tuple

  • A tuple is a row in a relation.

  • Represents a single record of the table.

Example: (101, Rohan, 20, BTech)


3. Attribute

  • An attribute is a column in a relation.

  • Represents a property of the entity.

Example: Roll_No, Name, Age, Course


4. Domain

  • A domain is a set of permissible values for an attribute.

Example: Age → {18, 19, 20, 21…}
Course → {BTech, BSc, BCom}


5. Relation Schema

  • A relation schema defines the name of the relation and its attributes.

  • Example: Student(Roll_No, Name, Age, Course)


6. Degree & Cardinality

Term Meaning Example
Degree Number of attributes in a relation Student table → 4
Cardinality Number of tuples (rows) in a relation Student table → 2

7. Keys in Relational Model

1️⃣ Primary Key

  • Uniquely identifies each tuple in a relation

  • Cannot have NULL values

Example: Roll_No in Student table


2️⃣ Candidate Key

  • Set of attributes that can uniquely identify a tuple

  • One candidate key is chosen as primary key

Example: Roll_No, Email (both can uniquely identify Student)


3️⃣ Foreign Key

  • Attribute in one table that refers to primary key of another table

  • Ensures referential integrity

Example:
Enrollment(Student_Roll_No → Student.Roll_No)


4️⃣ Super Key

  • One or more attributes that can uniquely identify a tuple

  • May include extra attributes

Example: (Roll_No), (Roll_No, Name)


Conclusion (1–2 marks)

The relational model provides a simple, structured, and mathematical approach to store and manage data.
Keys ensure uniqueness and relationships, making the database reliable and consistent.


2️⃣ Codd’s Rules (Rules for Fully Relational Systems)

Introduction (1 mark)

Codd’s Rules are a set of 13 rules proposed by E.F. Codd to define a fully relational database system.
They ensure data integrity, consistency, and reliability in relational databases.


Important Codd’s Rules (8–10 Rules with Explanation)

1️⃣ Rule 0 – Foundation Rule

  • The system must qualify as a relational database and support relational capabilities fully.

2️⃣ Information Rule

  • All data is represented as values in tables (relations).

  • Everything, including metadata, must be stored in tables.


3️⃣ Guaranteed Access Rule

  • Every value in the database can be accessed by specifying table name, primary key, and column name.

  • Ensures no hidden data.


4️⃣ Systematic Treatment of NULLs

  • The DBMS must support NULLs for missing or inapplicable data.

  • NULLs are handled consistently in all operations.


5️⃣ Physical Data Independence

  • Changes in physical storage (e.g., indexing, file structure) do not affect application programs.

6️⃣ Logical Data Independence

  • Changes in logical schema (e.g., adding attributes or tables) do not affect user views or applications.

7️⃣ Integrity Independence

  • Integrity constraints (e.g., primary key, foreign key) should be stored in the catalog and not in application programs.

  • Allows constraints to be modified without affecting programs.


8️⃣ Non‑Subversion Rule

  • Low-level access (like pointers or record-level operations) cannot bypass integrity rules.

  • Ensures all database rules are enforced.


9️⃣ View Updating Rule

  • All views that are theoretically updatable must be updatable by the system.

  • Guarantees data consistency.


10️⃣ Relational Catalog Rule

  • The database must contain a catalog (data dictionary) that describes all the database objects in relational terms.

  • Users can query the catalog using the same relational language (SQL).


Conclusion (1–2 marks)

Codd’s rules define a fully relational DBMS.
They ensure data consistency, integrity, independence, and accessibility, forming the foundation for modern relational databases.


3️⃣ Advantages of the Relational Model

  1. Simplicity

    • Data is stored in tables (relations), making it easy to understand and use.
  2. Structural Independence

    • Changes in table structure do not affect applications if keys and relationships are maintained.
  3. Easy Data Manipulation

    • Supports powerful query languages like SQL for inserting, updating, deleting, and retrieving data.
  4. Data Integrity

    • Enforces primary key, foreign key, and other constraints to maintain accuracy and consistency.
  5. Flexibility

    • Easily accommodates new data and changing relationships without redesigning the database.
  6. Security

    • Access can be controlled at table or column level to prevent unauthorized access.
  7. Data Consistency

    • Minimizes redundancy and ensures all data is consistent across tables.
  8. Multi-user Support

    • Multiple users can simultaneously access and update the database safely using transaction management.

🟧 HIGH PRIORITY (VERY LIKELY)

4️⃣ Mapping E-R Model to Relational Model

Introduction (1 mark)

Mapping an E-R model to a relational model is the process of converting entities, attributes, and relationships into tables (relations) in a database.
This is an essential step in database design.


1. Mapping of Strong Entity

  • Each strong entity becomes a relation (table).

  • Attributes of the entity become columns.

  • Primary key of the entity becomes the primary key of the table.

Example:
Entity: Student(Roll_No, Name, Age) → Table: Student(Roll_No PK, Name, Age)


2. Mapping of Weak Entity

  • Weak entity becomes a relation.

  • Primary key of the strong entity (owner) is included as foreign key.

  • Combination of weak entity’s partial key + owner key becomes the primary key.

Example:
Dependent(Name, Age) → Table: Dependent(Emp_ID FK, Name, Age, PK = (Emp_ID, Name))


3. Mapping of Relationships

a) 1:1 Relationship

  • Include primary key of one entity as a foreign key in the other entity.

  • Choose the side that frequently accesses the relationship.

Example:
Employee ↔ Passport (1:1)
Table: Employee(Emp_ID PK, Name, Passport_No FK)


b) 1:N Relationship

  • Include primary key of “one” side as foreign key in “many” side.

Example:
Department(Dept_ID PK, Name)
Employee(Emp_ID PK, Name, Dept_ID FK)


c) M:N Relationship

  • Create a separate relation for the relationship.

  • Include primary keys of both entities as foreign keys.

  • Combined keys become primary key of the relationship table.

Example:
Student ↔ Course (M:N)
Enrollment(Student_Roll_No FK, Course_ID FK, PK = (Student_Roll_No, Course_ID))


4. Mapping of Attributes

a) Simple Attribute

  • Directly becomes a column in the table.

Example: Name → Name


b) Composite Attribute

  • Break into component attributes and add each as a column.

Example: Full_Name(First, Last) → First_Name, Last_Name


c) Multivalued Attribute

  • Create a separate table with foreign key referencing the main entity.

Example:
Phone_Numbers → Table: Student_Phone(Roll_No FK, Phone_No, PK=(Roll_No, Phone_No))


Conclusion (1–2 marks)

Mapping E-R model to relational model ensures that the conceptual design is correctly implemented in the relational database, maintaining keys, relationships, and constraints.


5️⃣ Data Integrity in Relational Model

Introduction (1 mark)

Data integrity refers to the accuracy, consistency, and correctness of data in a database over its entire lifecycle.
It ensures that the database remains reliable and error-free.


Types of Data Integrity

1️⃣ Domain Integrity

  • Ensures that values stored in a column are valid and within a specified domain.

  • Implemented using data types, constraints, and validation rules.

Example:

  • Age attribute in Student table must be between 18 and 30.

  • Course can only be {BTech, BSc, BCom}


2️⃣ Entity Integrity

  • Ensures that each row (tuple) in a table is uniquely identifiable.

  • Achieved by primary key constraint.

  • Primary key cannot be NULL.

Example:

  • Roll_No in Student table is a primary key → cannot be NULL and must be unique

3️⃣ Referential Integrity

  • Ensures that foreign key values in one table must match primary key values in another table.

  • Maintains consistency among related tables.

Example:

  • Employee.Department_ID in Employee table must exist in Department.Dept_ID

  • Prevents orphan records


Conclusion (1 mark)

Data integrity constraints ensure that the database remains accurate, consistent, and reliable, which is crucial for relational database management systems.


6️⃣ Data Manipulation in Relational Model

Introduction (1 mark)

Data manipulation refers to the process of adding, deleting, modifying, or retrieving data in a relational database.
It is performed using Data Manipulation Language (DML), but here we focus on the concepts, not SQL syntax.


1. Insert Operation

  • Used to add new tuples (rows) to a relation (table).

  • Ensures that new data obeys integrity constraints.

Example:
Adding a new student record in Student table.


2. Delete Operation

  • Used to remove existing tuples from a relation.

  • Must ensure referential integrity is not violated.

Example:
Removing a student who has left the college.


3. Update Operation

  • Used to modify existing data in one or more tuples.

  • Must maintain data integrity and consistency.

Example:
Changing a student’s course from BSc to BTech.


4. Selection Operation (Basic Idea)

  • Refers to retrieving specific tuples from a table based on conditions.

  • Helps in querying only relevant data.

Example:
Select all students with Age > 20 from the Student table.


Conclusion (1 mark)

Data manipulation operations allow users to manage and maintain data efficiently in a relational database while ensuring integrity, consistency, and relevance of information.


🔥 UNIT–3 : SQL

🟥 TOP PRIORITY

1️⃣ SQL (Structured Query Language)

Introduction (1 mark)

SQL is a standard computer language used for managing and manipulating relational databases.


Definition (1 mark)

SQL is a high-level language that allows users to define, manipulate, control, and query data in a relational database.


Purpose of SQL (1 mark)

  1. To create and modify database structures

  2. To insert, update, delete, and retrieve data

  3. To control access and security of data

  4. To manage integrity and consistency of the database


Types of SQL (6–7 marks)

1️⃣ DDL – Data Definition Language

  • Used to define and modify database structures

  • Commands: CREATE, ALTER, DROP
    Example: CREATE TABLE Student(...)


2️⃣ DML – Data Manipulation Language

  • Used to manipulate data stored in tables

  • Commands: INSERT, UPDATE, DELETE, SELECT
    Example: INSERT INTO Student VALUES(...)


3️⃣ DCL – Data Control Language

  • Used to control access and permissions

  • Commands: GRANT, REVOKE
    Example: GRANT SELECT ON Student TO User1


4️⃣ TCL – Transaction Control Language (Optional Mention)

  • Used to manage transactions

  • Commands: COMMIT, ROLLBACK, SAVEPOINT


Conclusion (1 mark)

SQL provides a comprehensive set of commands for creating, manipulating, controlling, and querying databases, making it the core language for relational database management.


2️⃣ DDL Commands (Data Definition Language)

Introduction

DDL (Data Definition Language) is used to define and manage database structures, such as tables, schemas, and indexes.
It focuses on the structure of the database, not the data itself.


1️⃣ CREATE

  • Purpose: To create new database objects like tables, views, or databases.

  • Use: Defines the structure of tables and columns.

  • Example: Creating a table named Student.


2️⃣ DROP

  • Purpose: To delete existing database objects permanently.

  • Use: Removes the table or database along with its structure and data.

  • Example: Dropping the Student table.


3️⃣ ALTER

  • Purpose: To modify the structure of an existing table.

  • Use: Can add, delete, or modify columns and constraints.

  • Example: Adding a new column Course to the Student table.


4️⃣ TRUNCATE

  • Purpose: To delete all data from a table while keeping its structure intact.

  • Use: Quickly removes all rows without logging individual deletions.

  • Example: Removing all records from the Student table.


3️⃣ DML Commands (Data Manipulation Language)

Introduction

DML (Data Manipulation Language) is used to manipulate data stored in database tables.
It allows users to insert, update, delete, and retrieve data, but does not affect the table structure.


1️⃣ SELECT

  • Purpose: To retrieve data from one or more tables.

  • Optional Clauses:

    • WHERE: Selects rows based on a condition.
      Example: Select students with Age > 20.

    • DISTINCT: Retrieves unique values only.
      Example: Select unique courses from Student table.

    • ORDER BY: Sorts the result ascending or descending.
      Example: Order students by Name in ascending order.


2️⃣ INSERT

  • Purpose: To add new rows (tuples) into a table.

  • Use: Adds new records while maintaining integrity constraints.

  • Example: Adding a new student to the Student table.


3️⃣ UPDATE

  • Purpose: To modify existing data in one or more rows.

  • Use: Updates values while maintaining data consistency.

  • Example: Changing the Course of a student from BSc to BTech.


4️⃣ DELETE

  • Purpose: To remove existing rows from a table.

  • Use: Deletes records based on conditions, maintaining referential integrity.

  • Example: Removing students who have left the college.


4️⃣ SQL Data Types

  1. INT (Integer)

    • Stores whole numbers without decimals.

    • Example: 10, 250, -5

  2. FLOAT (Floating Point)

    • Stores decimal numbers (approximate values).

    • Example: 3.14, 0.5, -2.75

  3. CHAR (Character)

    • Stores fixed-length text.

    • Example: CHAR(5) → “Hello”

  4. VARCHAR (Variable Character)

    • Stores variable-length text.

    • Example: VARCHAR(20) → “Database”

  5. DATE

    • Stores date values in YYYY-MM-DD format.

    • Example: 2025-12-20

  6. BOOLEAN

    • Stores TRUE or FALSE values.

    • Example: TRUE, FALSE


🟧 HIGH PRIORITY

5️⃣ Aggregate Functions in SQL

Introduction (1 mark)

Aggregate functions perform calculations on a set of values and return a single summary value.
They are commonly used in reporting, grouping, and analytics.


Common Aggregate Functions (5–6 marks)

Function Purpose Example
COUNT() Counts the number of rows COUNT(*) → Number of students
SUM() Adds values of a column SUM(Marks) → Total marks of all students
AVG() Calculates average value AVG(Marks) → Average marks
MIN() Returns minimum value MIN(Age) → Youngest student
MAX() Returns maximum value MAX(Age) → Oldest student

Use with GROUP BY

  • GROUP BY is used to group rows with same values in one or more columns.

  • Aggregate functions can then summarize each group separately.

Example:

  • Total marks obtained by students in each course.
SELECT Course, SUM(Marks)
FROM Student
GROUP BY Course;

Use with HAVING

  • HAVING is used to filter groups based on aggregate values (like WHERE but for groups).

Example:

  • Courses with total marks > 200.
SELECT Course, SUM(Marks)
FROM Student
GROUP BY Course
HAVING SUM(Marks) > 200;

Conclusion (1 mark)

Aggregate functions allow summarizing and analyzing data efficiently, especially when combined with GROUP BY and HAVING clauses.


6️⃣ JOIN Expressions in SQL

Introduction (1 mark)

JOINs are used to combine rows from two or more tables based on a related column between them.
They help in retrieving related data from multiple tables efficiently.


1️⃣ INNER JOIN

  • Definition: Returns only the matching rows from both tables.

  • Example: Retrieve students with their course details where enrollment exists.

SELECT Student.Name, Course.Course_Name
FROM Student
INNER JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

2️⃣ LEFT JOIN (or LEFT OUTER JOIN)

  • Definition: Returns all rows from the left table, and matching rows from the right table.

  • Example: List all students and their courses, including students who are not enrolled.

SELECT Student.Name, Course.Course_Name
FROM Student
LEFT JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

3️⃣ RIGHT JOIN (or RIGHT OUTER JOIN)

  • Definition: Returns all rows from the right table, and matching rows from the left table.

  • Example: List all courses and students enrolled, including courses with no students.

SELECT Student.Name, Course.Course_Name
FROM Student
RIGHT JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

4️⃣ FULL JOIN (or FULL OUTER JOIN)

  • Definition: Returns all rows when there is a match in either left or right table.

  • Example: List all students and courses, including unmatched students or courses.

SELECT Student.Name, Course.Course_Name
FROM Student
FULL JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

Conclusion (1 mark)

JOIN expressions allow combining related data from multiple tables.

  • INNER JOIN: Only matches

  • LEFT JOIN: All left + matches

  • RIGHT JOIN: All right + matches

  • FULL JOIN: All rows from both tables




🔥 UNIT–4: SCHEMA REFINEMENT & QUERY PROCESSING & OPTIMIZATION

🟥 TOP PRIORITY (VERY HIGH CHANCE)

1️⃣ Normalization

Introduction (1 mark)

Normalization is the process of organizing data in a database to reduce redundancy and avoid anomalies during insertion, deletion, or update.


Purpose of Normalization (1 mark)

  1. Remove redundancy – avoids duplicate data.

  2. Prevent anomalies – eliminates insert, update, and delete anomalies.

  3. Improve data consistency – ensures data is accurate and reliable.


Types of Normal Forms

1️⃣ First Normal Form (1NF)

  • Definition: Table is in 1NF if all columns have atomic (indivisible) values and there are no repeating groups.

  • Example:

	| Student_ID |  Name   | Courses       |  
    |------------|---------|---------------|  
    | 101        | Rohan   | BTech, Math   | → Not 1NF

Converted to 1NF:

Student_ID Name Course
101 Rohan BTech
101 Rohan Math

2️⃣ Second Normal Form (2NF)

  • Definition: Table is in 2NF if it is in 1NF and no non-prime attribute is partially dependent on part of a composite primary key.

  • Example:
    | Student_ID | Course_ID | Course_Name | Grade |
    Partial dependency: Course_Name depends only on Course_ID → Not 2NF

Converted to 2NF:

  • Student_Course(Student_ID, Course_ID, Grade)

  • Course(Course_ID, Course_Name)


3️⃣ Third Normal Form (3NF)

  • Definition: Table is in 3NF if it is in 2NF and no transitive dependency exists.

  • Example:
    | Emp_ID | Emp_Name | Dept_ID | Dept_Name |
    Dept_Name depends on Dept_ID, not Emp_ID → Transitive dependency

Converted to 3NF:

  • Employee(Emp_ID, Emp_Name, Dept_ID)

  • Department(Dept_ID, Dept_Name)


4️⃣ Boyce-Codd Normal Form (BCNF)

  • Definition: Stronger version of 3NF.

  • Every determinant must be a candidate key.

  • Resolves anomalies that 3NF cannot handle in some rare cases.


Conclusion (1–2 marks)

Normalization ensures efficient database design, reduces redundancy, prevents anomalies, and improves data consistency.

  • 1NF → Atomic values

  • 2NF → Remove partial dependency

  • 3NF → Remove transitive dependency

  • BCNF → Stronger than 3NF


2️⃣ Functional Dependencies (Single-Valued)

Introduction (1 mark)

A functional dependency (FD) is a relationship between two sets of attributes in a relation where the value of one attribute (or set of attributes) uniquely determines the value of another attribute (or set of attributes).
It is a fundamental concept in normalization.


Notation (1 mark)

  • Represented as:

A→BA \rightarrow BA→B

  • Meaning: Attribute B is functionally dependent on attribute A.

  • Example:

    • In a Student table:

      • Roll_No → Name

      • This means Roll_No uniquely determines the Name.


Use in Normal Forms (5–6 marks)

1. First Normal Form (1NF)

  • Ensures atomic values, which reduces repeating groups.

  • Functional dependencies are identified to remove multivalued attributes.

2. Second Normal Form (2NF)

  • Eliminates partial dependency.

  • A non-prime attribute should be fully dependent on the whole primary key, not part of it.

  • FD Example:

    • Composite Key (Student_ID, Course_ID) → Grade

    • Grade depends on the whole key, not just Student_ID.

3. Third Normal Form (3NF)

  • Eliminates transitive dependency.

  • FD Example:

    • Emp_ID → Dept_ID → Dept_Name

    • Dept_Name depends transitively on Emp_ID → move Dept_Name to Department table.

4. Boyce-Codd Normal Form (BCNF)

  • Ensures that every determinant is a candidate key.

  • FD Example:

    • If Course → Instructor, but Course is not a candidate key, BCNF requires decomposition.

3️⃣ Decomposition

Introduction (1 mark)

Decomposition is the process of breaking a single relation (table) into two or more relations to eliminate redundancy and anomalies while maintaining data integrity.


Purpose

  • Achieve higher normal forms (1NF, 2NF, 3NF, BCNF)

  • Reduce redundancy and update, insert, delete anomalies

  • Ensure lossless join and dependency preservation


1️⃣ Lossless Decomposition (1–2 marks)

  • A decomposition is lossless if the original table can be reconstructed exactly by joining the decomposed tables.

  • Ensures no data is lost after decomposition.

Example:
Original Table: Student_Course(Student_ID, Course_ID, Grade)

  • Decompose into:

    1. Student(Student_ID, Name)

    2. Enrollment(Student_ID, Course_ID, Grade)

Join: Student ⋈ Enrollment → Original table


2️⃣ Dependency Preservation (1–2 marks)

  • A decomposition preserves functional dependencies if all original dependencies can be enforced without joining the tables.

  • Ensures data integrity rules remain intact in decomposed tables.


Example of Decomposition (2–3 marks)

Original Table:
| Emp_ID | Emp_Name | Dept_ID | Dept_Name |

  • Problem: Dept_Name depends on Dept_ID → transitive dependency (not 3NF)

Decomposition into 3NF:

  1. Employee(Emp_ID, Emp_Name, Dept_ID)

  2. Department(Dept_ID, Dept_Name)

  • Now, redundancy and anomalies are removed

  • Lossless: Joining Employee ⋈ Department reconstructs original table

  • Dependency preserved: Emp_ID → Dept_ID and Dept_ID → Dept_Name


Conclusion (1 mark)

Decomposition is essential for efficient, anomaly-free database design, ensuring lossless join and dependency preservation while achieving higher normal forms.




🔥 UNIT‑5: TRANSACTION MANAGEMENT, CONCURRENCY & SECURITY

1️⃣ Transaction Concept and ACID Properties

Introduction (1 mark)

A transaction is a logical unit of work in a database that performs one or more operations (like read, write, update) on the database.

  • A transaction is atomic, meaning it is either fully executed or not executed at all.

  • Transactions ensure data consistency, integrity, and reliability.


ACID Properties (8 marks)

1️⃣ Atomicity (A)

  • Definition: A transaction is atomic, meaning it is indivisible.

  • Either all operations succeed or none are applied.

  • Example:

    • Transferring $100 from Account A to Account B:

      1. Debit A by $100

      2. Credit B by $100

    • If any step fails, rollback occurs and no money is transferred.


2️⃣ Consistency (C)

  • Definition: Transaction must take the database from one valid state to another, maintaining all integrity constraints.

  • Example:

    • Inserting a student record with a valid Roll_No and Age.

    • Database rules like primary key uniqueness remain intact after transaction.


3️⃣ Isolation (I)

  • Definition: Transactions are executed independently.

  • Intermediate results of a transaction are not visible to other transactions until commit.

  • Example:

    • Two transactions update the same account simultaneously.

    • Isolation ensures final balance is correct, preventing dirty reads.


4️⃣ Durability (D)

  • Definition: Once a transaction is committed, its changes are permanent, even if there is a system crash.

  • Example:

    • After transferring money, even if the system fails, the updated balances remain in the database.

Conclusion (1 mark)

Transactions ensure reliable, consistent, and concurrent operations in a database.
The ACID properties guarantee atomicity, consistency, isolation, and durability, which are essential for a robust DBMS.


2️⃣ Concurrent Transactions & Issues

Introduction (1 mark)

Concurrency in a database occurs when multiple transactions are executed simultaneously.

  • It improves system efficiency and resource utilization.

  • However, it can cause anomalies or inconsistencies if not controlled properly.


Problems in Concurrent Execution (8 marks)

1️⃣ Lost Update

  • Definition: Occurs when two transactions simultaneously update the same data, and one update overwrites the other.

  • Example:

    • Transaction T1 reads Account A = 100 and adds 50 → 150

    • Transaction T2 reads Account A = 100 and subtracts 30 → 70

    • Without control, final balance may become 70 instead of 120, losing T1’s update.


2️⃣ Temporary Inconsistency

  • Definition: Occurs when intermediate data produced by one transaction is read by another transaction, leading to wrong results.

  • Example:

    • T1 transfers $100: Debit A = 500 → 400, Credit B = 200 → 300

    • T2 reads B = 200 during transfer → temporary wrong balance


3️⃣ Uncommitted Dependency (Dirty Read)

  • Definition: Occurs when a transaction reads data written by another uncommitted transaction.

  • Example:

    • T1 updates Employee salary to 50,000 but not yet committed

    • T2 reads salary as 50,000

    • If T1 rolls back, T2 has inconsistent data (dirty read).


Conclusion (1 mark)

Concurrent execution improves performance, but can lead to lost updates, temporary inconsistency, and dirty reads.

  • To prevent these problems, DBMS uses concurrency control mechanisms like locking, timestamps, and isolation levels.

7️⃣ Deadlock in DBMS

Introduction (1 mark)

A deadlock occurs when two or more transactions are waiting indefinitely for resources held by each other, so none of them can proceed.


Necessary Conditions for Deadlock (4 marks)

  1. Mutual Exclusion

    • At least one resource must be held in a non-shareable mode.

    • Only one transaction can use it at a time.

  2. Hold and Wait

    • A transaction holds at least one resource and waits for additional resources held by other transactions.
  3. No Preemption

    • Resources cannot be forcibly taken from a transaction; they are released only voluntarily.
  4. Circular Wait

    • A set of transactions exists such that each transaction is waiting for a resource held by the next transaction in a cycle.

Detection & Prevention (3–4 marks)

Detection:

  • DBMS monitors resource allocation graph or uses wait-for graph.

  • If a cycle is detected, a deadlock exists.

  • Action: Abort one or more transactions to break the cycle.

Prevention:

  • Eliminate one or more necessary conditions:

    1. Mutual exclusion: Make resources sharable if possible

    2. Hold and wait: Require transactions to request all resources at once

    3. No preemption: Preempt resources from transactions

    4. Circular wait: Impose an ordering of resource requests


Conclusion (1 mark)

Deadlocks block concurrent transactions, reducing system performance.
Detection and prevention techniques ensure smooth transaction execution in DBMS.


7️⃣ Deadlock in DBMS

Introduction (1 mark)

A deadlock occurs when two or more transactions are waiting indefinitely for resources held by each other, so none of them can proceed.


Necessary Conditions for Deadlock (4 marks)

  1. Mutual Exclusion

    • At least one resource must be held in a non-shareable mode.

    • Only one transaction can use it at a time.

  2. Hold and Wait

    • A transaction holds at least one resource and waits for additional resources held by other transactions.
  3. No Preemption

    • Resources cannot be forcibly taken from a transaction; they are released only voluntarily.
  4. Circular Wait

    • A set of transactions exists such that each transaction is waiting for a resource held by the next transaction in a cycle.

Detection & Prevention (3–4 marks)

Detection:

  • DBMS monitors resource allocation graph or uses wait-for graph.

  • If a cycle is detected, a deadlock exists.

  • Action: Abort one or more transactions to break the cycle.

Prevention:

  • Eliminate one or more necessary conditions:

    1. Mutual exclusion: Make resources sharable if possible

    2. Hold and wait: Require transactions to request all resources at once

    3. No preemption: Preempt resources from transactions

    4. Circular wait: Impose an ordering of resource requests


Conclusion (1 mark)

Deadlocks block concurrent transactions, reducing system performance.
Detection and prevention techniques ensure smooth transaction execution in DBMS.


Transaction in DBMS

1️⃣ What is a Transaction?

A transaction is a logical unit of work that performs one or more database operations (like read, write, update) as a single unit.

  • Either the transaction completes fully (commit) or does not execute at all (abort/rollback).

  • Ensures consistency, integrity, and reliability of the database.


2️⃣ Properties of a Transaction (ACID)

  1. Atomicity: All operations of a transaction succeed or none succeed.

  2. Consistency: Transaction moves database from one valid state to another.

  3. Isolation: Transactions execute independently without affecting each other.

  4. Durability: Committed changes are permanent, even if system fails.


3️⃣ Types of Transactions

a) Based on Purpose

  1. Read-only Transaction

    • Only reads data and performs no updates.

    • Example: Viewing account balance.

  2. Update Transaction

    • Reads and modifies data.

    • Example: Transferring money between accounts.


b) Based on Execution

  1. Single-user Transaction

    • Only one user executes the transaction at a time.

    • Example: Updating personal profile in a local database.

  2. Multi-user Transaction

    • Multiple users concurrently access and modify data.

    • Requires concurrency control to avoid anomalies.


c) Based on Duration

  1. Short Transactions

    • Execute quickly, often read/write a few records.

    • Example: Updating a single student's marks.

  2. Long Transactions

    • Take more time and may involve multiple operations.

    • Example: Processing end-of-month payroll for all employees.


4️⃣ Transaction States (Lifecycle)

State Description
Active Transaction is executing.
Partially Committed Final operation executed, but not yet committed.
Committed All operations successfully completed; changes permanent.
Failed Some operation failed; transaction cannot commit.
Aborted/Rolled Back Transaction rolled back to initial state; no changes saved.
Terminated Transaction ends after commit or rollback.

5️⃣ Types of Failures

  1. System Crash: Hardware or software failure during transaction.

  2. Transaction Failure: Logical error or constraint violation.

  3. Media Failure: Disk or storage corruption.

  • Recovery mechanisms like logs, checkpoints, and rollbacks restore consistency.

6️⃣ Importance of Transactions

  • Ensures data consistency and integrity

  • Prevents lost updates, dirty reads, and temporary inconsistencies

  • Supports concurrent multi-user environments

  • Guarantees durability of committed changes


Concurrency in DBMS

1️⃣ What is Concurrency?

Concurrency occurs when multiple transactions are executed simultaneously in a database system.

  • Improves system throughput and resource utilization.

  • Without proper control, concurrency can lead to data anomalies and inconsistencies.


2️⃣ Types of Concurrency

  1. Read-Only Concurrency

    • Transactions only read data, no updates.

    • Example: Multiple users viewing the same report.

  2. Update Concurrency

    • Transactions read and write data simultaneously.

    • Requires concurrency control to avoid conflicts.

    • Example: Two users updating the same account balance.


3️⃣ Problems in Concurrent Execution

a) Lost Update

  • Definition: Two transactions simultaneously update the same data, and one update overwrites the other.

  • Example:

    • T1 reads Account A = 100 and adds 50 → 150

    • T2 reads Account A = 100 and subtracts 30 → 70

    • Final balance = 70 (T1 update lost)


b) Temporary Inconsistency

  • Definition: A transaction reads intermediate data of another transaction that has not yet committed, causing inconsistency.

  • Example:

    • T1 transfers $100: Debit A → 400, Credit B → 300

    • T2 reads B = 200 during transfer → temporary wrong balance


c) Uncommitted Dependency (Dirty Read)

  • Definition: A transaction reads data written by another uncommitted transaction.

  • Example:

    • T1 updates Employee salary to 50,000 (not committed)

    • T2 reads salary = 50,000

    • T1 rolls back → T2 read invalid data (dirty read)


4️⃣ Conclusion

  • Concurrency improves database performance, but may cause lost updates, temporary inconsistencies, and dirty reads.

  • To prevent these problems, DBMS uses locks, 2PL, timestamps, and isolation levels.