🔥 UNIT–1 : MOST IMPORTANT TOPICS (EXAM-ORIENTED)

🟥 TOP PRIORITY (VERY HIGH CHANCE – ALMOST GUARANTEED)

Database & DBMS

1. Database – Definition

A database is an organized collection of related data stored electronically in a computer system.
It is designed to store, retrieve, and manage data efficiently.

Example:
Student database containing Roll No, Name, Course, Marks.

2. DBMS – Definition

A Database Management System (DBMS) is a software system that allows users to define, create, store, retrieve, update, and control access to a database.

Examples: MySQL, Oracle, SQL Server, PostgreSQL.

3. Database System

A database system consists of:

Database
DBMS software
Application programs
Users

4. Characteristics of Database Approach

The database approach has the following important characteristics:

1️⃣ Self-Describing Nature of Database System

A DBMS contains not only the data but also the description of the data, known as metadata.

Metadata includes:
- Data type
- Size
- Structure
- Constraints

This information is stored in a system catalog (data dictionary).

👉 Hence, the database is self-describing.

2️⃣ Program–Data Independence

In DBMS, application programs are independent of data storage structure.

Changes in data structure do not require changes in application programs.
Achieved through:
- Logical data independence
- Physical data independence

👉 This makes the system flexible and easy to maintain.

3️⃣ Data Abstraction

DBMS hides complex internal details from users and provides different levels of abstraction.

Levels of Data Abstraction:

Physical Level – How data is stored internally
Logical Level – What data is stored and relationships
View Level – User-specific view of data

👉 Data abstraction simplifies database usage.

4️⃣ Multiple Views of Data

A DBMS allows different users to have different views of the same database.

Each user sees only the data relevant to them.
Improves security and usability.

Example:

Student sees marks
Admin sees all student records

5️⃣ Data Sharing and Multi-User Access

A DBMS allows multiple users to access the database simultaneously.

Supports:
- Concurrency control
- Transaction management
Prevents data inconsistency.

👉 Enables efficient data sharing in organizations.

File System vs Database System (DBMS)

Definition (Intro – 1 mark)

A file system stores data in separate files managed by the operating system, whereas a Database Management System (DBMS) stores data in an integrated database and provides controlled access to it.

Difference between File System and DBMS

Basis of Comparison	File System	Database System (DBMS)
Data Redundancy	High data redundancy due to duplicate files	Minimal redundancy due to centralized database
Data Consistency	Data inconsistency may occur	Data consistency is maintained
Data Sharing	Data sharing is difficult	Data sharing is easy among multiple users
Security	Less security, handled by OS	High security using authorization and access control
Data Integrity	Integrity constraints are difficult to enforce	Integrity constraints are easily enforced
Backup & Recovery	Manual and difficult	Automatic backup and recovery mechanisms
Data Independence	No data independence	Logical and physical data independence supported
Concurrency Control	Not supported	Supported using transaction management
Query Processing	No query language support	SQL support for efficient querying
Maintenance Cost	High maintenance cost	Lower maintenance cost in long term

Why DBMS is Better than File System? (Conclusion – 2 marks)

A DBMS is better than a file system because it reduces redundancy, maintains consistency, ensures data security and integrity, supports multi-user access, provides backup and recovery, and offers data independence, making data management more efficient and reliable.

Advantages and Disadvantages of DBMS

Introduction (1 mark)

A Database Management System (DBMS) provides a systematic way to store, manage, and retrieve data efficiently. It offers several advantages over traditional file systems, but also has some limitations.

Advantages of DBMS

1️⃣ Reduced Data Redundancy

Data is stored in a centralized database.
Duplicate data is minimized.

👉 This saves storage space.

2️⃣ Improved Data Consistency

Since data is not duplicated, updates are reflected everywhere.
Prevents conflicting data values.

3️⃣ Data Sharing

Multiple users can access the same database simultaneously.
Supports multi-user environment.

4️⃣ Data Security

DBMS provides authorization and authentication.
Only authorized users can access data.

5️⃣ Backup and Recovery

Automatic backup and recovery mechanisms are available.
Protects data from system failure or crashes.

6️⃣ Data Independence

Changes in data structure do not affect application programs.
Supports:
- Logical data independence
- Physical data independence

Disadvantages of DBMS

1️⃣ High Cost

DBMS software is expensive.
Requires costly hardware and maintenance.

2️⃣ Complexity

DBMS is complex to design and manage.
Requires careful configuration.

3️⃣ Performance Overhead

For small applications, DBMS may be slower than file systems.
Extra processing for security and concurrency.

4️⃣ Skilled Manpower Required

Requires trained database administrators (DBA).
Users need technical knowledge.

Data Abstraction and Data Independence

Introduction (1 mark)

A DBMS hides the complex details of data storage and provides users with a simplified view of data. This is achieved through data abstraction and data independence.

1. Data Abstraction

Definition

Data abstraction is the process of hiding internal implementation details of data and showing only the relevant information to the user.

Levels of Data Abstraction

A DBMS provides three levels of abstraction:

1️⃣ Physical Level

Lowest level of abstraction.
Describes how data is stored on disk.
Includes file structures, indexes, and storage details.

👉 Used by database administrators.

2️⃣ Logical Level

Middle level of abstraction.
Describes what data is stored and the relationships among data.
Does not include storage details.

👉 Used by database designers.

3️⃣ View Level

Highest level of abstraction.
Shows only a part of the database to the user.
Different users can have different views.

👉 Improves security and simplicity.

2. Data Independence

Definition

Data independence is the ability to change the database schema at one level without affecting the schema at the next higher level.

Types of Data Independence

1️⃣ Logical Data Independence

Ability to change the logical schema without changing the view level.
Examples:
- Adding a new field
- Adding a new table

👉 Harder to achieve.

2️⃣ Physical Data Independence

Ability to change the physical schema without changing the logical schema.
Examples:
- Changing file organization
- Adding indexes

👉 Easier to achieve.

Difference between Data Abstraction and Data Independence

Data Abstraction	Data Independence
Hides complexity of data	Allows schema changes without affecting programs
Concerned with levels	Concerned with schema changes
Improves usability	Improves flexibility

Conclusion (1–2 marks)

Data abstraction simplifies database usage by hiding internal details, while data independence allows changes in database structure without affecting applications, making DBMS flexible and scalable.

🟧 HIGH PRIORITY (VERY LIKELY)

DBMS Architecture

Introduction (1 mark)

DBMS architecture defines the logical design of a database system that determines how data is stored, accessed, and managed by users.

1️⃣ One-Tier (1-Tier) Architecture

Definition

In 1-tier architecture, the database, DBMS, and application all reside on the same system.

Diagram (Textual)

User
 |
Application + DBMS + Database

Explanation

Used for local applications
No network involved

Example

MS Access on a single computer

2️⃣ Two-Tier (2-Tier) Architecture

Definition

In 2-tier architecture, the client communicates directly with the database server.

Diagram

Client (Application)
        |
     DBMS Server
        |
     Database

Explanation

Client handles presentation and logic
Server handles database processing

Example

Client–server applications using MySQL

3️⃣ Three-Tier (3-Tier) Architecture

Definition

In 3-tier architecture, the application is divided into three layers: presentation, application, and database.

Diagram

Client (Presentation Layer)
           |
Application Server (Business Logic)
           |
Database Server (DBMS + Database)

Explanation

Client sends request to application server
Application server processes logic
Database server stores data

Advantages of Three-Tier Architecture

1️⃣ Improved Security

Database is not directly accessible to users

2️⃣ Scalability

Easy to add more users and servers

3️⃣ Better Performance

Load is distributed across layers

4️⃣ Easy Maintenance

Changes in one layer do not affect others

5️⃣ High Reliability

Failure in one layer does not crash entire system

Conclusion (1–2 marks)

Among all architectures, three-tier architecture is most widely used in modern DBMS applications due to its security, scalability, and maintainability.

Database Users and Database Administrator (DBA)

Introduction (1 mark)

A database system is accessed and managed by different types of users, and its overall control is handled by a Database Administrator (DBA).

1. Types of Database Users

1️⃣ Naïve Users

Use database through predefined applications.
Do not have knowledge of DBMS.

Example: Bank customers using ATM.

2️⃣ Application Programmers

Develop application programs to access database.
Use programming languages like Java, C++, and SQL.

Example: Software developers.

3️⃣ Sophisticated Users

Directly interact with the database using SQL queries.
Have good knowledge of DBMS.

Example: Data analysts.

2. Database Administrator (DBA)

Definition

A Database Administrator (DBA) is a person responsible for the overall management, control, and maintenance of the database system.

Roles and Responsibilities of DBA

1️⃣ Schema Definition

Defines database structure using DDL.
Creates tables, views, and constraints.

2️⃣ Security Management

Controls user access using authorization.
Prevents unauthorized access.

3️⃣ Backup and Recovery

Takes regular database backups.
Restores data after system failure.

4️⃣ Performance Tuning

Optimizes queries and indexing.
Improves system efficiency.

5️⃣ Integrity Maintenance

Ensures data accuracy and consistency.
Enforces integrity constraints.

Conclusion (1–2 marks)

Database users interact with the DBMS in different ways, while the DBA plays a critical role in ensuring security, performance, and reliability of the database system.

🟨 IMPORTANT (MEDIUM–HIGH CHANCE)

7️⃣ Entity–Relationship (E–R) Model – Basics

⭐⭐⭐⭐ (10 Marks)

Introduction (1 mark)

The Entity–Relationship (E-R) Model is a conceptual data model used to represent the structure of a database using entities, attributes, and relationships.

1. Entity

An entity is a real-world object that has an independent existence and can be uniquely identified.

Example: Student, Employee, Course

2. Attribute

An attribute describes a property of an entity.

Example:
Student → Roll_No, Name, Age

3. Relationship

A relationship represents an association between two or more entities.

Example:
Student enrolls in Course

4. Strong Entity

A strong entity:

Has a primary key
Exists independently

Example:
Student (Roll_No is primary key)

5. Weak Entity

A weak entity:

Does not have a primary key
Depends on a strong entity for identification

Example of Weak Entity

Dependent depends on Employee

Employee (Emp_ID) → Strong entity
Dependent (Name, Age) → Weak entity

Dependent is identified using Emp_ID + Name

Difference between Strong and Weak Entity

Strong Entity	Weak Entity
Has primary key	No primary key
Exists independently	Depends on strong entity
Represented by single rectangle	Represented by double rectangle

Conclusion (1–2 marks)

The E-R model provides a clear and simple way to design databases by representing real-world data using entities, attributes, and relationships.

8️⃣ Data Modeling & Phases of Database Modeling

⭐⭐⭐ (Short Answer / 5–7 Marks)

1. Data Modeling

Data modeling is the process of creating a conceptual representation of data and its relationships before designing a database.

2. Phases of Database Modeling

1️⃣ Conceptual Design

High-level design using E-R diagrams
Independent of DBMS

2️⃣ Logical Design

Converts conceptual model into relational schema
Defines tables, keys, and relationships

3️⃣ Physical Design

Defines storage structure
Indexes, file organization, and memory allocation

3. Benefits of Data Modeling

Improves data understanding
Reduces redundancy
Ensures data consistency
Saves development time

Conclusion

Data modeling helps in designing a well-structured, efficient, and scalable database system.

🟩 LOW PRIORITY (ONLY IF TIME LEFT)

9️⃣ Database Design (Basics)

⭐⭐ (Short Answer)

Meaning of Database Design

Database design is the process of organizing data structure by defining tables, attributes, relationships, and constraints to efficiently store and retrieve data.

Goals of Database Design

Minimize data redundancy
Ensure data consistency and integrity
Improve data security
Efficient data retrieval
Easy maintenance and scalability

🔟 Strengths and Weaknesses of E–R Model

⭐⭐

Strengths of E–R Model

Simple and easy to understand
Graphical representation of data
Helps in database planning
Improves communication between users and designers
Independent of DBMS

Weaknesses of E–R Model

Not suitable for very complex systems
No standard notation universally followed
Cannot represent procedural logic
Large diagrams become confusing
Needs conversion to relational model

🔥 UNIT–2: RELATIONAL MODEL

🟥 TOP PRIORITY

1️⃣ Relational Model – Introduction & Basic Concepts

Introduction (1 mark)

The Relational Model is a widely used logical database model in which data is stored in tables (relations) consisting of rows and columns.
It was proposed by E.F. Codd in 1970.

1. Relation

A relation is a table with rows and columns.

Rows → Represent individual records (tuples)
Columns → Represent attributes of data

Example: Student Table

Roll_No	Name	Age	Course
101	Rohan	20	BTech
102	Anjali	21	BSc

2. Tuple

A tuple is a row in a relation.
Represents a single record of the table.

Example: (101, Rohan, 20, BTech)

3. Attribute

An attribute is a column in a relation.
Represents a property of the entity.

Example: Roll_No, Name, Age, Course

4. Domain

A domain is a set of permissible values for an attribute.

Example: Age → {18, 19, 20, 21…}
Course → {BTech, BSc, BCom}

5. Relation Schema

A relation schema defines the name of the relation and its attributes.
Example: Student(Roll_No, Name, Age, Course)

6. Degree & Cardinality

Term	Meaning	Example
Degree	Number of attributes in a relation	Student table → 4
Cardinality	Number of tuples (rows) in a relation	Student table → 2

7. Keys in Relational Model

1️⃣ Primary Key

Uniquely identifies each tuple in a relation
Cannot have NULL values

Example: Roll_No in Student table

2️⃣ Candidate Key

Set of attributes that can uniquely identify a tuple
One candidate key is chosen as primary key

Example: Roll_No, Email (both can uniquely identify Student)

3️⃣ Foreign Key

Attribute in one table that refers to primary key of another table
Ensures referential integrity

Example:
Enrollment(Student_Roll_No → Student.Roll_No)

4️⃣ Super Key

One or more attributes that can uniquely identify a tuple
May include extra attributes

Example: (Roll_No), (Roll_No, Name)

Conclusion (1–2 marks)

The relational model provides a simple, structured, and mathematical approach to store and manage data.
Keys ensure uniqueness and relationships, making the database reliable and consistent.

2️⃣ Codd’s Rules (Rules for Fully Relational Systems)

Introduction (1 mark)

Codd’s Rules are a set of 13 rules proposed by E.F. Codd to define a fully relational database system.
They ensure data integrity, consistency, and reliability in relational databases.

Important Codd’s Rules (8–10 Rules with Explanation)

1️⃣ Rule 0 – Foundation Rule

The system must qualify as a relational database and support relational capabilities fully.

2️⃣ Information Rule

All data is represented as values in tables (relations).
Everything, including metadata, must be stored in tables.

3️⃣ Guaranteed Access Rule

Every value in the database can be accessed by specifying table name, primary key, and column name.
Ensures no hidden data.

4️⃣ Systematic Treatment of NULLs

The DBMS must support NULLs for missing or inapplicable data.
NULLs are handled consistently in all operations.

5️⃣ Physical Data Independence

Changes in physical storage (e.g., indexing, file structure) do not affect application programs.

6️⃣ Logical Data Independence

Changes in logical schema (e.g., adding attributes or tables) do not affect user views or applications.

7️⃣ Integrity Independence

Integrity constraints (e.g., primary key, foreign key) should be stored in the catalog and not in application programs.
Allows constraints to be modified without affecting programs.

8️⃣ Non‑Subversion Rule

Low-level access (like pointers or record-level operations) cannot bypass integrity rules.
Ensures all database rules are enforced.

9️⃣ View Updating Rule

All views that are theoretically updatable must be updatable by the system.
Guarantees data consistency.

10️⃣ Relational Catalog Rule

The database must contain a catalog (data dictionary) that describes all the database objects in relational terms.
Users can query the catalog using the same relational language (SQL).

Conclusion (1–2 marks)

Codd’s rules define a fully relational DBMS.
They ensure data consistency, integrity, independence, and accessibility, forming the foundation for modern relational databases.

3️⃣ Advantages of the Relational Model

Simplicity
- Data is stored in tables (relations), making it easy to understand and use.
Structural Independence
- Changes in table structure do not affect applications if keys and relationships are maintained.
Easy Data Manipulation
- Supports powerful query languages like SQL for inserting, updating, deleting, and retrieving data.
Data Integrity
- Enforces primary key, foreign key, and other constraints to maintain accuracy and consistency.
Flexibility
- Easily accommodates new data and changing relationships without redesigning the database.
Security
- Access can be controlled at table or column level to prevent unauthorized access.
Data Consistency
- Minimizes redundancy and ensures all data is consistent across tables.
Multi-user Support
- Multiple users can simultaneously access and update the database safely using transaction management.

🟧 HIGH PRIORITY (VERY LIKELY)

4️⃣ Mapping E-R Model to Relational Model

Introduction (1 mark)

Mapping an E-R model to a relational model is the process of converting entities, attributes, and relationships into tables (relations) in a database.
This is an essential step in database design.

1. Mapping of Strong Entity

Each strong entity becomes a relation (table).
Attributes of the entity become columns.
Primary key of the entity becomes the primary key of the table.

Example:
Entity: Student(Roll_No, Name, Age) → Table: Student(Roll_No PK, Name, Age)

2. Mapping of Weak Entity

Weak entity becomes a relation.
Primary key of the strong entity (owner) is included as foreign key.
Combination of weak entity’s partial key + owner key becomes the primary key.

Example:
Dependent(Name, Age) → Table: Dependent(Emp_ID FK, Name, Age, PK = (Emp_ID, Name))

3. Mapping of Relationships

a) 1:1 Relationship

Include primary key of one entity as a foreign key in the other entity.
Choose the side that frequently accesses the relationship.

Example:
Employee ↔ Passport (1:1)
Table: Employee(Emp_ID PK, Name, Passport_No FK)

b) 1:N Relationship

Include primary key of “one” side as foreign key in “many” side.

Example:
Department(Dept_ID PK, Name)
Employee(Emp_ID PK, Name, Dept_ID FK)

c) M:N Relationship

Create a separate relation for the relationship.
Include primary keys of both entities as foreign keys.
Combined keys become primary key of the relationship table.

Example:
Student ↔ Course (M:N)
Enrollment(Student_Roll_No FK, Course_ID FK, PK = (Student_Roll_No, Course_ID))

4. Mapping of Attributes

a) Simple Attribute

Directly becomes a column in the table.

Example: Name → Name

b) Composite Attribute

Break into component attributes and add each as a column.

Example: Full_Name(First, Last) → First_Name, Last_Name

c) Multivalued Attribute

Create a separate table with foreign key referencing the main entity.

Example:
Phone_Numbers → Table: Student_Phone(Roll_No FK, Phone_No, PK=(Roll_No, Phone_No))

Conclusion (1–2 marks)

Mapping E-R model to relational model ensures that the conceptual design is correctly implemented in the relational database, maintaining keys, relationships, and constraints.

5️⃣ Data Integrity in Relational Model

Introduction (1 mark)

Data integrity refers to the accuracy, consistency, and correctness of data in a database over its entire lifecycle.
It ensures that the database remains reliable and error-free.

Types of Data Integrity

1️⃣ Domain Integrity

Ensures that values stored in a column are valid and within a specified domain.
Implemented using data types, constraints, and validation rules.

Example:

Age attribute in Student table must be between 18 and 30.
Course can only be {BTech, BSc, BCom}

2️⃣ Entity Integrity

Ensures that each row (tuple) in a table is uniquely identifiable.
Achieved by primary key constraint.
Primary key cannot be NULL.

Example:

Roll_No in Student table is a primary key → cannot be NULL and must be unique

3️⃣ Referential Integrity

Ensures that foreign key values in one table must match primary key values in another table.
Maintains consistency among related tables.

Example:

Employee.Department_ID in Employee table must exist in Department.Dept_ID
Prevents orphan records

Conclusion (1 mark)

Data integrity constraints ensure that the database remains accurate, consistent, and reliable, which is crucial for relational database management systems.

6️⃣ Data Manipulation in Relational Model

Introduction (1 mark)

Data manipulation refers to the process of adding, deleting, modifying, or retrieving data in a relational database.
It is performed using Data Manipulation Language (DML), but here we focus on the concepts, not SQL syntax.

1. Insert Operation

Used to add new tuples (rows) to a relation (table).
Ensures that new data obeys integrity constraints.

Example:
Adding a new student record in Student table.

2. Delete Operation

Used to remove existing tuples from a relation.
Must ensure referential integrity is not violated.

Example:
Removing a student who has left the college.

3. Update Operation

Used to modify existing data in one or more tuples.
Must maintain data integrity and consistency.

Example:
Changing a student’s course from BSc to BTech.

4. Selection Operation (Basic Idea)

Refers to retrieving specific tuples from a table based on conditions.
Helps in querying only relevant data.

Example:
Select all students with Age > 20 from the Student table.

Conclusion (1 mark)

Data manipulation operations allow users to manage and maintain data efficiently in a relational database while ensuring integrity, consistency, and relevance of information.

🔥 UNIT–3 : SQL

🟥 TOP PRIORITY

1️⃣ SQL (Structured Query Language)

Introduction (1 mark)

SQL is a standard computer language used for managing and manipulating relational databases.

Definition (1 mark)

SQL is a high-level language that allows users to define, manipulate, control, and query data in a relational database.

Purpose of SQL (1 mark)

To create and modify database structures
To insert, update, delete, and retrieve data
To control access and security of data
To manage integrity and consistency of the database

Types of SQL (6–7 marks)

1️⃣ DDL – Data Definition Language

Used to define and modify database structures
Commands: CREATE, ALTER, DROP
Example: CREATE TABLE Student(...)

2️⃣ DML – Data Manipulation Language

Used to manipulate data stored in tables
Commands: INSERT, UPDATE, DELETE, SELECT
Example: INSERT INTO Student VALUES(...)

3️⃣ DCL – Data Control Language

Used to control access and permissions
Commands: GRANT, REVOKE
Example: GRANT SELECT ON Student TO User1

4️⃣ TCL – Transaction Control Language (Optional Mention)

Used to manage transactions
Commands: COMMIT, ROLLBACK, SAVEPOINT

Conclusion (1 mark)

SQL provides a comprehensive set of commands for creating, manipulating, controlling, and querying databases, making it the core language for relational database management.

2️⃣ DDL Commands (Data Definition Language)

Introduction

DDL (Data Definition Language) is used to define and manage database structures, such as tables, schemas, and indexes.
It focuses on the structure of the database, not the data itself.

1️⃣ CREATE

Purpose: To create new database objects like tables, views, or databases.
Use: Defines the structure of tables and columns.
Example: Creating a table named Student.

2️⃣ DROP

Purpose: To delete existing database objects permanently.
Use: Removes the table or database along with its structure and data.
Example: Dropping the Student table.

3️⃣ ALTER

Purpose: To modify the structure of an existing table.
Use: Can add, delete, or modify columns and constraints.
Example: Adding a new column Course to the Student table.

4️⃣ TRUNCATE

Purpose: To delete all data from a table while keeping its structure intact.
Use: Quickly removes all rows without logging individual deletions.
Example: Removing all records from the Student table.

3️⃣ DML Commands (Data Manipulation Language)

Introduction

DML (Data Manipulation Language) is used to manipulate data stored in database tables.
It allows users to insert, update, delete, and retrieve data, but does not affect the table structure.

1️⃣ SELECT

Purpose: To retrieve data from one or more tables.
Optional Clauses:
- WHERE: Selects rows based on a condition.
  Example: Select students with Age > 20.
- DISTINCT: Retrieves unique values only.
  Example: Select unique courses from Student table.
- ORDER BY: Sorts the result ascending or descending.
  Example: Order students by Name in ascending order.

2️⃣ INSERT

Purpose: To add new rows (tuples) into a table.
Use: Adds new records while maintaining integrity constraints.
Example: Adding a new student to the Student table.

3️⃣ UPDATE

Purpose: To modify existing data in one or more rows.
Use: Updates values while maintaining data consistency.
Example: Changing the Course of a student from BSc to BTech.

4️⃣ DELETE

Purpose: To remove existing rows from a table.
Use: Deletes records based on conditions, maintaining referential integrity.
Example: Removing students who have left the college.

4️⃣ SQL Data Types

INT (Integer)
- Stores whole numbers without decimals.
- Example: 10, 250, -5
FLOAT (Floating Point)
- Stores decimal numbers (approximate values).
- Example: 3.14, 0.5, -2.75
CHAR (Character)
- Stores fixed-length text.
- Example: CHAR(5) → “Hello”
VARCHAR (Variable Character)
- Stores variable-length text.
- Example: VARCHAR(20) → “Database”
DATE
- Stores date values in YYYY-MM-DD format.
- Example: 2025-12-20
BOOLEAN
- Stores TRUE or FALSE values.
- Example: TRUE, FALSE

🟧 HIGH PRIORITY

5️⃣ Aggregate Functions in SQL

Introduction (1 mark)

Aggregate functions perform calculations on a set of values and return a single summary value.
They are commonly used in reporting, grouping, and analytics.

Common Aggregate Functions (5–6 marks)

Function	Purpose	Example
COUNT()	Counts the number of rows	COUNT(*) → Number of students
SUM()	Adds values of a column	SUM(Marks) → Total marks of all students
AVG()	Calculates average value	AVG(Marks) → Average marks
MIN()	Returns minimum value	MIN(Age) → Youngest student
MAX()	Returns maximum value	MAX(Age) → Oldest student

Use with GROUP BY

GROUP BY is used to group rows with same values in one or more columns.
Aggregate functions can then summarize each group separately.

Example:

Total marks obtained by students in each course.

SELECT Course, SUM(Marks)
FROM Student
GROUP BY Course;

Use with HAVING

HAVING is used to filter groups based on aggregate values (like WHERE but for groups).

Example:

Courses with total marks > 200.

SELECT Course, SUM(Marks)
FROM Student
GROUP BY Course
HAVING SUM(Marks) > 200;

Conclusion (1 mark)

Aggregate functions allow summarizing and analyzing data efficiently, especially when combined with GROUP BY and HAVING clauses.

6️⃣ JOIN Expressions in SQL

Introduction (1 mark)

JOINs are used to combine rows from two or more tables based on a related column between them.
They help in retrieving related data from multiple tables efficiently.

1️⃣ INNER JOIN

Definition: Returns only the matching rows from both tables.
Example: Retrieve students with their course details where enrollment exists.

SELECT Student.Name, Course.Course_Name
FROM Student
INNER JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

2️⃣ LEFT JOIN (or LEFT OUTER JOIN)

Definition: Returns all rows from the left table, and matching rows from the right table.
Example: List all students and their courses, including students who are not enrolled.

SELECT Student.Name, Course.Course_Name
FROM Student
LEFT JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

3️⃣ RIGHT JOIN (or RIGHT OUTER JOIN)

Definition: Returns all rows from the right table, and matching rows from the left table.
Example: List all courses and students enrolled, including courses with no students.

SELECT Student.Name, Course.Course_Name
FROM Student
RIGHT JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

4️⃣ FULL JOIN (or FULL OUTER JOIN)

Definition: Returns all rows when there is a match in either left or right table.
Example: List all students and courses, including unmatched students or courses.

SELECT Student.Name, Course.Course_Name
FROM Student
FULL JOIN Enrollment
ON Student.Roll_No = Enrollment.Student_Roll_No;

Conclusion (1 mark)

JOIN expressions allow combining related data from multiple tables.

INNER JOIN: Only matches
LEFT JOIN: All left + matches
RIGHT JOIN: All right + matches
FULL JOIN: All rows from both tables

🔥 UNIT–4: SCHEMA REFINEMENT & QUERY PROCESSING & OPTIMIZATION

🟥 TOP PRIORITY (VERY HIGH CHANCE)

1️⃣ Normalization

Introduction (1 mark)

Normalization is the process of organizing data in a database to reduce redundancy and avoid anomalies during insertion, deletion, or update.

Purpose of Normalization (1 mark)

Remove redundancy – avoids duplicate data.
Prevent anomalies – eliminates insert, update, and delete anomalies.
Improve data consistency – ensures data is accurate and reliable.

Types of Normal Forms

1️⃣ First Normal Form (1NF)

Definition: Table is in 1NF if all columns have atomic (indivisible) values and there are no repeating groups.
Example:

	| Student_ID |  Name   | Courses       |  
    |------------|---------|---------------|  
    | 101        | Rohan   | BTech, Math   | → Not 1NF

Converted to 1NF:

Student_ID	Name	Course
101	Rohan	BTech
101	Rohan	Math

2️⃣ Second Normal Form (2NF)

Definition: Table is in 2NF if it is in 1NF and no non-prime attribute is partially dependent on part of a composite primary key.
Example:
| Student_ID | Course_ID | Course_Name | Grade |
Partial dependency: Course_Name depends only on Course_ID → Not 2NF

Converted to 2NF:

Student_Course(Student_ID, Course_ID, Grade)
Course(Course_ID, Course_Name)

3️⃣ Third Normal Form (3NF)

Definition: Table is in 3NF if it is in 2NF and no transitive dependency exists.
Example:
| Emp_ID | Emp_Name | Dept_ID | Dept_Name |
Dept_Name depends on Dept_ID, not Emp_ID → Transitive dependency

Converted to 3NF:

Employee(Emp_ID, Emp_Name, Dept_ID)
Department(Dept_ID, Dept_Name)

4️⃣ Boyce-Codd Normal Form (BCNF)

Definition: Stronger version of 3NF.
Every determinant must be a candidate key.
Resolves anomalies that 3NF cannot handle in some rare cases.

Conclusion (1–2 marks)

Normalization ensures efficient database design, reduces redundancy, prevents anomalies, and improves data consistency.

1NF → Atomic values
2NF → Remove partial dependency
3NF → Remove transitive dependency
BCNF → Stronger than 3NF

2️⃣ Functional Dependencies (Single-Valued)

Introduction (1 mark)

A functional dependency (FD) is a relationship between two sets of attributes in a relation where the value of one attribute (or set of attributes) uniquely determines the value of another attribute (or set of attributes).
It is a fundamental concept in normalization.

Notation (1 mark)

Represented as:

A→BA \rightarrow BA→B

Meaning: Attribute B is functionally dependent on attribute A.
Example:
- In a Student table:
  - Roll_No → Name
  - This means Roll_No uniquely determines the Name.

Use in Normal Forms (5–6 marks)

1. First Normal Form (1NF)

Ensures atomic values, which reduces repeating groups.
Functional dependencies are identified to remove multivalued attributes.

2. Second Normal Form (2NF)

Eliminates partial dependency.
A non-prime attribute should be fully dependent on the whole primary key, not part of it.
FD Example:
- Composite Key (Student_ID, Course_ID) → Grade
- Grade depends on the whole key, not just Student_ID.

3. Third Normal Form (3NF)

Eliminates transitive dependency.
FD Example:
- Emp_ID → Dept_ID → Dept_Name
- Dept_Name depends transitively on Emp_ID → move Dept_Name to Department table.

4. Boyce-Codd Normal Form (BCNF)

Ensures that every determinant is a candidate key.
FD Example:
- If Course → Instructor, but Course is not a candidate key, BCNF requires decomposition.

3️⃣ Decomposition

Introduction (1 mark)

Decomposition is the process of breaking a single relation (table) into two or more relations to eliminate redundancy and anomalies while maintaining data integrity.

Purpose

Achieve higher normal forms (1NF, 2NF, 3NF, BCNF)
Reduce redundancy and update, insert, delete anomalies
Ensure lossless join and dependency preservation

1️⃣ Lossless Decomposition (1–2 marks)

A decomposition is lossless if the original table can be reconstructed exactly by joining the decomposed tables.
Ensures no data is lost after decomposition.

Example:
Original Table: Student_Course(Student_ID, Course_ID, Grade)

Decompose into:
1. Student(Student_ID, Name)
2. Enrollment(Student_ID, Course_ID, Grade)

Join: Student ⋈ Enrollment → Original table

2️⃣ Dependency Preservation (1–2 marks)

A decomposition preserves functional dependencies if all original dependencies can be enforced without joining the tables.
Ensures data integrity rules remain intact in decomposed tables.

Example of Decomposition (2–3 marks)

Problem: Dept_Name depends on Dept_ID → transitive dependency (not 3NF)

Decomposition into 3NF:

Employee(Emp_ID, Emp_Name, Dept_ID)
Department(Dept_ID, Dept_Name)

Now, redundancy and anomalies are removed
Lossless: Joining Employee ⋈ Department reconstructs original table
Dependency preserved: Emp_ID → Dept_ID and Dept_ID → Dept_Name

Conclusion (1 mark)

Decomposition is essential for efficient, anomaly-free database design, ensuring lossless join and dependency preservation while achieving higher normal forms.

🔥 UNIT‑5: TRANSACTION MANAGEMENT, CONCURRENCY & SECURITY

1️⃣ Transaction Concept and ACID Properties

Introduction (1 mark)

A transaction is a logical unit of work in a database that performs one or more operations (like read, write, update) on the database.

A transaction is atomic, meaning it is either fully executed or not executed at all.
Transactions ensure data consistency, integrity, and reliability.

ACID Properties (8 marks)

1️⃣ Atomicity (A)

Definition: A transaction is atomic, meaning it is indivisible.
Either all operations succeed or none are applied.
Example:
- Transferring $100 from Account A to Account B:
  1. Debit A by $100
  2. Credit B by $100
- If any step fails, rollback occurs and no money is transferred.

2️⃣ Consistency (C)

Definition: Transaction must take the database from one valid state to another, maintaining all integrity constraints.
Example:
- Inserting a student record with a valid Roll_No and Age.
- Database rules like primary key uniqueness remain intact after transaction.

3️⃣ Isolation (I)

Definition: Transactions are executed independently.
Intermediate results of a transaction are not visible to other transactions until commit.
Example:
- Two transactions update the same account simultaneously.
- Isolation ensures final balance is correct, preventing dirty reads.

4️⃣ Durability (D)

Definition: Once a transaction is committed, its changes are permanent, even if there is a system crash.
Example:
- After transferring money, even if the system fails, the updated balances remain in the database.

Conclusion (1 mark)

Transactions ensure reliable, consistent, and concurrent operations in a database.
The ACID properties guarantee atomicity, consistency, isolation, and durability, which are essential for a robust DBMS.

2️⃣ Concurrent Transactions & Issues

Introduction (1 mark)

Concurrency in a database occurs when multiple transactions are executed simultaneously.

It improves system efficiency and resource utilization.
However, it can cause anomalies or inconsistencies if not controlled properly.

Problems in Concurrent Execution (8 marks)

1️⃣ Lost Update

Definition: Occurs when two transactions simultaneously update the same data, and one update overwrites the other.
Example:
- Transaction T1 reads Account A = 100 and adds 50 → 150
- Transaction T2 reads Account A = 100 and subtracts 30 → 70
- Without control, final balance may become 70 instead of 120, losing T1’s update.

2️⃣ Temporary Inconsistency

Definition: Occurs when intermediate data produced by one transaction is read by another transaction, leading to wrong results.
Example:
- T1 transfers $100: Debit A = 500 → 400, Credit B = 200 → 300
- T2 reads B = 200 during transfer → temporary wrong balance

3️⃣ Uncommitted Dependency (Dirty Read)

Definition: Occurs when a transaction reads data written by another uncommitted transaction.
Example:
- T1 updates Employee salary to 50,000 but not yet committed
- T2 reads salary as 50,000
- If T1 rolls back, T2 has inconsistent data (dirty read).

Conclusion (1 mark)

Concurrent execution improves performance, but can lead to lost updates, temporary inconsistency, and dirty reads.

To prevent these problems, DBMS uses concurrency control mechanisms like locking, timestamps, and isolation levels.

7️⃣ Deadlock in DBMS

Introduction (1 mark)

A deadlock occurs when two or more transactions are waiting indefinitely for resources held by each other, so none of them can proceed.

Necessary Conditions for Deadlock (4 marks)

Mutual Exclusion
- At least one resource must be held in a non-shareable mode.
- Only one transaction can use it at a time.
Hold and Wait
- A transaction holds at least one resource and waits for additional resources held by other transactions.
No Preemption
- Resources cannot be forcibly taken from a transaction; they are released only voluntarily.
Circular Wait
- A set of transactions exists such that each transaction is waiting for a resource held by the next transaction in a cycle.

Detection & Prevention (3–4 marks)

Detection:

DBMS monitors resource allocation graph or uses wait-for graph.
If a cycle is detected, a deadlock exists.
Action: Abort one or more transactions to break the cycle.

Prevention:

Eliminate one or more necessary conditions:
1. Mutual exclusion: Make resources sharable if possible
2. Hold and wait: Require transactions to request all resources at once
3. No preemption: Preempt resources from transactions
4. Circular wait: Impose an ordering of resource requests

Conclusion (1 mark)

Deadlocks block concurrent transactions, reducing system performance.
Detection and prevention techniques ensure smooth transaction execution in DBMS.

7️⃣ Deadlock in DBMS

Introduction (1 mark)

A deadlock occurs when two or more transactions are waiting indefinitely for resources held by each other, so none of them can proceed.

Necessary Conditions for Deadlock (4 marks)

Mutual Exclusion
- At least one resource must be held in a non-shareable mode.
- Only one transaction can use it at a time.
Hold and Wait
- A transaction holds at least one resource and waits for additional resources held by other transactions.
No Preemption
- Resources cannot be forcibly taken from a transaction; they are released only voluntarily.
Circular Wait
- A set of transactions exists such that each transaction is waiting for a resource held by the next transaction in a cycle.

Detection & Prevention (3–4 marks)

Detection:

DBMS monitors resource allocation graph or uses wait-for graph.
If a cycle is detected, a deadlock exists.
Action: Abort one or more transactions to break the cycle.

Prevention:

Eliminate one or more necessary conditions:
1. Mutual exclusion: Make resources sharable if possible
2. Hold and wait: Require transactions to request all resources at once
3. No preemption: Preempt resources from transactions
4. Circular wait: Impose an ordering of resource requests

Conclusion (1 mark)

Deadlocks block concurrent transactions, reducing system performance.
Detection and prevention techniques ensure smooth transaction execution in DBMS.

Transaction in DBMS

1️⃣ What is a Transaction?

A transaction is a logical unit of work that performs one or more database operations (like read, write, update) as a single unit.

Either the transaction completes fully (commit) or does not execute at all (abort/rollback).
Ensures consistency, integrity, and reliability of the database.

2️⃣ Properties of a Transaction (ACID)

Atomicity: All operations of a transaction succeed or none succeed.
Consistency: Transaction moves database from one valid state to another.
Isolation: Transactions execute independently without affecting each other.
Durability: Committed changes are permanent, even if system fails.

3️⃣ Types of Transactions

a) Based on Purpose

Read-only Transaction
- Only reads data and performs no updates.
- Example: Viewing account balance.
Update Transaction
- Reads and modifies data.
- Example: Transferring money between accounts.

b) Based on Execution

Single-user Transaction
- Only one user executes the transaction at a time.
- Example: Updating personal profile in a local database.
Multi-user Transaction
- Multiple users concurrently access and modify data.
- Requires concurrency control to avoid anomalies.

c) Based on Duration

Short Transactions
- Execute quickly, often read/write a few records.
- Example: Updating a single student's marks.
Long Transactions
- Take more time and may involve multiple operations.
- Example: Processing end-of-month payroll for all employees.

4️⃣ Transaction States (Lifecycle)

State	Description
Active	Transaction is executing.
Partially Committed	Final operation executed, but not yet committed.
Committed	All operations successfully completed; changes permanent.
Failed	Some operation failed; transaction cannot commit.
Aborted/Rolled Back	Transaction rolled back to initial state; no changes saved.
Terminated	Transaction ends after commit or rollback.

5️⃣ Types of Failures

System Crash: Hardware or software failure during transaction.
Transaction Failure: Logical error or constraint violation.
Media Failure: Disk or storage corruption.

Recovery mechanisms like logs, checkpoints, and rollbacks restore consistency.

6️⃣ Importance of Transactions

Ensures data consistency and integrity
Prevents lost updates, dirty reads, and temporary inconsistencies
Supports concurrent multi-user environments
Guarantees durability of committed changes

Concurrency in DBMS

1️⃣ What is Concurrency?

Concurrency occurs when multiple transactions are executed simultaneously in a database system.

Improves system throughput and resource utilization.
Without proper control, concurrency can lead to data anomalies and inconsistencies.

2️⃣ Types of Concurrency

Read-Only Concurrency
- Transactions only read data, no updates.
- Example: Multiple users viewing the same report.
Update Concurrency
- Transactions read and write data simultaneously.
- Requires concurrency control to avoid conflicts.
- Example: Two users updating the same account balance.

3️⃣ Problems in Concurrent Execution

a) Lost Update

Definition: Two transactions simultaneously update the same data, and one update overwrites the other.
Example:
- T1 reads Account A = 100 and adds 50 → 150
- T2 reads Account A = 100 and subtracts 30 → 70
- Final balance = 70 (T1 update lost)

b) Temporary Inconsistency

Definition: A transaction reads intermediate data of another transaction that has not yet committed, causing inconsistency.
Example:
- T1 transfers $100: Debit A → 400, Credit B → 300
- T2 reads B = 200 during transfer → temporary wrong balance

c) Uncommitted Dependency (Dirty Read)

Definition: A transaction reads data written by another uncommitted transaction.
Example:
- T1 updates Employee salary to 50,000 (not committed)
- T2 reads salary = 50,000
- T1 rolls back → T2 read invalid data (dirty read)

4️⃣ Conclusion

Concurrency improves database performance, but may cause lost updates, temporary inconsistencies, and dirty reads.
To prevent these problems, DBMS uses locks, 2PL, timestamps, and isolation levels.

FilesExpand file tree

RDBMS.md

Latest commit

History

RDBMS.md

File metadata and controls

🔥 UNIT–1 : MOST IMPORTANT TOPICS (EXAM-ORIENTED)

🟥 TOP PRIORITY (VERY HIGH CHANCE – ALMOST GUARANTEED)

Database & DBMS

1. Database – Definition

2. DBMS – Definition

3. Database System

4. Characteristics of Database Approach

1️⃣ Self-Describing Nature of Database System

2️⃣ Program–Data Independence

3️⃣ Data Abstraction

Levels of Data Abstraction:

4️⃣ Multiple Views of Data

5️⃣ Data Sharing and Multi-User Access

File System vs Database System (DBMS)

Definition (Intro – 1 mark)

Difference between File System and DBMS

Why DBMS is Better than File System? (Conclusion – 2 marks)

Advantages and Disadvantages of DBMS

Introduction (1 mark)

Advantages of DBMS

1️⃣ Reduced Data Redundancy

2️⃣ Improved Data Consistency

3️⃣ Data Sharing

4️⃣ Data Security

5️⃣ Backup and Recovery

6️⃣ Data Independence

Disadvantages of DBMS

1️⃣ High Cost

2️⃣ Complexity

3️⃣ Performance Overhead

4️⃣ Skilled Manpower Required

Data Abstraction and Data Independence

Introduction (1 mark)

1. Data Abstraction

Definition

Levels of Data Abstraction

1️⃣ Physical Level

2️⃣ Logical Level

3️⃣ View Level

2. Data Independence

Definition

Types of Data Independence

1️⃣ Logical Data Independence

2️⃣ Physical Data Independence

Difference between Data Abstraction and Data Independence

Conclusion (1–2 marks)

🟧 HIGH PRIORITY (VERY LIKELY)

DBMS Architecture

Introduction (1 mark)

1️⃣ One-Tier (1-Tier) Architecture

Definition

Diagram (Textual)

Explanation

Example

2️⃣ Two-Tier (2-Tier) Architecture

Definition

Diagram

Explanation

Example

3️⃣ Three-Tier (3-Tier) Architecture

Definition

Diagram

Explanation

Advantages of Three-Tier Architecture

1️⃣ Improved Security

2️⃣ Scalability

3️⃣ Better Performance

4️⃣ Easy Maintenance

5️⃣ High Reliability

Conclusion (1–2 marks)

Database Users and Database Administrator (DBA)

Introduction (1 mark)

1. Types of Database Users

1️⃣ Naïve Users