Skip to content

Commit 5e405e5

Browse files
committed
FHWN DB Introduction Init
1 parent 112e61f commit 5e405e5

4 files changed

Lines changed: 152 additions & 0 deletions

File tree

99.3 KB
Loading
208 KB
Loading
Lines changed: 152 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,152 @@
1+
# Introduction to DB / RDBMS
2+
3+
> [!CAUTION]
4+
> This is only a brief listing of some cornerpoints of the course material, and **cannot** be considered as a full-fledged learning material in itself.
5+
> This only serves as a short recap of the most important concepts.
6+
> Reading the book and additional resources is still necessary.
7+
8+
9+
## Motivation for DB
10+
11+
### What do we want/need?
12+
13+
Store (a lot of) data (in a structured way).
14+
15+
![What do we want](./what_do_we_want.jpg)
16+
17+
### Why?
18+
19+
To get asnwers to questions and/or keep state of things.
20+
- What are the latest news?
21+
- How many people and who liked my last insta post?
22+
- Which supermarket is still open nearby?
23+
- etc.
24+
25+
### What do we do with that data?
26+
27+
**C** reate it
28+
29+
**R** ead it
30+
31+
**U** pdate it
32+
33+
**D** elete it
34+
35+
## With what tool?
36+
37+
### General wisdom
38+
39+
> Use the right tool for the job.
40+
41+
Each scenario / requiremenet / use-case is different, chose the tool for storing the data based on your needs, not what *the best*, *the state-of-the-art* tool is.
42+
43+
> If your only tool is a hammer, every problem looks like a nail.
44+
45+
RDBMS is not the only solution.
46+
It is a **very** important tool to know, but not suited for everything.
47+
If it *feels* inadequate, look for a more fitting tool, in most of the cases there is something.
48+
Be open to learn, extend your knowledge, avoid becoming a one-trick pony.
49+
50+
> Don't bring a knife to a gun fight
51+
52+
Excel. is. NOT. a. database!
53+
54+
![Excel](./excel.jpg)
55+
56+
> Don't use a sledgehammer to crack a nut.
57+
58+
The tool for your beer tasting diary does not need a distributed Oracle Cloud Autonomous Database with in-memory processing, multi-region replication, and Kubernetes orchestration.
59+
60+
> If it ain't broke, don't fix it.
61+
62+
If your backend runs fine on a good old local MySQL DB, there's no need to ditch it for a trendy BaaS.
63+
64+
### Rule of thumbs
65+
66+
**Small amount of data, mostly hierarchical and changing structure, single user**
67+
→ A JSON file should do the trick.
68+
69+
**If it grows a lot, but the structure remains changing** → Good time to move to a Document based NoSQL solution. (MongoDB, Couchbase, etc.)
70+
71+
**You don't want to manage it yourself, looking for a cloud solution**
72+
→ Something like Firebase is easy to set up, and can carry you a long way.
73+
74+
**If the structure solidifies and in a relational** → Probably a good time to learn about RDBMS. Logical to move for safety and speed, especially if data keeps growing.
75+
76+
**Small amount of mostly tabular data, with unfrequent changes or deletions, single user with manual editing needs**
77+
→ A spreadsheet application will suffice.
78+
79+
**Same, but programmable access is needed**
80+
→ You can probably still get away with a spredsheet and a dataframe library like pandas.
81+
82+
**Data keeps growing, just a few sheets, just additions, mostly analytical functions, pivot tables**
83+
→ OLAP is the way.
84+
85+
**Or instead: more and more sheets, interconnected by lot of `XLOOKUPS`, `INDEX`, etc. and/or need for multiple users**
86+
→ This suggests OLTP direction with RDBMS.
87+
88+
#### RDBMS rule of thumbs
89+
90+
**Small/Medium size, single user, manual edits, reports**
91+
→ MS Acces / Libreoffice Base is a reasonable choice.
92+
93+
**Still a single user and small/medium size, but need for programmable access**
94+
→ sqlite is most probably enough for these needs. (VERY prominent for embedded / mobile apps.)
95+
96+
**Larger size and/or need for multiple users possibly over the internet**
97+
→ Open source proper DBMS, like MySQL, PosgreSQL is a good choice.
98+
99+
**Parallelization and performance become crucial due to growing size, query frequency**
100+
→ State-of-the-art proprietary RDBMS, like Oracle, MS SQL is necessary.
101+
102+
**If it is not legally necessary for data to be on-premise, and infrastructure management is better outsourced**
103+
→ Cloud native managed RDBMS (AWS RDS, Azure SQL).
104+
105+
**Same but simpler cases**
106+
→ Supabase will probably ease your workflow (similar to Firebase in the NoSQL realm).
107+
108+
#### Still more
109+
110+
Confused already? We still haven't touched on:
111+
- Time-series data → InfluxDB, TimescaleDB
112+
- Special type of data (e.g. geodata) → PostGIS, GeoJSON
113+
- Graph-like relationships → Neo4j
114+
- Streaming platforms → Apache Kafka, Pulsar
115+
- ...
116+
117+
> [!IMPORTANT]
118+
> You **DON'T** need to learn all these at once. You’ll likely never use more than half — but you don’t know which half.
119+
> The key takeaway: there are tools for many needs. When the need arises, be curious.
120+
121+
122+
## Relational DB basics
123+
124+
TODO: conceptual/logical/physical design
125+
TODO: table/relation, attribute/column, row/record, scheme/metadata, key, composite key, primary key, foreign key
126+
127+
## Database (Query) language
128+
129+
TODO: need for formal way of asking questions and an algorithmic way of answering them
130+
TODO: SQL, DDL, DML, DQL, DCL, DTL
131+
132+
## Recapception (recap of the recap)
133+
![Acronyms](./acronyms.jpg)
134+
135+
Test yourself by explaining the meaning of these acronyms:
136+
- TLA
137+
- DB
138+
- DBMS
139+
- RDBMS
140+
- CRUD
141+
- BaaS
142+
- JSON
143+
- OLAP
144+
- SQL
145+
- DDL
146+
- DML
147+
- DTL
148+
- DCL
149+
- DQL
150+
- PK
151+
152+
81.1 KB
Loading

0 commit comments

Comments
 (0)