MANGA BOT OPEN SOURCE TEAM PROJECT

CONCEPT:

This will be an easy-to-host Discord bot that people can use to read manga in Discord. We are not going to host it like other big Discord bots. We may host it using free hosting so people can try and test it initially.

It will be like people are going to host it on their system and use it with their friend, or use it as their product. It is going to be open source so people can use it to sell it with their product. But we are going to add some credit for the contributor in the main C++ engine in the binary code, so people can’t remove the contributor credit even by changing the Python-written bot code.

Multilingual:

C++, Python, [HTML, CSS, JS]

C++:

C++ will be used for the main web scraping: getting info like the manga name, manga title, and manga thumbnail from the web. It will also handle the search system in C++, fulfilling the requests from Discord via a JSON file. It will communicate, give instructions, and receive requests.json instructions from the Discord bot. It will also be used to write the contributor info in the main engine.

PYTHON:

Python will be used to make the actual Discord bot with discord.py. It will make JSON files to give instructions to the C++ engine. It will receive the manga pages in image form from the C++ engine, and the image location and name will also be in a JSON file. It will communicate the most with the user, so it will also filter users. If any spam requests or command spamming happens, it will ban the user permanently. It will also send the bot info to a web server: info like bot status and uptime, requests, error logs.

HTML, CSS, JavaScript:

We will also make an admin panel like a web server with password protection. These are only for the contributors. We will be able to see bot error logs, how many bots are active in real time, how many are working, and how many requests are made. We will also be able to see which user was banned. Also, we can make announcements to all bots from the admin panel web page. This will give contributors admin power.

Technical details (code + algorithms)

Implementation details (C++ / Python / Web)

🧠

This page is for the real implementation plan: code structure, logic, and algorithms.

1) High-level architecture

Discord bot (Python / discord.py): handles commands, permissions, UI, rate limiting, and chooses what to request.
Engine (C++): scraping, parsing, search indexing, image processing, and returning results.
Shared contract (JSON): Python sends a request JSON to C++. C++ returns a response JSON.
Web admin panel (HTML/CSS/JS + server): collects logs, shows stats, and sends announcements.

2) Shared JSON contract (Python ⇄ C++)

Goals

Stable across versions
Easy to debug
Backward compatible when possible

Request JSON: request.json

{
	"request_id": "uuid-or-increment",
	"action": "search|info|chapter_list|pages|health",
	"source": "mangadex|site_x|site_y",
	"query": "one piece",
	"manga_id": "optional",
	"chapter_id": "optional",
	"language": "en",
	"page": 1,
	"limit": 10,
	"cache": true
}

Response JSON: response.json

{
	"request_id": "same-as-request",
	"ok": true,
	"error": null,
	"data": {
		"items": []
	}
}

Error model

ok=false
error object: { "code": "SCRAPE_FAILED", "message": "...", "retryable": true }

3) C++ (Engine) — scraping + algorithms

A) Folder structure suggestion

engine/
	src/
		main.cpp
		api/
			request_parser.cpp
			response_writer.cpp
		scrapers/
			site_x_scraper.cpp
			mangadex_scraper.cpp
		core/
			search_index.cpp
			cache.cpp
			rate_limit.cpp
			credits.cpp
	include/
	third_party/

B) Main algorithm flow

Read request.json
Validate fields (action, required IDs)
Route to handler (search/info/chapter_list/pages)
Scrape or fetch cached result
Write response.json

C) Scraping pipeline (per site)

Fetch HTML
Parse DOM
Extract:
- Title
- Thumbnail URL
- Manga ID
- Chapter list
- Page image URLs
Normalize outputs into one common internal model

D) Search algorithm options

Option 1: Simple (good start)

Keep an in-memory list of titles per site
On search, do:
- lowercase
- remove punctuation
- substring match + basic scoring

Option 2: Better scoring (still simple)

Tokenize query and titles
Score = token overlap + prefix bonus + length penalty
Return top N

E) Caching strategy

Cache by (source, action, manga_id, chapter_id, language)
Disk cache for:
- chapter list
- page URLs
Time-based expiry (example: 6 hours)

F) Pages output

C++ downloads images (or streams them) and outputs:
- file paths
- width/height if available
- page count

Example data.pages payload:

{
	"pages": [
		{ "index": 1, "path": "./cache/onepiece/1071/001.jpg" },
		{ "index": 2, "path": "./cache/onepiece/1071/002.jpg" }
	]
}

G) Contributor credits (harder to remove)

Practical approach

Store credits as:
- a compiled-in string table
- displayed via a command like /credits
- also included in engine --version output

Harder-to-remove ideas (still not perfect)

Spread credits across multiple translation units
Validate checksum of credit strings at runtime
Fail safe if tampered (but be careful: this can annoy legitimate forks)

4) Python (discord.py) — bot logic + safety

A) Folder structure suggestion

bot/
	bot.py
	cog_reader.py
	cog_admin.py
	engine_client.py
	config.py
	storage.py
	anti_spam.py

B) Bot flow per user command

Parse command (example: /manga search <query>)
Check permissions + server settings
Apply rate limits (per-user + per-guild)
Build request.json
Run the C++ engine (subprocess) or call it as a local service
Parse response.json
Render response:
- embeds
- buttons for next/prev page
- select menus for chapters

C) Anti-spam algorithm (simple and effective)

Keep a sliding window per user:
- N commands in T seconds
If exceeded:
- warn
- temporary cooldown
Repeat offenders:
- add to ban list

Data model:

banned_users.json
cooldowns.json

D) Instruction building (what to send to C++)

Always include request_id
Always include action
Only include IDs when needed
Keep a schema_version field once you start evolving the contract

5) Web (Admin panel) — logs, stats, announcements

A) Minimal components

Collector API (server): receives bot heartbeats + logs
Dashboard (web UI): shows bots online, errors, request counts
Auth: password login for contributors

B) Data to send from bots

bot_id
guild_count
uptime_seconds
requests_last_1h
errors_last_1h
engine_version

C) Announcement flow

Admin creates announcement in panel
Server stores it + assigns announcement_id
Bots poll periodically or receive a push
Bot posts announcement to configured channels

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
MANGA READER/ENGINE_C++		MANGA READER/ENGINE_C++
.gitattributes		.gitattributes
.gitignore		.gitignore
Implementation details (C++ Python Web) 654e7e876ab543f8ba953b8b594eacdf.md		Implementation details (C++ Python Web) 654e7e876ab543f8ba953b8b594eacdf.md
MANGA BOT OPEN SOURCE TEAM PROJECT 320b4f7dd316808b8a88d6af949e100d.md		MANGA BOT OPEN SOURCE TEAM PROJECT 320b4f7dd316808b8a88d6af949e100d.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MANGA BOT OPEN SOURCE TEAM PROJECT

Technical details (code + algorithms)

Implementation details (C++ / Python / Web)

1) High-level architecture

2) Shared JSON contract (Python ⇄ C++)

3) C++ (Engine) — scraping + algorithms

A) Folder structure suggestion

B) Main algorithm flow

C) Scraping pipeline (per site)

D) Search algorithm options

E) Caching strategy

F) Pages output

G) Contributor credits (harder to remove)

4) Python (discord.py) — bot logic + safety

A) Folder structure suggestion

B) Bot flow per user command

C) Anti-spam algorithm (simple and effective)

D) Instruction building (what to send to C++)

5) Web (Admin panel) — logs, stats, announcements

A) Minimal components

B) Data to send from bots

C) Announcement flow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MANGA BOT OPEN SOURCE TEAM PROJECT

Technical details (code + algorithms)

Implementation details (C++ / Python / Web)

1) High-level architecture

2) Shared JSON contract (Python ⇄ C++)

3) C++ (Engine) — scraping + algorithms

A) Folder structure suggestion

B) Main algorithm flow

C) Scraping pipeline (per site)

D) Search algorithm options

E) Caching strategy

F) Pages output

G) Contributor credits (harder to remove)

4) Python (discord.py) — bot logic + safety

A) Folder structure suggestion

B) Bot flow per user command

C) Anti-spam algorithm (simple and effective)

D) Instruction building (what to send to C++)

5) Web (Admin panel) — logs, stats, announcements

A) Minimal components

B) Data to send from bots

C) Announcement flow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages