Persistance by SuperALKALINEdroiD · Pull Request #12 · SuperALKALINEdroiD/DatabaseTBN

SuperALKALINEdroiD · 2025-08-17T07:15:00Z

No description provided.

gemini-code-assist

Summary of Changes

Hello @SuperALKALINEdroiD, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the database's foundational structure by introducing robust persistence capabilities. It establishes a new manifest system for managing SSTables, refactors configuration loading with dynamic generation and validation, and lays the groundwork for flushing in-memory data to disk. Furthermore, it adds a GET endpoint for data retrieval and updates internal node logic to intelligently handle in-memory storage thresholds and data searching. These changes collectively move the database closer to a fully persistent and reliable state.

Highlights

Enhanced Configuration Management: The database now dynamically generates a default configuration if one isn't found, and includes robust validation for all settings. The Write-Ahead Log (WAL) path is now configurable and relative to the application's runtime files, improving deployment flexibility.
Introduction of Manifest System: A new manifest package has been introduced to track SSTable metadata, including file paths and associated Bloom filter files. This lays the essential groundwork for implementing efficient disk-based storage and retrieval mechanisms.
In-Memory Data Management and Flushing: Internal node logic has been updated to monitor the size of in-memory data. When a configurable threshold is reached, the system will automatically trigger a flush of this data to persistent storage, significantly improving data durability and recovery capabilities.
New Data Retrieval Endpoint: A new GET / endpoint has been implemented, allowing clients to query data by key. This endpoint leverages the internal node's in-memory store for quick lookups, marking a step towards full CRUD functionality.
Persistence Utilities (Bloom Filter): A BloomFilter implementation has been added to the utils/persistance package. This probabilistic data structure will be crucial for efficient key lookups in disk-based SSTables, helping to reduce unnecessary disk I/O operations.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces persistence-related features, including configuration validation, manifest file handling, and the groundwork for flushing in-memory data to disk. While these are great additions, there are several critical issues that need to be addressed. These include incorrect algorithm usage (sort.Search), potential panics in HTTP handlers and gRPC methods from unhandled errors and nil dereferences, and flawed logic for memtable flushing which could lead to data loss. I've left detailed comments on these points with suggestions for fixes.

gemini-code-assist · 2025-08-17T07:17:17Z

handlers/get.go

+		value := r.URL.Query().Get("value")
+
+		grpcNode, hashError := appConfig.NodeHashInfo.GetNode(key)
+
+		if hashError != nil {
+			panic("Unable to get a node to store data")
+		}
+
+		logs.AddWalEntry(appConfig.WAL, key, value, grpcNode)


This GetValue handler has side effects, which is a violation of REST principles for GET requests. It takes a value from the query parameters and writes a WAL entry. GET requests should be idempotent and should not modify server state. This logic seems to be incorrectly copied from an insert/update handler and should be removed.

gemini-code-assist · 2025-08-17T07:17:17Z

utils/nodes/server.go

+	searhResult, _ := server.memTable.Get(request.Key)
+
 	return &NodeResponse{
 		Status:       Status_OK,
 		Timestamp:    time.Now().String(),
 		Node:         request.Node,
 		Key:          request.Key,
-		Value:        "DummyValue", // Dummy response
-		ErrorMessage: "",
+		Value:        searhResult.(string),
+		ErrorMessage: "Not Found",
 	}, nil


This method has a critical bug. server.memTable.Get returns two values: the result and a boolean indicating if the key was found. The boolean is ignored here (using _). If the key is not found, searhResult will be nil. The subsequent type assertion searhResult.(string) will cause a panic.

You should check the boolean and handle the 'not found' case gracefully, for example by returning an error or a specific response. Additionally, the ErrorMessage is hardcoded to "Not Found" even when a value is successfully retrieved.

searhResult, found := server.memTable.Get(request.Key) if !found { return &NodeResponse{ Status: Status_ERROR, Timestamp: time.Now().String(), Node: request.Node, Key: request.Key, ErrorMessage: "Not Found", }, nil } return &NodeResponse{ Status: Status_OK, Timestamp: time.Now().String(), Node: request.Node, Key: request.Key, Value: searhResult.(string), ErrorMessage: "", }, nil

gemini-code-assist · 2025-08-17T07:17:17Z

handlers/get.go

+
+		logs.AddWalEntry(appConfig.WAL, key, value, grpcNode)
+
+		destinationNodeIndex := sort.Search(len(appConfig.Nodes), func(i int) bool { return appConfig.Nodes[i].ID == grpcNode }) % len(appConfig.Nodes)


The use of sort.Search here is incorrect. sort.Search requires a sorted slice and a predicate that returns false for the first part of the slice and true for the rest. The provided predicate func(i int) bool { return appConfig.Nodes[i].ID == grpcNode } does not meet this requirement and will not work correctly for finding the node index.

Additionally, comparing node IDs as strings can lead to incorrect sorting order (e.g., "10" < "2").

A simple linear scan would be more appropriate and correct here.

Suggested change

destinationNodeIndex := sort.Search(len(appConfig.Nodes), func(i int) bool { return appConfig.Nodes[i].ID == grpcNode }) % len(appConfig.Nodes)

destinationNodeIndex := -1

for i, node := range appConfig.Nodes {

if node.ID == grpcNode {

destinationNodeIndex = i

break

}

}

if destinationNodeIndex == -1 {

http.Error(w, "Could not find destination node", http.StatusInternalServerError)

return

}

gemini-code-assist · 2025-08-17T07:17:18Z

utils/nodes/server.go

+func (server *internalNode) flushMemTableToMemory() {
+	log.Println("Starting memory flush to persistant storage")
+	defer log.Println("Completed Memory write, Continuing normal operations")
+
+	keysInTable := server.memTable.Keys()
+
+	for index, value := range keysInTable {
+		fmt.Println(index, value.(string))
+	}
+
+	defer server.memTable.Clear()
+
+}


This function has several issues that could lead to data loss:

It iterates over keysInTable but doesn't retrieve the corresponding values from the memtable. The values are needed to write the data to persistent storage.

defer server.memTable.Clear() is risky. If writing to storage fails partway through, the defer will still execute, clearing the memtable and causing data loss. The memtable should only be cleared after all its contents have been successfully written to disk.

The actual logic to persist data is missing, it only prints to the console.

gemini-code-assist · 2025-08-17T07:17:18Z

utils/nodes/server.go

+func (server *internalNode) shouldFlushToMemory() bool {
+	nodeCount := server.memTable.Size()
+	var sampleNode redblacktree.Node
+	nodeSize := int64(unsafe.Sizeof(sampleNode))
+	totalSize := int64(nodeCount) * nodeSize
+
+	return totalSize > server.dbConfig.InMemoryStorageThreshold
+}


The calculation of the memtable's memory size is incorrect. unsafe.Sizeof(sampleNode) only returns the size of the redblacktree.Node struct itself (which contains pointers and other small fields), not the size of the actual key and value data that the pointers point to. This will drastically underestimate the memory usage, and the flush to disk will likely not be triggered when intended.

A more accurate approach would be to track the total size of keys and values as they are added to and removed from the memtable.

gemini-code-assist · 2025-08-17T07:17:18Z

config/main.go

-		return nil, fmt.Errorf("failed to save config file, error on encoder: %v", err)
+	manifest, err := manifest.GetManifest()
+	if err != nil {
+		return nil, fmt.Errorf("failed to parse config file: %v", err)


The error message here is misleading. The error is returned from manifest.GetManifest(), not from parsing a config file. A more accurate error message would be "failed to get manifest".

Suggested change

return nil, fmt.Errorf("failed to parse config file: %v", err)

return nil, fmt.Errorf("failed to get manifest: %v", err)

gemini-code-assist · 2025-08-17T07:17:19Z

utils/common/main.go

+func findDestinationNode() {
+
+}


This function findDestinationNode is empty and unused. It should be removed to avoid dead code in the repository.

gemini-code-assist · 2025-08-17T07:17:19Z

cmd/init.go

+func GetAppPath() string {
+	path := common.GetAppPath()
+
+	return path
+}


This function GetAppPath is a wrapper around common.GetAppPath but it's not used anywhere in the cmd package. The main function calls common.GetAppPath directly. This function appears to be dead code and should be removed to avoid confusion.

gemini-code-assist · 2025-08-17T07:17:19Z

config/main.go

+	data, err := json.Marshal(configData)
 	if err != nil {
-		return nil, fmt.Errorf("failed to create config file: %v", err)
+		fmt.Println("Error marshalling example config:", err)


The codebase predominantly uses the log package for logging. Using fmt.Println here is inconsistent. It's better to use log.Printf or log.Println for consistency and to ensure logs are routed to the standard logger output (usually stderr).

Suggested change

fmt.Println("Error marshalling example config:", err)

log.Println("Error marshalling example config:", err)

gemini-code-assist · 2025-08-17T07:17:19Z

config/main.go

 	if config.validateConfig() {
+		manifest, err := manifest.GetManifest()
+		if err != nil {
+			return nil, fmt.Errorf("failed to parse config file: %v", err)


The error message "failed to parse config file" is misleading here. The error comes from manifest.GetManifest(), which can fail for reasons other than parsing, such as file I/O errors when creating the manifest. The error message should more accurately reflect the source of the error, for example: "failed to get manifest".

Suggested change

return nil, fmt.Errorf("failed to parse config file: %v", err)

return nil, fmt.Errorf("failed to get manifest: %v", err)

SuperALKALINEdroiD added 12 commits April 27, 2025 16:54

WIP: config rewrite in progress

dfb658a

added app path

f3400ae

load config using environment variable: LOG_BASE_SETTINGS

ea699c7

fixed the destination address for client connection

b79031b

WIP storage manifest for in memory storage

6f01088

WIP storage manifest for in memory storage

4012a54

WIP: memory storage

5ee6776

WIP: memort storage and planning some arch changes

c7796c5

refactor get value endpoint

8e1fb60

WIP bloom filters: Added Bloom Filters

bb2c96c

WIP: atomic flush to storage

325c70c

remove log prefix, setting up a potential logger for future

4a8d0cf

gemini-code-assist bot reviewed Aug 17, 2025

View reviewed changes

SuperALKALINEdroiD added 3 commits January 17, 2026 17:08

WIP: tweaks around persistance

60a895c

WIP: replay and persistance

1f7b5c9

persist data on disk

f6c45d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Persistance#12

Persistance#12
SuperALKALINEdroiD wants to merge 15 commits intodevelopfrom
persistance

SuperALKALINEdroiD commented Aug 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

gemini-code-assist bot Aug 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant


		logs.AddWalEntry(appConfig.WAL, key, value, grpcNode)

		destinationNodeIndex := sort.Search(len(appConfig.Nodes), func(i int) bool { return appConfig.Nodes[i].ID == grpcNode }) % len(appConfig.Nodes)

-		destinationNodeIndex := sort.Search(len(appConfig.Nodes), func(i int) bool { return appConfig.Nodes[i].ID == grpcNode }) % len(appConfig.Nodes)
+destinationNodeIndex := -1
+		for i, node := range appConfig.Nodes {
+			if node.ID == grpcNode {
+				destinationNodeIndex = i
+				break
+			}
+		}
+		if destinationNodeIndex == -1 {
+				http.Error(w, "Could not find destination node", http.StatusInternalServerError)
+				return
+		}

	return nil, fmt.Errorf("failed to parse config file: %v", err)
	return nil, fmt.Errorf("failed to get manifest: %v", err)

	fmt.Println("Error marshalling example config:", err)
	log.Println("Error marshalling example config:", err)

		func findDestinationNode() {

		}

Conversation

SuperALKALINEdroiD commented Aug 17, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Aug 17, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant