Distributed File System in Go

A fault-tolerant distributed file system built in Go, inspired by GFS and HDFS. Separates the control plane (Raft-based metadata cluster) from the data plane (storage nodes) with peer-to-peer chunk replication, automatic failure detection, and self-healing.

Architecture

                         ┌─────────────────────────────┐
                         │      Metadata Cluster (Raft)│
                         │   Meta1 ←→ Meta2 ←→ Meta3   │
                         └──────────────┬──────────────┘
                                        │
                ┌───────────────────────┼───────────────────────┐
                │ heartbeat             │ placement             │ repair
                ▼                       ▼                       ▼
         ┌──────────┐           ┌──────────┐           ┌──────────┐
         │ Storage  │◄─────────►│ Storage  │◄─────────►│ Storage  │
         │  Node 1  │           │  Node 2  │           │  Node 3  │
         └──────────┘           └──────────┘           └──────────┘
                ▲
                │ upload / download
                │
            Client CLI

Control Plane — Raft-based metadata cluster. Stores file-to-chunk mappings, chunk-to-node mappings, node health, and replication state. Never touches chunk data.

Data Plane — Storage nodes handle chunk storage, P2P replication between peers, checksum validation, and heartbeating to metadata.

Client — CLI that splits files into chunks, coordinates with metadata for placement, and streams chunks directly to/from storage nodes in parallel.

Quick Start

Prerequisites

Go 1.21+
protoc + protoc-gen-go + protoc-gen-go-grpc plugins
buf CLI
Docker + Docker Compose

go install google.golang.org/protobuf/cmd/protoc-gen-go@latest
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@latest

Clone and build

git clone https://github.com/your-org/distributed-fs.git
cd distributed-fs
go mod download
make proto-gen       # generate protobuf/gRPC code with buf
make build-all       # produces bin/storage, bin/metadata, bin/client/dfs-cli

Run a local cluster

./scripts/run-cluster.sh -m 1 -s 3 up -d --build

Dynamically generates a docker-compose.yml, builds images, and starts 1 metadata node + 3 storage nodes in daemon mode.

# Override node counts
./scripts/run-cluster.sh -m 3 -s 5 up -d --build

# Include a client container
./scripts/run-cluster.sh -m 1 -s 3 -c up -d --build

# View logs
docker compose -f scripts/docker-compose.yml logs -f

# Stop cluster
docker compose -f scripts/docker-compose.yml down

CLI usage

# Upload a file
./bin/client/dfs-cli upload --file ./data.csv --name data.csv

# List all files
./bin/client/dfs-cli list

# List with prefix filter
./bin/client/dfs-cli list --prefix reports/

# Download a file
./bin/client/dfs-cli download --name data.csv --out ./output/data.csv

# Delete a file
./bin/client/dfs-cli delete --id <file_id>

Features

Raft consensus for replicated, fault-tolerant metadata
Fixed-size chunking (4 MB) with parallel upload and download
Peer-to-peer chunk replication — metadata never transfers bytes
Automatic failure detection via heartbeat timeouts
Self-healing — under-replicated chunks automatically re-replicated
SHA256 checksum verification for chunk integrity
Atomic chunk writes via tmp-file + rename (no partial chunks on disk)
Resumable uploads via local manifest
gRPC-based communication across all components

How It Works

Upload flow

Client splits the file into 4 MB chunks, computes chunk IDs as SHA256(file_id + chunk_index), writes a local manifest
Client calls CreateFile on metadata — receives chunk placement (primary + replicas)
Client streams each chunk to its primary storage node in parallel
Primary writes locally, fans out replication to replica nodes via P2P
Primary calls CommitChunk to metadata after replication quorum
Client calls CommitFile after all chunks are committed

Download flow

Client calls GetFile on metadata — receives ordered chunk list with live node addresses
Client downloads all chunks in parallel from any live replica per chunk
Client verifies SHA256 checksum per chunk, reassembles file in order

Failure detection and repair

Storage nodes send heartbeats every 3 seconds to metadata
Metadata marks a node Suspect after 10s of silence, Dead after 30s
Dead nodes trigger repair: metadata enqueues jobs, piggybacks them on heartbeat responses to source nodes
Source nodes replicate missing chunks to new targets using the same P2P engine as upload
Restarted nodes with stale replicas are reconciled — stale chunks evicted automatically

Configuration

Each component is configured via environment variables. See dedicated docs:

Component	Docs
Storage node	docs/storage.md
Metadata node	docs/metadata.md
Client	docs/client.md
Full reference	docs/configuration.md

Repository Structure

proto/                        - .proto source files (metadata/v1/, storage/v1/)
gen/proto/                    - generated protobuf/gRPC code
internal/                     - shared packages
  checksum/                   - SHA256 helpers
  errors/                     - typed error definitions
  leaderclient/               - metadata leader resolution and caching
  logging/                    - structured logging
  raftutil/                   - Raft bootstrap utilities
  retry/                      - exponential backoff with jitter
  utils/                      - filepath validation, deep copy
storage/                      - storage node (chunk store, replication, heartbeat, gRPC server)
metadata/                     - metadata node (Raft FSM, BoltDB store, node watcher, repair scheduler, reconciliation)
client/                       - CLI client (chunker, manifest, uploader, downloader)
scripts/                      - run-cluster.sh
integration/                  - integration tests (e2e, failover, meta_storage)

gRPC Services

MetadataService (`proto/metadata/v1/`)

Handles file lifecycle (create, commit, get, delete), chunk placement, node registration, heartbeats, and repair coordination. Called by both clients and storage nodes.

StorageService (`proto/storage/v1/`)

Client-facing service on storage nodes. Handles chunk upload (client streaming), chunk download (server streaming), eviction, and verification.

ReplicationService (`proto/storage/v1/`)

Internal P2P service between storage nodes. Handles chunk replication using bidirectional streaming for flow control. Clients never call this directly.

Development

make test-all          # run all unit tests
make integration-test  # run integration tests (real in-process cluster)
make lint              # lint Go + proto files
make fmt               # format Go + proto files
make proto-gen         # regenerate protobuf/gRPC code with buf
make clean             # remove data and manifest directories

Tech Stack

Component	Technology
Language	Go 1.21
Consensus	hashicorp/raft
Raft log store	hashicorp/raft-boltdb
Metadata persistence	BoltDB (bbolt)
RPC framework	gRPC + Protocol Buffers
Checksums	crypto/sha256
CLI	cobra
Local chunk storage	Disk (two-level sharded directories)

Design Decisions

Raft for metadata — avoids single point of failure; automatic leader election ensures availability through minority node failures.
P2P replication — the primary storage node owns fan-out, keeping metadata lightweight regardless of data volume. Metadata only records outcomes, never transfers bytes.
Fixed-size chunking — predictable and simple. Content-defined chunking adds complexity without benefit for a write-once system without deduplication.
Atomic rename for chunk writes — writing to .tmp then renaming guarantees a crash never leaves a corrupt visible chunk.
Piggyback repair on heartbeats — storage nodes already call metadata every 3s; using that channel for repair delivery avoids metadata maintaining outbound connections.

Limitations

This is an academic implementation. Known simplifications:

Write-once model — no file updates or appends
No rack-awareness in placement strategy
No TLS — all gRPC connections are insecure
No authentication or access control
Single metadata cluster — no cross-datacenter replication
No quota management

Name		Name	Last commit message	Last commit date
Latest commit History 160 Commits
.github/workflows		.github/workflows
.vscode		.vscode
bin		bin
client		client
docs		docs
gen/proto		gen/proto
integration		integration
internal		internal
metadata		metadata
proto		proto
scripts		scripts
storage		storage
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
buf.gen.yaml		buf.gen.yaml
buf.yaml		buf.yaml
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed File System in Go

Architecture

Quick Start

Prerequisites

Clone and build

Run a local cluster

CLI usage

Features

How It Works

Upload flow

Download flow

Failure detection and repair

Configuration

Repository Structure

gRPC Services

MetadataService (`proto/metadata/v1/`)

StorageService (`proto/storage/v1/`)

ReplicationService (`proto/storage/v1/`)

Development

Tech Stack

Design Decisions

Limitations

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Distributed File System in Go

Architecture

Quick Start

Prerequisites

Clone and build

Run a local cluster

CLI usage

Features

How It Works

Upload flow

Download flow

Failure detection and repair

Configuration

Repository Structure

gRPC Services

MetadataService (proto/metadata/v1/)

StorageService (proto/storage/v1/)

ReplicationService (proto/storage/v1/)

Development

Tech Stack

Design Decisions

Limitations

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

MetadataService (`proto/metadata/v1/`)

StorageService (`proto/storage/v1/`)

ReplicationService (`proto/storage/v1/`)

Packages