Big Data apps on MacOS

This project sets up a big data applications on a Mac with ARM64 architecture/Apple silicon using Docker. It uses Makefiles to manage and run Zookeeper, Kafka, AKHQ (a GUI for Kafka), Pinot, Superset, ZooNavigator, Spark, Flink, KafkaProducer, and Hadoop services.

Overview

This project provides a streamlined way to set up and manage a big data environment. Each service is managed from its own directory under apps/ using a Makefile-based workflow. Docker Compose files and Makefiles ensure services run correctly, efficiently, and with proper dependency management.

Features

Makefile & Docker Compose Integration: Easily manage services using standardized Makefile commands.
Automatic Dependency Management: Dependencies are started automatically as needed.
Health Checks: Verify that services are healthy with a single command.
Orphan Container Cleanup: Remove orphan containers before starting new ones.
Network Management: All services run on the same Docker network.
Persistent Data & Configs: Data and configuration are stored in app-data/ and configs/.

Prerequisites

Docker or Docker Desktop
Bash

Setup Instructions

1. Clone the Repository

git clone https://github.com/your-repo/big-data-mac-arm64-setup.git
cd big-data-mac-arm64-setup

Running the Services (Makefile-based)

Each service is managed from its own directory under apps/ using Makefile commands. The Makefile system will automatically start dependencies, run health checks, and manage containers.

General Pattern

To start any service:

cd apps/<service>
make up

To stop a service:

make down

To check health:

make health

To view logs:

make logs

Service-specific Instructions

Zookeeper

cd apps/zookeeper
make up

Access Admin UI: http://localhost:8081/commands, http://localhost:8082/commands, http://localhost:8083/commands

Kafka

cd apps/kafka
make up

AKHQ (Kafka UI)

cd apps/akhq
make up

Access UI: http://localhost:9093

Pinot

cd apps/pinot
make up

Access UI: http://localhost:9000

Superset

cd apps/superset
make up

Access UI: http://localhost:8088

ZooNavigator

cd apps/zoonavigator
make up

Access UI: http://localhost:2180

Spark

cd apps/spark
make up

Spark Master UI: http://localhost:8072 Spark Worker UI: http://localhost:8073 Place job JARs in app-data/spark/local-jars/ To run a Spark job:

docker exec -it spark-master spark-submit \
    --master spark://spark-master:7077 \
    --class com.abc.MainClass \
    --executor-memory 1G \
    --total-executor-cores 1 \
    /opt/spark/local-jars/yourjar.jar

For remote debugging, add the appropriate --conf options as in the previous README.

Flink

cd apps/flink
make up

Flink Job Manager UI: http://localhost:8074 To enable flame graphs, set rest.flamegraph.enabled: true in configs/flink/flink-conf.yaml

KafkaProducer

cd apps/kafkaproducer
make up

Place message files in app-data/kafkaproducer/messages/

Hadoop

cd apps/hadoop
make up

if you're running for the first time use

cd apps/hadoop
make up-reset

Namenode UI: http://localhost:9870 ResourceManager UI: http://localhost:8088

Additional Information

Health Checks: Use make health in each service directory.
Dependencies: Handled automatically; you do not need to start dependencies manually.
Network: All services are on the same Docker network for seamless communication.
Logs: Use make logs for troubleshooting.
Volumes: Data and configs are persisted in app-data/ and configs/.
.gitignore: Add /app-data/ and /configs/ to .gitignore to avoid committing data/configs.

By following these instructions, you can set up and manage a big data environment on your Mac with ease using Makefile commands.

Summary Table

Service	Directory	Start Command	UI/Port Example	Depends On
Zookeeper	zookeeper	`make up`	8081/8082/8083	-
Kafka	kafka	`make up`	9092	Zookeeper
AKHQ	akhq	`make up`	9093	Kafka
Pinot	pinot	`make up`	9000	Zookeeper
Superset	superset	`make up`	8088	Pinot, Postgres
ZooNavigator	zoonavigator	`make up`	2180	Zookeeper
Spark	spark	`make up`	8072, 8073, 18080	Hadoop, Zookeeper
Flink	flink	`make up`	8074	Hadoop, Zookeeper
KafkaProducer	kafkaproducer	`make up`	8085	Kafka
Hadoop	hadoop	`make up`	9870, 8088	Zookeeper

License

Contributions are welcome! Please fork the repository and submit a pull request for review. This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 219 Commits
.idea		.idea
apps		apps
configs		configs
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Big Data apps on MacOS

Overview

Features

Prerequisites

Setup Instructions

1. Clone the Repository

Running the Services (Makefile-based)

General Pattern

Service-specific Instructions

Zookeeper

Kafka

AKHQ (Kafka UI)

Pinot

Superset

ZooNavigator

Spark

Flink

KafkaProducer

Hadoop

Additional Information

Summary Table

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Big Data apps on MacOS

Overview

Features

Prerequisites

Setup Instructions

1. Clone the Repository

Running the Services (Makefile-based)

General Pattern

Service-specific Instructions

Zookeeper

Kafka

AKHQ (Kafka UI)

Pinot

Superset

ZooNavigator

Spark

Flink

KafkaProducer

Hadoop

Additional Information

Summary Table

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages