Turn any ML model into a REST API with one command.
mlship serve model.pklDeploy your machine learning models locally in seconds - no Docker, no YAML, no configuration files.
Why mlship? Read WHY_MLSHIP.md to see how mlship compares to transformers-serve, vLLM, Ollama, and BentoML.
- ✅ One-command deployment - No configuration needed
- ✅ Multi-framework - sklearn, PyTorch, TensorFlow, HuggingFace (local + Hub)
- ✅ HuggingFace Hub - Serve models directly from Hub without downloading
- ✅ Built-in benchmarking - Measure latency and throughput with
mlship benchmark - ✅ Auto-generated API - REST API with interactive docs
- ✅ Works offline - Zero internet dependency after installation
- ✅ Fast - Deploy in seconds, predictions in milliseconds
# Install
pip install mlship
# Serve any model
mlship serve model.pkl# Sentiment analysis
mlship serve distilbert-base-uncased-finetuned-sst-2-english --source huggingface
# Test it
curl -X POST http://localhost:8000/predict \
-H "Content-Type: application/json" \
-d '{"features": "This product is amazing!"}'📖 See QUICKSTART.md for complete hands-on examples with:
- HuggingFace models (sentiment analysis, text generation)
- Local models (sklearn, PyTorch, TensorFlow)
- Training code, curl commands, and expected responses
| Framework | File Format | Example |
|---|---|---|
| scikit-learn | .pkl, .joblib |
mlship serve model.pkl |
| PyTorch | .pt, .pth |
mlship serve model.pt |
| TensorFlow | .h5, .keras, SavedModel |
mlship serve model.h5 |
| HuggingFace (local) | Model directory | mlship serve ./sentiment-model/ |
| HuggingFace (Hub) | Model ID | mlship serve bert-base-uncased --source huggingface |
Note: For custom PyTorch models, use TorchScript format (torch.jit.save()) for best compatibility.
Serve models directly from HuggingFace Hub:
mlship serve gpt2 --source huggingface
mlship serve distilbert-base-uncased-finetuned-sst-2-english --source huggingfaceModels are downloaded on first use and cached locally. See QUICKSTART.md for more examples.
Every model automatically gets:
- POST
/predict- Make predictions - GET
/health- Health check - GET
/info- Model metadata - GET
/docs- Interactive Swagger UI documentation
Examples in QUICKSTART.md.
# Custom port
mlship serve model.pkl --port 5000
# Development mode (auto-reload on code changes)
mlship serve model.pkl --reload
# Custom model name
mlship serve model.pkl --name "fraud-detector"
# Custom preprocessing/postprocessing
mlship serve model.pkl --pipeline my_module.MyPipeline
# Benchmark performance
mlship benchmark model.pkl --requests 1000
# Benchmark with custom warmup and port
mlship benchmark model.pkl --requests 500 --warmup 10 --port 9000
# Benchmark with custom payload
mlship benchmark model.pkl --payload '{"features": [5.1, 3.5, 1.4, 0.2]}'
# Benchmark HuggingFace model
mlship benchmark distilbert-base-uncased-finetuned-sst-2-english --source huggingface --requests 50
# Benchmark with JSON output (for CI/CD)
mlship benchmark model.pkl --output json > results.jsonBenchmark options:
| Option | Default | Description |
|---|---|---|
--requests N |
100 | Number of benchmark requests |
--warmup N |
5 | Number of warmup requests |
--port PORT |
8000 | Server port |
--payload JSON |
auto | Custom test payload as JSON string |
--source |
local | Model source (local or huggingface) |
--output |
text | Output format (text or json) |
See CONTRIBUTING.md for custom pipeline documentation.
See QUICKSTART.md for detailed benchmarking guide.
For Students & Learners
- Learn model serving without framework-specific tools
- One tool works for entire ML curriculum (sklearn → PyTorch → transformers)
For Data Scientists
- Prototype models locally before production
- Test models with realistic API interactions
- Share models with teammates without cloud setup
For Educators
- Teach framework-agnostic model serving concepts
- Create reproducible examples that work across frameworks
Read WHY_MLSHIP.md for detailed positioning.
-
QUICKSTART.md - 🚀 Start here! Complete hands-on guide with:
- HuggingFace Hub examples (sentiment analysis, text generation)
- Local model examples (sklearn, PyTorch, TensorFlow)
- Training code, curl commands, and expected responses
- Perfect for first-time users and students
-
WHY_MLSHIP.md - 🎯 Understanding mlship's unique value
- Comparison with transformers-serve, vLLM, Ollama, BentoML
- Multi-framework advantage explained
- Target audience and use cases
- When to use mlship vs alternatives
-
CONTRIBUTING.md - 🛠️ For contributors
- Development setup and workflow
- Running tests and code style
- Custom pipeline development
- How to submit pull requests
-
ARCHITECTURE.md - 🏗️ Technical deep dive
- Design decisions and philosophy
- Performance optimizations
- Framework detection system
- Internal architecture
pip install mlshipWith specific frameworks:
pip install mlship[sklearn] # scikit-learn
pip install mlship[pytorch] # PyTorch
pip install mlship[tensorflow] # TensorFlow
pip install mlship[huggingface] # HuggingFace
pip install mlship[all] # All frameworksWe welcome contributions! See CONTRIBUTING.md for:
- Development setup
- Running tests
- Code style guidelines
- Custom pipeline development
- Issues: GitHub Issues
- Documentation: See docs linked above
- Examples: Check the
examples/directory
MIT License - see LICENSE file for details.
mlship is the only zero-code tool that supports sklearn, PyTorch, TensorFlow, AND HuggingFace models with a single command. Read WHY_MLSHIP.md for detailed comparison with transformers-serve, vLLM, Ollama, and BentoML.
Quick comparison:
- ✅ Multi-framework (not just one)
- ✅ Zero code required (no Python files)
- ✅ Local-first (no cloud dependency)
- ✅ HuggingFace Hub integration
- ✅ Perfect for learning and prototyping
✅ Implemented:
- ✅ Multi-framework support - sklearn, PyTorch, TensorFlow, HuggingFace
- ✅ HuggingFace Hub integration - Serve models directly from Hub with
--source huggingface - ✅ PyTorch TorchScript support - Full support for custom PyTorch models via TorchScript
- ✅ Built-in benchmarking - Measure latency (p50/p95/p99) and throughput with
mlship benchmark - ✅ Zero-code deployment - One command to serve any model
- ✅ Auto-generated REST API - With interactive Swagger docs
- ✅ Custom pipelines - Preprocessing/postprocessing support
- ✅ Local-first - Works completely offline (after installation)
🔄 Planned:
- 🔄 More HuggingFace tasks - Question answering, translation, summarization, NER (requires flexible input schemas)
- 🔄 PyTorch Hub integration - Serve models directly from PyTorch Hub with
--source pytorch-hub - 🔄 TensorFlow Hub integration - Serve models from TensorFlow Hub with
--source tensorflow-hub - 🔄 XGBoost & LightGBM support - First-class support for gradient boosting frameworks
- 🔄 Model versioning - Support specific model versions (e.g.,
--revision main) - 🔄 GPU support - Automatic GPU detection and utilization
- 🔄 Batch inference - Efficient batch prediction endpoints
- 🔄 Authentication - Optional API key authentication for deployments
Want to contribute? See CONTRIBUTING.md or open an issue with your ideas!
Happy serving! 🚀