Skip to content

olivenet-iot/entropy-hunter

Repository files navigation

🔥 EntropyHunter-7B

A specialized open-source LLM for industrial exergy analysis and entropy generation detection.

License: Apache 2.0 HuggingFace Status: In Development Benchmark: 85.5%


Current Model: v0.2

Metric Value
Base model Qwen2.5-7B-Instruct
Fine-tuning LoRA r=16, Unsloth
Benchmark 85.5% (40 tests x 3 runs, Grade B+)
vs Base Qwen +17.3pp improvement
Temperature 0.7 (optimal)
Training data 800 examples, Alpaca format

See docs/BENCHMARK_ANALYSIS.md for detailed results.

Next: v0.4 (in progress)

v0.4 training data pipeline is complete: 1369 examples (1235 train / 134 val) in ChatML format, generated via Claude Opus with thermodynamic QC. Covers 8 equipment types and 6 analysis families.


What is EntropyHunter?

EntropyHunter is a domain-specific language model fine-tuned to perform second-law thermodynamic analysis on industrial equipment. Given equipment parameters, it can:

  • Calculate exergy balances (input, output, destruction, efficiency)
  • Identify entropy generation mechanisms (heat transfer, pressure drop, mixing)
  • Perform Gouy-Stodola verification (Ex_destroyed = T₀ × S_gen)
  • Classify equipment with Bejan number grading (A–F)
  • Recommend practical improvements based on avoidable/unavoidable decomposition
  • Conduct thermoeconomic analysis (SPECO methodology)
  • Perform pinch analysis for heat integration
  • Generate ISO 50001 energy management assessments

Supported Equipment

Equipment Subtypes Analysis Depth
Compressor screw, piston, scroll, centrifugal Full
Boiler fire-tube, water-tube, condensing, waste heat, biomass, electric Full
Chiller screw, centrifugal, absorption, air/water-cooled Full
Pump centrifugal, positive displacement, submersible, vertical turbine Full
Heat Exchanger shell & tube, plate, air-cooled, economizer, recuperator Full
Steam Turbine back-pressure, condensing, extraction, ORC, micro Full
Dryer rotary, fluidized bed, spray, belt, heat pump, infrared Full

7 equipment types × 48 subtypes × 14 analysis types

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("OlivenetAI/EntropyHunter-7B")
tokenizer = AutoTokenizer.from_pretrained("OlivenetAI/EntropyHunter-7B")

prompt = """Perform an exergy analysis for a shell & tube heat exchanger.

Operating conditions:
- Hot fluid: Flue gas, inlet 320°C, outlet 180°C, flow rate 2.5 kg/s
- Cold fluid: Water, inlet 25°C, outlet 85°C, flow rate 4.0 kg/s
- Dead state: T₀ = 25°C, P₀ = 101.325 kPa

Provide: exergy balance, efficiency, entropy generation, Bejan number, recommendations."""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Project Structure

entropy-hunter/
├── taxonomy/          # Equipment & analysis type definitions
├── datagen/           # Synthetic training data generation pipeline
├── data/              # Generated datasets (not tracked in git)
├── training/          # LoRA fine-tuning scripts
├── eval/              # Evaluation benchmarks & results
├── docs/              # Analysis documentation
├── archive/           # Archived experiments
└── serving/           # Deployment configurations

Training Methodology

EntropyHunter is trained via knowledge distillation:

  1. Domain Expert Design — Thermodynamic scenarios designed by chemical engineers with field experience
  2. Synthetic Data Generation — High-quality instruction-output pairs generated from frontier models
  3. Quality Control — Every example verified for thermodynamic consistency (energy balance, second law, Gouy-Stodola)
  4. LoRA Fine-tuning — Efficient adaptation on Qwen2.5-7B base using Unsloth
  5. Evaluation — Benchmarked against base model and frontier models on held-out test sets

Training data is informed by the ExergyLab platform's 7 analysis engines, 317 knowledge files, and industrial reference data.

Roadmap

  • v0.0 — Project structure, taxonomy, prompt templates
  • v0.1 — MVP: ~800 examples, basic exergy + EGM + SPECO
  • v0.2 — Current: 85.5% benchmark, LoRA r=16, Qwen2.5-7B (active model)
  • v0.3 — JSON-free experiment: -7.2pp regression, archived
  • v0.4 — 1369 ChatML examples, Calculation Summary scaffold, 14B base eval (in progress)
  • v1.0 — Production: ExergyLab integration, edge deployment

Technical Foundation

Built on established thermodynamic methodologies:

  • Exergy analysis: Kotas, Bejan, Tsatsaronis & Moran
  • SPECO methodology: Tsatsaronis (2009) — Thermoeconomic cost allocation
  • EGM: Bejan (1996) — Entropy Generation Minimization
  • Advanced exergy: Tsatsaronis & Morosuk — EN/EX + AV/UN decomposition
  • Pinch analysis: Linnhoff & Hindmarsh (1983) — Heat integration
  • Gap analysis: 3-layer exergetic gap (minimum / BAT / actual)

Author

Kemal Düzkar — Chemical Engineer & Founder, Olivenet Ltd.

Combining thermodynamic expertise with IoT and AI to hunt entropy in industrial systems.

License

Apache 2.0 — See LICENSE for details.

Citation

@misc{entropyhunter2026,
  author = {Düzkar, Kemal},
  title = {EntropyHunter-7B: Fine-tuned Model for Industrial Exergy Analysis},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/OlivenetAI/EntropyHunter-7B}}
}

"Every irreversibility is a missed opportunity. Every entropy generated is value destroyed. This model finds them."

About

The world's first open-source fine-tuned model for industrial exergy analysis and entropy generation detection

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages