LLM-Powered Automated FMEA Generator

An intelligent system that automatically generates Failure Mode and Effects Analysis (FMEA) from both structured and unstructured data using Large Language Models

📋 Table of Contents

Overview
Features
Architecture
Installation
Quick Start
Usage
Configuration
Project Structure
Examples
API Reference
Contributing
License

🎯 Overview

Traditional FMEA is manual, time-consuming, and expert-dependent. This system revolutionizes the process by:

Automating extraction of failure information from customer reviews, complaints, and reports
Processing structured data from Excel/CSV files
Using LLMs for intelligent semantic understanding
Computing risk scores (Severity, Occurrence, Detection)
Generating actionable insights with recommended actions

Problem Statement

Organizations receive failure information in multiple formats:

Unstructured: Customer reviews, complaint text, incident reports
Structured: Excel spreadsheets, CSV files with failure data

This system provides a unified, intelligent solution to convert all these inputs into a standardized FMEA.

✨ Features

Core Capabilities

✅ Dual Input Support: Process both structured and unstructured data
🤖 LLM-Powered Extraction: Uses Mistral/LLaMA/GPT models for intelligent entity extraction
📊 Automated Risk Scoring: Calculates S, O, D scores and RPN automatically
🎯 Action Priority Classification: Categorizes risks as Critical, High, Medium, Low
📈 Visual Analytics: Interactive dashboards with charts and risk matrices
💾 Multiple Export Formats: Excel, CSV, JSON
🔄 Hybrid Processing: Combine multiple data sources seamlessly
🚀 Production-Ready: Modular, extensible, well-documented code

Technical Features

NLP Processing: Sentiment analysis, keyword extraction, text cleaning
Rule-Based Fallback: Works even without LLM for faster processing
Batch Processing: Handle large datasets efficiently
Deduplication: Intelligent removal of similar failure modes
Configurable: YAML-based configuration for easy customization

🏗️ Architecture

User Input (Text/CSV/Excel)
         ↓
┌────────────────────┐
│  Data Preprocessing │ ← Text cleaning, validation, sentiment analysis
└────────────────────┘
         ↓
┌────────────────────┐
│  LLM Extraction     │ ← Extract: Failure Mode, Effect, Cause, Component
└────────────────────┘
         ↓
┌────────────────────┐
│  Risk Scoring       │ ← Calculate: Severity, Occurrence, Detection
└────────────────────┘
         ↓
┌────────────────────┐
│  FMEA Generator     │ ← Compute RPN, prioritize, recommend actions
└────────────────────┘
         ↓
┌────────────────────┐
│  Output & Export    │ ← Dashboard, Excel, CSV, JSON
└────────────────────┘

🚀 Installation

Prerequisites

Python 3.9 or higher
8GB RAM minimum (16GB recommended for LLM)
GPU (optional, for faster LLM inference)

Step 1: Clone Repository

git clone <repository-url>
cd Symboisis

Step 2: Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/Mac
python3 -m venv venv
source venv/bin/activate

Step 3: Install Dependencies

pip install -r requirements.txt

Step 4: Download NLTK Data

python -c "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('averaged_perceptron_tagger')"

Step 5: (Optional) Install spaCy Model

python -m spacy download en_core_web_sm

Step 6: Configure Environment

# Copy example environment file
copy .env.example .env

# Edit .env with your settings (optional)

⚡ Quick Start

🎯 Working with YOUR Data (FMEA.csv + Car Reviews)

FASTEST WAY - Process your actual datasets:

python process_my_data.py

This will automatically process:

✅ Your FMEA.csv (161 industrial failure modes)
✅ Car reviews from archive (3) folder (Ford, Toyota, Honda)
✅ Create hybrid analysis combining both
✅ Export all results to output/ folder

📖 See YOUR_DATA_GUIDE.md for detailed instructions on working with your datasets!

Option 1: Web Dashboard (Recommended)

streamlit run app.py

Navigate to http://localhost:8501 in your browser.

Option 2: Command Line

# From unstructured text
python cli.py --text reviews.csv --output fmea_output.xlsx

# From structured data
python cli.py --structured failures.csv --output fmea_output.xlsx

# Hybrid mode
python cli.py --text reviews.csv --structured failures.csv --output fmea_output.xlsx

Option 3: Run Examples

python examples.py

This will run 3 demonstration examples and generate sample FMEAs.

📖 Usage

1. Using the Web Dashboard

Start the dashboard: streamlit run app.py
Select input type: Unstructured, Structured, or Hybrid
Upload files or paste text
Click "Generate FMEA"
View results: Metrics, tables, charts
Export: Download as Excel or CSV

2. Using Python API

from fmea_generator import FMEAGenerator
import yaml

# Load configuration
with open('config/config.yaml', 'r') as f:
    config = yaml.safe_load(f)

# Initialize generator
generator = FMEAGenerator(config)

# Generate from text
reviews = ["Brake failure on highway...", "Engine overheated..."]
fmea_df = generator.generate_from_text(reviews, is_file=False)

# Generate from structured file
fmea_df = generator.generate_from_structured('data.csv')

# Export
generator.export_fmea(fmea_df, 'output/fmea.xlsx', format='excel')

3. Using CLI

# Basic usage
python cli.py --text input.csv --output result.xlsx

# With summary report
python cli.py --text input.csv --output result.xlsx --summary

# Faster rule-based mode (no LLM)
python cli.py --text input.csv --output result.xlsx --no-model

# Custom configuration
python cli.py --text input.csv --config custom_config.yaml --output result.xlsx

⚙️ Configuration

Edit config/config.yaml to customize:

Model Settings

model:
  name: "mistralai/Mistral-7B-Instruct-v0.2"  # LLM model
  max_length: 512
  temperature: 0.3
  device: "auto"  # auto, cuda, cpu
  quantization: true  # Use 4-bit quantization

Risk Scoring Parameters

risk_scoring:
  severity:
    high_keywords: ["critical", "catastrophic", "severe"]
    medium_keywords: ["moderate", "significant"]
    low_keywords: ["minor", "slight"]
    default: 5

Text Processing

text_processing:
  min_review_length: 10
  negative_threshold: 0.3  # Sentiment threshold
  max_reviews_per_batch: 100
  enable_sentiment_filter: true

📁 Project Structure

Symboisis/
├── src/
│   ├── preprocessing.py       # Data preprocessing module
│   ├── llm_extractor.py       # LLM-based extraction
│   ├── risk_scoring.py        # Risk scoring engine
│   ├── fmea_generator.py      # Main FMEA generator
│   └── utils.py               # Utility functions
├── config/
│   └── config.yaml            # Configuration file
├── output/                    # Generated FMEAs
├── archive (3)/               # Sample car review data
├── app.py                     # Streamlit dashboard
├── cli.py                     # Command-line interface
├── examples.py                # Usage examples
├── requirements.txt           # Python dependencies
├── .env.example              # Environment variables template
└── README.md                 # This file

📊 Examples

Example 1: Customer Reviews

reviews = [
    "Brake failure during heavy rain, very dangerous!",
    "Engine overheated and seized, no warning lights."
]

fmea_df = generator.generate_from_text(reviews, is_file=False)

Output:

Failure Mode	Effect	Severity	Occurrence	Detection	RPN	Priority
Brake failure	Unable to stop	10	7	8	560	Critical
Engine seized	Vehicle breakdown	9	6	7	378	High

Example 2: Structured Data

Input CSV:

failure_mode,effect,cause,component
Brake system failure,Cannot stop vehicle,Worn brake pads,Brake System
Engine overheating,Engine damage,Coolant leak,Cooling System

fmea_df = generator.generate_from_structured('failures.csv')

Example 3: Real Car Reviews

# Process actual car review data
fmea_df = generator.generate_from_text('archive (3)/Scraped_Car_Review_ford.csv', is_file=True)

# Generate summary
from utils import generate_summary_report
print(generate_summary_report(fmea_df))

🔌 API Reference

FMEAGenerator

Main class for FMEA generation.

Methods

generate_from_text(text_input, is_file=False)

Generate FMEA from unstructured text
Args: text_input (str or list), is_file (bool)
Returns: DataFrame

generate_from_structured(file_path)

Generate FMEA from structured CSV/Excel
Args: file_path (str)
Returns: DataFrame

generate_hybrid(structured_file, text_input)

Generate FMEA from both sources
Args: structured_file (str), text_input (str or list)
Returns: DataFrame

export_fmea(fmea_df, output_path, format='excel')

Export FMEA to file
Args: fmea_df (DataFrame), output_path (str), format (str)

DataPreprocessor

Handles data cleaning and preprocessing.

LLMExtractor

Extracts failure information using LLMs.

RiskScoringEngine

Calculates risk scores and RPN.

🧪 Testing

Run the examples to test the system:

python examples.py

This will:

Generate FMEA from sample reviews
Process structured data
Analyze real car reviews (if available)

🎯 Use Cases

Manufacturing

Analyze equipment failure reports
Process quality control data
Generate preventive maintenance schedules

Automotive

Process customer complaints
Analyze warranty claims
Identify safety issues

Healthcare

Analyze adverse event reports
Process medical device failures
Improve patient safety

Software

Analyze bug reports
Process incident tickets
Identify system vulnerabilities

🔬 Research Applications

This system is suitable for:

Academic research papers
Case studies
Benchmarking studies
Tool comparisons
Industry reports

Key Advantages:

Reproducible results
Configurable parameters
Comprehensive logging
Export capabilities

🛠️ Troubleshooting

Issue: Model loading fails

# Use rule-based mode instead
python cli.py --text input.csv --output result.xlsx --no-model

Issue: Out of memory

Enable quantization in config.yaml
Use smaller batch sizes
Use rule-based mode

Issue: Slow processing

Use GPU if available
Enable quantization
Reduce batch size
Use rule-based mode for faster results

📈 Performance

Processing Speed

Mode	Speed	Accuracy
LLM (GPU)	~2 reviews/sec	High
LLM (CPU)	~0.3 reviews/sec	High
Rule-based	~50 reviews/sec	Medium

Resource Requirements

Component	Minimum	Recommended
RAM	8GB	16GB
GPU	None	8GB VRAM
Disk	2GB	10GB

🤝 Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Make your changes
Add tests
Submit a pull request

📄 License

This project is licensed under the MIT License.

👥 Authors

Developed as a production-grade academic and industry project for automated FMEA generation.

🙏 Acknowledgments

HuggingFace for transformer models
Streamlit for dashboard framework
Open-source community

📞 Support

For issues, questions, or suggestions:

Open an issue on GitHub
Check the documentation
Run examples for guidance

🚀 Future Enhancements

⚠️ LLM-Powered FMEA Generator

Transforming failure analysis with AI

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
archive (3)		archive (3)
config		config
mitigation_module		mitigation_module
src		src
test_images		test_images
tests		tests
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dataset_AI_Supply_Optimization.csv		Dataset_AI_Supply_Optimization.csv
FMEA.csv		FMEA.csv
LICENSE		LICENSE
News_Category_Dataset_v3.json		News_Category_Dataset_v3.json
README.md		README.md
app.py		app.py
check_city_risks.py		check_city_risks.py
cli.py		cli.py
create_test_images.py		create_test_images.py
demo_pfmea_generator.py		demo_pfmea_generator.py
examples.py		examples.py
export_csv_report.py		export_csv_report.py
mitigation_report.csv		mitigation_report.csv
process_my_data.py		process_my_data.py
requirements.txt		requirements.txt
run_app.bat		run_app.bat
run_test.bat		run_test.bat
test_all_cities.py		test_all_cities.py
test_city_wide_risk.py		test_city_wide_risk.py
test_dynamic_expansion.py		test_dynamic_expansion.py
test_dynamic_extraction.py		test_dynamic_extraction.py
test_gdelt_scalability.py		test_gdelt_scalability.py
test_guardian_fix.py		test_guardian_fix.py
test_indian_city_currency.py		test_indian_city_currency.py
test_indian_integration.py		test_indian_integration.py
test_input_parsing.py		test_input_parsing.py
test_integration_example.py		test_integration_example.py
test_llm_routes.py		test_llm_routes.py
test_mitigation_module.py		test_mitigation_module.py
test_my_data.py		test_my_data.py
test_no_fallback.py		test_no_fallback.py
test_quantity_usage.py		test_quantity_usage.py
test_redundancy_rerouting.py		test_redundancy_rerouting.py
test_report_generator.py		test_report_generator.py
test_rerouting_scenario.py		test_rerouting_scenario.py
test_risk_comparison.py		test_risk_comparison.py
test_routes_fix.py		test_routes_fix.py
test_solve_mitigation_plan.py		test_solve_mitigation_plan.py
test_unlimited_cities.py		test_unlimited_cities.py
test_user_guide.py		test_user_guide.py
train_models.py		train_models.py

Folders and files

Latest commit

History

Repository files navigation

LLM-Powered Automated FMEA Generator

📋 Table of Contents

🎯 Overview

Problem Statement

✨ Features

Core Capabilities

Technical Features

🏗️ Architecture

🚀 Installation

Prerequisites

Step 1: Clone Repository

Step 2: Create Virtual Environment

Step 3: Install Dependencies

Step 4: Download NLTK Data

Step 5: (Optional) Install spaCy Model

Step 6: Configure Environment

⚡ Quick Start

🎯 Working with YOUR Data (FMEA.csv + Car Reviews)

Option 1: Web Dashboard (Recommended)

Option 2: Command Line

Option 3: Run Examples

📖 Usage

1. Using the Web Dashboard

2. Using Python API

3. Using CLI

⚙️ Configuration

Model Settings

Risk Scoring Parameters

Text Processing

📁 Project Structure

📊 Examples

Example 1: Customer Reviews

Example 2: Structured Data

Example 3: Real Car Reviews

🔌 API Reference

FMEAGenerator

Methods

DataPreprocessor

LLMExtractor

RiskScoringEngine

🧪 Testing

🎯 Use Cases

Manufacturing

Automotive

Healthcare

Software

🔬 Research Applications

🛠️ Troubleshooting

Issue: Model loading fails

Issue: Out of memory

Issue: Slow processing

📈 Performance

Processing Speed

Resource Requirements

🤝 Contributing

📄 License

👥 Authors

🙏 Acknowledgments

📞 Support

🚀 Future Enhancements

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages