confidence-calibration

Star

Here are 20 public repositories matching this topic...

JIA-Lab-research / MiSLAS

Star

Improving Calibration for Long-Tailed Recognition (CVPR2021)

long-tailed-recognition confidence-calibration

Updated Nov 10, 2021
Python

Impression2805 / FMFP

Star

PyTorch implementation of our ECCV 2022 paper "Rethinking Confidence Calibration for Failure Prediction"

uncertainty-quantification failure-prediction confidence-estimation selective-classification confidence-calibration flat-minima misclassification-detection

Updated Jun 10, 2023
Python

tor4z / awesome-confidence-calibration

Star

awesome confidence calibration paper list

awesome confidence-calibration

Updated Oct 21, 2021

[ICCV 2025 CVAMD] The official implementation of the paper "Prompt4Trust: A Reinforcement Learning Prompt Augmentation Framework for Clinically-Aligned Confidence Calibration in Multimodal Large Language Models".

machine-learning reinforcement-learning pytorch medical-imaging uncertainty-quantification confidence-calibration trustworthy-ai large-language-models prompt-engineering multimodal-large-language-models

Updated Dec 11, 2025
Python

HKUST-KnowComp / MarConf

Star

[ACL 2025] Revisiting Epistemic Markers in Confidence Estimation: Can Markers Accurately Reflect Large Language Models' Uncertainty?.

uncertainty-estimation confidence-estimation epistemic-uncertainty confidence-calibration

Updated Apr 14, 2026
Python

xingbpshen / nested-diffusion

Star

[IEEE Trans. Med. Imaging] The official implementation of the paper "Improving Robustness and Reliability in Medical Image Classification with Latent-Guided Diffusion and Nested-Ensembles".

machine-learning pytorch medical-imaging ensemble-learning uncertainty-quantification diffusion-models confidence-calibration trustworthy-ai

Updated Dec 11, 2025
Python

martinferianc / noise

Star

Investigation of how noise perturbations impact neural network calibration and generalisation

machine-learning neural-network noise generalisation confidence-calibration

Updated Mar 26, 2024
Shell

EFS-OpenSource / Thetis

Star

Service to examine data processing pipelines (e.g., machine learning or deep learning pipelines) for uncertainty consistency (calibration), fairness, and other safety-relevant aspects.

machine-learning validation ai deep-learning dataset neural-networks traceability fairness data-quality robustness fairness-ai explainability fairness-ml uncertainty-calibration confidence-calibration robustness-ml robustness-ai

Updated Mar 13, 2026
Python

Mr-Loevan / VL-Calibration

Star

[ACL 2026 Main] VL-Calibration: Decoupled Confidence Calibration for Large Vision-Language Models Reasoning

confidence-calibration vision-language-model grpo

Updated Apr 13, 2026
Python

xingbpshen / medical-calibration-fairness-mllm

Star

[MICCAI 2025] The official implementation of the paper "Exposing and Mitigating Calibration Biases and Demographic Unfairness in MLLM Few-Shot In-Context Learning for Medical Image Classification".

machine-learning pytorch medical-imaging uncertainty-quantification fairness-ml responsible-ai confidence-calibration trustworthy-ai large-language-models multimodal-large-language-models

Updated Dec 11, 2025
Python

sleep3r / garrus

Star

Python framework for high quality confidence estimation of deep neural networks, providing methods such as confidence calibration and ordinal ranking

python deep-neural-networks deep-learning pytorch python-framework confidence-estimation mahine-learning confidence-calibration confidence-ranking

Updated Dec 25, 2024
Python

lahavdabah / TS4CP

Star

Code for enhancing Conformal Prediction using Temperature Scaling. Explore more of our work at:

classification uncertainty-quantification icml conformal-prediction temperature-scaling confidence-calibration icml-2025

Updated Jun 13, 2025
Python

aperry938 / movecalibrate

Star

Adaptive movement rehabilitation with confidence calibration — novel Movement Calibration Gap metric combining real-time pose estimation with metacognitive self-assessment

react typescript spaced-repetition biomechanics human-computer-interaction pose-estimation rehabilitation privacy-preserving adaptive-systems tensorflow-js mediapipe movement-analysis kinesiology confidence-calibration

Updated Mar 22, 2026
TypeScript

abisliouk / HS-MATH-LLM

Star

Evaluate high school math reasoning in LLMs with baseline and Chain-of-Thought (CoT) prompts. Includes confidence calibration metrics, JSON output parsing, and reliability analysis.

openai gpt json-parsing model-evaluation interpretability reliability-analysis confidence-calibration llm prompt-engineering chain-of-thought-reasoning safe-ai

Updated May 29, 2025
Python

Anbu-00001 / deep-trust

Star

Multimodal deepfake detection with explainable AI, robustness validation, and calibrated trust scoring for real-world media.

react computer-vision serverless audio-analysis deepfake-detection confidence-calibration multimodal-ai robustness-testing media-authentication

Updated Mar 15, 2026
TypeScript

ThePharmer / fade

Star

FADE: AI that deliberately forgets like humans do, using memory degradation as intrinsic confidence signal. Reduces hallucinations, enables epistemic humility, solves stateful deployment. Conceptual proposal seeking implementation and validation.

machine-learning language-models research-proposal rag ai-alignment confidence-calibration retrieval-augmented-generation memory-architecture epistemic-humility

Updated Nov 25, 2025
Python

naist-nlp / CC_RAG

Star

The repository of our paper about confidence calibration on RAG.

rag confidence-calibration large-language-models llm retrieval-augmented-generation

Updated Jun 7, 2025
Python

Devanik21 / MindScopeX

Star

Here’s a complete Streamlit app scaffold that lets you: Enter your Gemini API key in the sidebar Upload up to four MRI images Invoke Gemini’s advanced image‐analysis (labels, objects, text) View the raw JSON analytics directly in the app

deep-learning confidence-calibration mechanistic-interpretability representation-stability metacognition-ai self-modeling neural-state-monitoring internal-state-estimation cognitive-diagnostics latent-probing

Updated Jul 15, 2025
Python

hinanohart / yuragi

Star

yuragi — LLM Confidence Fragility Analyzer. Perturbation-driven hallucination detection with workshop-grade real benchmarks (TruthfulQA n=412 ensemble AUC 0.73, TriviaQA n=200 confidence-inversion AUC 0.75).

python nlp cli machine-learning evaluation stress-testing psychology uncertainty-quantification ai-safety confidence explainability model-testing confidence-calibration llm prompt-engineering llm-evaluation hallucination-detection perturbation-testing

Updated Apr 20, 2026
Python

wolfenix / llm-math-reasoning-analysis

Star

🔍 Analyze the mathematical reasoning abilities of the Mistral-7B model using diverse prompting techniques on multi-step math problems.

deep-learning openai course-project gpt json-parsing zero-shot interpretability reliability-analysis majority-voting huggingface-transformers confidence-calibration large-language-models llm prompt-engineering chain-of-thought-reasoning safe-ai llm-evaluation mistral-7b

Updated Apr 24, 2026
HTML

Improve this page

Add a description, image, and links to the confidence-calibration topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the confidence-calibration topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

confidence-calibration

Here are 20 public repositories matching this topic...

JIA-Lab-research / MiSLAS

Impression2805 / FMFP

tor4z / awesome-confidence-calibration

xingbpshen / prompt4trust

HKUST-KnowComp / MarConf

xingbpshen / nested-diffusion

martinferianc / noise

EFS-OpenSource / Thetis

Mr-Loevan / VL-Calibration

xingbpshen / medical-calibration-fairness-mllm

sleep3r / garrus

lahavdabah / TS4CP

aperry938 / movecalibrate

abisliouk / HS-MATH-LLM

Anbu-00001 / deep-trust

ThePharmer / fade

naist-nlp / CC_RAG

Devanik21 / MindScopeX

hinanohart / yuragi

wolfenix / llm-math-reasoning-analysis

Improve this page

Add this topic to your repo