Confidence_GPT

Adding confidence scores to Chat-GPT's predictions to detect hallucinations

Project Specifications:

We are attempting to build upon the ideas of Wang et. al. & Lakshminarayanan et. al. in this project.

To improve the accuracy of the LLM, we use Wang et. al.'s idea of 'self-consistency', which involves sampling the answer to the same question multiple times and using the answer with the maximum frequency as the answer returned t othe user. The whole process can be though of as a majority vote, which amplifies the chances to arrive at the correct answer and discards unfrequent 'hallucinations'.

Next, we extend this by using Lakshminarayanan et. al.'s ideas on uncertainity in prediction to quantify how sure the LLM is about the final answer. This metric can be used to set a threshold that prevents hallucinatory answers from being displayed to the user.

Files and their usage

sampler.py -> makes a single query to Open AI's ChatGPT
prompts/ -> folder containing exemplar prompts for different tasks
ensembler.py -> uses sampler's results to cast a majority vote among answers
parser/ -> folder contains parsers corresponding to different input prompts
example_usage.py -> demonstration on how to use the provided codes

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
parser		parser
prompts		prompts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
ensembler.py		ensembler.py
example_usage.py		example_usage.py
sampler.py		sampler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Confidence_GPT

Project Specifications:

Files and their usage

Note: We'll provide the codes for the "Closed QA" use case first, and other use cases can be subsequently appended

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

DChops/Confidence_GPT

Folders and files

Latest commit

History

Repository files navigation

Confidence_GPT

Project Specifications:

Files and their usage

Note: We'll provide the codes for the "Closed QA" use case first, and other use cases can be subsequently appended

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages