update cleanlab-tlm package to support binary evals#130
Conversation
| query_identifier: Optional[str] = None, | ||
| context_identifier: Optional[str] = None, | ||
| response_identifier: Optional[str] = None, | ||
| mode: Optional[str] = "numeric", |
|
missing unit tests |
Co-authored-by: Jonas Mueller <1390638+jwmueller@users.noreply.github.com>
Added here |
Co-authored-by: Aditya Thyagarajan <aditya1593@icloud.com> Co-authored-by: Jonas Mueller <1390638+jwmueller@users.noreply.github.com>
|
What's going on with the formatting check in the CI? |
elisno
left a comment
There was a problem hiding this comment.
Here's some initial feedback.
| trustworthy_rag, # noqa: F401 | ||
| trustworthy_rag_api_key, # noqa: F401 |
There was a problem hiding this comment.
Why did you remove these? These fixtures are not defined in conftest.py, so they need to be imported.
There was a problem hiding this comment.
Why is this file being updated at all?
There was a problem hiding this comment.
This was with hatch automatic format fix, I'll look into these
| # Compile and validate the eval | ||
| self.mode = self._compile_mode(mode, criteria, name) | ||
|
|
||
| def _compile_mode(self, mode: Optional[str], criteria: str, name: str) -> str: |
There was a problem hiding this comment.
This _compile_mode method needs to be written more carefully to avoid unintentionally breaking all our tests. A lot of the userwarnings will be thrown as errors during automated testing.
| # Compile and validate the eval | ||
| self.mode = self._compile_mode(mode, criteria, name) | ||
|
|
||
| def _compile_mode(self, mode: Optional[str], criteria: str, name: str) -> str: |
There was a problem hiding this comment.
Add separate test cases for these
This PR introduces mode for TLM RAG evals Binary/continuous
merge after backend PR