Assumptions#36
Conversation
…-modeling modules
rahulbshrestha
left a comment
There was a problem hiding this comment.
1) Could you use class Variables in cais/models.py. Feel free to add more variables to it, else we can create a new shared object that holds all parameters in the functions here: e.g (I would still prefer if we could stick to the Variables class)
class AssumptionVariables(): # Feel free to rename this to something better
df: Optional[pd.DataFrame]
treatment: Optional[str]
outcome: Optional[str]
covariates: List[str] = []
instruments: List[str] = []
running_variable: Optional[str]
time_var: Optional[str]
dataset_description: Optional[str]
variables_summary: Dict[str, Any]
.......
This would mean we can make each assumption check cleaner:
def check_balance_after_matching(vars: AssumptionVariables) -> AssumptionResult:
.......
2) Could you make def _result into a class in the form:
class AssumptionResult
This should also go inside cais/models.py
3) Could we also include a registry to associate each assumption check with a method? It would make it easier to apply and know the mapping from the assumption <> method.
ASSUMPTION_REGISTRY = {
"iv": [
check_iv_relevance,
check_iv_exclusion,
check_iv_exogeneity,
],
"did": [
check_parallel_trends,
check_no_anticipation,
],
}
4) Could you also add test cases for each of the assumption check? That can go to a separate file in the tests directory.
There was a problem hiding this comment.
Two comments:
-
I would like to see pass/failure rates for some of the assumption checks. On the real datasets we collected, we would expect a close to >90% pass rate since they were used in real causal experiments.
-
I like the Jupyter notebooks, but both, primarily the use cases notebooks, could use more markdown cells to explain the workflow and why adding the assumption checks is a valuable addition. Think of it as if you are creating an argument for why CAIS is better than something like TabPFN, i.e. we rigorously check different assumptions which is important for a real causal analysis.
Also, it doesn't pass the Unit Tests for the CAIS core functions---could you look into why? I can try to come up with a fix if it is on how the test cases are executed.
add: modules and use cases notebooks for assumption checking
minor fix in DiD difference_in_differences/diagnostics.py: improve formula quoting using Q() instead of backticks