Skip to content

Fix RDD estimation issues#33

Open
ARY2260 wants to merge 2 commits intocausalNLP:mainfrom
ARY2260:RDD-fixes
Open

Fix RDD estimation issues#33
ARY2260 wants to merge 2 commits intocausalNLP:mainfrom
ARY2260:RDD-fixes

Conversation

@ARY2260
Copy link
Copy Markdown
Contributor

@ARY2260 ARY2260 commented Apr 27, 2026

This PR fixes following issues in RDD estimation:

  1. CAIS agent uses “dataset_cleaner_tool” to clean the dataset and defaults to give treatment to everyone above the cutoff without information from dataset description.
image
  1. CAIS agent uses “evan-magnusson/rdd” to fit rdd model and defaults to give treatment to everyone above the cutoff and hence gives negative effect value for the example below:
Example: real_data/gov_transfers.csv
Key characteristics:
- Strict RDD (p(D) goes from 0 -> 1 cutoff)
- The treatment is applied below cutoff

Current result: eval = ["effect_estimate": -0.061402932849435654, "standard_error": 0.0480432626622543,]

Expected result: 0.093 +- 0.046

The files updated:

  • cais/components/dataset_cleaner.py : enable description in planning of dataset transformation to avoid unnecessary transformations.
  • cais/components/method_validator.py: to update rdd_design_compliance for cases where treatment is assigned below cutoff.
  • `cais/methods/regression_discontinuity/estimator.py': to adjust effect estimation and confidence intervals for cases where treatment is assigned below cutoff.
  • cais/models.py; cais/prompts/method_identification_prompts.py: to add additional prompt to identify a new variable = "treat_above_cutoff": true_false_or_null
  • cais/components/decision_tree.py; cais/components/explanation_generator.py; cais/components/query_interpreter.py; cais/tools/method_executor_tool.py; cais/tools/method_validator_tool.py; cais/utils/agent.py; : to account for new rdd variable "treat_above_cutoff"
  • added unit tests

(In Progress for next PR) Bandwidth selection: IK optimal bandwidth algo seems to underestimate the bandwidth for the PANES real dataset (experiments/data/causcibench/real_data/gov_transfers.csv).

@jacobemmerson jacobemmerson marked this pull request as ready for review April 27, 2026 15:10
@jacobemmerson jacobemmerson deployed to causal-agent April 27, 2026 15:10 — with GitHub Actions Active
@jacobemmerson jacobemmerson marked this pull request as draft April 27, 2026 15:34
@jacobemmerson jacobemmerson marked this pull request as ready for review April 27, 2026 15:34
@ARY2260
Copy link
Copy Markdown
Contributor Author

ARY2260 commented Apr 30, 2026

exp_gov_transfers_with_check_gpt4-o_2026-04-30_10-48-17_2026-04-30.log

This is the experiment logs with new "treat_above_cutoff" variable identified by agent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants