add tuning results to docs

SvenKlaassen · SvenKlaassen · commit 375333cf5168 · 2025-11-27T16:17:12.000+01:00
diff --git a/doc/irm/apo.qmd b/doc/irm/apo.qmd
@@ -22,7 +22,9 @@ from utils.style_tables import generate_and_show_styled_table
 init_notebook_mode(all_interactive=True)
 ```
 
-## APO Pointwise Coverage
+## Coverage
+
+### APO Pointwise Coverage
 
 The simulations are based on the  the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
 
@@ -78,7 +80,7 @@ generate_and_show_styled_table(
 ```
 
 
-## APOS Coverage
+### APOS Coverage
 
 The simulations are based on the  the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
 
@@ -134,7 +136,7 @@ generate_and_show_styled_table(
 )
 ```
 
-## Causal Contrast Coverage
+### Causal Contrast Coverage
 
 The simulations are based on the  the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. Due to the linearity of the DGP, Lasso and Logit Regression are nearly optimal choices for the nuisance estimation.
 
@@ -189,3 +191,118 @@ generate_and_show_styled_table(
     coverage_highlight_cols=["Coverage", "Uniform Coverage"]
 )
 ```
+
+
+## Tuning
+
+The simulations are based on the  the [make_irm_data_discrete_treatments](https://docs.doubleml.org/stable/api/datasets.html#dataset-generators)-DGP with $500$ observations. This is only an example as the untuned version just relies on the default configuration.
+
+### APOS Coverage
+
+The non-uniform results (coverage, ci length and bias) refer to averaged values over all quantiles (point-wise confidende intervals).
+
+::: {.callout-note title="Metadata"  collapse="true"}
+
+```{python}
+#| echo: false
+metadata_file = '../../results/irm/apos_tune_metadata.csv'
+metadata_df = pd.read_csv(metadata_file)
+print(metadata_df.T.to_string(header=False))
+```
+
+:::
+
+```{python}
+#| echo: false
+
+# set up data
+df_apos = pd.read_csv("../../results/irm/apos_tune_coverage.csv", index_col=None)
+
+assert df_apos["repetition"].nunique() == 1
+n_rep_apos = df_apos["repetition"].unique()[0]
+
+display_columns_apos = ["Learner g", "Learner m", "Tuned", "Bias", "CI Length", "Coverage", "Uniform CI Length", "Uniform Coverage"]
+```
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_apos,
+    filters={"level": 0.95},
+    display_cols=display_columns_apos,
+    n_rep=n_rep_apos,
+    level_col="level",
+    coverage_highlight_cols=["Coverage", "Uniform Coverage"]
+)
+```
+
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_apos,
+    filters={"level": 0.9},
+    display_cols=display_columns_apos,
+    n_rep=n_rep_apos,
+    level_col="level",
+    coverage_highlight_cols=["Coverage", "Uniform Coverage"]
+)
+```
+
+
+### Causal Contrast Coverage
+
+The non-uniform results (coverage, ci length and bias) refer to averaged values over all quantiles (point-wise confidende intervals).
+
+
+::: {.callout-note title="Metadata"  collapse="true"}
+
+```{python}
+#| echo: false
+metadata_file = '../../results/irm/apos_tune_metadata.csv'
+metadata_df = pd.read_csv(metadata_file)
+print(metadata_df.T.to_string(header=False))
+```
+
+:::
+
+```{python}
+#| echo: false
+
+# set up data
+df_contrast = pd.read_csv("../../results/irm/apos_tune_causal_contrast.csv", index_col=None)
+
+assert df_contrast["repetition"].nunique() == 1
+n_rep_contrast = df_contrast["repetition"].unique()[0]
+
+display_columns_contrast = ["Learner g", "Learner m", "Tuned", "Bias", "CI Length", "Coverage", "Uniform CI Length", "Uniform Coverage"]
+```
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_contrast,
+    filters={"level": 0.95},
+    display_cols=display_columns_contrast,
+    n_rep=n_rep_contrast,
+    level_col="level",
+    coverage_highlight_cols=["Coverage", "Uniform Coverage"]
+)
+```
+
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_contrast,
+    filters={"level": 0.9},
+    display_cols=display_columns_contrast,
+    n_rep=n_rep_contrast,
+    level_col="level",
+    coverage_highlight_cols=["Coverage", "Uniform Coverage"]
+)
+```
diff --git a/doc/plm/lplr.qmd b/doc/plm/lplr.qmd
@@ -111,3 +111,94 @@ generate_and_show_styled_table(
     coverage_highlight_cols=["Coverage"]
 )
 ```
+
+
+## Tuning
+
+The simulations are based on the  the [make_lplr_LZZ2020](https://docs.doubleml.org/stable/api/generated/doubleml.plm.datasets.make_lplr_LZZ2020.html)-DGP with $500$ observations. This is only an example as the untuned version just relies on the default configuration.
+
+::: {.callout-note title="Metadata"  collapse="true"}
+
+```{python}
+#| echo: false
+metadata_file = '../../results/plm/lplr_ate_tune_metadata.csv'
+metadata_df = pd.read_csv(metadata_file)
+print(metadata_df.T.to_string(header=False))
+```
+
+:::
+
+```{python}
+#| echo: false
+
+# set up data and rename columns
+df_coverage = pd.read_csv("../../results/plm/lplr_ate_tune_coverage.csv", index_col=None)
+
+if "repetition" in df_coverage.columns and df_coverage["repetition"].nunique() == 1:
+    n_rep_coverage = df_coverage["repetition"].unique()[0]
+elif "n_rep" in df_coverage.columns and df_coverage["n_rep"].nunique() == 1:
+    n_rep_coverage = df_coverage["n_rep"].unique()[0]
+else:
+    n_rep_coverage = "N/A" # Fallback if n_rep cannot be determined
+
+display_columns_coverage = ["Learner m", "Learner M", "Learner t", "Tuned", "Bias", "CI Length", "Coverage"]
+```
+
+### Nuisance space
+
+```{python}
+# | echo: false
+
+generate_and_show_styled_table(
+    main_df=df_coverage,
+    filters={"level": 0.95, "Score": "nuisance_space"},
+    display_cols=display_columns_coverage,
+    n_rep=n_rep_coverage,
+    level_col="level",
+#    rename_map={"Learner g": "Learner l"},
+    coverage_highlight_cols=["Coverage"]
+)
+```
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_coverage,
+    filters={"level": 0.9, "Score": "nuisance_space"},
+    display_cols=display_columns_coverage,
+    n_rep=n_rep_coverage,
+    level_col="level",
+#    rename_map={"Learner g": "Learner l"},
+    coverage_highlight_cols=["Coverage"]
+)
+```
+
+### Instrument
+
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_coverage,
+    filters={"level": 0.95, "Score": "instrument"},
+    display_cols=display_columns_coverage,
+    n_rep=n_rep_coverage,
+    level_col="level",
+    coverage_highlight_cols=["Coverage"]
+)
+```
+
+```{python}
+#| echo: false
+
+generate_and_show_styled_table(
+    main_df=df_coverage,
+    filters={"level": 0.9, "Score": "instrument"},
+    display_cols=display_columns_coverage,
+    n_rep=n_rep_coverage,
+    level_col="level",
+    coverage_highlight_cols=["Coverage"]
+)
+```
diff --git a/doc/plm/plr.qmd b/doc/plm/plr.qmd
@@ -213,6 +213,17 @@ generate_and_show_styled_table(
 
 The simulations are based on the  the [make_plr_CCDDHNR2018](https://docs.doubleml.org/stable/api/generated/doubleml.plm.datasets.make_plr_CCDDHNR2018.html)-DGP with $500$ observations. This is only an example as the untuned version just relies on the default configuration.
 
+::: {.callout-note title="Metadata"  collapse="true"}
+
+```{python}
+#| echo: false
+metadata_file = '../../results/plm/plr_ate_tune_metadata.csv'
+metadata_df = pd.read_csv(metadata_file)
+print(metadata_df.T.to_string(header=False))
+```
+
+:::
+
 ```{python}
 #| echo: false