What happened?
I have a graph with 8 hypotheses representing two doses (A and B) each compared to a shared placebo. H1a/H1b are the primary hypotheses, with H2a/H2b and H3a/H3b as secondary, and H4a/H4b as final hypotheses in the sequence. H1a and H1b are tested jointly using a parametric (Dunnett) test. All other hypotheses use Bonferroni.
In my example, the p-values for every dose A hypothesis are smaller than the corresponding dose B hypothesis. Dose A is also given more initial alpha weight (0.5294 vs 0.4706). Despite this, graph_test_closure rejects H1b but not H1a.
I would expect that if dose A has uniformly smaller p-values across all hypotheses and more initial alpha weight, it should not be possible for a dose B hypothesis to be rejected while the corresponding dose A hypothesis is not.
Session Information
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_Switzerland.utf8 LC_CTYPE=English_Switzerland.utf8 LC_MONETARY=English_Switzerland.utf8
[4] LC_NUMERIC=C LC_TIME=English_Switzerland.utf8
time zone: Europe/Zurich
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mvtnorm_1.3-6
loaded via a namespace (and not attached):
[1] compiler_4.5.2 tools_4.5.2 rstudioapi_0.18.0
Reproducible Example
library(graphicalMCP)
=================
Reproducible Example: Unexpected rejection pattern with parametric test
==================
--- Define graph ---
Two primary hypotheses (H1a, H1b) with unequal weights,
Using a weighted Dunnett test
followed by two pairs of secondary hypotheses (H2a/H2b, H3a/H3b)
and two final hypotheses (H4a, H4b)
epsilon <- 0.0001
hypotheses <- c(
H1a = 0.5294,
H1b = 0.4706,
H2a = 0, H2b = 0,
H3a = 0, H3b = 0,
H4a = 0, H4b = 0
)
transitions <- rbind(
c(0, 0, 0.75, 0, 0.25 - epsilon, 0, epsilon, 0),
c(0, 0, 0, 0.75, 0, 0.25 - epsilon, 0, epsilon),
c(0, epsilon, 0, 0, 1 - epsilon, 0, 0, 0),
c(epsilon, 0, 0, 0, 0, 1 - epsilon, 0, 0),
c(0, epsilon, 1 - epsilon, 0, 0, 0, 0, 0),
c(epsilon, 0, 0, 1 - epsilon, 0, 0, 0, 0),
c(0, 0, 0, 0, 0, 0, 0, 0),
c(0, 0, 0, 0, 0, 0, 0, 0)
)
g <- graph_create(hypotheses, transitions)
--- Define test parameters ---
H1a and H1b are tested parametrically (Dunnett) with correlation 0.5
All other hypotheses tested with Bonferroni
test_groups <- list(c(1, 2), 3:8)
test_types <- c("parametric", "bonferroni")
test_corr <- list(
matrix(c(1, 0.5, 0.5, 1), 2, 2),
NA
)
--- P-values ---
Note: every "a" hypothesis has a smaller p-value than its "b" counterpart
H1a has more weight than H1b
p_values <- c(
H1a = 0.0112505,
H1b = 0.02,
H2a = 0.0001,
H2b = 0.1,
H3a = 0.0001,
H3b = 0.1,
H4a = 0.0001,
H4b = 0.1
)
--- Run test ---
results <- graph_test_closure(
g,
p = p_values,
alpha = 0.02125,
test_types = test_types,
test_groups = test_groups,
test_corr = test_corr
)
print(results)
--- Issue ---
H1a has a smaller p-value than H1b AND has more alpha weight allocated to it.
Yet the result rejects H1b but not H1a.
What happened?
I have a graph with 8 hypotheses representing two doses (A and B) each compared to a shared placebo. H1a/H1b are the primary hypotheses, with H2a/H2b and H3a/H3b as secondary, and H4a/H4b as final hypotheses in the sequence. H1a and H1b are tested jointly using a parametric (Dunnett) test. All other hypotheses use Bonferroni.
In my example, the p-values for every dose A hypothesis are smaller than the corresponding dose B hypothesis. Dose A is also given more initial alpha weight (0.5294 vs 0.4706). Despite this, graph_test_closure rejects H1b but not H1a.
I would expect that if dose A has uniformly smaller p-values across all hypotheses and more initial alpha weight, it should not be possible for a dose B hypothesis to be rejected while the corresponding dose A hypothesis is not.
Session Information
R version 4.5.2 (2025-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)
Matrix products: default
LAPACK version 3.12.1
locale:
[1] LC_COLLATE=English_Switzerland.utf8 LC_CTYPE=English_Switzerland.utf8 LC_MONETARY=English_Switzerland.utf8
[4] LC_NUMERIC=C LC_TIME=English_Switzerland.utf8
time zone: Europe/Zurich
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mvtnorm_1.3-6
loaded via a namespace (and not attached):
[1] compiler_4.5.2 tools_4.5.2 rstudioapi_0.18.0
Reproducible Example
library(graphicalMCP)
=================
Reproducible Example: Unexpected rejection pattern with parametric test
==================
--- Define graph ---
Two primary hypotheses (H1a, H1b) with unequal weights,
Using a weighted Dunnett test
followed by two pairs of secondary hypotheses (H2a/H2b, H3a/H3b)
and two final hypotheses (H4a, H4b)
epsilon <- 0.0001
hypotheses <- c(
H1a = 0.5294,
H1b = 0.4706,
H2a = 0, H2b = 0,
H3a = 0, H3b = 0,
H4a = 0, H4b = 0
)
transitions <- rbind(
c(0, 0, 0.75, 0, 0.25 - epsilon, 0, epsilon, 0),
c(0, 0, 0, 0.75, 0, 0.25 - epsilon, 0, epsilon),
c(0, epsilon, 0, 0, 1 - epsilon, 0, 0, 0),
c(epsilon, 0, 0, 0, 0, 1 - epsilon, 0, 0),
c(0, epsilon, 1 - epsilon, 0, 0, 0, 0, 0),
c(epsilon, 0, 0, 1 - epsilon, 0, 0, 0, 0),
c(0, 0, 0, 0, 0, 0, 0, 0),
c(0, 0, 0, 0, 0, 0, 0, 0)
)
g <- graph_create(hypotheses, transitions)
--- Define test parameters ---
H1a and H1b are tested parametrically (Dunnett) with correlation 0.5
All other hypotheses tested with Bonferroni
test_groups <- list(c(1, 2), 3:8)
test_types <- c("parametric", "bonferroni")
test_corr <- list(
matrix(c(1, 0.5, 0.5, 1), 2, 2),
NA
)
--- P-values ---
Note: every "a" hypothesis has a smaller p-value than its "b" counterpart
H1a has more weight than H1b
p_values <- c(
H1a = 0.0112505,
H1b = 0.02,
H2a = 0.0001,
H2b = 0.1,
H3a = 0.0001,
H3b = 0.1,
H4a = 0.0001,
H4b = 0.1
)
--- Run test ---
results <- graph_test_closure(
g,
p = p_values,
alpha = 0.02125,
test_types = test_types,
test_groups = test_groups,
test_corr = test_corr
)
print(results)
--- Issue ---
H1a has a smaller p-value than H1b AND has more alpha weight allocated to it.
Yet the result rejects H1b but not H1a.