Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
5b95236
standards
SFJohnson24 Mar 5, 2026
5adee56
rules
SFJohnson24 Mar 5, 2026
e7852ac
remove sheet .xpt
SFJohnson24 Mar 6, 2026
7402469
rules and test data sorted
SFJohnson24 Mar 20, 2026
ae2f06e
env
SFJohnson24 Apr 1, 2026
209618e
remove xlsx results
SFJohnson24 Apr 1, 2026
2ef0531
script
SFJohnson24 Apr 2, 2026
90abac7
use_case, usdm, results.csv
SFJohnson24 Apr 9, 2026
f89e17d
results processed
SFJohnson24 Apr 14, 2026
f4f278f
usdm trim
SFJohnson24 Apr 14, 2026
45add25
usdm, multiple sub folders
SFJohnson24 Apr 15, 2026
cdce7a3
last tables
SFJohnson24 Apr 15, 2026
6cc170d
update
SFJohnson24 Apr 16, 2026
5ff8146
merge main
SFJohnson24 Apr 20, 2026
f52e2a1
moved stray adam file
SFJohnson24 Apr 21, 2026
5befe49
update datasets and variables csv and docs
SFJohnson24 Apr 21, 2026
9f581ef
remove .xpt extension
SFJohnson24 Apr 21, 2026
e8edd14
Merge branch 'refs/heads/main' into testing
alexfurmenkov Apr 23, 2026
7c09370
allow to override engine location in validation script
alexfurmenkov Apr 23, 2026
03fb6fe
modified five rules to test by workflow
alexfurmenkov Apr 24, 2026
1697b2b
validation script change to support creating two reports
alexfurmenkov Apr 30, 2026
4ce45db
rollback metadata files names
alexfurmenkov Apr 30, 2026
6fa9017
Merge branch 'main' into rules_2
alexfurmenkov May 13, 2026
a976fdf
Merge branch 'main' into feature/update-validation-pipeline
alexfurmenkov May 18, 2026
21dc910
fixed cycle in validation script. rolled back test data
alexfurmenkov May 18, 2026
878e39e
Merge branch 'main' into rules_2
alexfurmenkov May 21, 2026
790bb05
Merge branch 'feature/update-validation-pipeline' into rules_2
alexfurmenkov May 21, 2026
2fac0ec
Merge branch 'main' into rules_2
alexfurmenkov May 21, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/scripts/convert_results.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ def detect_standard(data: dict) -> str:
return data.get("Conformance_Details", {}).get("Standard", "").upper()


def convert_nonusdm(issue_details: list) -> tuple[list[str], list[tuple]]:
def convert_non_usdm(issue_details: list) -> tuple[list[str], list[tuple]]:
header = ["Dataset", "Record", "Variable", "Value"]
rows = []
for issue in issue_details:
Expand Down Expand Up @@ -65,7 +65,7 @@ def convert(json_path: str, csv_path: str) -> None:
if standard == "USDM":
header, rows = convert_usdm(issue_details)
else:
header, rows = convert_nonusdm(issue_details)
header, rows = convert_non_usdm(issue_details)

with open(csv_path, "w", newline="") as f:
writer = csv.writer(f)
Expand Down
90 changes: 65 additions & 25 deletions .github/scripts/run_validation.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
#!/usr/bin/env bash
# run_validation.sh — iterates all positive/ and negative/ test cases for a rule,
# runs the CORE engine against each, converts JSON output to results.csv,
# diffs against any committed results.csv, and writes a markdown report.
# diffs against any committed results.csv, and writes two outputs:
# - $REPO_ROOT/validation_report.md (detailed markdown, legacy/fallback)
# - $REPO_ROOT/case_results.jsonl (one JSON line per test case for the summary table)
#
# Usage:
# bash .github/scripts/run_validation.sh <rule_rel_path> <python_cmd> <repo_root>
Expand All @@ -18,9 +20,11 @@ REPO_ROOT="${3:?repo_root required}"

RULE_ID=$(basename "$RULE_REL_PATH")
RULE_DIR="$REPO_ROOT/$RULE_REL_PATH"
ENGINE_DIR="$REPO_ROOT/engine"
# Allow caller to override engine location (e.g. when called from rules-engine repo)
ENGINE_DIR="${ENGINE_DIR_OVERRIDE:-$REPO_ROOT/engine}"
SCRIPTS_DIR="$REPO_ROOT/.github/scripts"
REPORT_FILE="$REPO_ROOT/validation_report.md"
JSONL_FILE="$REPO_ROOT/case_results.jsonl"

# ---------------------------------------------------------------------------
# Locate the rule YAML
Expand Down Expand Up @@ -48,27 +52,60 @@ TOTAL_CASES=0
PASSED_CASES=0
FAILED_CASES=0

# ---------------------------------------------------------------------------
# Helper: append one JSON line to case_results.jsonl
# Passes all values via env vars to avoid shell-quoting issues with paths.
# Args: exec(true|false) expected got match(true|false) diff_path stderr_path
# ---------------------------------------------------------------------------
emit_result() {
R_RULE="$RULE_ID" \
R_TYPE="$TEST_TYPE" \
R_NUM="$CASE_ID" \
R_EXEC="$1" \
R_EXPECTED="$2" \
R_GOT="$3" \
R_MATCH="$4" \
R_DIFF="$5" \
R_STDERR="$6" \
python3 -c "
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not hardcode python3 here. The script may fail on system where python is not registered as python3

import json, os
e = os.environ
print(json.dumps({
'rule': e['R_RULE'],
'type': e['R_TYPE'],
'num': e['R_NUM'],
'exec': e['R_EXEC'] == 'true',
'expected': e['R_EXPECTED'],
'got': e['R_GOT'],
'match': e['R_MATCH'] == 'true',
'diff': e['R_DIFF'],
'stderr': e['R_STDERR'],
}))" >> "$JSONL_FILE"
}

# ---------------------------------------------------------------------------
# Iterate test types and cases
# ---------------------------------------------------------------------------
for TEST_TYPE in positive negative; do
TYPE_DIR="$RULE_DIR/$TEST_TYPE"
[ -d "$TYPE_DIR" ] || continue

echo "" >> "$REPORT_FILE"
echo "## $TEST_TYPE" >> "$REPORT_FILE"
echo "" >> "$REPORT_FILE"
{
echo ""
echo "## $TEST_TYPE"
echo ""
} >> "$REPORT_FILE"

for CASE_DIR in $(find "$TYPE_DIR" -mindepth 1 -maxdepth 1 -type d | sort); do
while IFS= read -r -d '' CASE_DIR; do
CASE_ID=$(basename "$CASE_DIR")
DATA_DIR="$CASE_DIR/data"
RESULTS_DIR="$CASE_DIR/results"
CASE_LABEL="$TEST_TYPE/$CASE_ID"

TOTAL_CASES=$((TOTAL_CASES + 1))
echo ""
echo "--- Processing $RULE_ID / $CASE_LABEL ---"

# -- Skip cases that are structurally incomplete (no jsonl entry emitted)
if [ ! -d "$DATA_DIR" ]; then
echo "::warning::Missing data/ directory for $CASE_LABEL — skipping"
echo "### \`$CASE_LABEL\` — ⚠️ Skipped (no data/ directory)" >> "$REPORT_FILE"
Expand All @@ -86,6 +123,10 @@ for TEST_TYPE in positive negative; do
continue
fi
echo " .env: $ENV_FILE"

TOTAL_CASES=$((TOTAL_CASES + 1))

# -- Missing committed baseline
if [ ! -f "$RESULTS_DIR/results.csv" ]; then
echo " ERROR: no committed results.csv found for $CASE_LABEL"
{
Expand All @@ -94,11 +135,16 @@ for TEST_TYPE in positive negative; do
echo "No \`results.csv\` was found for this test case. Run the rule locally before opening a PR and commit the generated \`results.csv\`."
echo ""
} >> "$REPORT_FILE"
emit_result "false" "" "" "false" "" ""
FAILED_CASES=$((FAILED_CASES + 1))
OVERALL_SUCCESS=false
continue
fi

# Back up committed results.csv before the engine run
cp "$RESULTS_DIR/results.csv" "$RESULTS_DIR/results.csv.committed"
COMMITTED_RESULTS="$RESULTS_DIR/results.csv.committed"

ENGINE_ARGS=(
"-lr" "$RULE_YML"
"-d" "$DATA_DIR"
Expand All @@ -110,10 +156,6 @@ for TEST_TYPE in positive negative; do

echo " Command: python core.py validate ${ENGINE_ARGS[*]}"

# Back up committed results.csv before the engine run
cp "$RESULTS_DIR/results.csv" "$RESULTS_DIR/results.csv.committed"
COMMITTED_RESULTS="$RESULTS_DIR/results.csv.committed"

# Run the engine
ENGINE_LOG="/tmp/engine_${TEST_TYPE}_${CASE_ID}.txt"
ENGINE_EXIT=0
Expand All @@ -133,6 +175,7 @@ for TEST_TYPE in positive negative; do
echo "</details>"
echo ""
} >> "$REPORT_FILE"
emit_result "false" "" "" "false" "" "$ENGINE_LOG"
FAILED_CASES=$((FAILED_CASES + 1))
OVERALL_SUCCESS=false
mv "$COMMITTED_RESULTS" "$RESULTS_DIR/results.csv"
Expand All @@ -147,7 +190,7 @@ for TEST_TYPE in positive negative; do
2>&1 | tee -a "$ENGINE_LOG" || CONVERT_EXIT=$?

if [ $CONVERT_EXIT -ne 0 ]; then
echo " ERROR: failed to convert results.json to results.csv"
echo " ERROR: failed to convert results.json to CSV"
{
echo "### \`$CASE_LABEL\` — ❌ Conversion error"
echo ""
Expand All @@ -159,12 +202,17 @@ for TEST_TYPE in positive negative; do
echo "</details>"
echo ""
} >> "$REPORT_FILE"
emit_result "false" "" "" "false" "" "$ENGINE_LOG"
FAILED_CASES=$((FAILED_CASES + 1))
OVERALL_SUCCESS=false
mv "$COMMITTED_RESULTS" "$RESULTS_DIR/results.csv"
continue
fi

# -- Diff
EXPECTED_COUNT=$(( $(wc -l < "$COMMITTED_RESULTS") - 1 ))
GOT_COUNT=$(( $(wc -l < "$GENERATED_CSV") - 1 ))

DIFF_LOG="/tmp/diff_${TEST_TYPE}_${CASE_ID}.txt"
DIFF_EXIT=0
$PYTHON_CMD "$SCRIPTS_DIR/diff_results.py" \
Expand All @@ -177,6 +225,7 @@ for TEST_TYPE in positive negative; do
echo "### \`$CASE_LABEL\` — ✅ Results match committed baseline"
echo ""
} >> "$REPORT_FILE"
emit_result "true" "$EXPECTED_COUNT" "$GOT_COUNT" "true" "" ""
PASSED_CASES=$((PASSED_CASES + 1))
else
echo " FAILED — committed results do not match engine output"
Expand All @@ -191,24 +240,15 @@ for TEST_TYPE in positive negative; do
echo "</details>"
echo ""
} >> "$REPORT_FILE"
emit_result "true" "$EXPECTED_COUNT" "$GOT_COUNT" "false" "$DIFF_LOG" ""
FAILED_CASES=$((FAILED_CASES + 1))
OVERALL_SUCCESS=false
fi

mv "$COMMITTED_RESULTS" "$RESULTS_DIR/results.csv"
if [ -s "$ENGINE_LOG" ]; then
{
echo "<details><summary>Engine output for \`$CASE_LABEL\`</summary>"
echo ""
echo '```'
cat "$ENGINE_LOG"
echo '```'
echo "</details>"
echo ""
} >> "$REPORT_FILE"
fi

done # cases
done # test types
done < <(find "$TYPE_DIR" -mindepth 1 -maxdepth 1 -type d -print0 | sort -z)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for to handle the spaces in the path safely but the for loop in the top does not use it. For this to be affective use while loop instead of for loop.

done

# ---------------------------------------------------------------------------
# Summary
Expand Down