This README documents what the top-level test runner executes today, the order and purpose of each subtest, required environment, logs, and a proposal to make the suite fail fast with targeted triage.
- Entry point:
run_scripts/run_all_tests.sh - Optional fail-fast runner:
run_scripts/run_triage_suite.sh - Helpers moved to
run_scripts/helpers/ - Active test entrypoints moved to
run_scripts/active/ - Goal: Validate Genesis end-to-end on DDS, covering memory, agent-to-agent, interface→agent→service pipelines, RPC math services, durability, framework sanity, and monitoring.
- Philosophy: Run hardest/most comprehensive checks early to fail fast.
- Python: 3.10.x
- DDS: RTI Connext DDS 7.3.0+ installed and configured
NDDSHOMEset,rtiddsspyavailable- Optionally:
RTIDDSSPY_BINorRTI_BIN_DIRoverrides
- API keys (where used):
OPENAI_API_KEY,ANTHROPIC_API_KEY - Virtualenv activated or available at
venv/
- Activates
venv/if present; sources.envif present. - Verifies Python 3.10.
- Locates
rtiddsspyusingRTIDDSSPY_BIN→RTI_BIN_DIR→$NDDSHOME/bin/rtiddsspy. - Creates
logs/folder. - Performs DDS cleanup check using
rtiddsspyand kills lingering test processes if needed. - Runs each test with a per-test timeout; writes per-test logs under
logs/. - On failure, prints the last 20 lines of relevant logs using heuristics to include related files.
-
Memory recall core test:
run_test_agent_memory.sh(timeout 60s)- Validates multi-stage agent memory recall behavior.
- Pass token in log: "✅ Multi-stage memory recall test PASSED".
-
Agent↔Agent comprehensive:
test_agent_to_agent_communication.py(120s)- PersonalAssistant ⇄ WeatherAgent using
@genesis_toolauto-discovery. - Verifies DDS messages, monitoring events, and successful cross-agent tool execution.
- PersonalAssistant ⇄ WeatherAgent using
-
Interface→Agent→Service pipeline:
run_interface_agent_service_test.sh(75s)SimpleGenesisAgent+CalculatorService+SimpleGenesisInterfaceStaticquestion flow.- Checks DDS quiet preflight, connection, RPC client logs, and expected sum in response.
-
Math interface/agent simple:
run_math_interface_agent_simple.sh(60s)- Two-phase test: registration durability then interface request/response.
- Verifies durable Advertisement, request/reply topics, and pass tokens.
-
Basic calculator client:
run_math.sh(30s)- Spawns
CalculatorServiceand usesGenesisRPCClientto verify add/sub/mul/div and div-by-zero handling.
- Spawns
-
Multi-instance calculator:
run_multi_math.sh(60s)- Launches 3 calculator services; runs client checks across them.
-
Simple agent with services:
run_simple_agent.sh(60s)- Starts Calculator/TextProcessor/LetterCounter services +
simple_agent; verifies RPC calls to each.
- Starts Calculator/TextProcessor/LetterCounter services +
-
Simple client path:
run_simple_client.sh(60s)- Similar to (7) but drives via a simple client program.
-
Calculator durability:
test_calculator_durability.sh(60s)- Service registration durability and function capability announcements via
rtiddsspy.
- Service registration durability and function capability announcements via
-
Example agent with functions:
run_test_agent_with_functions.sh(60s)- Requires
OPENAI_API_KEY. Runs function and letter counter tests, checks RPC traces.
- Requires
-
Services + CLI sanity:
start_services_and_cli.sh(90s)- Spins up services and does a short lifecycle sanity; CLI is presently commented.
-
Framework sanity:
test_genesis_framework.sh(120s)- DDS/RPC basic checks; services registration poll via
FunctionRequester.
- DDS/RPC basic checks; services registration poll via
-
Monitoring test:
test_monitoring.sh(60s)- Starts multiple calculator services, a monitoring script, and runs a test agent to generate events.
- All logs under
logs/. Common files include:test_agent_memory.log,test_sga_pipeline.log,test_calc_pipeline.log,test_static_interface_pipeline.log,rtiddsspy_*.log,math_test_agent.log,math_test_interface.log,serviceside_*.log, etc.
- On failure, the runner prints the tail of the primary log and related logs (heuristics per test).
- Global default: 120s; specific tests override (see list above).
- Exit code 124 is treated as timeout.
- Runner stops on first failure, prints relevant logs, cleans up processes, and exits non-zero.
- Stage ordering (already close): put comprehensive tests first, then branch to targeted subtests if they fail.
- For each comprehensive stage, define 1–3 fast subtests to isolate root cause before exiting.
Suggested triage map:
- If Memory (1) fails:
- Run a quick DDS health check (spy only) and a minimal memory unit subset with verbose logs.
- If Agent↔Agent (2) fails:
- Run (3) Interface→Agent→Service pipeline to see if service/RPC path is healthy.
- If still failing, run (4) Math simple to isolate RPC + discovery without multi-agent complexity.
- If Pipeline (3) fails:
- Run (4) Math simple and (5) run_math.sh to confirm RPC service path.
- If Math simple (4) fails:
- Run (5) run_math.sh as the leanest RPC smoke.
Implementation notes:
- Add a
--triageflag orTRIAGE=trueenv to enable subtest fallback. - Encode subtests as small shell functions in
run_all_tests.shwith shared log parsing. - Print a short “triage summary” explaining which subtest failed and likely area (DDS env, discovery, RPC, monitoring).
- Keep
run_scripts/as the authoritative orchestrator. When repos split:- Place the runner and example-heavy scripts in
genesis-examples. - Add a CI shim in
genesis-libto invoke the examples runner when available; otherwise run a minimal librarypytestsuite.
- Place the runner and example-heavy scripts in
- Inventory sub-scripts to categorize as “library unit” vs “example/integration”. Keep example/integration here; mirror pure unit tests into
genesis-libgradually.
- Set
DEBUG=truefor extra debug output from the runner. - Verify DDS quickly:
rtiddsspy -vand ensureNDDSHOMEand library paths are correct. - Clean lingering DDS processes if a prior run crashed.
Some scripts look like tests or utilities but are not invoked by the runner today. See the structure notes below for categorization.
- Baselines and variants:
baseline_test_agent.py,baseline_test_interface.py,run_baseline_*,run_math_interface_agent_test.sh - Durability variants:
dev/test_interface_durability.sh,dev/test_personal_agent_durability.sh - Example/demo:
helpers/interactive_memory_test.py,dev/run_interactive_memory_test.sh,dev/run_example_agent1*.sh - Utilities:
helpers/interface_keepalive.py,dev/run_math_service.sh,helpers/simpleGenesisInterfaceCLI.py - Older orchestrators:
run_tests.sh,test_genesis_complete.sh
These are candidates to either (a) fold into the fail-fast triage path, (b) remain as documented dev utilities, or (c) move to the examples repo during the split.
run_scripts/: top-level orchestrators only (and docs).run_scripts/active/: scripts invoked byrun_all_tests.shand/orrun_triage_suite.sh.run_scripts/helpers/: Python helpers and drivers used by the runners. Includes:baseline_test_agent.pybaseline_test_interface.pyinteractive_memory_test.pyinterface_keepalive.pymath_test_agent.pymath_test_interface.pysimpleGenesisAgent.pysimpleGenesisInterfaceCLI.pysimpleGenesisInterfaceStatic.pytest_agent.pytest_agent_memory.pycomprehensive_multi_agent_test_interface.py
All referencing scripts have been updated to use run_scripts/helpers/... paths.
run_all_tests.sh: Full suite orchestrator invoked before merges.run_triage_suite.sh: Fail-fast orchestrator used in debugging.- Core tests used by both suites (now under
active/):active/run_test_agent_memory.shactive/test_agent_to_agent_communication.pyactive/run_interface_agent_service_test.shactive/run_math_interface_agent_simple.shactive/run_math.shactive/run_multi_math.shactive/run_simple_agent.shactive/run_simple_client.shactive/test_calculator_durability.shactive/test_monitoring_graph_state.pyactive/test_monitoring_interface_agent_pipeline.pyactive/test_monitoring.sh(optional when noOPENAI_API_KEY)active/test_viewer_contract.py
Other groupings:
run_scripts/dev/: development or ad-hoc tests (e.g.,run_math_interface_agent_test.sh,run_example_agent1*.sh,run_baseline_*,limited_mesh_test.sh,test_interface_durability.sh).run_scripts/legacy/: older orchestrators retained for reference (run_tests.sh,test_genesis_complete.sh).
If relocating any of these, update both orchestrators accordingly.
test_monitoring_graph_state.py: Validates GraphMonitor/GraphState invariants usingCalculatorService:- One unique node per endpoint (service and each function)
SERVICE→FUNCTIONedges exist for advertised functions- For each function call, a
BUSY→READYpair is observed on the service node (closed reply)
Run:
python run_scripts/test_monitoring_graph_state.pyRequirements: DDS installed (NDDSHOME), Python 3.10.
- Use
run_triage_suite.shto run: Memory → Agent↔Agent → Pipeline → Monitoring. - On failure, it runs targeted subtests in-order to isolate likely causes:
- For Agent↔Agent: Pipeline → Math Simple → Math Client
- For Pipeline: Math Simple → Math Client
- Includes an advisory DDS writer sweep filtered to Genesis topics, logging warnings instead of aborting.
- Keeps
run_all_tests.shunchanged.
Monitoring
- Included by default as Stage 4.
- 4a:
test_monitoring_graph_state.py(no API keys) validates GraphMonitor/GraphState invariants (unique nodes, service→function edges, BUSY→READY pairing for a call). - 4a.2:
test_monitoring_interface_agent_pipeline.pyruns the interface→agent→service pipeline and asserts INTERFACE→AGENT edge plus INTERFACE_REQUEST_START→COMPLETE activity pairing. - 4b:
test_monitoring.sh(requiresOPENAI_API_KEY) exercises the heavier monitoring path; skipped with a warning if key is missing.
Viewer Contract
test_viewer_contract.pyvalidates that the library exports a stable viewer topology JSON:- Converts an in-memory graph (GenesisNetworkGraph) to viewer JSON via
genesis_lib.viewer_export. - Validates structure and required fields (nodes, edges, timestamp, version).
- Enforces a small back-compat gate on counts and required fields.
- Converts an in-memory graph (GenesisNetworkGraph) to viewer JSON via
mcp/: Optional MCP server to run tests from Cursor/assistants.mcp/test_runner_server.pyprovides tools:preflight,run_triage,run_all_tests,run_active_test,tail_log,sweep_dds..cursor/mcp.jsonregisters the server to run via your projectvenv.
- Ensure venv exists and install the
mcppackage if missing:source venv/bin/activate && pip install mcp
- Start automatically in Cursor:
.cursor/mcp.jsonrunsmcp/start_server.shwhich:- sources
.env(NDDSHOME, API keys), runsmcp/preflight.sh, then starts the server if OK. - If preflight fails, fix issues and retry.
- sources
- Tools available:
preflight→ prints Python, NDDSHOME,rtiddsspypath, API keys presence.run_triage/run_all_tests→ runs the suites, returns exit code and stdout/stderr.run_active_test {name}→ runs a single script fromrun_scripts/active/.tail_log {filename}→ returns the tail of a file underlogs/.sweep_dds→ runs an advisoryrtiddsspy -printSamplesweep.
(c) 2025 Copyright, Real-Time Innovations, Inc. (RTI) All rights reserved.
RTI grants Licensee a license to use, modify, compile, and create derivative works of the Software. Licensee has the right to distribute object form only for use with RTI products. The Software is provided "as is", with no warranty of any type, including any warranty for fitness for any purpose. RTI is under no obligation to maintain or support the Software. RTI shall not be liable for any incidental or consequential damages arising out of the use or inability to use the software.