|
| 1 | +# Simulation-GNN training Landscape Study |
| 2 | + |
| 3 | +## Goal |
| 4 | + |
| 5 | +Map the **simulation-GNN training landscape**: understand which simulation configurations allow successful GNN training (connectivity_R2 > 0.9) and which simulation configurations are fundamentally harder for GNN training. |
| 6 | +When Can GNN recover synaptic weights from simulated data? |
| 7 | + |
| 8 | +## Context |
| 9 | + |
| 10 | +You are a LLM, you are **hyperparameter optimizer** in a meta-learning loop. Your role: |
| 11 | + |
| 12 | +1. **Analyze results**: Read activity plots and metrics from the current GNN training run |
| 13 | +2. **Update config**: Modify training parameters for the next iteration based on UCB scores |
| 14 | +3. **Log decisions**: Append structured observations to the analysis file |
| 15 | +4. **Self-improve**: At simulation block boundaries, you are asked edit THIS protocol file to refine your own exploration rules |
| 16 | + |
| 17 | +### Simulation Blocks |
| 18 | + |
| 19 | +Each block = 24 iterations exploring one simulation configuration. |
| 20 | + |
| 21 | +- **Within block (iter 1-24, 25-48, ...)**: Only modify training parameters (learning rates, regularization, batch size) |
| 22 | +- **At block boundaries (iter 25, 49, 73...)**: |
| 23 | + - Summarize what worked/failed in previous block |
| 24 | + - Change simulation parameters (connectivity_type, Dale_law, noise_model_level) |
| 25 | + - UCB tree resets (parent=root for first iteration of new block) |
| 26 | + |
| 27 | +At block boundaries, add: |
| 28 | + |
| 29 | +``` |
| 30 | +## Iter N: [status] |
| 31 | +--- NEW SIMULATION BLOCK --- |
| 32 | +Simulation: connectivity_type=[type], Dale_law=[True/False], Dale_law_factor=[F], connectivity_rank = [R] if connectivity_type='low_rank', noise_model_level=[L] |
| 33 | +Node: id=N, parent=root |
| 34 | +``` |
| 35 | + |
| 36 | +### Simulation block Summary |
| 37 | + |
| 38 | +1. Did this simulation regime converge? |
| 39 | +2. What training configs worked best? |
| 40 | +3. Comparison to previous blocks |
| 41 | +4. Remains to be explored |
| 42 | + |
| 43 | +``` |
| 44 | +## Simulation block N Summary (iters X-Y) |
| 45 | +
|
| 46 | +Simulation: [connectivity_type], [n_types] types, noise=[level] |
| 47 | +Best R2: [value] at iter [N] |
| 48 | +Observation: [four lines about what worked/failed for this simulation] |
| 49 | +Optimum training parameters: [learning_rate_W_start, learning_rate_start, learning_rate_embedding_start, coeff_W_L1: 1.0E-5] |
| 50 | +
|
| 51 | +``` |
| 52 | + |
| 53 | +## MANDATORY: Block Boundary Actions (iter 25, 49, 73, ...) |
| 54 | + |
| 55 | +At the **first iteration of each new block**, you MUST complete ALL of these actions: |
| 56 | + |
| 57 | +### Checklist (complete in order): |
| 58 | + |
| 59 | +- [ ] **1. Write block summary** for the previous block (see "Simulation block Summary" format above) |
| 60 | +- [ ] **2. Evaluate exploration rules** using metrics below |
| 61 | +- [ ] **3. EDIT THIS PROTOCOL FILE** - modify the rules between `## Parent Selection Rule (CRITICAL)` and `## END Parent selection Rule (CRITICAL)` |
| 62 | +- [ ] **4. Document your edit** - in the analysis file, state what you changed and why (or state "No changes needed" with justification) |
| 63 | + |
| 64 | +### Evaluation Metrics for Rule Modification: |
| 65 | + |
| 66 | +1. **Branching rate**: Count unique parents in last 6 iters |
| 67 | + - If all sequential (rate=0%) → ADD exploration incentive to rules |
| 68 | +2. **Improvement rate**: How many iters improved R²? |
| 69 | + - If <30% improving → INCREASE exploitation (raise R² threshold) |
| 70 | + - If >80% improving → INCREASE exploration (probe boundaries) |
| 71 | +3. **Stuck detection**: Same R² plateau (±0.05) for 3+ iters? |
| 72 | + - If yes → ADD forced branching rule |
| 73 | + |
| 74 | +### Example Protocol Edit: |
| 75 | + |
| 76 | +If branching rate was 0% (all sequential), you might add a new row to the strategy table: |
| 77 | + |
| 78 | +**Before:** |
| 79 | +``` |
| 80 | +| Default | **exploit** | Use highest UCB node, try new mutation | |
| 81 | +``` |
| 82 | + |
| 83 | +**After:** |
| 84 | +``` |
| 85 | +| Default | **exploit** | Use highest UCB node, try new mutation | |
| 86 | +| Branching rate < 20% in last block | **force-branch** | Select random node from top 3 UCB, not the sequential parent| |
| 87 | +``` |
| 88 | + |
| 89 | +Or modify threshold values, add new conditions, remove ineffective rules, etc. |
| 90 | + |
| 91 | +**IMPORTANT**: You must actually use the Edit tool to modify this file. Simply stating what you would change is NOT sufficient. |
| 92 | + |
| 93 | +## Analysis of Files |
| 94 | + |
| 95 | +- `analysis.log`: metrics from training/test/plot: |
| 96 | + - `spectral_radius`: eigenvalue analysis of connectivity |
| 97 | + - `svd_rank`: SVD rank at 99% variance (activity complexity) |
| 98 | + - `test_R2`: R² between ground truth and rollout prediction |
| 99 | + - `test_pearson`: Pearson correlation per neuron (mean) |
| 100 | + - `connectivity_R2`: R² of learned vs true connectivity weights |
| 101 | + - `final_loss`: final training loss (lower is better) |
| 102 | +- `ucb_scores.txt`: provides pre-computed UCB scores for all nodes including current iteration |
| 103 | + at block boundaries, the UCB file will be empty (erased). When UCB file is empty, use `parent=root`. |
| 104 | + |
| 105 | +``` |
| 106 | +
|
| 107 | +Node 2: UCB=2.175, parent=1, visits=1, R2=0.997 [CURRENT] |
| 108 | +Node 1: UCB=2.110, parent=root, visits=2, R2=0.934 |
| 109 | +
|
| 110 | +``` |
| 111 | + |
| 112 | +- `Node N`: |
| 113 | +- `UCB`: Upper Confidence Bound score = R² + c×√(log(N_total)/visits); higher = more promising to explore |
| 114 | +- `parent`: which node's config was mutated to create this node (root = baseline config) |
| 115 | +- `visits`: how many times this node or its descendants have been explored |
| 116 | +- `R2`: connectivity_R2 achieved by this node's config |
| 117 | + |
| 118 | +## Classification |
| 119 | + |
| 120 | +- **Converged**: connectivity_R2 > 0.9 |
| 121 | +- **Partial**: connectivity_R2 0.1-0.9 |
| 122 | +- **Failed**: connectivity_R2 < 0.1 |
| 123 | + |
| 124 | +## Simulation Parameters to explore |
| 125 | + |
| 126 | +These parameters affect the **data generation** (simulation). Only change at block boundaries. |
| 127 | + |
| 128 | +```yaml |
| 129 | +simulation: |
| 130 | + connectivity_type: "chaotic" # or "low_rank" |
| 131 | + Dale_law: True # enforce excitatory/inhibitory separation |
| 132 | + Dale_law_factor: 0.5 # fraction excitatory/inhibitory (0.1 to 0.9) |
| 133 | + connectivity_rank: 20 # only used when connectivity_type="low_rank", range 5-100 |
| 134 | +# noise_model_level: 0.0 # noise added during simulation, affects data complexity. values: 0, 0.5, 1 |
| 135 | +``` |
| 136 | + |
| 137 | +## Training Parameters to explore |
| 138 | + |
| 139 | +These parameters affect the **GNN training**. Can be changed within a block. |
| 140 | + |
| 141 | +```yaml |
| 142 | +training: |
| 143 | + learning_rate_W_start: 2.0E-3 # LR for connectivity weights W range: 1.0E-4 to 1.0E-2 |
| 144 | + learning_rate_start: 1.0E-4 # LR for model parameters range: 1.0E-5 to 1.0E-3 |
| 145 | + learning_rate_embedding_start: 2.5E-4 # LR for embeddings range: 1.0E-5 to 1.0E-3, only if n_neuron_types > 1 |
| 146 | + coeff_W_L1: 1.0E-5 # L1 regularization on W range: 1.0E-6 to 1.0E-3 |
| 147 | + batch_size: 8 # batch size values: 8, 16, 32 |
| 148 | +``` |
| 149 | +
|
| 150 | +## Parent Selection Rule (CRITICAL) |
| 151 | +
|
| 152 | +**Step 1: select parent node to ccontinue** |
| 153 | +
|
| 154 | +- Use `ucb_scores.txt` to select a new node |
| 155 | +- If UCB file is empty → `parent=root` |
| 156 | +- Otherwise → select node with **highest UCB** as parent |
| 157 | + |
| 158 | +**Step 2: Choose exploration strategy** |
| 159 | + |
| 160 | +| Condition | Strategy | Action | |
| 161 | +| ----------------------------------- | ------------------- | ----------------------------------------------------------- | |
| 162 | +| Default | **exploit** | Use highest UCB node, try new mutation | |
| 163 | +| 3+ consecutive successes (R² ≥ 0.9) | **failure-probe** | Deliberately try extreme parameter to find failure boundary | |
| 164 | +| 6+ consecutive successes (R² ≥ 0.9) | **explore** | Use highest UCB node not last 6 nodes, try new mutation | |
| 165 | +| Found good config | **robustness-test** | Re-run same config (no mutation) to verify reproducibility | |
| 166 | + |
| 167 | +**failure-probe**: After multiple successes, intentionally push parameters to extremes (e.g., 10x lr, 0.1x lr) to map where the config breaks. This helps understand the stability region. |
| 168 | + |
| 169 | +**robustness-test**: Duplicate the best iteration with identical config to verify the result is reproducible, not due to lucky initialization. |
| 170 | + |
| 171 | +**Reversion check**: If reverting a parameter to match a previous node's value, use that node as parent. |
| 172 | +Example: If reverting `lr` back to `1E-4` (Node 2's value), use `parent=2`. |
| 173 | + |
| 174 | +## END Parent selection Rule (CRITICAL) |
| 175 | + |
| 176 | +## Log Format |
| 177 | + |
| 178 | +``` |
| 179 | +## Iter N: [converged/partial/failed] |
| 180 | +Node: id=N, parent=P |
| 181 | +Mode/Strategy: [success-exploit/failure-probe]/[exploit/explore/boundary] |
| 182 | +Config: lr_W=X, lr=Y, lr_emb=Z, coeff_W_L1=W, batch_size=B |
| 183 | +Metrics: test_R2=A, test_pearson=B, connectivity_R2=C, final_loss=D |
| 184 | +Activity: [brief description of dynamics] |
| 185 | +Mutation: [param]: [old] -> [new] |
| 186 | +Parent rule: [brief description of Parent Selection Rule] |
| 187 | +Observation: [one line about result] |
| 188 | +Next: parent=P [CRITICAL: specify which node the NEXT iteration should branch from] |
| 189 | +``` |
0 commit comments