diff --git a/skills/dpgen-simplify/SKILL.md b/skills/dpgen-simplify/SKILL.md
new file mode 100644
index 000000000..e5cea1d81
--- /dev/null
+++ b/skills/dpgen-simplify/SKILL.md
@@ -0,0 +1,341 @@
+---
+name: dpgen-simplify
+description: Prepare, explain, validate, and run DP-GEN simplify workflows for reducing repeated or redundant DeepMD datasets. Use when the user wants to generate or modify `param.json` and `machine.json`, run `dpgen simplify param.json machine.json`, organize repeated simplify experiments, or inspect simplify outputs.
+compatibility: Requires a runnable environment with Python and an activated DP-GEN runtime where `dpgen` is available in PATH for the outer simplify command. Real execution also requires DeePMD-kit and any backend-specific software required by the selected `fp_style`. For scheduler execution, each stage environment must be explicitly activated in `resources.source_list`.
+license: LGPL-3.0-or-later
+metadata:
+  author: hyb1109
+  version: 0.2.0
+  repository: https://github.com/deepmodeling/dpgen
+---
+
+# DP-GEN Simplify
+
+Use this skill when the user wants to prepare, explain, validate, or execute the `dpgen simplify` workflow.
+
+This skill is for dataset simplification workflows where the user already has candidate data in DeepMD-compatible format and wants to reduce repeated or redundant structures through iterative selection.
+
+## Core Rule (Critical)
+
+DP-GEN simplify always uses **two parameter classes** and therefore **two JSON files**:
+
+- **Workflow parameters** -> `param.json`
+- **Execution / machine parameters** -> `machine.json`
+
+Run exactly:
+
+```bash
+dpgen simplify param.json machine.json
+```
+
+Environment boundary rule:
+
+- Outer layer: run `dpgen simplify param.json machine.json` in an activated environment where `dpgen --version` works.
+- Inner layer: for scheduler stages, explicitly activate runtime in `resources.source_list` on the server side.
+
+## Agent responsibilities
+
+When using this skill, the agent should:
+
+1. confirm that the task is a simplify workflow
+1. check whether existing configs or templates are already available
+1. collect only the missing dataset, training, FP, and machine inputs
+1. generate or patch `param.json`
+1. generate or patch `machine.json`
+1. explain important simplify parameters in plain language when asked
+1. validate the workflow before execution
+1. provide the exact command for running simplify
+1. if requested, help structure repeated experiments
+1. after execution, summarize outputs and next inspection targets
+
+## Working policy
+
+### 1. Ask only for missing inputs
+
+Do not ask the user for everything if part of the configuration is already available.
+
+If the user already provides:
+
+- a partial `param.json`
+- a partial `machine.json`
+- a known training template
+- a known cluster template
+
+then patch those files instead of rebuilding everything from scratch.
+
+### 2. Preserve the user's scientific choices
+
+Do not silently change:
+
+- descriptor family
+- fitting net structure
+- fp backend
+- trust thresholds
+- `type_map` ordering
+
+If a value looks scientifically questionable, explain the concern instead of silently replacing it.
+
+### 3. Keep local and scheduler execution explicit
+
+If the user wants local execution, produce local-friendly commands.
+
+If the user wants scheduler execution, produce scheduler-friendly commands and keep queue, partition, and resource requests explicit.
+
+Do not invent scheduler module names or executable paths.
+
+### 4. Do not invent environment activation commands
+
+If the user already has a working activation command such as:
+
+- `conda activate ...`
+- `module load ...`
+- `source ...`
+
+reuse it exactly.
+
+If execution is requested and the activation method is unknown, ask the user for the precise activation command.
+
+Do not guess conda environment names, module names, or site-specific paths.
+
+### 4.1 Outer launcher policy
+
+Use an activated DP-GEN environment and verify with:
+
+```bash
+dpgen --version
+```
+
+Do not start simplify from a shell where `dpgen` is unavailable.
+
+### 4.2 Outer vs inner runtime boundaries (critical)
+
+Treat simplify execution as two separate environment layers:
+
+1. Outer layer: the shell that launches `dpgen simplify param.json machine.json` (must have `dpgen` in PATH)
+1. Inner layer: stage tasks dispatched by DP-GEN (`train` / `model_devi` / `fp`) on server/runtime side
+
+Even if the outer layer is correct, inner stage tasks still need explicit runtime setup in `machine.json`.
+Do not assume the outer shell environment will be inherited by dispatched stage jobs.
+For scheduler-style execution, `resources.source_list` must explicitly activate the required runtime environment.
+
+### 5. Prefer reproducible output layout
+
+When generating a simplify workflow, keep files organized and predictable.
+
+Recommended structure:
+
+```text
+project/
+├── param.json
+├── machine.json
+├── run.sh
+├── logs/
+└── summary/
+```
+
+For repeated experiments:
+
+```text
+project/
+├── base/
+├── exp_01/
+├── exp_02/
+├── exp_03/
+└── summary/
+```
+
+## Minimum required inputs
+
+Collect the following information before generating files.
+
+### Dataset information
+
+- `pick_data`
+- `sys_configs`
+- `init_data_prefix`
+- `init_data_sys`
+- `sys_batch_size`
+- dataset format
+- `type_map`
+- `mass_map` if needed
+- `labeled`
+
+### Simplify controls
+
+- `init_pick_number`
+- `iter_pick_number`
+- `model_devi_f_trust_lo`
+- `model_devi_f_trust_hi`
+- `model_devi_e_trust_lo` / `model_devi_e_trust_hi` if energy trust is used
+- `numb_models` if not already specified
+
+### Training setup
+
+- `train_backend` if required by environment (for example `pytorch`)
+- `default_training_param`
+  - descriptor settings
+  - fitting network settings
+  - learning rate settings
+  - loss settings
+  - training step settings
+
+### FP setup
+
+- `fp_style`
+- If data is already labeled (energy/force/virial available) and no re-labeling is requested, set `fp_style` to `none`.
+- if `fp_style != "none"`, collect matching FP runtime settings such as:
+  - `fp_task_max`
+  - `fp_task_min`
+  - `fp_params`
+  - pseudopotential or backend file paths if required
+
+### Execution setup
+
+For each stage `train`, `model_devi`, and `fp`, collect or preserve:
+
+- `command`
+- `machine.batch_type`
+- `machine.context_type`
+- `machine.local_root`
+- `machine.remote_root`
+- `resources.number_node`
+- `resources.cpu_per_node`
+- `resources.gpu_per_node`
+- `resources.group_size`
+- `resources.source_list` (required for scheduler jobs; use it to activate environment explicitly)
+- any explicit queue / partition / custom scheduler flags if the user already uses them
+
+Choose a runtime profile first, then fill the matching template:
+
+- server-local Slurm: `assets/machine.template.server-local-slurm.json`
+- local machine -> remote Slurm via SSH: `assets/machine.template.ssh-remote-slurm.json`
+- pure local shell testing: `assets/machine.template.local-shell.json`
+
+## How to build `param.json`
+
+Construct `param.json` around these logical blocks:
+
+1. element and mass definitions
+1. data source and batch settings
+1. model ensemble count
+1. default DeePMD training parameters
+1. FP backend settings
+1. simplify pick settings
+1. trust thresholds
+
+Key fields usually include:
+
+- `type_map`
+- `mass_map`
+- `pick_data`
+- `init_data_prefix`
+- `init_data_sys`
+- `sys_batch_size`
+- `numb_models`
+- `default_training_param`
+- `fp_style`
+- `shuffle_poscar`
+- `fp_task_max`
+- `fp_task_min`
+- `fp_pp_path`
+- `fp_pp_files`
+- `fp_params`
+- `init_pick_number`
+- `iter_pick_number`
+- `model_devi_f_trust_lo`
+- `model_devi_f_trust_hi`
+
+If the user is doing grid experiments, keep a base template and derive variants from it.
+
+Official reference example (QM7-style, adapted with path placeholders):
+
+- `assets/param.example.qm7.from-official-docs.json`
+
+## How to build `machine.json`
+
+Construct `machine.json` with separate stage blocks for:
+
+- `train`
+- `model_devi`
+- `fp`
+
+For each stage, keep the following explicit:
+
+- `command`
+- machine or context configuration
+- resources
+- queue or partition if needed
+- cpu and gpu counts
+- custom scheduler flags
+- environment activation commands
+
+Do not merge all stages into one vague machine block.
+
+## Validation before run
+
+Before execution, validate the workflow in this order:
+
+1. confirm outer-layer `dpgen` is available:
+
+```bash
+dpgen --version
+```
+
+2. validate JSON syntax:
+
+```bash
+python -m json.tool param.json
+python -m json.tool machine.json
+```
+
+3. verify required dataset paths exist
+1. verify stage commands match the selected software stack
+1. if `fp_style` is `none`, do not require FP-specific backend settings
+1. only then run:
+
+```bash
+dpgen simplify param.json machine.json
+```
+
+## Output contract
+
+Always provide:
+
+1. final absolute paths to `param.json` and `machine.json`
+1. the exact simplify command to run (`dpgen simplify param.json machine.json`)
+1. a short pre-run checklist
+1. any unresolved required fields
+1. if execution was performed, the main output locations and next files to inspect
+
+## Guardrails
+
+- Never merge workflow and machine parameters into one file.
+- Never run `dpgen simplify` before both JSON files are present.
+- Never hardcode personal cluster, account, queue, or path settings as universal defaults.
+- Never silently change the user's scientific choices.
+- Keep `type_map` ordering consistent with dataset typing.
+- If required inputs are missing, stop and ask instead of guessing.
+- If `fp_style` is `none`, skip FP-specific prompts and keep FP-specific settings disabled or unset.
+- If data is already labeled and the user does not request new labels, enforce `fp_style = "none"` and do not require active FP runtime fields.
+- Do not assume outer-shell activation is inherited by stage jobs; for scheduler execution, require explicit `source_list` per stage.
+- If the user already has working templates, patch them rather than overwriting them blindly.
+
+## References and bundled files
+
+Use these bundled files:
+
+- `assets/param.template.json`
+- `assets/param.example.qm7.from-official-docs.json`
+- `assets/machine.template.json`
+- `assets/machine.template.server-local-slurm.json`
+- `assets/machine.template.ssh-remote-slurm.json`
+- `assets/machine.template.local-shell.json`
+- `references/param-fields.md`
+- `references/machine-fields.md`
+- `references/workflow-notes.md`
+
+External references:
+
+- DP-GEN simplify overview: https://docs.deepmodeling.com/projects/dpgen/en/latest/simplify/simplify.html
+- simplify parameter definitions: https://docs.deepmodeling.com/projects/dpgen/en/latest/simplify/simplify-jdata.html
+- simplify machine definitions: https://docs.deepmodeling.com/projects/dpgen/en/latest/simplify/simplify-mdata.html
diff --git a/skills/dpgen-simplify/assets/machine.template.json b/skills/dpgen-simplify/assets/machine.template.json
new file mode 100644
index 000000000..e2ddb386d
--- /dev/null
+++ b/skills/dpgen-simplify/assets/machine.template.json
@@ -0,0 +1,49 @@
+{
+    "api_version": "1.0",
+    "deepmd_version": "2.0",
+    "train": {
+        "command": "dp",
+        "machine": {
+            "batch_type": null,
+            "context_type": null,
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": null
+        }
+    },
+    "model_devi": {
+        "command": "dp",
+        "machine": {
+            "batch_type": null,
+            "context_type": null,
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": null
+        }
+    },
+    "fp": {
+        "command": null,
+        "machine": {
+            "batch_type": null,
+            "context_type": null,
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": null
+        }
+    }
+}
diff --git a/skills/dpgen-simplify/assets/machine.template.local-shell.json b/skills/dpgen-simplify/assets/machine.template.local-shell.json
new file mode 100644
index 000000000..6da1ca260
--- /dev/null
+++ b/skills/dpgen-simplify/assets/machine.template.local-shell.json
@@ -0,0 +1,49 @@
+{
+    "api_version": "1.0",
+    "deepmd_version": "2.0",
+    "train": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Shell",
+            "context_type": "LazyLocalContext",
+            "local_root": "./",
+            "remote_root": "./"
+        },
+        "resources": {
+            "number_node": 1,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": 1
+        }
+    },
+    "model_devi": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Shell",
+            "context_type": "LazyLocalContext",
+            "local_root": "./",
+            "remote_root": "./"
+        },
+        "resources": {
+            "number_node": 1,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": 1
+        }
+    },
+    "fp": {
+        "command": null,
+        "machine": {
+            "batch_type": "Shell",
+            "context_type": "LazyLocalContext",
+            "local_root": "./",
+            "remote_root": "./"
+        },
+        "resources": {
+            "number_node": 1,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "group_size": 1
+        }
+    }
+}
diff --git a/skills/dpgen-simplify/assets/machine.template.server-local-slurm.json b/skills/dpgen-simplify/assets/machine.template.server-local-slurm.json
new file mode 100644
index 000000000..6840701b4
--- /dev/null
+++ b/skills/dpgen-simplify/assets/machine.template.server-local-slurm.json
@@ -0,0 +1,58 @@
+{
+    "api_version": "1.0",
+    "deepmd_version": "2.0",
+    "train": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "LocalContext",
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    },
+    "model_devi": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "LocalContext",
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    },
+    "fp": {
+        "command": null,
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "LocalContext",
+            "local_root": "./",
+            "remote_root": null
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    }
+}
diff --git a/skills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.json b/skills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.json
new file mode 100644
index 000000000..70fa45e95
--- /dev/null
+++ b/skills/dpgen-simplify/assets/machine.template.ssh-remote-slurm.json
@@ -0,0 +1,76 @@
+{
+    "api_version": "1.0",
+    "deepmd_version": "2.0",
+    "train": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "SSHContext",
+            "local_root": "./",
+            "remote_root": null,
+            "remote_profile": {
+                "hostname": null,
+                "username": null,
+                "port": 22,
+                "key_filename": null
+            }
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    },
+    "model_devi": {
+        "command": "dp",
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "SSHContext",
+            "local_root": "./",
+            "remote_root": null,
+            "remote_profile": {
+                "hostname": null,
+                "username": null,
+                "port": 22,
+                "key_filename": null
+            }
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    },
+    "fp": {
+        "command": null,
+        "machine": {
+            "batch_type": "Slurm",
+            "context_type": "SSHContext",
+            "local_root": "./",
+            "remote_root": null,
+            "remote_profile": {
+                "hostname": null,
+                "username": null,
+                "port": 22,
+                "key_filename": null
+            }
+        },
+        "resources": {
+            "number_node": null,
+            "cpu_per_node": null,
+            "gpu_per_node": null,
+            "queue_name": null,
+            "group_size": 1,
+            "custom_flags": [],
+            "source_list": []
+        }
+    }
+}
diff --git a/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json b/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json
new file mode 100644
index 000000000..0cd4af48c
--- /dev/null
+++ b/skills/dpgen-simplify/assets/param.example.qm7.from-official-docs.json
@@ -0,0 +1,105 @@
+{
+    "type_map": [
+        "C",
+        "H",
+        "N",
+        "O",
+        "S"
+    ],
+    "mass_map": [
+        12.011,
+        1.008,
+        14.007,
+        15.999,
+        32.065
+    ],
+    "_comment": "Adapted from official DP-GEN simplify QM7 example with path placeholders.",
+    "pick_data": "/path/to/qm7/deepmd_npy",
+    "init_data_prefix": "",
+    "init_data_sys": [],
+    "sys_batch_size": [
+        "auto"
+    ],
+    "numb_models": 4,
+    "default_training_param": {
+        "model": {
+            "type_map": [
+                "C",
+                "H",
+                "N",
+                "O",
+                "S"
+            ],
+            "descriptor": {
+                "type": "se_a",
+                "sel": [
+                    7,
+                    16,
+                    3,
+                    3,
+                    1
+                ],
+                "rcut_smth": 1.0,
+                "rcut": 6.0,
+                "neuron": [
+                    25,
+                    50,
+                    100
+                ],
+                "resnet_dt": false,
+                "axis_neuron": 12
+            },
+            "fitting_net": {
+                "neuron": [
+                    240,
+                    240,
+                    240
+                ],
+                "resnet_dt": true
+            }
+        },
+        "learning_rate": {
+            "type": "exp",
+            "start_lr": 0.001,
+            "stop_lr": 5e-08,
+            "decay_rate": 0.99
+        },
+        "loss": {
+            "start_pref_e": 0.02,
+            "limit_pref_e": 1,
+            "start_pref_f": 1000,
+            "limit_pref_f": 1,
+            "start_pref_v": 0,
+            "limit_pref_v": 0,
+            "start_pref_pf": 0,
+            "limit_pref_pf": 0
+        },
+        "training": {
+            "numb_steps": 10000,
+            "disp_file": "lcurve.out",
+            "disp_freq": 1000,
+            "numb_test": 1,
+            "save_freq": 1000,
+            "disp_training": true,
+            "time_training": true,
+            "profiling": false,
+            "profiling_file": "timeline.json"
+        }
+    },
+    "fp_style": "gaussian",
+    "shuffle_poscar": false,
+    "fp_task_max": 1000,
+    "fp_task_min": 10,
+    "fp_pp_path": "/path/to/fp/support/files",
+    "fp_pp_files": [],
+    "fp_params": {
+        "keywords": "mn15/6-31g** force nosymm scf(maxcyc=512)",
+        "nproc": 28,
+        "multiplicity": 1
+    },
+    "init_pick_number": 100,
+    "iter_pick_number": 100,
+    "model_devi_f_trust_lo": 0.25,
+    "model_devi_f_trust_hi": 0.45,
+    "_comment_tail": "Official example source: docs.deepmodeling.com/projects/dpgen/en/latest/simplify/simplify.html"
+}
diff --git a/skills/dpgen-simplify/assets/param.template.json b/skills/dpgen-simplify/assets/param.template.json
new file mode 100644
index 000000000..642d78fab
--- /dev/null
+++ b/skills/dpgen-simplify/assets/param.template.json
@@ -0,0 +1,24 @@
+{
+    "type_map": null,
+    "mass_map": null,
+    "pick_data": null,
+    "sys_configs": null,
+    "init_data_prefix": "",
+    "init_data_sys": [],
+    "sys_batch_size": [
+        "auto"
+    ],
+    "train_backend": null,
+    "labeled": true,
+    "fp_task_max": null,
+    "fp_task_min": null,
+    "model_devi_e_trust_lo": null,
+    "model_devi_e_trust_hi": null,
+    "init_pick_number": null,
+    "iter_pick_number": null,
+    "model_devi_f_trust_lo": null,
+    "model_devi_f_trust_hi": null,
+    "numb_models": 4,
+    "fp_style": "none",
+    "default_training_param": null
+}
diff --git a/skills/dpgen-simplify/references/machine-fields.md b/skills/dpgen-simplify/references/machine-fields.md
new file mode 100644
index 000000000..a2d2b8a28
--- /dev/null
+++ b/skills/dpgen-simplify/references/machine-fields.md
@@ -0,0 +1,97 @@
+# Simplify Machine File Notes
+
+This file gives concise notes for `machine.json` used by `dpgen simplify`.
+
+## General rule
+
+Keep three stage blocks separate:
+
+- `train`
+- `model_devi`
+- `fp`
+
+Do not collapse them into one ambiguous runtime block.
+
+## Runtime profiles
+
+Use one of these profiles based on where `dpgen` is launched:
+
+1. Server-local Slurm (already logged into cluster login node)
+
+   - `context_type = "LocalContext"`
+   - `batch_type = "Slurm"`
+   - template: `assets/machine.template.server-local-slurm.json`
+
+1. Local workstation -> remote Slurm cluster
+
+   - `context_type = "SSHContext"`
+   - `batch_type = "Slurm"`
+   - requires `remote_profile`
+   - template: `assets/machine.template.ssh-remote-slurm.json`
+
+1. Local single-machine shell testing
+
+   - `context_type = "LazyLocalContext"`
+   - `batch_type = "Shell"`
+   - template: `assets/machine.template.local-shell.json`
+
+If your current workflow is "on server, submit Slurm jobs", use profile 1.
+
+## Each stage should make these explicit
+
+- `command`
+- machine or context information
+- resources
+- queue / partition if needed
+- cpu count
+- gpu count
+- environment activation commands
+- custom scheduler flags if needed
+
+Important boundary: outer `dpgen simplify` environment and inner stage-job environments are different layers.
+Outer layer must be an activated DP-GEN environment (`dpgen --version` passes).
+Do not assume outer activation is inherited by stage jobs. For scheduler profiles, set `resources.source_list` explicitly.
+
+## `train`
+
+Used for model training.
+
+Typical concerns:
+
+- DeePMD environment
+- GPU availability
+- training queue
+- environment activation
+
+## `model_devi`
+
+Used for model deviation evaluation.
+
+Typical concerns:
+
+- DeePMD runtime
+- consistency with training environment
+- output and log handling
+
+## `fp`
+
+Used for first-principles calculations.
+
+Typical concerns:
+
+- backend executable
+- pseudopotential / basis support files
+- scheduler settings
+- backend-specific environment
+
+If `fp_style` is `none`, keep this stage disabled/unset and do not require active FP executable settings.
+
+## Practical advice
+
+When building `machine.json`:
+
+1. do not invent executable names
+1. do not invent scheduler module names
+1. keep environment activation explicit
+1. keep queue and resource requests explicit
+1. if the user already has a working template, patch it instead of rewriting everything
diff --git a/skills/dpgen-simplify/references/param-fields.md b/skills/dpgen-simplify/references/param-fields.md
new file mode 100644
index 000000000..04c653684
--- /dev/null
+++ b/skills/dpgen-simplify/references/param-fields.md
@@ -0,0 +1,140 @@
+# Simplify Parameter Notes
+
+This file gives concise notes for the most important fields in `param.json` for `dpgen simplify`.
+
+For a complete official-style example template, see:
+
+- `assets/param.example.qm7.from-official-docs.json`
+
+## Core dataset fields
+
+### `pick_data`
+
+Path to the candidate dataset to simplify.
+
+Use this field to point to the existing DeepMD-compatible data source.
+
+### `sys_configs`
+
+Configuration discovery pattern for systems.
+
+Often provided as nested lists of path patterns.
+
+### `init_data_prefix`
+
+Prefix used by DP-GEN when resolving initial data systems.
+
+### `init_data_sys`
+
+List of initial system indices for training.
+
+Can be empty when starting fully from `pick_data`.
+
+### `sys_batch_size`
+
+Batch size policy per system, often set to `["auto"]`.
+
+### `type_map`
+
+Ordered list of element symbols.
+
+This ordering should stay consistent with the dataset and the DeePMD training configuration.
+
+### `mass_map`
+
+Atomic masses corresponding to `type_map`.
+
+Keep the order consistent with `type_map`.
+
+## Simplify selection fields
+
+### `init_pick_number`
+
+Number of structures picked at the initial stage.
+
+### `iter_pick_number`
+
+Number of structures picked in each later simplify iteration.
+
+### `model_devi_f_trust_lo`
+
+Lower bound of the force-deviation trust window.
+
+### `model_devi_f_trust_hi`
+
+Upper bound of the force-deviation trust window.
+
+In general, these two values define the force-deviation region used during structure selection.
+
+### `model_devi_e_trust_lo`
+
+Lower bound of the energy-deviation trust window (optional).
+
+### `model_devi_e_trust_hi`
+
+Upper bound of the energy-deviation trust window (optional).
+
+## Training-related fields
+
+### `numb_models`
+
+Number of DeePMD models trained as an ensemble.
+
+### `train_backend`
+
+Training backend (for example `pytorch`) when needed by your environment.
+
+### `labeled`
+
+Whether the input candidate data is already labeled.
+
+### `default_training_param`
+
+Training template used during simplify.
+
+This usually contains:
+
+- model descriptor
+- fitting net
+- learning rate
+- loss
+- training steps
+
+## FP-related fields
+
+### `fp_style`
+
+Backend used for first-principles calculations.
+
+Set to `none` if no FP stage is intended.
+If data is already labeled and no re-labeling is requested, use `none`.
+
+### `fp_task_max`
+
+Maximum number of FP tasks.
+
+### `fp_task_min`
+
+Minimum number of FP tasks.
+
+### `fp_params`
+
+Backend-specific parameters for the chosen `fp_style`.
+
+### `fp_pp_path`
+
+Path to pseudopotential or backend support files if needed.
+
+### `fp_pp_files`
+
+List of files needed by the FP backend.
+
+## Practical advice
+
+When modifying `param.json`:
+
+1. keep `type_map` consistent
+1. do not silently switch descriptor families
+1. do not silently change the FP backend
+1. make threshold changes explicit and traceable
+1. if doing experiments, derive all variants from one base template
diff --git a/skills/dpgen-simplify/references/workflow-notes.md b/skills/dpgen-simplify/references/workflow-notes.md
new file mode 100644
index 000000000..d19d1db27
--- /dev/null
+++ b/skills/dpgen-simplify/references/workflow-notes.md
@@ -0,0 +1,65 @@
+# Simplify Workflow Notes
+
+## Standard command (outer layer)
+
+```bash
+dpgen simplify param.json machine.json
+```
+
+Run this command from an activated DP-GEN environment where `dpgen --version` succeeds.
+
+## Recommended workflow
+
+1. confirm this is really a simplify task
+1. choose machine profile (`server-local-slurm` / `ssh-remote-slurm` / `local-shell`)
+1. inspect dataset source
+1. define simplify thresholds
+1. build or patch `param.json`
+1. build or patch `machine.json`
+1. validate the workflow
+1. execute if requested
+1. summarize outputs and next checks
+
+## Validation checklist
+
+Run these before execution:
+
+```bash
+dpgen --version
+python -m json.tool param.json
+python -m json.tool machine.json
+```
+
+Also confirm:
+
+- `pick_data` paths exist
+- outer launcher shell has DP-GEN environment activated
+- stage commands match the intended software stack
+- scheduler stages include explicit `resources.source_list` activation
+- FP-specific settings are present only when `fp_style != "none"`
+
+## Recommended repeated-experiment structure
+
+For multiple simplify experiments, use one base template and derive variants.
+
+Example:
+
+```text
+project/
+├── base/
+├── exp_lo020_hi040/
+├── exp_lo025_hi045/
+├── exp_lo030_hi050/
+└── summary/
+```
+
+## What to summarize after a run
+
+At minimum, report:
+
+- run status
+- stage status
+- output locations
+- simplify thresholds
+- picked counts
+- next files to inspect