Make xe-forge-skill benchmark support optional `--spec` with baseline-derived config fallback

## Summary
The benchmark skill currently requires `--spec`, but the KernelBench baseline files already contain enough context to derive most runtime config. This issue proposes making `--spec` optional for `xe-forge-skill benchmark`, preserving current behavior when spec is provided, and adding a baseline-driven fallback path when it is not.

## Current Behavior
In `__init__.py`, the benchmark CLI marks `--spec` as required.
In benchmark.py, benchmark execution always loads spec-derived values:
- input shapes
- flop
- dtype
- input dtypes
- init args

## Problem
This makes benchmark usage more rigid than needed. For many workflows, the baseline kernel/model file already provides enough metadata to run correctness/perf comparison without requiring a YAML spec. 

Users also demand more control on input generation, as in #38
The widely adopted KernelBench format should also serve the users' needs well.

## Proposed Behavior
1. Make `--spec` optional for benchmark CLI.
2. Keep existing spec-driven path unchanged when `--spec` is provided.
3. When `--spec` is omitted, resolve benchmark config from baseline-derived metadata.
4. In spec-less mode, use baseline-derived dtype (with optional explicit override --dtype).

## Scope
### In scope
- CLI argument requirement/help updates for benchmark.
- Refactor benchmark config resolution into:
  - spec-backed path
  - baseline-backed fallback path
- Add tests for both paths.

### Out of scope
- Broad executor redesign.
- Changes to unrelated optimize/pipeline flows.

## Implementation Notes
- Update benchmark parser in __init__.py:
  - remove required=True from `--spec`
  - update help text to document fallback behavior
- Refactor benchmark.py:
  - isolate config resolution from execution
  - return resolved `input_shapes`, `flop`, `dtype`, `input_dtypes`, `init_args` regardless of source
- Prefer reusing existing analysis/utilities before adding parsing logic (candidate: kernel_analyzer.py)
- Keep executor.py interface unchanged; it should receive resolved values as today.

## Acceptance Criteria
1. `xe-forge-skill benchmark <baseline> <optimized>` runs without `--spec` for KernelBench baseline files.
2. `xe-forge-skill benchmark ... --spec ...` behavior remains unchanged.
3. Spec-less execution resolves dtype from baseline path or --dtype.
4. New/updated tests verify:
   - no regression in spec mode
   - no crash/error for omitted spec in fallback mode
5. CLI help reflects that `--spec` is optional for benchmark.

## Test Plan
- Add benchmark-focused unit tests under tests (or nearest existing skill test location) for:
  - spec provided path
  - spec omitted fallback path
- Run targeted test subset for benchmark + any touched resolution helpers.

## Risks / Open Questions
- Some kernels may not expose enough baseline metadata to infer all fields.
- If dtype cannot be inferred reliably in edge cases, add a minimal opt-in override flag --dtype rather than reintroducing mandatory spec.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make xe-forge-skill benchmark support optional `--spec` with baseline-derived config fallback #46

Summary

Current Behavior

Problem

Proposed Behavior

Scope

In scope

Out of scope

Implementation Notes

Acceptance Criteria

Test Plan

Risks / Open Questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Make xe-forge-skill benchmark support optional --spec with baseline-derived config fallback #46

Description

Summary

Current Behavior

Problem

Proposed Behavior

Scope

In scope

Out of scope

Implementation Notes

Acceptance Criteria

Test Plan

Risks / Open Questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Make xe-forge-skill benchmark support optional `--spec` with baseline-derived config fallback #46