You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`generate_keywords(ticker)` in `src/news.py` — auto-builds a keyword list for any ticker from Yahoo Finance metadata (company name tokens, executive surnames). Uses a daemon thread bounded by `REQUEST_TIMEOUT` so a hung network call never blocks the pipeline. Falls back to `[ticker]` on timeout, network failure, or unknown ticker. Part of #35 (v0.3 arbitrary ticker support).
18
-
-`Docs/CONTRIBUTORS.md` — acknowledgement list for all project contributors.
19
-
- 5 new unit tests in `tests/test_core.py`: `generate_keywords` known ticker, unknown ticker, network failure, timeout, and `correlate_news` false-positive regression. Total test count: 42.
17
+
-`generate_keywords(ticker)` in `src/news.py` for arbitrary ticker support groundwork:
18
+
- pulls Yahoo Finance metadata via `yfinance`
19
+
- includes ticker symbol, company name tokens, and top executive surnames
20
+
- removes duplicate and short tokens (`< 3` chars)
- bounded by `REQUEST_TIMEOUT` using a daemon thread so hung metadata calls cannot block the pipeline
23
+
- graceful fallback to `[ticker]` for timeout, network failure, or unknown symbols
24
+
-`Docs/CONTRIBUTORS.md` added to document acknowledged contributors.
25
+
- 5 focused tests added in `tests/test_core.py`:
26
+
-`generate_keywords` known ticker path
27
+
- unknown ticker fallback path
28
+
- network failure fallback path
29
+
- timeout fallback path
30
+
-`correlate_news` regression guard against substring false positives
31
+
- Test suite count increased to 42.
20
32
21
33
### Fixed
22
-
-`correlate_news` in `src/signals.py` now uses word-boundary regex matching (`\b{kw}\b` via `_kw_re`) instead of plain Python substring containment (`kw in blob`). Previously, short keywords such as `"gold"` matched unrelated articles where the word appeared as a substring (e.g. *Goldman Sachs* articles appearing in the Gold section, *S&P 500* articles matching via broad terms).
34
+
-`correlate_news` in `src/signals.py` now uses word-boundary regex matching (`\b{kw}\b` via `_kw_re`) instead of plain substring containment (`kw in blob`). This removes false-positive matches where short keywords were only present as substrings (for example, `gold` matching `goldman`).
35
+
- Dashboard stale-data handling in `dashboard/main.py` changed from automatic stale-triggered reruns to explicit user action (`Refresh now`), reducing involuntary refresh loops.
36
+
- Dashboard refresh epoch increments now use session-state source of truth to avoid stale local increments during refresh/scan actions.
37
+
38
+
### Changed
39
+
- Sidebar now includes `Enable auto background scan` toggle in `dashboard/main.py`, allowing contributors/users to disable scan auto-trigger behavior during interactive sessions.
40
+
- Background scan refresh and manual refresh flows were aligned to use consistent state updates before rerun.
41
+
- CI workflow concurrency policy in `.github/workflows/ci.yml` updated to preserve all `main` branch runs (only non-main runs are canceled in-progress), keeping branch checks reliable during rapid pushes.
42
+
43
+
### Documentation
44
+
-`Docs/code_flow.md` expanded with the keyword generation pipeline and updated news-correlation matching flow notes.
45
+
-`Docs/variable_list.md` extended with new symbols/constants (`generate_keywords`, `_CORP_SUFFIXES`, `_KW_PATTERN_CACHE`, `_kw_re`) and updated correlation behavior references.
46
+
-`README.md` improvements:
47
+
- fixed disclaimer badge/doc links
48
+
- added Docker quick-start TOC entry
49
+
- normalized scan invocation to module form (`python -m app.scan`)
50
+
- added contributors document references
51
+
-`CONTRIBUTING.md` updated to use module-form scan commands (`python -m app.scan --dry-run`) for consistency with package layout.
23
52
24
53
### Technical
25
-
-`_KW_PATTERN_CACHE` added to `src/signals.py` — module-level dict caching compiled `re.Pattern` objects so keyword patterns are compiled once per process rather than on every article scored.
26
-
- Industry and sector fields intentionally excluded from `generate_keywords` output — terms like `"Technology"` are too broad to use as correlation keywords without causing noise across unrelated assets.
54
+
-`_KW_PATTERN_CACHE` introduced in `src/signals.py` to reuse compiled regex patterns across correlation calls, reducing repeated regex compilation overhead.
55
+
-`_kw_re` helper added to centralize safe keyword pattern construction.
56
+
- Industry and sector fields intentionally excluded from `generate_keywords` output because broad labels (for example, `Technology`) create noisy cross-asset matches.
0 commit comments