WIP add downloading part from fink_ai topics in datatransfert #274 by Farid841 · Pull Request #275 · astrolabsoftware/fink-client

Farid841 · 2026-06-11T10:28:28Z

No description provided.

Copilot

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds first-class support to finkctl_transfer for “AI transfer” topics by consuming JSON predictions, joining them with original Avro alerts from a companion feed topic, and writing enriched Parquet output.

Changes:

Detect fink_ai_* topics and route them through a new AI transfer path.
Add logic to read/flatten alerts, parse/fix Avro schema, join with predictions, and write Parquet datasets.
Expand documentation and topic validation messaging to include AI topics.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+    rng = np.random.RandomState(42)
+    pq.write_to_dataset(
+        table,
+        args.outdir,
+        schema=arrow_schema,
+        basename_template="part-0-{{i}}-{}.parquet".format(rng.randint(0, int(1e9))),
+        partition_cols=partitioning,
+        existing_data_behavior="overwrite_or_ignore",
+    )


+    valid_prefixes = ("ftransfer", "fxmatch", _AI_TOPIC_PREFIX)
+    if not args.topic.startswith(valid_prefixes):
        msg = """
 {} is not a valid topic name.
-Topic name must start with `ftransfer_` or `fxmatch_`.
+Topic name must start with `ftransfer_`, `fxmatch_`, or `fink_ai_`.


+def _get_ai_schema(kafka_config: dict, feed_topic: str, maxtimeout: float):
+    """Return parsed Avro schema for the feed topic, with Spark union fix applied."""
+    raw = get_schema_from_stream(kafka_config, feed_topic, maxtimeout)
+    if raw is None:
+        return None
+    # raw is already parsed by get_schema_from_stream — re-parse with the fix
+    # by extracting the original dict and re-parsing
+    try:
+        return fastavro.parse_schema(_fix_spark_schema(raw))
+    except Exception:
+        return raw


+                try:
+                    rec = json.loads(msg.value())
+                except (json.JSONDecodeError, Exception):
+                    continue


+    rows = []
+    for candid, pred in predictions.items():
+        row = {"candid": candid, **pred}
+        if candid in alerts:
+            row.update({k: v for k, v in alerts[candid].items() if k != "candid"})
+        rows.append(row)
+
+    all_keys = list(rows[0].keys())
+    for row in rows[1:]:
+        for k in row:
+            if k not in all_keys:
+                all_keys.append(k)
+
+    columns = {k: [row.get(k) for row in rows] for k in all_keys}
+
+    arrays = {}
+    for k, vals in columns.items():
+        if k == "predictions":
+            arrays[k] = pa.array(vals, type=pa.list_(pa.float64()))
+        else:
+            try:
+                arrays[k] = pa.array(vals)
+            except Exception:
+                arrays[k] = pa.array([str(v) if v is not None else None for v in vals])
+
+    table = pa.table(arrays)


WIP add downloading part from fink_ai topics \n (todo) opmised

3686495

Copilot AI review requested due to automatic review settings June 11, 2026 10:28

Copilot AI reviewed Jun 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP add downloading part from fink_ai topics in datatransfert #274 #275

WIP add downloading part from fink_ai topics in datatransfert #274 #275
Farid841 wants to merge 1 commit into
astrolabsoftware:masterfrom
Farid841:master

Farid841 commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Farid841 commented Jun 11, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants