Skip to content

Add channel highlights generation flow#3723

Draft
idoshamun wants to merge 2 commits intomainfrom
codex/channel-highlights-api
Draft

Add channel highlights generation flow#3723
idoshamun wants to merge 2 commits intomainfrom
codex/channel-highlights-api

Conversation

@idoshamun
Copy link
Member

Summary

  • add API-side channel highlights persistence, cron, and worker wiring
  • generate horizon-bound highlight candidates with collection-first story grouping and lightweight continuity state
  • add mocked highlight evaluation plus shadow/publish run tracking and regression coverage for collection upgrades and relation-driven reevaluation

Verification

  • pnpm run lint
  • pnpm run build
  • pnpm run test -- tests/cron/channelHighlights.ts tests/workers/generateChannelHighlight.ts --runInBand

@pulumi
Copy link

pulumi bot commented Mar 18, 2026

🍹 The Update (preview) for dailydotdev/api/prod (at c1488f9) was successful.

✨ Neo Explanation

This is a standard application deployment rolling out a new version across all services (deployments and cron jobs), introducing a new `channel-highlights` cron job, running DB and Clickhouse migrations for the new version, and replacing the Kubernetes secret due to updated values.

Root Cause Analysis

A new version of the API application has been built and is being deployed to production. All services are being updated to run the new container image, and a new channel-highlights cron job is being introduced for the first time. The Kubernetes secret also has updated values that require a replacement.

Dependency Chain

The new application version cascades uniformly across the entire workload:

  • 7 Deployments (API, background, workers, WebSocket, private, Temporal, personalized digest) are rolling out the new image
  • 38 CronJobs are updated to reference the new image version
  • 1 new CronJob (channel-highlights) is being created — a new scheduled job that didn't exist before
  • Migration Jobs: The previous DB and Clickhouse migration jobs (tied to the old version) are deleted, and new migration jobs for the new version are created to run schema migrations against both the primary database and Clickhouse
  • Kubernetes Secret: Updated secret data requires a replace (delete + recreate)

Risk Analysis

The Kubernetes Secret replacement (vpc-native-k8s-secret) is the most notable item — it will be deleted and recreated, which could briefly cause issues for any pods that reference it if they attempt to reload credentials during the window between deletion and recreation. The migration jobs run against live databases, though this is a standard deployment pattern. All other changes are standard rolling image updates with no stateful resource deletions.

Resource Changes

    Name                                                       Type                           Operation
~   vpc-native-channel-digests-cron                            kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-analytics-history-clickhouse-cron  kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-updated-sync-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-personalized-digest-cron                        kubernetes:batch/v1:CronJob    update
~   vpc-native-hourly-notification-cron                        kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-zombie-opportunities-cron                 kubernetes:batch/v1:CronJob    update
+   vpc-native-api-db-migration-f4027cda                       kubernetes:batch/v1:Job        create
~   vpc-native-update-highlighted-views-cron                   kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-gifted-plus-cron                          kubernetes:batch/v1:CronJob    update
~   vpc-native-update-trending-cron                            kubernetes:batch/v1:CronJob    update
~   vpc-native-validate-active-users-cron                      kubernetes:batch/v1:CronJob    update
~   vpc-native-deployment                                      kubernetes:apps/v1:Deployment  update
~   vpc-native-clean-zombie-users-cron                         kubernetes:batch/v1:CronJob    update
+   vpc-native-channel-highlights-cron                         kubernetes:batch/v1:CronJob    create
~   vpc-native-clean-stale-user-transactions-cron              kubernetes:batch/v1:CronJob    update
~   vpc-native-generate-search-invites-cron                    kubernetes:batch/v1:CronJob    update
~   vpc-native-private-deployment                              kubernetes:apps/v1:Deployment  update
~   vpc-native-clean-zombie-images-cron                        kubernetes:batch/v1:CronJob    update
~   vpc-native-user-posts-analytics-refresh-cron               kubernetes:batch/v1:CronJob    update
~   vpc-native-generic-referral-reminder-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-update-tag-recommendations-cron                 kubernetes:batch/v1:CronJob    update
-   vpc-native-api-clickhouse-migration-7552f8e0               kubernetes:batch/v1:Job        delete
~   vpc-native-rotate-weekly-quests-cron                       kubernetes:batch/v1:CronJob    update
~   vpc-native-update-achievement-rarity-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-temporal-deployment                             kubernetes:apps/v1:Deployment  update
~   vpc-native-ws-deployment                                   kubernetes:apps/v1:Deployment  update
~   vpc-native-clean-zombie-user-companies-cron                kubernetes:batch/v1:CronJob    update
~   vpc-native-update-current-streak-cron                      kubernetes:batch/v1:CronJob    update
~   vpc-native-personalized-digest-deployment                  kubernetes:apps/v1:Deployment  update
~   vpc-native-check-analytics-report-cron                     kubernetes:batch/v1:CronJob    update
~   vpc-native-daily-digest-cron                               kubernetes:batch/v1:CronJob    update
~   vpc-native-post-analytics-clickhouse-cron                  kubernetes:batch/v1:CronJob    update
~   vpc-native-update-source-public-threshold-cron             kubernetes:batch/v1:CronJob    update
~   vpc-native-sync-subscription-with-cio-cron                 kubernetes:batch/v1:CronJob    update
~   vpc-native-clean-expired-better-auth-sessions-cron         kubernetes:batch/v1:CronJob    update
~   vpc-native-update-tags-str-cron                            kubernetes:batch/v1:CronJob    update
~   vpc-native-post-analytics-history-day-clickhouse-cron      kubernetes:batch/v1:CronJob    update
~   vpc-native-squad-posts-analytics-refresh-cron              kubernetes:batch/v1:CronJob    update
~   vpc-native-user-profile-analytics-clickhouse-cron          kubernetes:batch/v1:CronJob    update
+   vpc-native-api-clickhouse-migration-f4027cda               kubernetes:batch/v1:Job        create
~   vpc-native-update-source-tag-view-cron                     kubernetes:batch/v1:CronJob    update
... and 10 other changes

const cron: Cron = {
name: 'channel-highlights',
handler: async (con, logger) => {
const now = getChannelHighlightsNow();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we doing this funky implementation instead of calling new Date() right away?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified this. The cron now just uses new Date().toISOString() directly in /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/cron/channelHighlights.ts, and the test was updated to assert the timestamp falls within the handler execution window instead of mocking a helper.

scheduledAt: string;
}): string => `channel-highlight:lock:${channel}:${scheduledAt}`;

const worker: TypedWorker<'api.v1.generate-channel-highlight'> = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We reuse this structure of acquiring redis lock quite often recently. It could be neat to create helper to do this flow that accepts a function to execute when acquiring and then set it to done/release the lock

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I extracted the done-lock flow into /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/workers/withRedisDoneLock.ts and switched both the highlight worker and the digest worker to use it, so the acquire/execute/mark-done/release pattern lives in one place now.

return new Date(Math.max(...candidates));
};

const toQualitySummary = (
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reads like a slop, no better way to achieve the same? less verbose maybe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this was too dense. I split the business logic out of the generator so /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/common/channelHighlight/generate.ts is now orchestration-only, and moved the heavier logic into queries.ts, stories.ts, decisions.ts, and types.ts. I also added high-level comments around the story-building and publish logic so the control flow is easier to follow without prior context.

: null,
});

const getStoryKey = ({
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about twitter support here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added Twitter-specific support as part of the refactor. /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/common/channelHighlight/stories.ts now builds a stable story key from sharedPostId when available, and otherwise falls back to parsing the tweet status id from x.com / twitter.com URLs before falling back to canonical URL or post id.

const quality = toQualitySummary(story.canonicalPost.contentQuality || {});
const penalty =
(quality.clickbaitProbability || 0) * 5 +
(quality.selfPromotionScore || 0) * 3;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self promotion is tricky, since anthropic announcing new claude model is self promotion. i think it's worth checking production db for some data and adjust the ranking here

);
};

const selectCanonicalPost = (posts: HighlightPost[]): HighlightPost =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canonical should be collection when possible, else we can use this formula

.andWhere('post.visible = true')
.andWhere('post.deleted = false')
.andWhere('post.banned = false')
.andWhere('post.showOnFeed = true')
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a post is part of a collection, showOnFeed becomes false. something to keep in mind, in case it changes our assumptions

return [...byId.values()];
};

const buildStories = ({
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is very hard to track

);
};

const shouldReuseEvaluations = ({
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition isn't clear. Why is that the condition?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I clarified this by moving the reuse logic into /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/common/channelHighlight/decisions.ts and documenting the rule above it. The condition is intentionally strict: we only reuse the cached LLM result when the underlying story is still the same story, the canonical post is unchanged, and no newer post or relation activity happened since the last evaluation.

status: story.cached?.status || 'active',
});

const shouldPublish = ({
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not clear

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same cleanup here. The publish rule now lives in /Users/shamun/Projects/daily/workspaces/highlights-generation/daily-api/src/common/channelHighlight/decisions.ts with a short comment above it. The goal is to only publish when the editorial surface changed in a user-visible way: a different story set, a different canonical post for the same story, a different ordering, or different headlines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant