Skip to content

prospect source changes for pocket sunset#292

Closed
lfishhead wants to merge 2 commits intomainfrom
prospect_source-updates
Closed

prospect source changes for pocket sunset#292
lfishhead wants to merge 2 commits intomainfrom
prospect_source-updates

Conversation

@lfishhead
Copy link
Copy Markdown
Contributor

@lfishhead lfishhead commented Apr 29, 2025

Goal

Various prospect sources are being decommissioned across multiple new tab scheduled surfaces as a result fo the pocket sunset and related snowflake sunset.

Changes are summarized in this page

After this metaflow PR is deployed, these changes should all take effect

Still need to update the pocket shared data page

Implementation Decisions

  • I did remove the counts dismissed and timespent top_saved and title_url_modeled from ProspectType altogether. happy to leave that in if that's preferred, but they should not be populated for any surface

References

confluence

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2025

Plan Result (prospect-translation-lambda-cdk-production)

CI link

Plan: 0 to add, 1 to change, 0 to destroy.
  • Update
    • aws_lambda_function.translation-lambda_translation-sqs-lambda_B9BDF6BA
Change Result (Click me)
  # aws_lambda_function.translation-lambda_translation-sqs-lambda_B9BDF6BA will be updated in-place
  ~ resource "aws_lambda_function" "translation-lambda_translation-sqs-lambda_B9BDF6BA" {
        id                             = "ProspectAPI-Prod-Sqs-Translation-Function"
        tags                           = {
            "app_code"       = "content"
            "component_code" = "content-prospectapi"
            "env_code"       = "prod"
            "environment"    = "Prod"
            "service"        = "ProspectAPI-Sqs-Translation"
        }
        # (22 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "GIT_SHA"                      = (sensitive value)
                # (4 unchanged elements hidden)
            }
        }

        # (4 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

⚠️ Errors

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2025

Plan Result (corpus-scheduler-lambda-cdk-production)

CI link

Plan: 0 to add, 1 to change, 0 to destroy.
  • Update
    • aws_lambda_function.corpus-scheduler-sqs-lambda_F2ECDF9F
Change Result (Click me)
  # aws_lambda_function.corpus-scheduler-sqs-lambda_F2ECDF9F will be updated in-place
  ~ resource "aws_lambda_function" "corpus-scheduler-sqs-lambda_F2ECDF9F" {
        id                             = "CorpusSchedulerLambda-Prod-SQS-Function"
      ~ qualified_arn                  = "arn:aws:lambda:us-east-1:996905175585:function:CorpusSchedulerLambda-Prod-SQS-Function:223" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:996905175585:function:CorpusSchedulerLambda-Prod-SQS-Function:223/invocations" -> (known after apply)
        tags                           = {
            "app_code"       = "content"
            "component_code" = "content-corpusschedulerlambda"
            "env_code"       = "prod"
            "environment"    = "Prod"
            "service"        = "CorpusSchedulerLambda"
        }
      ~ version                        = "223" -> (known after apply)
        # (20 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              ~ "GIT_SHA"                          = (sensitive value)
                # (7 unchanged elements hidden)
            }
        }

        # (4 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

⚠️ Errors

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2025

Plan Result (section-manager-lambda-cdk-production)

CI link

Plan: 0 to add, 1 to change, 0 to destroy.
  • Update
    • aws_lambda_function.section-manager-sqs-lambda_D7365DAE
Change Result (Click me)
  # aws_lambda_function.section-manager-sqs-lambda_D7365DAE will be updated in-place
  ~ resource "aws_lambda_function" "section-manager-sqs-lambda_D7365DAE" {
        id                             = "SectionManagerLambda-Prod-SQS-Function"
      ~ qualified_arn                  = "arn:aws:lambda:us-east-1:996905175585:function:SectionManagerLambda-Prod-SQS-Function:20" -> (known after apply)
      ~ qualified_invoke_arn           = "arn:aws:apigateway:us-east-1:lambda:path/2015-03-31/functions/arn:aws:lambda:us-east-1:996905175585:function:SectionManagerLambda-Prod-SQS-Function:20/invocations" -> (known after apply)
        tags                           = {
            "app_code"       = "content"
            "component_code" = "content-sectionmanagerlambda"
            "env_code"       = "prod"
            "environment"    = "Prod"
            "service"        = "SectionManagerLambda"
        }
      ~ version                        = "20" -> (known after apply)
        # (20 unchanged attributes hidden)

      ~ environment {
          ~ variables = {
              - "GIT_SHA"     = "ed44aa12a7bfd0d8e14b006dcc7f7057b3cfa5bc" -> null
                # (5 unchanged elements hidden)
            }
        }

        # (4 unchanged blocks hidden)
    }

Plan: 0 to add, 1 to change, 0 to destroy.

⚠️ Errors

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2025

Plan Result (prospect-api-cdk-production)

CI link

Plan: 0 to add, 2 to change, 0 to destroy.
  • Update
    • aws_dynamodb_table.dynamodb_prospects_dynamodb_table_9854E41E
    • aws_iam_policy.application_ecs_service_ecs-iam_ecs-task-role-policy_6FC89FB6
Change Result (Click me)
  # data.aws_iam_policy_document.application_ecs_service_ecs-iam_data-ecs-task-role-policy_090CC3AD will be read during apply
  # (depends on a resource or a module with changes pending)
 <= data "aws_iam_policy_document" "application_ecs_service_ecs-iam_data-ecs-task-role-policy_090CC3AD" {
      + id            = (known after apply)
      + json          = (known after apply)
      + minified_json = (known after apply)
      + version       = "2012-10-17"

      + statement {
          + actions   = [
              + "dynamodb:BatchGet*",
              + "dynamodb:DescribeTable",
              + "dynamodb:Get*",
              + "dynamodb:Query",
              + "dynamodb:Scan",
              + "dynamodb:UpdateItem",
            ]
          + effect    = "Allow"
          + resources = [
              + "arn:aws:dynamodb:us-east-1:996905175585:table/PROAPI-Prod-Prospects",
              + "arn:aws:dynamodb:us-east-1:996905175585:table/PROAPI-Prod-Prospects/*",
            ]
        }
      + statement {
          + actions   = [
              + "s3:*",
            ]
          + effect    = "Allow"
          + resources = [
              + "arn:aws:s3:::pocket-prospectapi-prod-images",
              + "arn:aws:s3:::pocket-prospectapi-prod-images/*",
            ]
        }
      + statement {
          + actions   = [
              + "events:PutEvents",
            ]
          + effect    = "Allow"
          + resources = [
              + "arn:aws:events:us-east-1:996905175585:event-bus/PocketEventBridge-Prod-Shared-Event-Bus",
            ]
        }
      + statement {
          + actions   = [
              + "logs:CreateLogGroup",
              + "logs:CreateLogStream",
              + "logs:DescribeLogGroups",
              + "logs:DescribeLogStreams",
              + "logs:PutLogEvents",
            ]
          + effect    = "Allow"
          + resources = [
              + "*",
            ]
        }
    }

  # aws_dynamodb_table.dynamodb_prospects_dynamodb_table_9854E41E will be updated in-place
  ~ resource "aws_dynamodb_table" "dynamodb_prospects_dynamodb_table_9854E41E" {
        id                          = "PROAPI-Prod-Prospects"
        name                        = "PROAPI-Prod-Prospects"
        tags                        = {
            "app_code"       = "content"
            "component_code" = "content-prospectapi"
            "env_code"       = "prod"
            "environment"    = "Prod"
            "service"        = "ProspectAPI"
        }
        # (9 unchanged attributes hidden)

      - global_secondary_index {
          - hash_key           = "scheduledSurfaceGuid" -> null
          - name               = "scheduledSurfaceGuid-prospectType" -> null
          - non_key_attributes = [] -> null
          - projection_type    = "ALL" -> null
          - range_key          = "prospectType" -> null
          - read_capacity      = 0 -> null
          - write_capacity     = 0 -> null
        }
      + global_secondary_index {
          + hash_key           = "scheduledSurfaceGuid"
          + name               = "scheduledSurfaceGuid-prospectType"
          + non_key_attributes = []
          + projection_type    = "ALL"
          + range_key          = "prospectType"
          + read_capacity      = 5
          + write_capacity     = 5
        }

        # (5 unchanged blocks hidden)
    }

  # aws_iam_policy.application_ecs_service_ecs-iam_ecs-task-role-policy_6FC89FB6 will be updated in-place
  ~ resource "aws_iam_policy" "application_ecs_service_ecs-iam_ecs-task-role-policy_6FC89FB6" {
        id               = "arn:aws:iam::996905175585:policy/ProspectAPI-Prod-TaskRolePolicy"
        name             = "ProspectAPI-Prod-TaskRolePolicy"
      ~ policy           = jsonencode(
            {
              - Statement = [
                  - {
                      - Action   = [
                          - "dynamodb:UpdateItem",
                          - "dynamodb:Scan",
                          - "dynamodb:Query",
                          - "dynamodb:Get*",
                          - "dynamodb:DescribeTable",
                          - "dynamodb:BatchGet*",
                        ]
                      - Effect   = "Allow"
                      - Resource = [
                          - "arn:aws:dynamodb:us-east-1:996905175585:table/PROAPI-Prod-Prospects/*",
                          - "arn:aws:dynamodb:us-east-1:996905175585:table/PROAPI-Prod-Prospects",
                        ]
                    },
                  - {
                      - Action   = "s3:*"
                      - Effect   = "Allow"
                      - Resource = [
                          - "arn:aws:s3:::pocket-prospectapi-prod-images/*",
                          - "arn:aws:s3:::pocket-prospectapi-prod-images",
                        ]
                    },
                  - {
                      - Action   = "events:PutEvents"
                      - Effect   = "Allow"
                      - Resource = "arn:aws:events:us-east-1:996905175585:event-bus/PocketEventBridge-Prod-Shared-Event-Bus"
                    },
                  - {
                      - Action   = [
                          - "logs:PutLogEvents",
                          - "logs:DescribeLogStreams",
                          - "logs:DescribeLogGroups",
                          - "logs:CreateLogStream",
                          - "logs:CreateLogGroup",
                        ]
                      - Effect   = "Allow"
                      - Resource = "*"
                    },
                ]
              - Version   = "2012-10-17"
            }
        ) -> (known after apply)
        tags             = {
            "app_code"       = "content"
            "component_code" = "content-prospectapi"
            "env_code"       = "prod"
            "environment"    = "Prod"
            "service"        = "ProspectAPI"
        }
        # (5 unchanged attributes hidden)
    }

Plan: 0 to add, 2 to change, 0 to destroy.

@lfishhead lfishhead changed the title prospect source changes fro pocket sunset prospect source changes for pocket sunset Apr 29, 2025
@jpetto
Copy link
Copy Markdown
Contributor

jpetto commented Apr 30, 2025

i think the base code change is fine - the issue is those removed prospect types are referenced in tests, which is why so many checks are failing. luckily this is pretty easy to fix - just search the repo for the prospect types that were removed and replace them with prospect types that still exist.

happy to help get this across the finish line if needed. just ping me in slack if so.

Copy link
Copy Markdown
Contributor

@jpetto jpetto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we just need to update tests throughout the repo to replace references to the removed prospect types.

@jpetto
Copy link
Copy Markdown
Contributor

jpetto commented Apr 22, 2026

closing as stale. we'll be revisiting prospecting as a whole later in 2026.

@jpetto jpetto closed this Apr 22, 2026
msujaws added a commit to msujaws/hnt-review-turnaround-time that referenced this pull request Apr 23, 2026
extractSamplesFromPullRequest emitted a pending sample for every
reviewer whose explicit request was still active at the end of the
timeline, with no check on PR state. PR_QUERY hits PRs across all
states within the 3-day updatedAt lookback, so a just-closed PR would
be refetched and its outstanding requests re-emitted into pending on
every run — stranding the reviewer in the overdue list indefinitely.
PR Pocket/content-monorepo#292 closed 2026-04-22 and surfaced this.

Thread GitHub's native "closed" boolean (true for CLOSED or MERGED)
through PullRequestData -> GraphQL queries -> schema -> mapper, and
gate only the pending emission block. Sample emission is left alone:
a review that landed before closure is legitimate completed data.

Self-healing on the next collect run: the fixed extractor will drop
PR 292 (still within the 3-day window) from fresh pending, and the
fresh-replaces-existing logic in collect.ts writes pending.json
without it.
msujaws added a commit to msujaws/hnt-review-turnaround-time that referenced this pull request Apr 23, 2026
Adds src/scripts/pruneClosedPending.ts — a one-shot cleanup that
walks github pending entries, queries each PR's "closed" state via
GraphQL, and filters out those whose PR is closed. Runs after
sourcing .env for GH_PAT: bun run src/scripts/pruneClosedPending.ts

Running it against the current snapshot dropped one entry
(Pocket/content-monorepo#292, reviewer katerinachinnappan, originally
requested 2025-04-29) that the pre-fix extractor had stranded in
pending.json after the PR closed on 2026-04-22. Pending went 41 -> 40.

The collect pipeline's fixed extractor would have healed this on the
next scheduled run; the script lets us fix the dashboard immediately
without waiting for the cron, and is kept in-tree in case a similar
pending-vs-closed drift re-emerges.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants