Skip to content

fix(conf): fix meta-externalagent UA case and use ASN verification#17

Open
adri wants to merge 1 commit intocnlangzi:mainfrom
adri:fix/meta-externalagent
Open

fix(conf): fix meta-externalagent UA case and use ASN verification#17
adri wants to merge 1 commit intocnlangzi:mainfrom
adri:fix/meta-externalagent

Conversation

@adri
Copy link
Copy Markdown
Contributor

@adri adri commented Mar 19, 2026

The actual User-Agent string is lowercase "meta-externalagent", not "Meta-ExternalAgent". Since UA matching is case-sensitive, the bot was not being detected.

Also switch from RDNS to ASN 32934 (Meta) verification for faster and more reliable detection.

Ref: https://developers.facebook.com/docs/sharing/webmasters/crawler

Summary by Sourcery

Update Meta crawler bot configuration to correctly detect the meta-externalagent user-agent using ASN-based verification instead of RDNS/domain matching.

Bug Fixes:

  • Correct the Meta crawler user-agent string to match the actual lowercase meta-externalagent value for proper detection.

Enhancements:

  • Switch Meta crawler verification from RDNS/domain matching to ASN 32934-based verification for faster and more reliable identification.

The actual User-Agent string is lowercase "meta-externalagent", not
"Meta-ExternalAgent". Since UA matching is case-sensitive, the bot was
not being detected.

Also switch from RDNS to ASN 32934 (Meta) verification for faster and
more reliable detection.

Ref: https://developers.facebook.com/docs/sharing/webmasters/crawler
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 19, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Updates the Meta AI/Facebook crawler bot definition to match the actual lowercase User-Agent string and switches verification from reverse DNS and domain list to ASN-based verification for Meta’s ASN 32934.

Sequence diagram for updated Meta crawler detection (UA + ASN)

sequenceDiagram
    actor MetaCrawler
    participant WebServer
    participant BotDetector
    participant ConfigMetaExternalagent
    MetaCrawler->>WebServer: HTTP GET /resource
    WebServer->>BotDetector: Request with headers and client_ip
    BotDetector->>ConfigMetaExternalagent: Load kind AITraining name meta-externalagent
    ConfigMetaExternalagent-->>BotDetector: ua meta-externalagent, asn 32934
    BotDetector->>BotDetector: Compare UserAgent == meta-externalagent
    alt UserAgent matches
        BotDetector->>ASNService: Lookup ASN for client_ip
        ASNService-->>BotDetector: ASN result
        alt ASN == 32934
            BotDetector-->>WebServer: Mark as verified Meta crawler
        else ASN != 32934
            BotDetector-->>WebServer: Not verified Meta crawler
        end
    else UserAgent does not match
        BotDetector-->>WebServer: Not Meta crawler
    end
    WebServer-->>MetaCrawler: HTTP response
Loading

Flow diagram for new Meta crawler verification logic

flowchart TD
    A[Start request] --> B[Read UserAgent and client_ip]
    B --> C{UserAgent == meta-externalagent?}
    C -- No --> D[Treat as normal traffic]
    C -- Yes --> E[Lookup ASN for client_ip]
    E --> F{ASN == 32934?}
    F -- Yes --> G[Classify as verified Meta AI/Facebook crawler]
    F -- No --> H[Do not classify as Meta crawler]
    D --> I[Continue standard handling]
    G --> I
    H --> I
    I --> J[End]
Loading

File-Level Changes

Change Details Files
Align bot User-Agent matching with the actual lowercase meta-externalagent UA.
  • Change the recorded User-Agent string from "Meta-ExternalAgent" to "meta-externalagent" to ensure case-sensitive matching succeeds.
  • Keep the bot name as meta-externalagent while correcting only the UA value.
bots/conf.d/meta-externalagent.yaml
Switch Meta crawler verification from RDNS/domain-based checks to ASN-based checks using Meta’s ASN 32934.
  • Remove RDNS-based verification flag and associated fbsv.net, tfbnw.net, and facebook.com domain list.
  • Introduce ASN-based verification by specifying ASN 32934, corresponding to Meta, for more reliable and faster verification.
  • Update the reference documentation comment URL to Meta/Facebook’s crawler docs.
bots/conf.d/meta-externalagent.yaml

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 19, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 61.50%. Comparing base (60ce193) to head (4bad4dd).
⚠️ Report is 7 commits behind head on main.

Additional details and impacted files
@@             Coverage Diff             @@
##             main      #17       +/-   ##
===========================================
- Coverage   72.76%   61.50%   -11.27%     
===========================================
  Files          15       24        +9     
  Lines         661     1000      +339     
===========================================
+ Hits          481      615      +134     
- Misses        136      327      +191     
- Partials       44       58       +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant