Skip to content

Conversation

@j616
Copy link
Contributor

@j616 j616 commented Jul 25, 2025

Details

While developing fine-grained auth approaches in #115 , several deficiencies were noticed with the Webhooks endpoints:

  • The CRUD patterns on the Webhooks endpoint do not match the rest of the API
  • The use of a secret attribute for auth on this endpoint is vulnerable to the loss of that secret. This would prevent editing or deleting of a webhook.
  • The BBC R&D (internal experimental) implementation and AWS implementation of TAMS differed in their interpretation of what is the primary key
  • Strict interpretation of the spec, and hiding of the secret api_key_value in the Webhooks listing, could result in multiple identical looking Webhooks being returned
  • The existing approach prevents "matrixing" of event types and filters (e.g. to receive all flows/created events, but only flows/segments_added for specific flows)

This PR proposes changes to the Webhooks endpoint to fix these issues.

Jira Issue (if relevant)

Jira URL: https://jira.dev.bbc.co.uk/browse/CLOUDFIT-5461

Related PRs

If merged before #115, add the new endpoints to the auth logic listings in that PR. If merged after, add that logic in this PR

Submitter PR Checks

(tick as appropriate)

  • PR completes task/fixes bug
  • API version has been incremented if necessary
  • ADR status has been updated, and ADR implementation has been recorded
  • Documentation updated (README, etc.)
  • PR added to Jira Issue (if relevant)
  • Follow-up stories added to Jira

Reviewer PR Checks

(tick as appropriate)

  • PR completes task/fixes bug
  • Design makes sense, and fits with our current code base
  • Code is easy to follow
  • PR size is sensible
  • Commit history is sensible and tidy

Info on PRs

The checks above are guidelines. They don't all have to be ticked, but they should all have been considered.

@j616 j616 requested a review from a team as a code owner July 25, 2025 14:22
Copy link
Member

@samdbmg samdbmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM - couple of docs nits inlined

Comment on lines 265 to 275
Register to receive event notifications as webhooks on a specified URL. Webhook messages will conform to the
format in the `webhooks` section of the API docs, depending on the event type (as defined in the same section).
Availability of this endpoint is indicated by the name "webhooks" appearing in the `event_stream_mechanisms`
list on the service endpoint.
HTTP requests from the service SHOULD include a `api_key_name` header with the 'api_key_value' value. Clients SHOULD verify this against the value they provided when registering the webhook.
API implementations MAY partially support event filtering and transformations.
API implementations SHALL return a 400 response code if the filtering or transformation specified in the request is not supported.
API implementations SHOULD consider the security implementations of providing webhooks, and include appropriate mitigations against Server Side Request Forgery (SSRF) attacks and similar. API implementations SHOULD take appropriate steps to authorize the modification of existing webhooks. This may take the form of RBAC, or ABAC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might need a tweak to match this being an update endpoint now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tweaked in fdcd744

@danjhd
Copy link
Contributor

danjhd commented Sep 2, 2025

Something that came up in a conversation today and may be something to include in this PR. Should we be supplying more "data" in the deleted events payload, for flows and sources? We had a use case where we wanted to be notified if a "multi" Source is deleted. Since the payload for deleted events only has the id this is not currently easy/possible.

@j616
Copy link
Contributor Author

j616 commented Sep 2, 2025

Something that came up in a conversation today and may be something to include in this PR. Should we be supplying more "data" in the deleted events payload, for flows and sources? We had a use case where we wanted to be notified if a "multi" Source is deleted. Since the payload for deleted events only has the id this is not currently easy/possible.

I think that sounds reasonable. Would you be able to provide some user stories/use cases we can use to inform the change? I'm mainly wondering if that data should be the whole Source/Flow metadata, or just a subset. I wonder if it should be a separate ADR and/or PR to avoid adding even more to this very large change set!

@danjhd
Copy link
Contributor

danjhd commented Sep 2, 2025

Something that came up in a conversation today and may be something to include in this PR. Should we be supplying more "data" in the deleted events payload, for flows and sources? We had a use case where we wanted to be notified if a "multi" Source is deleted. Since the payload for deleted events only has the id this is not currently easy/possible.

I think that sounds reasonable. Would you be able to provide some user stories/use cases we can use to inform the change? I'm mainly wondering if that data should be the whole Source/Flow metadata, or just a subset. I wonder if it should be a separate ADR and/or PR to avoid adding even more to this very large change set!

Separate PR is best plan. I just figured it was worth "sounding out" the idea here first in case it was considered "silly" ;)

"204":
description: No content. The webhook has been deleted.
"403":
description: Forbidden. You do not have permission to modify this webhook.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't have a read-only field on the webhook, like we do on Flows i was just wondering if this 403 is needed explicitly? Since i guess this will be produced by the RBAC/ABAC/Coarse access rules? Unless i have missed something. Feels like if we are including a 403 here we should go back and add 403 to all methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This was added for access control use cases. We should make sure we're consistent with this.

@danjhd
Copy link
Contributor

danjhd commented Sep 18, 2025

I think it might be a good idea to have a status field (values of enabled and disabled) on a webhook. This would serve 2 purposes:

  1. It would be updateable on the PUT endpoint so that a client can disable a webhook for a period if needed without deleting it and without the need to alter it in any other way.
  2. It would allow the implementation to automatically disable webhooks after a defined number of failures so avoid continuously loading the system with POSTs that will fail. The addition of an optional error field (readonly) would be useful to give feedback to the client on the reason for the disablement if it was done by the system rather than the client.

In our implementation i need to solve the problem of repeated failures and thought about adding a disabled flag but it think this might be useful in the main spec, and also worth adding as part of this PR as it is the same schema?

@j616
Copy link
Contributor Author

j616 commented Sep 23, 2025

Rebased

@j616
Copy link
Contributor Author

j616 commented Sep 23, 2025

I think it might be a good idea to have a status field (values of enabled and disabled) on a webhook. This would serve 2 purposes:

  1. It would be updateable on the PUT endpoint so that a client can disable a webhook for a period if needed without deleting it and without the need to alter it in any other way.
  2. It would allow the implementation to automatically disable webhooks after a defined number of failures so avoid continuously loading the system with POSTs that will fail. The addition of an optional error field (readonly) would be useful to give feedback to the client on the reason for the disablement if it was done by the system rather than the client.

In our implementation i need to solve the problem of repeated failures and thought about adding a disabled flag but it think this might be useful in the main spec, and also worth adding as part of this PR as it is the same schema?

Initial solution proposed in #150 was merged into this PR. This was expanded on in 087510c

@j616 j616 requested a review from samdbmg September 24, 2025 08:54
@j616
Copy link
Contributor Author

j616 commented Sep 24, 2025

Rebased

Copy link
Member

@samdbmg samdbmg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, but LGTM

@j616
Copy link
Contributor Author

j616 commented Sep 24, 2025

Rebasing again

@j616 j616 merged commit 1e87170 into main Sep 24, 2025
8 checks passed
@j616 j616 deleted the jamessa-webhooks branch September 24, 2025 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants