-
Notifications
You must be signed in to change notification settings - Fork 617
Closed
Labels
t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.
Description
I have a list of web sites from which I am trying to scrape a given piece of info. For each site, once I have found that info, I want to stop and move on to the next (with several sites being scraped concurrently).
I have tried the following approach (emptying the request queue when my goal is found) :
request_queue = await RequestQueue.open()
crawler = PlaywrightCrawler(
request_provider=request_queue,
headless=True, # Show the browser window.
browser_type='firefox', # Use the Firefox browser.
)
await crawler.add_requests([root_url])
@crawler.router.default_handler
async def request_handler(context: PlaywrightCrawlingContext) -> None:
# ...
if found:
await request_queue.drop()But that's actually raising an error :
ValueError: Request queue with id "default" does not exist.Any idea how I should proceed to have a finer control over the request queue ? Thanks !
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
t-toolingIssues with this label are in the ownership of the tooling team.Issues with this label are in the ownership of the tooling team.