When lein ring server-headless is used with :auto-refresh? true, the /__source_changed long-poll endpoint in ring.middleware.refresh can occupy many Jetty worker threads. With enough concurrent long-polls, normal page requests get queued and may take tens of seconds to complete.
This happens often during normal development when refreshing the page or navigating from page to page.
Reproducing
Here is a minimal project for reproducing the issue: https://github.com/luontola/ring-refresh-starvation-repro
- Run
lein ring server-headless
- Open http://localhost:3000/ in a browser and also open network dev tools
- Refresh the page tens of times
- Or navigate from page to page. Chromium seems to get stuck after just a few page navigations. Firefox takes longer.
- Eventually the
GET / request gets stuck in pending state for tens of seconds
The thread dump stack-starved.txt shows tens of Jetty threads blocked in ring.middleware.refresh$watch_until. (For reference, stack-ok.txt is a thread dump of when everything is still fine.)
Ideas
I suppose long-polling can't detect when the user leaves the page, but could websockets be better? Or could websockets avoid the thread starvation issue completely?
Could lein-ring count how many long-polling requests are active, and close the oldest requests automatically? Or course we need to avoid a live loop if somebody has the site open on lots of tabs. Instead of closing the oldest requests outright, could we dynamically lower the timeout duration of the oldest requests? Alternatively, an easier solution could be to exponentially lower the timeout based on how many long-polls are active when starting a new long-poll; then at least the new requests would timeout quickly and keep the total number of threads low.
Can we increase the thread pool size and/or lower the long polling delay as a workaround to delay the starvation? Or is the web browser's max connections per host limit also involved, because when navigating from page to page with Chromium it gets stuck already at about 8 requests?
When
lein ring server-headlessis used with:auto-refresh? true, the/__source_changedlong-poll endpoint inring.middleware.refreshcan occupy many Jetty worker threads. With enough concurrent long-polls, normal page requests get queued and may take tens of seconds to complete.This happens often during normal development when refreshing the page or navigating from page to page.
Reproducing
Here is a minimal project for reproducing the issue: https://github.com/luontola/ring-refresh-starvation-repro
lein ring server-headlessGET /request gets stuck in pending state for tens of secondsThe thread dump stack-starved.txt shows tens of Jetty threads blocked in
ring.middleware.refresh$watch_until. (For reference, stack-ok.txt is a thread dump of when everything is still fine.)Ideas
I suppose long-polling can't detect when the user leaves the page, but could websockets be better? Or could websockets avoid the thread starvation issue completely?
Could lein-ring count how many long-polling requests are active, and close the oldest requests automatically? Or course we need to avoid a live loop if somebody has the site open on lots of tabs. Instead of closing the oldest requests outright, could we dynamically lower the timeout duration of the oldest requests? Alternatively, an easier solution could be to exponentially lower the timeout based on how many long-polls are active when starting a new long-poll; then at least the new requests would timeout quickly and keep the total number of threads low.
Can we increase the thread pool size and/or lower the long polling delay as a workaround to delay the starvation? Or is the web browser's max connections per host limit also involved, because when navigating from page to page with Chromium it gets stuck already at about 8 requests?