-
Notifications
You must be signed in to change notification settings - Fork 695
Preserve images within figure elements #989
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Context for this change: For recipe website: https://www.allrecipes.com/hot-honey-brussels-sprouts-recipe-11832010 Readability omits image under "Directions" section. |
|
The "1 failing check" was from |
|
Apologies for the slow response, I was at a dedicated workweek last week. In addition to reviewing this PR more closely, I'll try to find some time this week or next to get some other folks to sign up to help with reviews here.
Ideally submit a separate CL with just that change?
Would you mind adding this or a similar case as a testcase so we don't regress this in future? |
| } | ||
|
|
||
| // Handle <img> buried inside nested <div> layers in <figure>. | ||
| if (tag === "div" && this._hasAncestorTag(node, "figure") && this._isSingleImage(node)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works but is a little inefficient if the divs are all nested, because the getElementsByTagName call that is being used for the divs here will find each of the ancestor divs and re-walk it to find the same image (and the ancestor chain to find the same figure).
Would it be feasible to de-nest singly nested divs with images? Along the lines of _prepareBrs, do a prepareImages or something like that. Presumably the nesting of all the divs isn't actually useful for anything if we're removing content otherwise - though I suppose there will be a question of what to do with ids/classnames in terms of scoring...
Also, I'm curious how this works out when there are figure tags with these deeply nested images that also have other content in them (so they fail the _isSingleImage check). From a naive inspection of the page you linked (haven't tried to debug readability right now), it would seem that the page in question does have a noscript tag in the figure as well...
... in fact, why doesn't the logic in this helper already deal with all of this? Why does it fall down for this website?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The <noscript> tags in our example embeds <img> as escaped text content, not HTML tags. That seems to be why _isSingleImage() fails the check.
Perhaps this was #990 ? I just merged that, hopefully just rebasing will address this. |
|
Egh, sorry for the spam, but actually looking at the logfile for the failing PR I think the ? Which would kind of make sense, in that I didn't think the automation would block on |
|
Okay I had more time to dig into _unwrapNoscriptImages(doc) {
...
// Next find noscript and try to extract its image
var noscripts = Array.from(doc.getElementsByTagName("noscript"));
this._forEachNode(noscripts, function (noscript) {
// Parse content of noscript and make sure it only contains image
if (!this._isSingleImage(noscript)) {
return;
}
var tmp = doc.createElement("div");
// We're running in the document context, and using unmodified
// document contents, so doing this should be safe.
// (Also we heavily discourage people from allowing script to
// run at all in this document...)
// eslint-disable-next-line no-unsanitized/property
tmp.innerHTML = noscript.innerHTML;Background info: When JS is available,
So we should keep the approach of this CL? |
This CL fixes an issue where images nested within `<div>` elements inside a `<figure>` tag were being incorrectly removed by Readability's cleaning process. Specifically, a structure like `<figure><div><div><img></div></div></figure>` would first be transformed to `<figure><div><p><img></p></div></figure>`, and then the outer `<div>` (and its contents) would be erroneously identified as extraneous and removed. The fix introduces a targeted exception within _cleanConditionally(). It prevents the removal of `<div>` elements that meet the following criteria: * The element is a `<div>`. * The `<div>` is an ancestor of a `<figure>` element. * The `<div>` contains a single `<img>` element (potentially nested). Also add test case allrecipes-1 taken from: https://www.allrecipes.com/hot-honey-brussels-sprouts-recipe-11832010
|
Added new test case, PTAL. Thanks! |
2afac51 to
c2ecf29
Compare
Thanks for doing this digging, @samuelhuang !
To be clear, this is still passing an element, but the element only has text content rather than a DOM tree representing that "HTML" (text content), is what you mean, right? I was confused why this would not have caused test failures if it was so different from before (without the code moving). From some searching, it seems we're not encountering issues in automation because Perhaps we should update
I think this is the part I would prefer to fix, if possible? I tried to do this in #993 . PTAL and let me know what you think? I wanted to link to a comparison of the diff --git a/test/test-pages/allrecipes-1/expected.html b/test/test-pages/allrecipes-1/expected.html
index 2190e23d7d..2939e04280 100644
--- a/test/test-pages/allrecipes-1/expected.html
+++ b/test/test-pages/allrecipes-1/expected.html
@@ -12,9 +12,7 @@
<div data-tooltip="Nicole Russell is a prolific contributor to Allrecipes and an avid member of the Allrecipes Allstars." tabindex="0" data-inline-tooltip="true">
<p><a href="https://www.allrecipes.com/nicole-russell-7113592" rel="nocaes" data-trigger-link="true" tabindex="-1">Nicole Russell</a></p>
<div id="mntl-author-tooltip_1-0">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/uYVcXCSJ-dbRVP6aENNapCfLdlY=/75x75/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg" width="75" height="75" srcset="https://www.allrecipes.com/thmb/Trlj084IC3wBi4LKJPIKHEidK_Y=/40x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 40w, https://www.allrecipes.com/thmb/PjK7K3x5635tX12yHK_IGTJCha8=/58x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 58w, https://www.allrecipes.com/thmb/r4I1SWydIlNIwza6llxnx6EUpZ4=/76x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 76w, https://www.allrecipes.com/thmb/pp0xDFVleEUku7amHGoSEAIxVdw=/94x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 94w, https://www.allrecipes.com/thmb/XIiaJQ1BUWIDHcE9XW8jQyRKIBQ=/112x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 112w" sizes="80px" alt="Nicole Russell" data-src="https://www.allrecipes.com/thmb/uYVcXCSJ-dbRVP6aENNapCfLdlY=/75x75/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg" data-srcset="https://www.allrecipes.com/thmb/Trlj084IC3wBi4LKJPIKHEidK_Y=/40x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 40w, https://www.allrecipes.com/thmb/PjK7K3x5635tX12yHK_IGTJCha8=/58x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 58w, https://www.allrecipes.com/thmb/r4I1SWydIlNIwza6llxnx6EUpZ4=/76x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 76w, https://www.allrecipes.com/thmb/pp0xDFVleEUku7amHGoSEAIxVdw=/94x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 94w, https://www.allrecipes.com/thmb/XIiaJQ1BUWIDHcE9XW8jQyRKIBQ=/112x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 112w" /></p>
- </div>
+ <p><img src="https://www.allrecipes.com/thmb/uYVcXCSJ-dbRVP6aENNapCfLdlY=/75x75/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg" width="75" height="75" srcset="https://www.allrecipes.com/thmb/Trlj084IC3wBi4LKJPIKHEidK_Y=/40x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 40w, https://www.allrecipes.com/thmb/PjK7K3x5635tX12yHK_IGTJCha8=/58x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 58w, https://www.allrecipes.com/thmb/r4I1SWydIlNIwza6llxnx6EUpZ4=/76x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 76w, https://www.allrecipes.com/thmb/pp0xDFVleEUku7amHGoSEAIxVdw=/94x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 94w, https://www.allrecipes.com/thmb/XIiaJQ1BUWIDHcE9XW8jQyRKIBQ=/112x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 112w" sizes="80px" alt="Nicole Russell" data-src="https://www.allrecipes.com/thmb/uYVcXCSJ-dbRVP6aENNapCfLdlY=/75x75/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg" data-srcset="https://www.allrecipes.com/thmb/Trlj084IC3wBi4LKJPIKHEidK_Y=/40x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 40w, https://www.allrecipes.com/thmb/PjK7K3x5635tX12yHK_IGTJCha8=/58x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 58w, https://www.allrecipes.com/thmb/r4I1SWydIlNIwza6llxnx6EUpZ4=/76x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 76w, https://www.allrecipes.com/thmb/pp0xDFVleEUku7amHGoSEAIxVdw=/94x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 94w, https://www.allrecipes.com/thmb/XIiaJQ1BUWIDHcE9XW8jQyRKIBQ=/112x0/filters:no_upscale():max_bytes(150000):strip_icc()/NicoleRussellSoupLovingNicole-9607577801804bd09ea93bbdb7fcae6d.jpg 112w" /></p>
<p> Nicole Russell is a prolific contributor to Allrecipes and an avid member of the Allrecipes Allstars. </p>
</div>
</div>
@@ -24,9 +22,7 @@
</div>
<div>
<figure id="figure-article_1-0" data-click-tracked="false">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/RDigrCswgpVoo9dPZxanUuRIudI=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-Beauty-4x3-5ee1ce8dde46479da6759f4cf0d8181c.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/LNvP40zVI2ZbS5O3gRnmmadoMEM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-Beauty-4x3-5ee1ce8dde46479da6759f4cf0d8181c.jpg 750w" sizes="750px" alt="Dish of roasted brussels sprouts with crumbled cheese and garnish" /></p>
- </div>
+ <img src="https://www.allrecipes.com/thmb/RDigrCswgpVoo9dPZxanUuRIudI=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-Beauty-4x3-5ee1ce8dde46479da6759f4cf0d8181c.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/LNvP40zVI2ZbS5O3gRnmmadoMEM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-Beauty-4x3-5ee1ce8dde46479da6759f4cf0d8181c.jpg 750w" sizes="750px" alt="Dish of roasted brussels sprouts with crumbled cheese and garnish" />
</figure>
<div id="article-content_1-0">
<div id="mm-recipes-details_1-0">
@@ -106,9 +102,7 @@
<li id="mntl-sc-block_2-0">
<p id="mntl-sc-block_3-0"> Gather all ingredients. Preheat the oven to 400 degrees F (200 degrees C). Line a large baking sheet with aluminum foil. </p>
<figure id="mntl-sc-block_4-0">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/YgfY5c5ZN8GjWNeoWQQ5ALoNWhM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg 750w" sizes="750px" alt="Ingredients for a recipe including Brussels sprouts oil honey cheese and seasonings arranged in bowls on a marble surface" data-src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" data-srcset="https://www.allrecipes.com/thmb/YgfY5c5ZN8GjWNeoWQQ5ALoNWhM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" /></p>
- </div>
+ <img src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/YgfY5c5ZN8GjWNeoWQQ5ALoNWhM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg 750w" sizes="750px" alt="Ingredients for a recipe including Brussels sprouts oil honey cheese and seasonings arranged in bowls on a marble surface" data-src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" data-srcset="https://www.allrecipes.com/thmb/YgfY5c5ZN8GjWNeoWQQ5ALoNWhM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/aqdax3V1v0dM1iYGEdk80SuiUW4=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-01-ced062a22ba24062afd44585679f2b11.jpg" />
<figcaption id="mntl-figure-caption_1-0">
<p>
<span>Allrecipes / Julia Hartbeck</span>
@@ -119,9 +113,7 @@
<li id="mntl-sc-block_7-0">
<p id="mntl-sc-block_8-0"> Place Brussels sprouts in a bowl. Add oil, hot honey, salt, and pepper. Stir to combine and transfer to the baking sheet. </p>
<figure id="mntl-sc-block_9-0">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/MABSZFZSQiMH9-AAdON6zeW12Xc=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg 750w" sizes="750px" alt="A baking sheet with halved Brussels sprouts arranged on foil prepared for cooking" data-src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" data-srcset="https://www.allrecipes.com/thmb/MABSZFZSQiMH9-AAdON6zeW12Xc=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" /></p>
- </div>
+ <img src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/MABSZFZSQiMH9-AAdON6zeW12Xc=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg 750w" sizes="750px" alt="A baking sheet with halved Brussels sprouts arranged on foil prepared for cooking" data-src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" data-srcset="https://www.allrecipes.com/thmb/MABSZFZSQiMH9-AAdON6zeW12Xc=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/eUPioUqRphcDxRjTsAyjPzwxZJc=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-02-9032bfa0cf5f484fb0effe17210fb06a.jpg" />
<figcaption id="mntl-figure-caption_2-0">
<p>
<span>Allrecipes / Julia Hartbeck</span>
@@ -132,9 +124,7 @@
<li id="mntl-sc-block_12-0">
<p id="mntl-sc-block_13-0"> Roast in the preheated oven for 20 minutes. </p>
<figure id="mntl-sc-block_14-0">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/2eqyp1WAanqlJ_UvmyL7dRxlfoM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg 750w" sizes="750px" alt="Roasted Brussels sprouts on a foillined baking sheet" data-src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" data-srcset="https://www.allrecipes.com/thmb/2eqyp1WAanqlJ_UvmyL7dRxlfoM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" /></p>
- </div>
+ <img src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/2eqyp1WAanqlJ_UvmyL7dRxlfoM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg 750w" sizes="750px" alt="Roasted Brussels sprouts on a foillined baking sheet" data-src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" data-srcset="https://www.allrecipes.com/thmb/2eqyp1WAanqlJ_UvmyL7dRxlfoM=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/UB3Jz-pfoXewTAcjSJz4RBjrigY=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-03-8b8c035f2a964810b5e253bb4a95ce8d.jpg" />
<figcaption id="mntl-figure-caption_3-0">
<p>
<span>Allrecipes / Julia Hartbeck</span>
@@ -145,9 +135,7 @@
<li id="mntl-sc-block_17-0">
<p id="mntl-sc-block_18-0"> Transfer sprouts to a bowl. Top with feta and scallions. Toss to combine. Serve immediately. </p>
<figure id="mntl-sc-block_19-0">
- <div>
- <p><img src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/lUcNGpjFIbAT1nndRb9huLaamgo=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg 750w" sizes="750px" alt="Bowl of roasted Brussels sprouts garnished with toppings and a spoon on the side" data-src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" data-srcset="https://www.allrecipes.com/thmb/lUcNGpjFIbAT1nndRb9huLaamgo=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" /></p>
- </div>
+ <img src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" width="4000" height="3000" srcset="https://www.allrecipes.com/thmb/lUcNGpjFIbAT1nndRb9huLaamgo=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg 750w" sizes="750px" alt="Bowl of roasted Brussels sprouts garnished with toppings and a spoon on the side" data-src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" data-srcset="https://www.allrecipes.com/thmb/lUcNGpjFIbAT1nndRb9huLaamgo=/750x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg 750w" data-hi-res-src="https://www.allrecipes.com/thmb/TKyvsRHU4FOU6HJ50w5StS_jOqk=/1500x0/filters:no_upscale():max_bytes(150000):strip_icc()/11832010-hot-honey-brussels-sprouts-recipe-VAT-step-04-59530f25a17644ab8d6d5ebd94c6d064.jpg" />
<figcaption id="mntl-figure-caption_4-0">
<p>
<span>Allrecipes / Julia Hartbeck</span> |
Yes.
Perhaps we can just read from
Sounds good; I'm happy to abandon my pull request and take your approach! |
This CL fixes an issue where images nested within
<div>elementsinside a
<figure>tag were being incorrectly removed by Readability'scleaning process. Specifically, a structure like
<figure><div><div><img></div></div></figure>would first betransformed to
<figure><div><p><img></p></div></figure>, and thenthe outer
<div>(and its contents) would be erroneously identifiedas extraneous and removed.
The fix introduces a targeted exception within _cleanConditionally().
It prevents the removal of
<div>elements that meet the followingcriteria:
<div>.<div>is an ancestor of a<figure>element.<div>contains a single<img>element (potentially nested).Also add test case allrecipes-1 taken from:
https://www.allrecipes.com/hot-honey-brussels-sprouts-recipe-11832010