Skip to content

Commit 2afac51

Browse files
committed
Preserve images within figure elements.
This CL fixes an issue where images nested within `<div>` elements inside a `<figure>` tag were being incorrectly removed by Readability's cleaning process. Specifically, a structure like `<figure><div><div><img></div></div></figure>` would first be transformed to `<figure><div><p><img></p></div></figure>`, and then the outer `<div>` (and its contents) would be erroneously identified as extraneous and removed. The fix introduces a targeted exception within _cleanConditionally(). It prevents the removal of `<div>` elements that meet the following criteria: * The element is a `<div>`. * The `<div>` is an ancestor of a `<figure>` element. * The `<div>` contains a single `<img>` element (potentially nested).
1 parent d7949dc commit 2afac51

1 file changed

Lines changed: 5 additions & 0 deletions

File tree

Readability.js

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2506,6 +2506,11 @@ Readability.prototype = {
25062506
return false;
25072507
}
25082508

2509+
// Handle <img> buried inside nested <div> layers in <figure>.
2510+
if (tag === "div" && this._hasAncestorTag(node, "figure") && this._isSingleImage(node)) {
2511+
return false;
2512+
}
2513+
25092514
var weight = this._getClassWeight(node);
25102515

25112516
this.log("Cleaning Conditionally", node);

0 commit comments

Comments
 (0)