-
Notifications
You must be signed in to change notification settings - Fork 12
Open
Description
EBSCO::EDS::Record#html_decode_and_santize fails to handle cases where the data being sanitized exceeds Nokogiri's Nokogiri::Gumbo::DEFAULT_MAX_ATTRIBUTES setting of 400.
We've seen an increased number of these errors since late December 2024.
There's a way to pass a larger configured value for :max_attributes as part of the Sanitize::Config hash. See: https://github.com/rgrove/sanitize?tab=readme-ov-file#parser_options-hash. The edsapi-ruby gem may need to set a higher limit based on the data being passed through this sanitize method and/or more gracefully handle cases where this limit would be exceeded.
I was able to trigger this error by searching the EDS API for:
- anom fbi
- fitch communes
- uri menezes
Top of stack trace:
[GEM_ROOT]/gems/nokogiri-1.18.2-x86_64-linux-gnu/lib/nokogiri/html5/document_fragment.rb:166 :in `fragment`
[GEM_ROOT]/gems/nokogiri-1.18.2-x86_64-linux-gnu/lib/nokogiri/html5/document_fragment.rb:166 :in `initialize`
[GEM_ROOT]/gems/nokogiri-1.18.2-x86_64-linux-gnu/lib/nokogiri/xml/document_fragment.rb:44 :in `new`
[GEM_ROOT]/gems/nokogiri-1.18.2-x86_64-linux-gnu/lib/nokogiri/html5/document_fragment.rb:84 :in `parse`
[GEM_ROOT]/gems/nokogiri-1.18.2-x86_64-linux-gnu/lib/nokogiri/html5.rb:281 :in `fragment`
[GEM_ROOT]/gems/sanitize-6.1.3/lib/sanitize.rb:138 :in `fragment`
[GEM_ROOT]/gems/sanitize-6.1.3/lib/sanitize.rb:67 :in `fragment`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:966 :in `html_decode_and_sanitize`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:936 :in `sanitize_data`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:904 :in `block in get_item_data`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:902 :in `each`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:902 :in `get_item_data`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/record.rb:189 :in `initialize`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/results.rb:57 :in `new`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/results.rb:57 :in `block in initialize`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/results.rb:55 :in `each`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/results.rb:55 :in `initialize`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/session.rb:247 :in `new`
[GEM_ROOT]/gems/ebsco-eds-1.1.5/lib/ebsco/eds/session.rb:247 :in `search`
Metadata
Metadata
Assignees
Labels
No labels