Skip to content

Optimization for large files #4

@erichulburd

Description

@erichulburd

I tried parsing a large file (6MB) and it didn't finish after 5 minutes. I tried a few raw Nokogiri searches and it's slow, though in terms of tens of seconds rather than tens of minutes.

Not sure where to start with this. I've heard conflicting suggestions about whether xpath is faster (http://nicksda.apotomo.de/2013/01/nokogiris-xpath-search-is-faster/) or tree walking (http://one.valeski.org/2009/10/nokogiri-performance-xpath-vs-tree.html). I'll have to run some tests.

Before I even do that, I'm going to try to move as much of the initialization code into methods.

Also, as far as the entry_xml variable goes, it looks like that actually stores an xpath, not the result. So every reference will look through the entire XML block to find the node. It may speed things up to remove the node from the DOM once the greenButtonEntry is initialized and store it as a string in thegreenButtonEntry.entry_xml variable??

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions