Skip to content

"sum" is 10x slower than expected #52

@archenemies

Description

@archenemies

Since 'ff' uses mmap, I would have thought that the implementation of sum in 'ffbase' would be pretty simple - just iterate over the mmap'ed address range while updating an accumulator.

However, benchmarking sum of a long vector seems to indicate that something more complicated is happening:

> x1=rep(0.0,1e8)
> x2=as.ff(x1)
> system.time(sum(x1))
   user  system elapsed 
  0.077   0.003   0.079 
> system.time(sum(x2))     # ran this 3 times
   user  system elapsed 
  0.527   0.333   0.859 
  0.487   0.880   1.369 
  0.440   0.507   0.946 

I noticed the large value for 'system' time, so I attached strace to the R process, and ran sum(x2) again. I found that mmap is being called 12208 times! For sum(x1), mmap was not called at all...

Note that the 0.079 seconds used by sum(x1) is almost exactly the time required to cat the 'ff' file which backs x2. Thus the file is located entirely in the page cache, and this 0.079 seconds simply represents the time required to read 763MB of RAM data into the CPU caches.

Now I downloaded the source of ffbase, and I could not find where sum.ff is defined, so I was not able to investigate further today. By the way, this question of efficiency arose out of an email exchange with 'ff' author Jens Oehlschlägel.

Thanks for your library, and thank you in advance for assistance with this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions