-
Notifications
You must be signed in to change notification settings - Fork 14
Description
Since 'ff' uses mmap, I would have thought that the implementation of sum in 'ffbase' would be pretty simple - just iterate over the mmap'ed address range while updating an accumulator.
However, benchmarking sum of a long vector seems to indicate that something more complicated is happening:
> x1=rep(0.0,1e8)
> x2=as.ff(x1)
> system.time(sum(x1))
user system elapsed
0.077 0.003 0.079
> system.time(sum(x2)) # ran this 3 times
user system elapsed
0.527 0.333 0.859
0.487 0.880 1.369
0.440 0.507 0.946
I noticed the large value for 'system' time, so I attached strace to the R process, and ran sum(x2) again. I found that mmap is being called 12208 times! For sum(x1), mmap was not called at all...
Note that the 0.079 seconds used by sum(x1) is almost exactly the time required to cat the 'ff' file which backs x2. Thus the file is located entirely in the page cache, and this 0.079 seconds simply represents the time required to read 763MB of RAM data into the CPU caches.
Now I downloaded the source of ffbase, and I could not find where sum.ff is defined, so I was not able to investigate further today. By the way, this question of efficiency arose out of an email exchange with 'ff' author Jens Oehlschlägel.
Thanks for your library, and thank you in advance for assistance with this issue.