-
Notifications
You must be signed in to change notification settings - Fork 8
Add Support for IonElement #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
85bdf53
3635479
6830675
4fa0573
4ea012d
53b4038
51df1e1
d97e811
4f501a6
35cc9f5
0746e71
e40263a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -84,6 +84,20 @@ abstract class MeasurableReadTask implements MeasurableTask { | |
| */ | ||
| abstract void fullyReadDomFromFile(SideEffectConsumer consumer) throws IOException; | ||
|
|
||
| /** | ||
| * Initialize the element loader and perform a fully-materialized deep read of the data from a buffer using | ||
| * IonElement API. The "loader" is defined as any context that is tied to a single stream. | ||
| * @throws IOException if thrown during reading. | ||
| */ | ||
| abstract void fullyReadElementFromBuffer(SideEffectConsumer consumer) throws IOException; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not a blocker—it would be nice if we could add new APIs to the benchmark CLI without having to add more methods for each one. (Especially since they are unlikely to be applicable for all data formats.) Have you thought about any ways this could be factored to make it more easily extensible?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're right, the current approach is a straightforward one but doesn't scale well. A capability-based system where each format declares which APIs it supports might be better. I'd be happy to refactor this in a follow-up if it's worth doing. |
||
|
|
||
| /** | ||
| * Initialize the element loader and perform a fully-materialized deep read of the data from a file using | ||
| * IonElement API. The "loader" is defined as any context that is tied to a single stream. | ||
| * @throws IOException if thrown during reading. | ||
| */ | ||
| abstract void fullyReadElementFromFile(SideEffectConsumer consumer) throws IOException; | ||
|
|
||
| @Override | ||
| public void setUpTrial() throws IOException { | ||
| inputFile = options.convertFileIfNecessary(originalFile).toFile(); | ||
|
|
@@ -124,6 +138,12 @@ public final Task getTask() { | |
| } else { | ||
| return this::fullyReadDomFromFile; | ||
| } | ||
| } else if (options.api == API.ION_ELEMENT_DOM) { | ||
| if (buffer != null) { | ||
| return this::fullyReadElementFromBuffer; | ||
| } else { | ||
| return this::fullyReadElementFromFile; | ||
| } | ||
| } else { | ||
| throw new IllegalStateException("Illegal combination of options."); | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious—when you run this, what sort of results to you get?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It runs two APIs in sequence, then gives two stacked statistical summaries at the end, one for each API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I should have been more clear. Approximately how much of a performance difference are you getting between the two APIs when you run this command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approximately a 30-ish percentage improvement in
primary_score. While this command is a rough comparison between two APIs, the result aligns well with our more extensive experiments run inSampleTimemode across three ion datasets, where we saw a ~37.09% improvement inprimary_score.