Add tokenIndex to SymbolNode #2796
Add tokenIndex to SymbolNode #2796matthew-canestraro wants to merge 7 commits intojosdejong:developfrom
Conversation
9c24329 to
edd789f
Compare
ad50348 to
742e662
Compare
josdejong
left a comment
There was a problem hiding this comment.
This is shaping up really nicely, and it looks well tested!
I made a few inline comments, can you have a look at those? Besides that, we should write a section in the documentation explaining source.
| // should have a source for brackets and item delimiters | ||
| const expected = [ | ||
| { index: 0, text: '[' }, | ||
| { index: 2, text: ',' }, |
There was a problem hiding this comment.
If we parser a large matrix, I can imagine that listing all comma's impacts the performance. I think we should double check if this is an issue or not.
There was a problem hiding this comment.
Can do! Are you worried about performance during parsing or evaluation or somewhere else?
There was a problem hiding this comment.
Yes I was thinking about performance of parsing, there is a lot of extra objects created then. I think it cannot impact evaluation performance since the sources are not used during evaluation.
We can set up a small benchmark in the /test/benchmark folder to test if there is a performance impact or not, if you want I can do that.
There was a problem hiding this comment.
If you're willing to do that, I'm glad for the help, thank you! Otherwise, I'll tackle it after finishing everything else
There was a problem hiding this comment.
I've run the benchmark test/benchmark/expression_parser.js and compared the develop branch with this PR (you can try it out yourself to see the differences).
- The
evaluatebenchmark is just as fast in both cases, that is most important: about 1.7 million ops/sec - The
parsebenchmark is slower, goes from 16000 ops/second to 11000 ops/sec. I think that is acceptable, and simply a consequence of the parse step having to do more work.
Shall we leave it at that?
There was a problem hiding this comment.
I'm getting the following results after rebasing the changes on the latest develop branch.
No changes:
expression: 2 + 3 * sin(pi / 4) - 4x
scope: Map(1) { 'x' => 2 }
result: -3.878679656440358
(plain js) evaluate 0.04 µs ±0.00%
(mathjs) evaluate 0.27 µs ±31.61%
(mathjs) parse, compile, evaluate 4.98 µs ±7.58%
(mathjs) parse, compile 4.95 µs ±10.21%
(mathjs) parse 3.68 µs ±8.69%
With changes from the task branch:
expression: 2 + 3 * sin(pi / 4) - 4x
scope: Map(1) { 'x' => 2 }
result: -3.878679656440358
(plain js) evaluate 0.04 µs ±0.00%
(mathjs) evaluate 0.25 µs ±0.23%
(mathjs) parse, compile, evaluate 6.15 µs ±14.67%
(mathjs) parse, compile 6.05 µs ±13.45%
(mathjs) parse 5.17 µs ±29.06%
@josdejong is this acceptable?
There was a problem hiding this comment.
Yes I think that's acceptable. Thanks for testing this 👍
josdejong
left a comment
There was a problem hiding this comment.
Thanks for the updates @matthew-canestraro! I made a few comments, mostly about thinking through the exact API for meta regarding optional arguments/properties.
| All nodes have the following methods: | ||
|
|
||
| - `clone() : Node` | ||
| - `clone(options: MetaOptions) : Node` |
There was a problem hiding this comment.
The options are optional, can you describe that here and for all constructors in the docs?
|
|
||
| function emptySourcesFromTree (tree) { | ||
| tree.traverse((node) => { | ||
| node.sources = [] |
There was a problem hiding this comment.
How about using the immutable methods .clone({ sources: [] }) and .transform() in these helper functions to try out whether the immutable API is workable? (eat your own dog foo 😉 )
There was a problem hiding this comment.
Something like this @josdejong ?
function emptySourcesFromTree (tree) {
return tree.clone().transform(function (node) {
node.sources = []
return node
})
}
There was a problem hiding this comment.
Not exactly, I think I meant something like:
function emptySourcesFromTree (tree) {
return tree.transform(node) => node.clone({ sources: [] }))
}| */ | ||
| clone () { | ||
| return new AccessorNode(this.object, this.index) | ||
| clone (meta = {}) { |
There was a problem hiding this comment.
The default value for meta, an empty object {}, does not correspond with the official TypeScript definition where sources is a require property.
I think it is best to have sources are required property. I would also like to get rid of mutating the input meta via meta.sources = ..., so I think we can rewrite the clone methods to something like:
/**
* Create a clone of this node, a shallow copy
* @param {MetaOptions} [meta] object with additional options for cloning this node
* @return {AccessorNode}
*/
clone (meta) {
const cloned = new AccessorNode(this.object, this.index, meta ?? { sources: this.sources })
return cloned
}Having to construct a default meta object via { sources: this.sources } is a bit odd. Would it make sense to just keep the meta object as it is instead of destructuring it in the Node constructor, and having to re-construct it in the methods clone before we can pass it to another constructor? I.e we could think about:
const defaultMetaOptions = {
sources = []
}
class Node {
/**
* @constructor Node
* A generic node, the parent of other AST nodes
* @param {MetaOptions} [meta] object with additional options for building this node
*/
constructor (meta = defaultMetaOptions) {
this.meta = meta
}
// ...
}| // should have a source for brackets and item delimiters | ||
| const expected = [ | ||
| { index: 0, text: '[' }, | ||
| { index: 2, text: ',' }, |
| * @param {Node} object The object from which to retrieve | ||
| * a property or subset. | ||
| * @param {IndexNode} index IndexNode containing ranges | ||
| * @param {MetaOptions} object with additional options for building this node |
There was a problem hiding this comment.
Just for clarity: to make this valid JSDoc, the lines like:
@param {MetaOptions} object with additional options for building this node
Should be changed to:
@param {MetaOptions} meta object with additional options for building this node
And if meta is optional:
@param {MetaOptions} [meta] object with additional options for building this node
|
@matthew-canestraro we never finished this PR. Are you still interested in getting it done? |
Hi @josdejong, sorry for dropping the ball on this, priorities shifted a lot over the past few months. That being said, I still believe this will be a good contribution to MathJS and it would be a shame to lose the work done. I'll do a final pass on the PR this week and ping you when I think it's ready |
|
Thanks @matthew-canestraro . Also if you can't find time for it just let me know, maybe someone else can finish your work then. |
Prior to this commit it was not possible to trace back the parsed node location in the original string. This information could be used to highlight the syntax of a math expression displayed in a rich editor (e.g. a spreadsheet or a calculator). This commit is based on the PR josdejong#2796 that has been abandoned for 2 years. I rebased the task branch and fixed the issues reported in the original PR. In the nutshell, each parsed node stores an array of sources (`SourceMapping[]`) that are set during parsing (see `tokenSource` and its usages for details). The node's constructor and clone method are adjusted to take an optional `MetaOptions` object containing the source mappings. In the future `MetaOptions` could be extended to store more information. Benchmarks showed no change in `evaluate`. `parse` became slower, from 3.68µs to 5.17µs with the changes from this commit. Closes josdejong#2795
|
I have opened a new PR with the changes based on this PR, see #3557 |
This is a small POC to go with #2795
I kept this minimal, only adding tokenIndex to SymbolNodes specifically, but I could see the value in extending this to most/all nodes
Will mark as draft until the discussion resolves