Skip to content

Add tokenIndex to SymbolNode #2796

Closed
matthew-canestraro wants to merge 7 commits intojosdejong:developfrom
matthew-canestraro:develop
Closed

Add tokenIndex to SymbolNode #2796
matthew-canestraro wants to merge 7 commits intojosdejong:developfrom
matthew-canestraro:develop

Conversation

@matthew-canestraro
Copy link

This is a small POC to go with #2795

I kept this minimal, only adding tokenIndex to SymbolNodes specifically, but I could see the value in extending this to most/all nodes

Will mark as draft until the discussion resolves

@matthew-canestraro matthew-canestraro marked this pull request as ready for review March 15, 2023 19:53
Copy link
Owner

@josdejong josdejong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is shaping up really nicely, and it looks well tested!

I made a few inline comments, can you have a look at those? Besides that, we should write a section in the documentation explaining source.

// should have a source for brackets and item delimiters
const expected = [
{ index: 0, text: '[' },
{ index: 2, text: ',' },
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we parser a large matrix, I can imagine that listing all comma's impacts the performance. I think we should double check if this is an issue or not.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can do! Are you worried about performance during parsing or evaluation or somewhere else?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I was thinking about performance of parsing, there is a lot of extra objects created then. I think it cannot impact evaluation performance since the sources are not used during evaluation.

We can set up a small benchmark in the /test/benchmark folder to test if there is a performance impact or not, if you want I can do that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're willing to do that, I'm glad for the help, thank you! Otherwise, I'll tackle it after finishing everything else

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that 👍

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run the benchmark test/benchmark/expression_parser.js and compared the develop branch with this PR (you can try it out yourself to see the differences).

  • The evaluate benchmark is just as fast in both cases, that is most important: about 1.7 million ops/sec
  • The parse benchmark is slower, goes from 16000 ops/second to 11000 ops/sec. I think that is acceptable, and simply a consequence of the parse step having to do more work.

Shall we leave it at that?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm getting the following results after rebasing the changes on the latest develop branch.
No changes:

expression: 2 + 3 * sin(pi / 4) - 4x
scope: Map(1) { 'x' => 2 }
result: -3.878679656440358
(plain js) evaluate                   0.04 µs   ±0.00%
(mathjs) evaluate                     0.27 µs  ±31.61%
(mathjs) parse, compile, evaluate     4.98 µs   ±7.58%
(mathjs) parse, compile               4.95 µs  ±10.21%
(mathjs) parse                        3.68 µs   ±8.69%

With changes from the task branch:

expression: 2 + 3 * sin(pi / 4) - 4x
scope: Map(1) { 'x' => 2 }
result: -3.878679656440358
(plain js) evaluate                   0.04 µs   ±0.00%
(mathjs) evaluate                     0.25 µs   ±0.23%
(mathjs) parse, compile, evaluate     6.15 µs  ±14.67%
(mathjs) parse, compile               6.05 µs  ±13.45%
(mathjs) parse                        5.17 µs  ±29.06%

@josdejong is this acceptable?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think that's acceptable. Thanks for testing this 👍

Copy link
Owner

@josdejong josdejong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @matthew-canestraro! I made a few comments, mostly about thinking through the exact API for meta regarding optional arguments/properties.

All nodes have the following methods:

- `clone() : Node`
- `clone(options: MetaOptions) : Node`
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options are optional, can you describe that here and for all constructors in the docs?


function emptySourcesFromTree (tree) {
tree.traverse((node) => {
node.sources = []
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using the immutable methods .clone({ sources: [] }) and .transform() in these helper functions to try out whether the immutable API is workable? (eat your own dog foo 😉 )

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this @josdejong ?

function emptySourcesFromTree (tree) {
  return tree.clone().transform(function (node) {
    node.sources = []
    return node
  })
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not exactly, I think I meant something like:

function emptySourcesFromTree (tree) {
  return tree.transform(node) => node.clone({ sources: [] }))
}

*/
clone () {
return new AccessorNode(this.object, this.index)
clone (meta = {}) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value for meta, an empty object {}, does not correspond with the official TypeScript definition where sources is a require property.

I think it is best to have sources are required property. I would also like to get rid of mutating the input meta via meta.sources = ..., so I think we can rewrite the clone methods to something like:

    /**
     * Create a clone of this node, a shallow copy
     * @param {MetaOptions} [meta] object with additional options for cloning this node
     * @return {AccessorNode}
     */
    clone (meta) {
      const cloned = new AccessorNode(this.object, this.index, meta ?? { sources: this.sources })
      return cloned
    }

Having to construct a default meta object via { sources: this.sources } is a bit odd. Would it make sense to just keep the meta object as it is instead of destructuring it in the Node constructor, and having to re-construct it in the methods clone before we can pass it to another constructor? I.e we could think about:

  const defaultMetaOptions = {
    sources = []
  }

  class Node {
    /**
     * @constructor Node
     * A generic node, the parent of other AST nodes
     * @param {MetaOptions} [meta] object with additional options for building this node
     */
    constructor (meta = defaultMetaOptions) {
      this.meta = meta
    }
    // ...
  }

// should have a source for brackets and item delimiters
const expected = [
{ index: 0, text: '[' },
{ index: 2, text: ',' },
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do that 👍

* @param {Node} object The object from which to retrieve
* a property or subset.
* @param {IndexNode} index IndexNode containing ranges
* @param {MetaOptions} object with additional options for building this node
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity: to make this valid JSDoc, the lines like:

@param {MetaOptions} object with additional options for building this node

Should be changed to:

@param {MetaOptions} meta                     object with additional options for building this node

And if meta is optional:

@param {MetaOptions} [meta]                   object with additional options for building this node

@josdejong
Copy link
Owner

@matthew-canestraro we never finished this PR. Are you still interested in getting it done?

@matthew-canestraro
Copy link
Author

@matthew-canestraro we never finished this PR. Are you still interested in getting it done?

Hi @josdejong, sorry for dropping the ball on this, priorities shifted a lot over the past few months. That being said, I still believe this will be a good contribution to MathJS and it would be a shame to lose the work done. I'll do a final pass on the PR this week and ping you when I think it's ready

@josdejong
Copy link
Owner

Thanks @matthew-canestraro . Also if you can't find time for it just let me know, maybe someone else can finish your work then.

serso added a commit to serso/mathjs that referenced this pull request Oct 17, 2025
Prior to this commit it was not possible to trace back the parsed node location
in the original string. This information could be used to highlight the syntax of
a math expression displayed in a rich editor (e.g. a spreadsheet or a calculator).

This commit is based on the PR josdejong#2796 that has been abandoned for 2 years.
I rebased the task branch and fixed the issues reported in the original PR.

In the nutshell, each parsed node stores an array of sources (`SourceMapping[]`)
that are set during parsing (see `tokenSource` and its usages for details).
The node's constructor and clone method are adjusted to take an optional `MetaOptions`
object containing the source mappings. In the future `MetaOptions` could be
extended to store more information.

Benchmarks showed no change in `evaluate`. `parse` became slower, from 3.68µs to 5.17µs
with the changes from this commit.

Closes josdejong#2795
@serso
Copy link

serso commented Oct 17, 2025

I have opened a new PR with the changes based on this PR, see #3557
I think this PR could be closed. I've rebased the changes on the latest develop branch and adjusted the code with the suggestions from this PR.

@josdejong
Copy link
Owner

Awesome @serso, that's good news!

I'll close this PR now in favor of #3557

@josdejong josdejong closed this Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants