1010SELECT date , revenue, region FROM sales WHERE year = 2024
1111VISUALISE date AS x, revenue AS y, region AS color
1212DRAW line
13- SCALE x SETTING type => ' date'
13+ SCALE x VIA date
1414COORD cartesian SETTING ylim => [0 , 100000 ]
1515LABEL title => ' Sales by Region' , x => ' Date' , y => ' Revenue'
1616THEME minimal
@@ -22,7 +22,7 @@ THEME minimal
2222- 507-line Tree-sitter grammar (simplified, no external scanner)
2323- Full bindings: Rust, C, Python, Node.js with tree-sitter integration
2424- Syntax highlighting support via Tree-sitter queries
25- - 166 total tests (comprehensive parser, builder, and integration tests)
25+ - 916 total tests (174 parser tests, comprehensive builder and integration tests)
2626- End-to-end working pipeline: SQL → Data → Visualization
2727- Coordinate transformations: Cartesian (xlim/ylim), Flip, Polar
2828- VISUALISE FROM shorthand syntax with automatic SELECT injection
@@ -100,8 +100,10 @@ DRAW line MAPPING month AS x, total AS y
100100 │
101101 ▼
102102 ┌───────────────────────────────┐
103- │ Query Splitter │
104- │ (Regex-based, tree-sitter) │
103+ │ SourceTree │
104+ │ (Parse once, reuse CST) │
105+ │ • extract_sql() │
106+ │ • extract_visualise() │
105107 └───────────┬───────────────────┘
106108 │
107109 ┌───────────┴───────────┐
@@ -226,24 +228,36 @@ For detailed API documentation, see [`src/doc/API.md`](src/doc/API.md).
226228
227229** Responsibility** : Split queries and parse visualization specifications into typed AST.
228230
229- #### Query Splitter ( ` splitter .rs` )
231+ #### SourceTree ( ` source_tree .rs` )
230232
231- - Uses tree-sitter to parse the full query and find VISUALISE statements
232- - Splits query at byte offset of first VISUALISE statement
233- - Handles VISUALISE FROM by injecting ` SELECT * FROM <source> `
234- - Robust to parse errors in SQL portion (complex SQL we don't fully parse)
235- - Properly handles semicolons between SQL statements
233+ ** Parse-once architecture** that eliminates duplicate parsing throughout the pipeline.
236234
237- ** Key Features: **
235+ ** Core Design ** :
238236
239- 1 . ** Byte offset splitting** : Uses character positions instead of parse tree node boundaries
240- 2 . ** SELECT injection** : Automatically adds ` SELECT * FROM <source> ` when VISUALISE FROM is used
237+ - Wraps tree-sitter ` Tree ` + source text + ` Language `
238+ - Parses query once, reuses CST for all operations
239+ - Declarative tree-sitter query API instead of manual tree walking
240+ - Lazy extraction methods for SQL and VISUALISE portions
241+
242+ ** High-Level Query API** :
243+
244+ - ` find_node(query) ` - Find first matching node via tree-sitter query
245+ - ` find_nodes(query) ` - Find all matching nodes
246+ - ` find_text(query) ` - Extract text of first match
247+ - ` find_texts(query) ` - Extract text of all matches
248+
249+ ** Lazy Extraction Methods** :
250+
251+ - ` extract_sql() ` - Lazily extract SQL portion (before VISUALISE)
252+ - ` extract_visualise() ` - Lazily extract VISUALISE portion
253+ - Both methods use declarative tree-sitter queries
254+ - Handles VISUALISE FROM by automatically injecting ` SELECT * FROM <source> `
241255
242256#### Tree-sitter Integration (` mod.rs ` )
243257
244258- Uses ` tree-sitter-ggsql ` grammar (507 lines, simplified approach)
245259- Parses ** full query** (SQL + VISUALISE) into concrete syntax tree (CST)
246- - Grammar supports: PLOT/TABLE/MAP types, DRAW/SCALE/FACET/COORD/LABEL/GUIDE/ THEME clauses
260+ - Grammar supports: PLOT/TABLE/MAP types, DRAW/SCALE/FACET/COORD/LABEL/THEME clauses
247261- British and American spellings: ` VISUALISE ` / ` VISUALIZE `
248262- ** SQL portion parsing** : Basic SQL structure (SELECT, WITH, CREATE, INSERT, subqueries)
249263- ** Recursive subquery support** : Fully recursive grammar for complex SQL
@@ -266,11 +280,14 @@ Key grammar rules:
266280
267281``` rust
268282pub fn parse_query (query : & str ) -> Result <Vec <Plot >> {
269- // Parse full query (SQL + VISUALISE) with tree-sitter
270- let tree = parse_full_query (query )? ;
283+ // Parse once with SourceTree
284+ let source_tree = SourceTree :: new (query )? ;
285+
286+ // Validate query structure
287+ source_tree . validate ()? ;
271288
272289 // Build AST from parse tree
273- let specs = builder :: build_ast (& tree , query )? ;
290+ let specs = builder :: build_ast (& source_tree )? ;
274291 Ok (specs )
275292}
276293```
@@ -288,7 +305,6 @@ pub struct Plot {
288305 pub facet : Option <Facet >, // FACET clause
289306 pub coord : Option <Coord >, // COORD clause
290307 pub labels : Option <Labels >, // LABEL clause
291- pub guides : Vec <Guide >, // GUIDE clauses
292308 pub theme : Option <Theme >, // THEME clause
293309}
294310
@@ -323,19 +339,14 @@ pub enum Geom {
323339
324340pub enum AestheticValue {
325341 Column (String ), // Unquoted column reference: revenue AS x
326- Literal (LiteralValue ), // Quoted literal: 'value' AS fill
327- }
328-
329- pub enum LiteralValue {
330- String (String ),
331- Number (f64 ),
332- Boolean (bool ),
342+ Literal (ParameterValue ), // Quoted literal: 'value' AS fill
333343}
334344
335345pub enum ParameterValue {
336346 String (String ),
337347 Number (f64 ),
338348 Boolean (bool ),
349+ Array (Vec <ParameterValue >), // Array values for properties
339350}
340351
341352pub struct Scale {
@@ -395,19 +406,6 @@ pub struct Labels {
395406 pub labels : HashMap <String , String >, // label type → text
396407}
397408
398- pub struct Guide {
399- pub aesthetic : String ,
400- pub guide_type : Option <GuideType >,
401- pub properties : HashMap <String , ParameterValue >,
402- }
403-
404- pub enum GuideType {
405- Legend ,
406- ColorBar ,
407- Axis ,
408- None ,
409- }
410-
411409pub struct Theme {
412410 pub style : Option <String >,
413411 pub properties : HashMap <String , ParameterValue >,
@@ -781,7 +779,7 @@ SELECT * FROM (VALUES
781779SELECT * FROM sales
782780VISUALISE
783781DRAW line MAPPING date AS x, revenue AS y, region AS color
784- SCALE x SETTING type => ' date'
782+ SCALE x VIA date
785783LABEL title => ' Sales Trends'
786784```
787785
@@ -1093,16 +1091,15 @@ Where `<global_mapping>` can be:
10931091
10941092### Clause Types
10951093
1096- | Clause | Repeatable | Purpose | Example |
1097- | ----------- | ---------- | ------------------ | ----------------------------------------- |
1098- | ` VISUALISE ` | ✅ Yes | Entry point | ` VISUALISE date AS x, revenue AS y ` |
1099- | ` DRAW ` | ✅ Yes | Define layers | ` DRAW line MAPPING date AS x, value AS y ` |
1100- | ` SCALE ` | ✅ Yes | Configure scales | ` SCALE x SETTING type => 'date' ` |
1101- | ` FACET ` | ❌ No | Small multiples | ` FACET WRAP region ` |
1102- | ` COORD ` | ❌ No | Coordinate system | ` COORD cartesian SETTING xlim => [0,100] ` |
1103- | ` LABEL ` | ❌ No | Text labels | ` LABEL title => 'My Chart', x => 'Date' ` |
1104- | ` GUIDE ` | ✅ Yes | Legend/axis config | ` GUIDE color SETTING position => 'right' ` |
1105- | ` THEME ` | ❌ No | Visual styling | ` THEME minimal ` |
1094+ | Clause | Repeatable | Purpose | Example |
1095+ | -------------- | ---------- | ------------------ | ------------------------------------ |
1096+ | ` VISUALISE ` | ✅ Yes | Entry point | ` VISUALISE date AS x, revenue AS y ` |
1097+ | ` DRAW ` | ✅ Yes | Define layers | ` DRAW line MAPPING date AS x, value AS y ` |
1098+ | ` SCALE ` | ✅ Yes | Configure scales | ` SCALE x VIA date ` |
1099+ | ` FACET ` | ❌ No | Small multiples | ` FACET WRAP region ` |
1100+ | ` COORD ` | ❌ No | Coordinate system | ` COORD cartesian SETTING xlim => [0,100] ` |
1101+ | ` LABEL ` | ❌ No | Text labels | ` LABEL title => 'My Chart', x => 'Date' ` |
1102+ | ` THEME ` | ❌ No | Visual styling | ` THEME minimal ` |
11061103
11071104### DRAW Clause (Layers)
11081105
@@ -1214,49 +1211,79 @@ DRAW line
12141211** Syntax** :
12151212
12161213``` sql
1217- SCALE < aesthetic> SETTING
1218- [type => < scale_type> ]
1219- [limits => [min, max]]
1220- [breaks => < array | interval> ]
1221- [palette => < name> ]
1222- [domain => [values ...]]
1214+ SCALE [TYPE] < aesthetic> [FROM < input> ] [TO < output> ] [VIA < transform> ] [SETTING < properties> ]
12231215```
12241216
1225- ** Scale Types ** :
1217+ ** Type Modifiers ** (optional, placed before aesthetic) :
12261218
1227- - ** Continuous** : ` linear ` , ` log10 ` , ` log ` , ` log2 ` , ` sqrt ` , ` reverse `
1228- - ** Discrete** : ` categorical ` , ` ordinal `
1229- - ** Temporal** : ` date ` , ` datetime ` , ` time `
1230- - ** Color Palettes** : ` viridis ` , ` plasma ` , ` magma ` , ` inferno ` , ` cividis ` , ` diverging ` , ` sequential `
1219+ - ** ` CONTINUOUS ` ** - Continuous numeric data
1220+ - ** ` DISCRETE ` ** - Categorical/discrete data
1221+ - ** ` BINNED ` ** - Binned/bucketed data
1222+ - ** ` DATE ` ** - Date data (maps to Vega-Lite temporal type)
1223+ - ** ` DATETIME ` ** - Datetime data (maps to Vega-Lite temporal type)
1224+
1225+ ** Subclauses** :
1226+
1227+ - ** ` FROM [...] ` ** - Input range specification (maps to Vega-Lite ` scale.domain ` )
1228+ - ** ` TO [...] ` ** or ** ` TO palette ` ** - Output range as array or named palette (maps to Vega-Lite ` scale.range ` or ` scale.scheme ` )
1229+ - ** ` VIA transform ` ** - Transformation method (reserved for future use)
1230+ - ** ` SETTING ... ` ** - Additional properties (e.g., ` breaks ` )
1231+
1232+ ** Named Palettes** (used with ` TO ` ):
1233+
1234+ - ` viridis ` , ` plasma ` , ` magma ` , ` inferno ` , ` cividis ` , ` diverging ` , ` sequential `
12311235
12321236** Critical for Date Formatting** :
12331237
12341238``` sql
1235- SCALE x SETTING type => ' date'
1239+ SCALE x VIA date
12361240-- Maps to Vega-Lite field type = "temporal"
12371241-- Enables proper date axis formatting
12381242```
12391243
1240- ** Domain Property ** :
1244+ ** Input Range Specification ** (FROM clause) :
12411245
1242- The ` domain ` property explicitly sets the input domain for a scale:
1246+ The ` FROM ` clause explicitly sets the input range for a scale:
12431247
12441248``` sql
1245- -- Set domain for discrete scale
1246- SCALE color SETTING domain => [ ' red ' , ' green ' , ' blue ' ]
1249+ -- Set range for discrete scale
1250+ SCALE DISCRETE color FROM [ ' A ' , ' B ' , ' C ' ]
12471251
1248- -- Set domain for continuous scale
1249- SCALE x SETTING domain => [0 , 100 ]
1252+ -- Set range for continuous scale
1253+ SCALE CONTINUOUS x FROM [0 , 100 ]
12501254```
12511255
1252- ** Note ** : Cannot specify domain in both SCALE and COORD for the same aesthetic (will error).
1256+ ** Range Specification ** (TO clause):
12531257
1254- ** Example** :
1258+ The ` TO ` clause sets the output range - either explicit values or a named palette:
1259+
1260+ ``` sql
1261+ -- Explicit color values
1262+ SCALE color FROM [' A' , ' B' ] TO [' red' , ' blue' ]
1263+
1264+ -- Named palette
1265+ SCALE color TO viridis
1266+ ```
1267+
1268+ ** Note** : Cannot specify range in both SCALE and COORD for the same aesthetic (will error).
1269+
1270+ ** Examples** :
12551271
12561272``` sql
1257- SCALE x SETTING type => ' date' , breaks => ' 2 months'
1258- SCALE y SETTING type => ' log10' , limits => [1 , 1000 ]
1259- SCALE color SETTING palette => ' viridis' , domain => [' A' , ' B' , ' C' ]
1273+ -- Date scale
1274+ SCALE x VIA date
1275+
1276+ -- Continuous scale with input range
1277+ SCALE CONTINUOUS y FROM [0 , 100 ]
1278+
1279+ -- Discrete color scale with input range and output range
1280+ SCALE DISCRETE color FROM [' A' , ' B' , ' C' ] TO [' red' , ' green' , ' blue' ]
1281+
1282+ -- Color scale with named palette
1283+ SCALE color TO viridis
1284+
1285+ -- Scale with input range and additional settings
1286+ SCALE x VIA date FROM [' 2024-01-01' , ' 2024-12-31' ] SETTING breaks => ' 1 month'
12601287```
12611288
12621289### FACET Clause
@@ -1313,22 +1340,22 @@ COORD SETTING <properties>
13131340
13141341- ` xlim => [min, max] ` - Set x-axis limits
13151342- ` ylim => [min, max] ` - Set y-axis limits
1316- - ` <aesthetic> => [values...] ` - Set domain for any aesthetic (color, fill, size, etc.)
1343+ - ` <aesthetic> => [values...] ` - Set range for any aesthetic (color, fill, size, etc.)
13171344
13181345** Flip** :
13191346
1320- - ` <aesthetic> => [values...] ` - Set domain for any aesthetic
1347+ - ` <aesthetic> => [values...] ` - Set range for any aesthetic
13211348
13221349** Polar** :
13231350
13241351- ` theta => <aesthetic> ` - Which aesthetic maps to angle (defaults to ` y ` )
1325- - ` <aesthetic> => [values...] ` - Set domain for any aesthetic
1352+ - ` <aesthetic> => [values...] ` - Set range for any aesthetic
13261353
13271354** Important Notes** :
13281355
132913561 . ** Axis limits auto-swap** : ` xlim => [100, 0] ` automatically becomes ` [0, 100] `
133013572 . ** ggplot2 compatibility** : ` coord_flip ` preserves axis label names (labels stay with aesthetic names, not visual position)
1331- 3 . ** Domain conflicts** : Error if same aesthetic has domain in both SCALE and COORD
1358+ 3 . ** Range conflicts** : Error if same aesthetic has input range in both SCALE and COORD
133213594 . ** Multi-layer support** : All coordinate transforms apply to all layers
13331360
13341361** Status** :
@@ -1344,7 +1371,7 @@ COORD SETTING <properties>
13441371-- Cartesian with axis limits
13451372COORD cartesian SETTING xlim => [0 , 100 ], ylim => [0 , 50 ]
13461373
1347- -- Cartesian with aesthetic domain
1374+ -- Cartesian with aesthetic range
13481375COORD cartesian SETTING color => O [' red' , ' green' , ' blue' ]
13491376
13501377-- Cartesian shorthand (type optional when using SETTING)
@@ -1353,7 +1380,7 @@ COORD SETTING xlim => [0, 100]
13531380-- Flip coordinates for horizontal bar chart
13541381COORD flip
13551382
1356- -- Flip with aesthetic domain
1383+ -- Flip with aesthetic range
13571384COORD flip SETTING color => [' A' , ' B' , ' C' ]
13581385
13591386-- Polar for pie chart (theta defaults to y)
@@ -1427,7 +1454,7 @@ DRAW line
14271454 MAPPING sale_date AS x, total AS y, region AS color
14281455DRAW point
14291456 MAPPING sale_date AS x, total AS y, region AS color
1430- SCALE x SETTING type => ' date'
1457+ SCALE x VIA date
14311458FACET WRAP region
14321459LABEL title => ' Sales Trends by Region' , x => ' Date' , y => ' Total Quantity'
14331460THEME minimal
0 commit comments