[SQL] EXPLAIN COST gives wrong estimated input size when querying via VIEW instead of physical Delta path
Component
Spark SQL (Catalyst statistics / EXPLAIN COST)
Describe the problem
EXPLAIN COST reports the correct estimated input size when querying a Delta table by its physical path. But if a VIEW (or table) is created on top of that same path, EXPLAIN COST on the view returns a wrong estimated input size for the identical data. Query results are correct — only the cost/size estimate is wrong.
Steps to reproduce
-- 1. Correct size
EXPLAIN COST SELECT * FROM delta.`/path/to/table_x`;
-- 2. Wrap in a view
CREATE OR REPLACE VIEW table_x AS
SELECT * FROM delta.`/path/to/table_x`;
-- 3. Wrong size
EXPLAIN COST SELECT * FROM table_x;
Observed
Step 1: correct sizeInBytes.
Step 3: incorrect sizeInBytes for the same underlying data.
Expected
EXPLAIN COST should report the same, accurate estimated input size whether queried via the physical path or via a view/logical table name wrapping it.
Environment
-
Spark: 3.5.6
-
Delta Lake: 3.3.0
-
Deployment: (k8s)
Willingness to contribute
[SQL] EXPLAIN COST gives wrong estimated input size when querying via VIEW instead of physical Delta path
Component
Spark SQL (Catalyst statistics / EXPLAIN COST)
Describe the problem
EXPLAIN COSTreports the correct estimated input size when querying a Delta table by its physical path. But if aVIEW(or table) is created on top of that same path,EXPLAIN COSTon the view returns a wrong estimated input size for the identical data. Query results are correct — only the cost/size estimate is wrong.Steps to reproduce
Observed
Step 1: correct
sizeInBytes.Step 3: incorrect
sizeInBytesfor the same underlying data.Expected
EXPLAIN COSTshould report the same, accurate estimated input size whether queried via the physical path or via a view/logical table name wrapping it.Environment
Spark: 3.5.6
Delta Lake: 3.3.0
Deployment: (k8s)
Willingness to contribute