Is your feature request related to a problem or challenge?
when i do one test use below sql, i find the performance is not except
SELECT
client_ip,
approx_distinct(trace_id) AS cnt
FROM
"*.parquet"
GROUP BY
client_ip ORDER BY cnt DESC LIMIT 10
i have 100M rows, and 0.5 M unique client_ip
datafusli-cli need 900s to get the result, but duckdb only need 3s
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
when i do one test use below sql, i find the performance is not except
i have 100M rows, and 0.5 M unique client_ip
datafusli-cli need 900s to get the result, but duckdb only need 3s
Describe the solution you'd like
No response
Describe alternatives you've considered
No response
Additional context
No response