Skip to content

Consider adding theta_sketch_agg_int64_lgk() #144

@nikunjbhartia

Description

@nikunjbhartia

Currently there are 2 methods to create theta sketches for int64 :

  • theta_sketch_agg_int64(value INT64)
  • theta_sketch_agg_int64_lgk_seed_p(value INT64, params STRUCT<lg_k BYTEINT, seed INT64, p FLOAT64> NOT AGGREGATE)

If I need to increase precision of the sketch, then I am forced to pass a seed and p value.
If a sketch is created with a seed, then end users now cannot use theta_sketch_agg_union(sketch BYTES) anymore and are forced to remember the seed value and use theta_sketch_agg_union_lgk_seed(sketch BYTES, params STRUCT<lg_k BYTEINT, seed INT64> NOT AGGREGATE)

Using theta_sketch_agg_union() on a sketch initialized with a seed gives me following error:

seed hash mismatch: expected 37836, actual 54156 at bqutil.datasketches.theta_sketch_agg_union_lgk_seed(BYTES, STRUCT<lg_k INT64, seed INT64>) line 55, columns 6-7; reason: invalidQuery, location: query, message: Error: seed hash mismatch: expected 37836, actual 54156 at bqutil.datasketches.theta_sketch_agg_union_lgk_seed(BYTES, STRUCT<lg_k INT64, seed INT64>) line 55, columns 6-7

Can we consider adding a functiontheta_sketch_agg_int64_lgk((value INT64, lg_k INT64) ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions