-
Notifications
You must be signed in to change notification settings - Fork 85
Optimization potential #138
Copy link
Copy link
Open
Labels
choreThings that need to happen but don't change the behaviourThings that need to happen but don't change the behaviourexperimentalThings we want to play with outside the stable buildThings we want to play with outside the stable buildhelp wantedExtra attention is neededExtra attention is needed
Metadata
Metadata
Assignees
Labels
choreThings that need to happen but don't change the behaviourThings that need to happen but don't change the behaviourexperimentalThings we want to play with outside the stable buildThings we want to play with outside the stable buildhelp wantedExtra attention is neededExtra attention is needed
Doing some rudimentary profiling I noticed, we are wasting a huge amount of time of the main thread in essentially 3 functions:
alpha_blend(~40%)convert_rgb_to_yuyv(~25%)cap.retrieve(~20%)Values are relative time for
main, which takes up roughly the same amount of time, we actually spend processing images (main~32.75%,bs_maskgen_process~33.95%, total runtime).On the positive:
bs_maskgen_processspends ~95% of the time waiting for processing by TFLite.FWIW: Timings heavily affected by running under callgrind, but looking at the code of
alpha_blendandconvert_rgb_to_yuyvI'm not really surprised of these results. A rewrite of these functions using some vectoring should yield quite a bit of improvement.