Skip to content

feat: add blazeface support#1187

Open
chmjkb wants to merge 5 commits into
mainfrom
@chmjkb/blazeface
Open

feat: add blazeface support#1187
chmjkb wants to merge 5 commits into
mainfrom
@chmjkb/blazeface

Conversation

@chmjkb
Copy link
Copy Markdown
Collaborator

@chmjkb chmjkb commented May 26, 2026

Description

Introduces a breaking change?

  • Yes
  • No

Type of change

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Documentation update (improves or adds clarity to existing documentation)
  • Other (chores, tests, code style improvements etc.)

Tested on

  • iOS
  • Android

Testing instructions

Screenshots

Related issues

Checklist

  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings

Additional notes

@chmjkb chmjkb marked this pull request as draft May 26, 2026 08:41
@chmjkb chmjkb force-pushed the @chmjkb/blazeface branch from cb4233a to 9809733 Compare May 26, 2026 08:42
@chmjkb chmjkb force-pushed the @chmjkb/blazeface branch from 9809733 to b3fcb85 Compare May 26, 2026 09:12
@chmjkb chmjkb marked this pull request as ready for review May 26, 2026 12:42
@chmjkb chmjkb requested a review from benITo47 May 26, 2026 12:42
/// Affine transform from model-input pixel coords back to source-image coords:
/// `x_src = x_model * scaleX + offsetX`. Covers both plain stretch (offsets
/// zero) and letterbox (offsets carry the centre-pad).
struct BoxTransform {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe move it to utils? This might be used in other model pipes as well.

Comment on lines 59 to -138
@@ -86,34 +87,37 @@ std::vector<types::Instance> BaseInstanceSegmentation::runInference(
auto instances = collectInstances(
forwardResult.get(), originalSize, modelInputSize, confidenceThreshold,
classIndices, returnMaskAtOriginalResolution);
return finalizeInstances(std::move(instances), iouThreshold, maxInstances);
return finalizeInstances(std::move(instances), iouThreshold, maxInstances,
useWeightedNms);
}

std::vector<types::Instance> BaseInstanceSegmentation::generateFromString(
std::string imageSource, double confidenceThreshold, double iouThreshold,
int32_t maxInstances, std::vector<int32_t> classIndices,
bool returnMaskAtOriginalResolution, std::string methodName) {
bool returnMaskAtOriginalResolution, std::string methodName,
bool useWeightedNms) {

cv::Mat imageBGR = image_processing::readImage(imageSource);
cv::Mat imageRGB;
cv::cvtColor(imageBGR, imageRGB, cv::COLOR_BGR2RGB);

return runInference(imageRGB, confidenceThreshold, iouThreshold, maxInstances,
classIndices, returnMaskAtOriginalResolution, methodName);
classIndices, returnMaskAtOriginalResolution, methodName,
useWeightedNms);
}

std::vector<types::Instance> BaseInstanceSegmentation::generateFromFrame(
jsi::Runtime &runtime, const jsi::Value &frameData,
double confidenceThreshold, double iouThreshold, int32_t maxInstances,
std::vector<int32_t> classIndices, bool returnMaskAtOriginalResolution,
std::string methodName) {
std::string methodName, bool useWeightedNms) {

auto orient = ::rnexecutorch::utils::readFrameOrientation(runtime, frameData);
cv::Mat frame = extractFromFrame(runtime, frameData);
cv::Mat rotated = utils::rotateFrameForModel(frame, orient);
auto instances =
runInference(rotated, confidenceThreshold, iouThreshold, maxInstances,
classIndices, returnMaskAtOriginalResolution, methodName);
auto instances = runInference(
rotated, confidenceThreshold, iouThreshold, maxInstances, classIndices,
returnMaskAtOriginalResolution, methodName, useWeightedNms);
for (auto &inst : instances) {
utils::inverseRotateBbox(inst.bbox, orient, rotated.size());
// Inverse-rotate the mask to match the screen orientation
@@ -131,11 +135,13 @@ std::vector<types::Instance> BaseInstanceSegmentation::generateFromFrame(
std::vector<types::Instance> BaseInstanceSegmentation::generateFromPixels(
JSTensorViewIn tensorView, double confidenceThreshold, double iouThreshold,
int32_t maxInstances, std::vector<int32_t> classIndices,
bool returnMaskAtOriginalResolution, std::string methodName) {
bool returnMaskAtOriginalResolution, std::string methodName,
bool useWeightedNms) {

cv::Mat image = extractFromPixels(tensorView);
return runInference(image, confidenceThreshold, iouThreshold, maxInstances,
classIndices, returnMaskAtOriginalResolution, methodName);
Copy link
Copy Markdown
Member

@msluszniak msluszniak May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are modifying the code that should be general (BASE instance segmentation) and we bloat it with some positional parameters that are useless for other models. This is super bad architectural design what we have right now. We need to discuss this internally how we should tackle that. Because this is not the problem of this PR only, but almost all models. I created RFC in discussion section how to deal with this one. See #1189

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants