Skip to content

[CUS-11641] Clicks on image using edge based detection and template matching.#370

Open
ManojTestsigma wants to merge 3 commits intodevfrom
CUS-11641
Open

[CUS-11641] Clicks on image using edge based detection and template matching.#370
ManojTestsigma wants to merge 3 commits intodevfrom
CUS-11641

Conversation

@ManojTestsigma
Copy link
Copy Markdown
Contributor

@ManojTestsigma ManojTestsigma commented Apr 2, 2026

please review this addon and publish as PUBLIC

Addon name : Image based actions
Addon accont: https://jarvis.testsigma.com/ui/tenants/2187/addons
Jira: https://testsigma.atlassian.net/browse/CUS-11641

fix

  1. Clicks on image using edge based detection.
  2. clicks on image using template matching.
  3. clicks on given co-ordinates.

Summary by CodeRabbit

  • New Features

    • Added image-based click actions using template matching and edge detection
    • Added coordinate-based click action with visual markers
    • Added screenshot annotation and S3 upload capabilities
  • Chores

    • Updated project version to 1.0.17
    • Added Apache HTTP Client dependency for cloud uploads

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

📝 Walkthrough

Walkthrough

Extended the image-based actions addon with normalized cross-correlation template matching, Sobel edge detection, screenshot annotation/S3 upload utilities, and three Windows action implementations for clicking on coordinates and detected image matches across single-scale and multi-scale approaches.

Changes

Cohort / File(s) Summary
Maven Configuration
image_based_actions/pom.xml
Updated project version to 1.0.17 and added Apache HTTP Client dependency (httpclient:4.5.14) for S3 uploads.
Image Matching Core
image_based_actions/src/main/java/com/testsigma/addons/util/ImageMatchUtils.java
Introduced new utility providing multi-scale NCC-based template matching with Sobel edge detection, image preprocessing (grayscale, resize, statistics), and file utilities (download, dimensions, cleanup).
Screenshot Handling
image_based_actions/src/main/java/com/testsigma/addons/util/ScreenshotUtils.java
Added screenshot annotation (bounding box and center marker overlay), temporary file save, and S3 upload support with configurable HTTP timeouts and error handling.
Windows Actions
image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnCoordinates.java, ClickOnImageEdgeMatch.java, ClickOnImageTemplateMatch.java
Implemented three clickable actions: direct coordinate-based clicking with visual annotation, edge-detection-based image matching (NCC on Sobel maps, threshold 0.45), and template-based image matching (NCC on grayscale, threshold 0.55). All capture screen, annotate results, upload to S3, and perform mouse clicks with cleanup.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Action as ClickOnImageTemplateMatch
    participant ImageMatchUtils as ImageMatchUtils
    participant ScreenshotUtils as ScreenshotUtils
    participant Robot as Robot/OS
    participant S3 as S3 Storage
    
    User->>Action: Execute action with image-url
    Action->>Action: Download template image
    Action->>Robot: Capture full screen
    Action->>ImageMatchUtils: Convert to grayscale
    Action->>ImageMatchUtils: searchMultiScale() with NCC
    ImageMatchUtils->>ImageMatchUtils: Iterate scales, compute NCC, track best match
    ImageMatchUtils-->>Action: MatchResult with bounding box
    
    alt Match Found
        Action->>ScreenshotUtils: highlightAndUpload()
        ScreenshotUtils->>ScreenshotUtils: Highlight bounding box on screenshot
        ScreenshotUtils->>ScreenshotUtils: Save to temp file
        ScreenshotUtils->>S3: PUT signed URL with screenshot
        ScreenshotUtils-->>Action: Return temp file path
        Action->>Action: Calculate click point (center of match)
        Action->>Robot: Mouse move + click
        Action-->>User: Success with coordinates & confidence
    else No Match
        Action->>ScreenshotUtils: uploadPlainScreenshot()
        ScreenshotUtils->>S3: PUT plain screenshot to S3
        Action-->>User: Failed - no match found
    end
    
    Action->>Action: Cleanup temp files
Loading
sequenceDiagram
    actor User
    participant Action as ClickOnImageEdgeMatch
    participant ImageMatchUtils as ImageMatchUtils
    participant ScreenshotUtils as ScreenshotUtils
    participant Robot as Robot/OS
    participant S3 as S3 Storage
    
    User->>Action: Execute action with image-url
    Action->>Action: Download reference image
    Action->>Robot: Capture full screen
    Action->>ImageMatchUtils: Convert both to grayscale
    Action->>ImageMatchUtils: sobelEdgeDetection() on screenshot
    Action->>ImageMatchUtils: searchMultiScale() on edge maps (threshold 0.45)
    ImageMatchUtils->>ImageMatchUtils: Iterate scales, compute NCC on Sobel derivatives
    ImageMatchUtils-->>Action: MatchResult with bounding box
    
    alt Match Found
        Action->>ScreenshotUtils: highlightAndUpload()
        ScreenshotUtils->>ScreenshotUtils: Draw highlight box + center marker
        ScreenshotUtils->>S3: Upload highlighted screenshot
        Action->>Action: Scale click point to logical coordinates
        Action->>Robot: Mouse move + click
        Action-->>User: Success
    else No Match
        Action->>ScreenshotUtils: uploadPlainScreenshot()
        ScreenshotUtils->>S3: Upload screenshot
        Action-->>User: Failed - match not found
    end
    
    Action->>Action: Cleanup temp files
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Suggested reviewers

  • Ganesh-Testsigma
  • vigneshtestsigma

Poem

🐰 Hops with glee through pixels and frames,
Template matching plays imaging games,
With Sobel's edge dance and NCC's blend,
Screenshots fly to the cloud, our new friend!
Click, highlight, upload—what a sight! 🎯✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main changes: adding image clicking functionality via edge-based detection and template matching across multiple new action classes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch CUS-11641

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageTemplateMatch.java (1)

40-134: Consider extracting shared logic with ClickOnImageEdgeMatch.

Both ClickOnImageTemplateMatch and ClickOnImageEdgeMatch share substantial code:

  • Screenshot capture and coordinate scaling logic (lines 55-81)
  • Image loading and grayscale conversion (lines 82-90)
  • Match result handling, highlighting, and click execution (lines 99-123)
  • Cleanup pattern (lines 129-133)

The only differences are the threshold value, whether edgeMode is true/false, and the edge preprocessing in ClickOnImageEdgeMatch.

Consider extracting a shared base class or helper method to reduce duplication and ensure consistent behavior (especially for the cleanup fix).

image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnCoordinates.java (1)

36-39: Consider providing clearer error message for invalid coordinate input.

When coordinates cannot be parsed as integers (e.g., "abc" or "12.5"), Integer.parseInt throws NumberFormatException. The generic exception handler at line 68-71 will catch this, but the error message won't clearly indicate that the input was invalid.

♻️ Optional: Add explicit validation
             String xCoordinateValue = xCoordinate.getValue().toString();
             String yCoordinateValue = yCoordinate.getValue().toString();
-            int x = Integer.parseInt(xCoordinateValue);
-            int y = Integer.parseInt(yCoordinateValue);
+            int x, y;
+            try {
+                x = Integer.parseInt(xCoordinateValue);
+                y = Integer.parseInt(yCoordinateValue);
+            } catch (NumberFormatException e) {
+                setErrorMessage("Invalid coordinates: x=" + xCoordinateValue + ", y=" + yCoordinateValue + ". Must be integers.");
+                return Result.FAILED;
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnCoordinates.java`
around lines 36 - 39, In ClickOnCoordinates, the parsing of
xCoordinate.getValue() and yCoordinate.getValue() using Integer.parseInt can
throw NumberFormatException for invalid inputs; add explicit validation or a
dedicated try/catch around parsing of xCoordinateValue and yCoordinateValue (and
trim the strings) to detect non-integer input and throw or log a clear, specific
error (e.g., "Invalid coordinate value for x: '...', y: '...': must be integer")
instead of letting the generic exception handler handle it; update the parsing
block that assigns x and y to handle NumberFormatException and surface
actionable feedback via the same failure mechanism used elsewhere in this class.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@image_based_actions/src/main/java/com/testsigma/addons/util/ImageMatchUtils.java`:
- Around line 274-276: The code calls ImageIO.read(imageFile) and immediately
uses img.getWidth(), which can NPE if ImageIO.read returns null; update the
method in ImageMatchUtils to check if img == null after ImageIO.read(imageFile)
and handle it explicitly (for example, throw an IOException or a custom
exception with a clear message like "Unsupported image format or unreadable
file: " + imageFile.getName(), or return a sentinel value) so callers don't get
a NullPointerException; ensure the check is placed before accessing
img.getWidth()/getHeight() and adjust the method signature or callers if you
choose to propagate an exception.
- Around line 278-302: The downloadImage method currently calls
urlObject.openStream() without timeouts causing potential hangs; replace that
code path in downloadImage to use HttpURLConnection (cast from
urlObject.openConnection()), call setConnectTimeout(...) and
setReadTimeout(...), connect, then read the InputStream via
connection.getInputStream() inside the existing try-with-resources and
Files.copy flow, and ensure the connection is disconnected in a finally block
(or try-with-resources) while preserving the existing logging and returning of
tempFile; do not change the local-path branch that returns new File(url).

In
`@image_based_actions/src/main/java/com/testsigma/addons/util/ScreenshotUtils.java`:
- Around line 107-109: The uploadFile method logs the full presigned S3 URL
(s3SignedURL) which exposes temporary credentials; change the logging to avoid
printing query parameters or the full URL. Update uploadFile to mask the
presigned URL before logging—e.g., parse s3SignedURL and log only the
scheme+host+path (or a constant like "<masked-presigned-url>") instead of the
query string, or replace the query part with "...[REDACTED]"; keep using the
same logger variable and ensure any error logs do not include the original
s3SignedURL.

In
`@image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageEdgeMatch.java`:
- Around line 130-134: The finally block in ClickOnImageEdgeMatch calls
ImageMatchUtils.cleanupFile on targetImageFile which may delete a user's local
file because ImageMatchUtils.downloadImage can return the original File; fix by
tracking whether downloadImage produced a temp/copy and only cleanup those temp
files: when calling ImageMatchUtils.downloadImage from ClickOnImageEdgeMatch,
set a boolean (e.g., isTempTargetImage) based on downloadImage's contract (or
change downloadImage to return a small wrapper/flag) and then in the finally
block call ImageMatchUtils.cleanupFile(targetImageFile) only if
isTempTargetImage is true (leave cleanup of screenshotFile/highlightedFile
unchanged).

In
`@image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageTemplateMatch.java`:
- Around line 129-133: In ClickOnImageTemplateMatch, avoid deleting a
user-supplied local image by tracking whether targetImageFile (and
highlightedFile if created by us) were downloaded/created as temp files before
calling ImageMatchUtils.cleanupFile; add boolean flags (e.g., targetIsTemp,
highlightedIsTemp) where you download or generate files and only call
cleanupFile for those with the flag set, leaving user-provided paths untouched,
mirroring the fix used in ClickOnImageEdgeMatch.

---

Nitpick comments:
In
`@image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnCoordinates.java`:
- Around line 36-39: In ClickOnCoordinates, the parsing of
xCoordinate.getValue() and yCoordinate.getValue() using Integer.parseInt can
throw NumberFormatException for invalid inputs; add explicit validation or a
dedicated try/catch around parsing of xCoordinateValue and yCoordinateValue (and
trim the strings) to detect non-integer input and throw or log a clear, specific
error (e.g., "Invalid coordinate value for x: '...', y: '...': must be integer")
instead of letting the generic exception handler handle it; update the parsing
block that assigns x and y to handle NumberFormatException and surface
actionable feedback via the same failure mechanism used elsewhere in this class.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 77db9a8e-ce29-432f-8558-dbf0b25322f0

📥 Commits

Reviewing files that changed from the base of the PR and between 21e0da1 and da7cc5a.

📒 Files selected for processing (6)
  • image_based_actions/pom.xml
  • image_based_actions/src/main/java/com/testsigma/addons/util/ImageMatchUtils.java
  • image_based_actions/src/main/java/com/testsigma/addons/util/ScreenshotUtils.java
  • image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnCoordinates.java
  • image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageEdgeMatch.java
  • image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageTemplateMatch.java

Comment on lines +274 to +276
BufferedImage img = ImageIO.read(imageFile);
return new int[]{img.getWidth(), img.getHeight()};
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential NullPointerException if ImageIO.read returns null.

When no registered ImageReader can decode the file, the fallback at line 274 calls ImageIO.read(), which can return null for unsupported formats. Accessing img.getWidth() on line 275 would then throw an NPE.

🐛 Proposed fix
         BufferedImage img = ImageIO.read(imageFile);
+        if (img == null) {
+            throw new Exception("Unable to read image dimensions: unsupported format");
+        }
         return new int[]{img.getWidth(), img.getHeight()};
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
BufferedImage img = ImageIO.read(imageFile);
return new int[]{img.getWidth(), img.getHeight()};
}
BufferedImage img = ImageIO.read(imageFile);
if (img == null) {
throw new Exception("Unable to read image dimensions: unsupported format");
}
return new int[]{img.getWidth(), img.getHeight()};
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@image_based_actions/src/main/java/com/testsigma/addons/util/ImageMatchUtils.java`
around lines 274 - 276, The code calls ImageIO.read(imageFile) and immediately
uses img.getWidth(), which can NPE if ImageIO.read returns null; update the
method in ImageMatchUtils to check if img == null after ImageIO.read(imageFile)
and handle it explicitly (for example, throw an IOException or a custom
exception with a clear message like "Unsupported image format or unreadable
file: " + imageFile.getName(), or return a sentinel value) so callers don't get
a NullPointerException; ensure the check is placed before accessing
img.getWidth()/getHeight() and adjust the method signature or callers if you
choose to propagate an exception.

Comment on lines +278 to +302
public static File downloadImage(String fileName, String url, Logger logger) {
try {
if (url.startsWith("https://") || url.startsWith("http://")) {
logger.info("Downloading: " + url);
URL urlObject = new URL(url);
String extension = ".png";
String urlPath = urlObject.getPath();
int dot = urlPath.lastIndexOf('.');
if (dot > 0) extension = urlPath.substring(dot);

File tempFile = File.createTempFile(fileName, extension);
try (InputStream in = urlObject.openStream()) {
Files.copy(in, tempFile.toPath(), StandardCopyOption.REPLACE_EXISTING);
}
logger.info("Downloaded to: " + tempFile.getAbsolutePath());
return tempFile;
} else {
logger.info("Local path: " + url);
return new File(url);
}
} catch (Exception e) {
logger.info("Image access error: " + url);
throw new RuntimeException("Unable to access image: " + url, e);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Missing timeout on URL stream could cause indefinite hangs.

urlObject.openStream() has no timeout configuration. If the remote server is slow or unresponsive, this can block indefinitely, potentially causing test execution to hang.

Consider using HttpURLConnection with explicit timeouts or the Apache HttpClient (already a dependency) for consistency with ScreenshotUtils.

🛠️ Proposed fix using HttpURLConnection with timeouts
-            try (InputStream in = urlObject.openStream()) {
-                Files.copy(in, tempFile.toPath(), StandardCopyOption.REPLACE_EXISTING);
-            }
+            java.net.HttpURLConnection conn = (java.net.HttpURLConnection) urlObject.openConnection();
+            conn.setConnectTimeout(30000);
+            conn.setReadTimeout(30000);
+            try (InputStream in = conn.getInputStream()) {
+                Files.copy(in, tempFile.toPath(), StandardCopyOption.REPLACE_EXISTING);
+            } finally {
+                conn.disconnect();
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@image_based_actions/src/main/java/com/testsigma/addons/util/ImageMatchUtils.java`
around lines 278 - 302, The downloadImage method currently calls
urlObject.openStream() without timeouts causing potential hangs; replace that
code path in downloadImage to use HttpURLConnection (cast from
urlObject.openConnection()), call setConnectTimeout(...) and
setReadTimeout(...), connect, then read the InputStream via
connection.getInputStream() inside the existing try-with-resources and
Files.copy flow, and ensure the connection is disconnected in a finally block
(or try-with-resources) while preserving the existing logging and returning of
tempFile; do not change the local-path branch that returns new File(url).

Comment on lines +130 to +134
} finally {
ImageMatchUtils.cleanupFile(screenshotFile);
ImageMatchUtils.cleanupFile(targetImageFile);
ImageMatchUtils.cleanupFile(highlightedFile);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Risk of deleting user's local image file.

When imageUrl is a local file path (not HTTP/HTTPS), ImageMatchUtils.downloadImage returns that same File object without copying it. The cleanupFile(targetImageFile) call in the finally block will then delete the user's original file.

🐛 Proposed fix: Only cleanup temp files

Track whether the file is a temp download or a local reference:

-        File targetImageFile = null;
+        File targetImageFile = null;
+        boolean isDownloadedTemp = false;
         ...
-            targetImageFile = ImageMatchUtils.downloadImage("target_image", imageUrlValue, logger);
+            String imageUrlValue = imageUrl.getValue().toString();
+            isDownloadedTemp = imageUrlValue.startsWith("http://") || imageUrlValue.startsWith("https://");
+            targetImageFile = ImageMatchUtils.downloadImage("target_image", imageUrlValue, logger);
         ...
         } finally {
             ImageMatchUtils.cleanupFile(screenshotFile);
-            ImageMatchUtils.cleanupFile(targetImageFile);
+            if (isDownloadedTemp) {
+                ImageMatchUtils.cleanupFile(targetImageFile);
+            }
             ImageMatchUtils.cleanupFile(highlightedFile);
         }

Alternatively, modify downloadImage to always create a temp copy for local files, or return a wrapper indicating whether cleanup is safe.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@image_based_actions/src/main/java/com/testsigma/addons/windows/ClickOnImageEdgeMatch.java`
around lines 130 - 134, The finally block in ClickOnImageEdgeMatch calls
ImageMatchUtils.cleanupFile on targetImageFile which may delete a user's local
file because ImageMatchUtils.downloadImage can return the original File; fix by
tracking whether downloadImage produced a temp/copy and only cleanup those temp
files: when calling ImageMatchUtils.downloadImage from ClickOnImageEdgeMatch,
set a boolean (e.g., isTempTargetImage) based on downloadImage's contract (or
change downloadImage to return a small wrapper/flag) and then in the finally
block call ImageMatchUtils.cleanupFile(targetImageFile) only if
isTempTargetImage is true (leave cleanup of screenshotFile/highlightedFile
unchanged).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants