Skip to content

Threading (?) causing glibc/low level crashes #155

@fricpa

Description

@fricpa

Platform

Knowing the platform greatly narrows down the potential causes of the problem.

  • Platform linux-arm32/64, Raspberry Pi 3/4, amd64
  • OS version busterarm32,bookworm` arm64=aarch64, Ubuntu 24.04
  • hid4java version 0.8.0
  • openjdk 11.0.23 (arm32, amd64) resp. 17.0.11 (aarch64) on those platforms

To Reproduce

Steps to reproduce the behavior:

Write a trivial program

HidServices hidServices =
            HidManager.getHidServices(new HidServicesSpecification());
while (true) hidServices.getAttachedHidDevices();

let it run for a while on the specified platforms.

Expected behavior

Runs without issues forever.

Screenshots and logs

I observed three crash modes so far (note I have a littlescript running the app and logging some stuff, but the basic program is as above):

all of them often appear within a few minutes of running that loop, however, sometimes they don't appear for a long time or only after I plugged in some devices and read/wrote some data to them...

2024-07-18T10:53:34,225 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:34,226 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:34,227 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:34,228 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:34,229 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
double free or corruption (!prev)
./run.sh: line 7:  1484 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
FATAL ERROR EXIT CODE 134 AT ./run.sh:7
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
corrupted size vs. prev_size
./run.sh: line 7:  2845 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
FATAL ERROR EXIT CODE 134 AT ./run.sh:7

2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T10:53:44,120 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
    #
    # A fatal error has been detected by the Java Runtime Environment:
    #
    #  SIGSEGV (0xb) at pc=0x0000007fa7bada9c, pid=2754, tid=2802
    #
    # JRE version: OpenJDK Runtime Environment (17.0.11+9) (build 17.0.11+9-Debian-1deb12u1)
    # Java VM: OpenJDK 64-Bit Server VM (17.0.11+9-Debian-1deb12u1, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-aarch64)
    # Problematic frame:
    # C  [libc.so.6+0x8da9c]
    [timeout occurred during error reporting in step "printing problematic frame"] after 30 s.
    # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
    #
    # An error report file with more information is saved as:
    # /home/pi/hid4java-apd-test/hs_err_pid2754.log
    # [ timer expired, abort... ]
    ./run.sh: line 7:  2754 Aborted                 MAVEN_OPTS="-ea" mvn package exec:java "-Dexec.mainClass=org.example.Main"
    FATAL ERROR EXIT CODE 134 AT ./run.sh:7

or, on Ubuntu 24.04

2024-07-18T11:07:20,940 INFO  [org.example.Main.main()] org.example.Main - =======================
2024-07-18T11:07:20,940 INFO  [org.example.Main.main()] org.example.Main - enumerate hid devices...
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x000074a8142ab7ec, pid=12909, tid=12965
#
# JRE version: OpenJDK Runtime Environment (11.0.23+9) (build 11.0.23+9-post-Ubuntu-1ubuntu1)
# Java VM: OpenJDK 64-Bit Server VM (11.0.23+9-post-Ubuntu-1ubuntu1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# C  [libc.so.6+0xab7ec]
[timeout occurred during error reporting in step "printing problematic frame"] after 30 s.
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/ubuntu/hid4java-apd-test/core.12909)
#
# An error report file with more information is saved as:
# /home/ubuntu/hid4java-apd-test/hs_err_pid12909.log

Additional information
I have not observed any of these failure modes on amd64 Windows 10, the loop seems to run forever there as it should.

However, on Linux it's definitely broken on every platform I tested.

It seems a lot of such issues can be caused by talking to native code from multiple java threads:

https://stackoverflow.com/questions/22491797/java-double-free-or-corruption
https://stackoverflow.com/questions/49628615/understanding-corrupted-size-vs-prev-size-glibc-error

I don't quite understand why hid4java needs any threads in the first place

image

at least for my usecase, all I would need are synchronous enumeration, synchronous read & write (with timeout), all of which are synchronous calls in hidapi

fwiw I have attached the hs_err log files
hs_err_pid2754.log
hs_err_pid12909.log

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions