Skip to content

how to troubleshoot storm worker crash #466

@z11373

Description

@z11373

Hi, sorry to write something here, but I wonder if anybody has suggestion for me on how to troubleshoot and figure out the culprit of the worker crash problem we have right now. We are using streamparse for our Python code on Storm 1.1.1
Below is the log that I caught before it got recycled due to crash. I am running out ideas on how to troubleshoot it, I really appreciate if anyone has idea or pointer. Thanks!

2019-08-28 15:05:32.947 o.a.s.s.ShellSpout Thread-11-event_spout-executor[10 10] [INFO] Launched subprocess with pid 10054
2019-08-28 15:05:32.951 o.a.s.d.executor Thread-11-event_spout-executor[10 10] [INFO] Opened spout event_spout:(10)
2019-08-28 15:05:32.953 o.a.s.d.executor Thread-11-event_spout-executor[10 10] [INFO] Activating spout event_spout:(10)
2019-08-28 15:05:32.953 o.a.s.s.ShellSpout Thread-11-event_spout-executor[10 10] [INFO] Start checking heartbeat...
2019-08-28 15:05:32.961 o.a.s.util Thread-11-event_spout-executor[10 10] [ERROR] Async loop died!
java.lang.RuntimeException: pid:10054, name:event_spout exitCode:-1, errorString:
at org.apache.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:218) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.spout.ShellSpout.sendSyncCommand(ShellSpout.java:145) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.spout.ShellSpout.activate(ShellSpout.java:266) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.daemon.executor$fn__4962$fn__4977$fn__5008.invoke(executor.clj:641) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484) [storm-core-1.1.1.jar:1.1.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:

    at org.apache.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:127) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:183) ~[storm-core-1.1.1.jar:1.1.1]
    ... 6 more

2019-08-28 15:05:32.968 o.a.s.d.executor Thread-11-event_spout-executor[10 10] [ERROR]
java.lang.RuntimeException: pid:10054, name:event_spout exitCode:-1, errorString:
at org.apache.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:218) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.spout.ShellSpout.sendSyncCommand(ShellSpout.java:145) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.spout.ShellSpout.activate(ShellSpout.java:266) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.daemon.executor$fn__4962$fn__4977$fn__5008.invoke(executor.clj:641) ~[storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:484) [storm-core-1.1.1.jar:1.1.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: java.lang.RuntimeException: org.apache.storm.multilang.NoOutputException: Pipe to subprocess seems to be broken! No output read.
Serializer Exception:

    at org.apache.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:127) ~[storm-core-1.1.1.jar:1.1.1]
    at org.apache.storm.spout.ShellSpout.querySubprocess(ShellSpout.java:183) ~[storm-core-1.1.1.jar:1.1.1]
    ... 6 more

2019-08-28 15:05:33.009 o.a.s.util Thread-11-event_spout-executor[10 10] [ERROR] Halting process: ("Worker died")
java.lang.RuntimeException: ("Worker died")
at org.apache.storm.util$exit_process_BANG_.doInvoke(util.clj:341) [storm-core-1.1.1.jar:1.1.1]
at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.7.0.jar:?]
at org.apache.storm.daemon.worker$fn__5632$fn__5633.invoke(worker.clj:763) [storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.daemon.executor$mk_executor_data$fn__4848$fn__4849.invoke(executor.clj:276) [storm-core-1.1.1.jar:1.1.1]
at org.apache.storm.util$async_loop$fn__557.invoke(util.clj:494) [storm-core-1.1.1.jar:1.1.1]
at clojure.lang.AFn.run(AFn.java:22) [clojure-1.7.0.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
2019-08-28 15:05:33.018 o.a.s.d.worker Thread-16 [INFO] Shutting down worker tmon-4-1567019114 ba5b3695-b390-4c3e-9d92-af0771f17b86 6700

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions