Skip to content

Unable to train controller on colab #26

@MUHAMMAD0KASHIF

Description

@MUHAMMAD0KASHIF

!xvfb-run -a -s "-screen 0 1400x900x24" python 05_train_controller.py car_racing --num_worker 16 --num_worker_trial 2 --num_episode 4 --max_length 1000 --eval_steps 25

/usr/local/lib/python3.6/dist-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Using TensorFlow backend.
['mpirun', '-np', '17', '/usr/bin/python3', '05_train_controller.py', 'car_racing', '--num_worker', '16', '--num_worker_trial', '2', '--num_episode', '4', '--max_length', '1000', '--eval_steps', '25']

mpirun has detected an attempt to run as root.
Running at root is strongly discouraged as any mistake (e.g., in
defining TMPDIR) or bug can result in catastrophic damage to the OS
file system, leaving your system in an unusable state.

You can override this protection by adding the --allow-run-as-root
option to your cmd line. However, we reiterate our strong advice
against doing so - please do so at your own risk.

Traceback (most recent call last):
File "05_train_controller.py", line 525, in
if "parent" == mpi_fork(args.num_worker+1): os.exit()
File "05_train_controller.py", line 492, in mpi_fork
subprocess.check_call(["mpirun", "-np", str(n), sys.executable] +['-u']+ sys.argv, env=env)
File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['mpirun', '-np', '17', '/usr/bin/python3', '-u', '05_train_controller.py', 'car_racing', '--num_worker', '16', '--num_worker_trial', '2', '--num_episode', '4', '--max_length', '1000', '--eval_steps', '25']' returned non-zero exit status 1.
[36c641d9ccde:04530] *** Process received signal ***
[36c641d9ccde:04530] Signal: Segmentation fault (11)
[36c641d9ccde:04530] Signal code: Address not mapped (1)
[36c641d9ccde:04530] Failing at address: 0x7f395ad0320d
[36c641d9ccde:04530] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f395ddb2890]
[36c641d9ccde:04530] [ 1] /lib/x86_64-linux-gnu/libc.so.6(getenv+0xa5)[0x7f395d9f1785]
[36c641d9ccde:04530] [ 2] /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4(_ZN13TCMallocGuardD1Ev+0x34)[0x7f395e25ce44]
[36c641d9ccde:04530] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__cxa_finalize+0xf5)[0x7f395d9f2615]
[36c641d9ccde:04530] [ 4] /usr/lib/x86_64-linux-gnu/libtcmalloc.so.4(+0x13cb3)[0x7f395e25acb3]
[36c641d9ccde:04530] *** End of error message ***

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions