OpenAI with ROS: Chapter 10: Broken with master branch of openai/baselines

I am currently trying to run the code of Chapter 10 on my local pc but I keep running into problems. When trying to run the train.py script from this Chapter (see this repository for the exact code) I get the following error:

AttributeError: 'function' object has no attribute 'reset' inf rollout.py

As stated in this issue this is caused by the fact that the code of Chapter 10 is based on an old version of the baselines package 146bbf8. The issue describes two solutions for this problem:

Revert back to the old version:

I tried this but unfortunately, I run into the following error:

ModuleNotFoundError: No module named 'mujoco_py'

As I would rather add modifications to my train.py script than buying an additional mujoco license, I tried the second solution.

Option 2: Initiate environment before initializing RolloutWorker

I tried adding the exact code given in the issue before the RolloutWorker:

from baselines.common.cmd_util import make_vec_env
env = make_vec_env(env, 'robotics', num_env=1, seed=None)

but when running the train_modified.py script I receive the following error:

Traceback (most recent call last):
  File "./src/my_fetch_train/scripts/train_modified.py", line 280, in <module>
    main()
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "./src/my_fetch_train/scripts/train_modified.py", line 276, in main
    launch(**kwargs)
  File "./src/my_fetch_train/scripts/train_modified.py", line 210, in launch
    env_vec, policy, dims, logger, **rollout_params
  File "/home/ricks/Development/robot_academy_ws/src/baselines/baselines/her/util.py", line 36, in wrapper
    return method(*positional_args, **keyword_args)
  File "/home/ricks/Development/robot_academy_ws/src/baselines/baselines/her/rollout.py", line 41, in __init__
    self.reset_all_rollouts()
  File "/home/ricks/Development/robot_academy_ws/src/baselines/baselines/her/rollout.py", line 46, in reset_all_rollouts
    self.initial_o = self.obs_dict['observation']
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

I presume this is caused since the DummyEnv class is used instead of my own environment. As a result, I tried the following code:

from baselines.common.cmd_util import make_robotics_env
env_vec= make_robotics_env(env, seed=rank_seed, rank=rank)

But this also gave me the error above. Lastly, I tried inputting a normal gym.env instead of a vectorized env by using the following code:

    rollout_worker = RolloutWorker(
        env_vec, policy, dims, logger, **rollout_params
    )
    rollout_worker.seed(rank_seed)

But I then received the following error:

Traceback (most recent call last):
  File "./src/my_fetch_train/scripts/train_modified.py", line 289, in <module>
    main()
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ricks/.catkin_ws_python3/openai_venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "./src/my_fetch_train/scripts/train_modified.py", line 285, in main
    launch(**kwargs)
  File "./src/my_fetch_train/scripts/train_modified.py", line 221, in launch
    rollout_worker.seed(rank_seed)
AttributeError: 'RolloutWorker' object has no attribute 'seed'

I was, therefore, wondering if somebody has an example on how to use the HER algorithm with the master branch of the openai/baselines

The above described problems were all solved when I switched to the stable_baselines fork.

Ok thanks for posting also the solution. Good to know for anyone that wants to execute localy. Thanks a lot.