-
Notifications
You must be signed in to change notification settings - Fork 250
Open
Description
Hi Haarnoja,
Thanks a lot for maintaining the amazing repo!
I feel a little confused about the implementation of SVGD in soft-q learning.
At
softlearning/softlearning/algorithms/sql.py
Line 281 in 05daa55
| log_probs = svgd_target_values + squash_correction |
,the log probs is calculated as log_probs = svgd_target_values + squash_correction,where is log probs on the
However, the following SVGD used the log probs on the
I think there should be actions = self._policy.raw_actions(expanded_observations) in
softlearning/softlearning/algorithms/sql.py
Line 235 in 05daa55
| actions = self._policy.actions(expanded_observations) |
Best,
Yuxuan
Metadata
Metadata
Assignees
Labels
No labels