Skip to content

ValueError when using temperature_action_probs as p argument in np.random.choice #2

@martinholecekmax

Description

@martinholecekmax

When using temperature_action_probs as the p argument in np.random.choice, a ValueError is raised with the message "probabilities do not sum to 1". This occurs because temperature_action_probs is being raised to the power of 1 / self.args["temperature"], which can cause the values to no longer sum to 1.

To fix this issue, you can normalize temperature_action_probs so that its values sum to 1. You can do this by dividing temperature_action_probs by its sum:

temperature_action_probs /= np.sum(temperature_action_probs)

This will ensure that the values in temperature_action_probs sum to 1, which will allow you to use it as the p argument for np.random.choice.

Steps to Reproduce:

Run the code with temperature_action_probs as the p argument in np.random.choice.
Observe the ValueError with the message "probabilities do not sum to 1".
Expected Behavior:

The np.random.choice function should be able to accept temperature_action_probs as the p argument without raising a ValueError.

Actual Behavior:

A ValueError is raised with the message "probabilities do not sum to 1".

Fix:

Normalize temperature_action_probs so that its values sum to 1 by dividing it by its sum:

temperature_action_probs /= np.sum(temperature_action_probs)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions