A3C instead of actor-critic in  reinforcement_learning/reinforce.py 

There is the code of reinforce.py
`for action, r in zip(self.saved_actions, rewards):
            action.reinforce(r)`

And there is the code of actor-critic.py:
`    for (action, value), r in zip(saved_actions, rewards):
        reward = r - value.data[0,0]
        action.reinforce(reward)
value_loss += F.smooth_l1_loss(value, Variable(torch.Tensor([r])))`

So i consider it is Asynchronous Advantage Actor-Critic, A3C, not Actor-critic


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A3C instead of actor-critic in reinforcement_learning/reinforce.py #151

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

A3C instead of actor-critic in reinforcement_learning/reinforce.py #151

Description

Activity

jeasinema commented on Oct 31, 2017

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions