Chapter 6: Bayesian Multi-armed Bandits Code

After carefully studying the example code for the multi-armed bandit on chapter six, I found a piece of code which I believe is missing a parameter:

```
def sample_bandits(self, n=1):

        bb_score = np.zeros(n)
        choices = np.zeros(n)
        
        for k in range(n):
            #sample from the bandits's priors, and select the largest sample
            choice = np.argmax(np.random.beta(1 + self.wins, 1 + self.trials - self.wins))
            
            #sample the chosen bandit
            result = self.bandits.pull(choice)
```

Here, `np.random.beta(1 + self.wins, 1 + self.trials - self.wins)` is missing the `size` parameter, thus it returns a single value, not an array. That makes `np.argmax()` to pick a bandit useless, as that will always return 0.

Shouldn't the code be `np.random.beta(1 + self.wins, 1 + self.trials - self.wins, len(self.n_bandits))` ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chapter 6: Bayesian Multi-armed Bandits Code #546

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Chapter 6: Bayesian Multi-armed Bandits Code #546

Description

Activity

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Participants

Issue actions