simulations-in-class-team1-3/Simulations-InClass_nour.Rmd at master · PHP2560-Statistical-Programming-R/simulations-in-class-team1-3 · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
---
title: "Simulations In-Class Project"
date: "Due October 13, 2017 at 11:59pm"
output:
  html_document


---

<style type="text/css">
.table {

    width: 80%;
    margin-left:10%;
    margin-right:10%;
}
</style>
```{r,setup, echo=FALSE, cache=TRUE}
## numbers >= 10^5 will be denoted in scientific notation,
## and rounded to 2 digits
options(scipen = 3, digits = 3)
```


#Project Goals:


With this project we will simulate a famous probability problem. This will not require knowledge of probability or statistics but only the logic to follow the steps in order to simulate this problem. This is one way to solve problems by using the computer.

 1. **Gambler's Ruin**: Suppose you have a bankroll of $1000 and make bets of $100 on a fair game. By simulating the outcome directly for at most 5000 iterations of the game (or hands), estimate:
    a. the probability that you have "busted" (lost all your money) by the time you have placed your one hundredth bet.
```{r}

simulateGambling<-function(bankroll,bet, n, p=0.5){
  count=0
  while(count < n & bankroll >= bet){ # check if you have enough cash
    play = rbinom(1,1,p) # 1:win and 0:lost
    if (play==1){ # win
      bankroll=bankroll+bet
    } else{ # lose
      bankroll=bankroll-bet
    }
    count=count+1 # next round
  }
  result<- list(bankroll=bankroll,count=count)
  return(result)
}
```

```
gambling <- function(bankroll =1000 , bet = 100 , winerate = .5, maxhands = 5000) {
  outcome <- c(0,0,0)

  for (i in 1:maxhands) {
    outcome[3] = i
    if (rbinom(1,1,winerate) == 1) {
      bankroll= bankroll + bet
    }
    else { bankroll = bankroll - bet}
  if (bankroll == 0) {break}
  if (i == 100) {
    outcome[1] = bankroll
  }
  if (i == 500){
    outcome[2] = bankroll

  }}
  return(outcome)
}


```

```
function(maxhands, gambling) {
  bankrupt <- 0
  for (i in 1:5000) {
    if (gambling(maxhand = maxhands)[3] < 500) {
      bankrupt <- bankrupt+1
    }
  }
  prob_bankrupt <- bankrupt / 5000
  return(prob_bankrup)
}
```
```{r warning = FALSE}
result<-replicate(10000,simulateGambling(1000,100,5000)$count)
length(which(result<=100))/10000
```
    b. the probability that you have busted by the time you have placed your five hundredth bet by simulating the outcome directly.

```{r warning = FALSE}
length(which(result<=500))/10000
```
    c. the mean time you go bust, given that you go bust within the first 5000 hands.

```{r warning = FALSE}
mean(!result==5000)
```
    d. the mean and variance of your bankroll after 100 hands (including busts).

```{r warning = FALSE}
result3<-replicate(10000, simulateGambling(1000,100,100)$bankroll)
mean(result3)
var(result3)
```
    e. the mean and variance of your bankroll after 500 hands (including busts).
```{r warning = FALSE}
result2<-replicate(10000, simulateGambling(1000,100,500)$bankroll)
mean(result2)
var(result2)
```


Note: you *must* stop playing if your player has gone bust. How will you handle this in the `for` loop?

2. Repeat the previous problem with betting on black in American roulette, where the probability of winning on any spin is 18/38 for an even payout.

 a. the probability that you have "busted" (lost all your money) by the time you have placed your one hundredth bet.

```{r warning = FALSE}
result<-replicate(10000,simulateGambling(1000,100,5000,18/38)$count)
length(which(result<=100))/10000
```
    b. the probability that you have busted by the time you have placed your five hundredth bet by simulating the outcome directly.

```{r warning = FALSE}
length(which(result<=500))/10000
```
    c. the mean time you go bust, given that you go bust within the first 5000 hands.

```{r warning = FALSE}
result[500]
mean(result[result!=5000])
```

    d. the mean and variance of your bankroll after 100 hands (including busts).

```{r warning = FALSE}
result3<-replicate(10000, simulateGambling(1000,100,100)$bankroll)
mean(result3)
var(result3)
```
    e. the mean and variance of your bankroll after 500 hands (including busts).

```{r warning = FALSE}
result2<-replicate(10000, simulateGambling(1000,100,500)$bankroll)
mean(result2)
var(result2)
```

3. **Markov Chains**. Suppose you have a game where the probability of winning on your first hand is 48%; each time you win, that probability goes up by one percentage point for the next game (to a maximum of 100%, where it must stay), and each time you lose, it goes back down to 48%. Assume you cannot go bust and that the size of your wager is a constant $100.
    a. Is this a fair game? Simulate one hundred thousand sequential hands to determine the size of your return. Then repeat this simulation 99 more times to get a range of values to calculate the expectation.
```{r warning = FALSE}
markov_game <- function(wager, iterations, p = .48, bankroll = 0) {
  init.p <- p
  inc <- 0.01
  for (i in 1:iterations) {
    play = rbinom(1,1,p) # 1:win and 0:lost
    if(p < 1 & play == 1) { # win and p <1
      p <- p + inc
      bankroll = bankroll + wager # inc
    } else if (p < 1 & play == 0) {# lose
      bankroll=bankroll-wager # dec
      p <- init.p # reset prob to initial
    }
  }
  return(bankroll)
}
markov.result<-replicate(99, markov_game(100,100000))
mean(markov.result)

```
    b. Repeat this process but change the starting probability to a new value within 2% either way. Get the expected return after 100 repetitions. Keep exploring until you have a return value that is as fair as you can make it. Can you do this automatically?

```{r warning = FALSE}

# automatic way
establish_fair_probs <- function (p=0, incr=0.01) {
  fair_means <- NA
  for (j in 1:100) {
    range <- seq(p, 1, incr)
    for (i in range) {
      res <- markov_game(100, 100, i)
      if (abs(mean(res)) < 50) {
        break
      }
      fair_means[j] = i
    }
  }
  return(fair_means)
}
fair.02 <- establish_fair_probs(0.02)
mean(fair.02, na.rm = TRUE)
```

    c. Repeat again, keeping the initial probability at 48%, but this time change the probability increment to a value different from 1%. Get the expected return after 100 repetitions. Keep changing this value until you have a return value that is as fair as you can make it.

```{r warning = FALSE}
fair.48<-establish_fair_probs(.48, 0.1)
mean(fair.48, na.rm = TRUE)
```


4. Creating a Bootstrap function. There is a particular concept called [bootstrapping]
(https://en.wikipedia.org/wiki/Bootstrapping_(statistics)) where we can easily create 95% confidence intervals, even for complex estimators.

The steps of this process are:

  a. Draw a sample, with replacement, from your data which is the same length of your data.
  b. Calculate the statistic of interest on this boostrap sample (ie mean, variance, regression,...)
  c. Peform steps 1:2 at least 1000 times over until you have a vector of your statistics.
  d. The lower bound of a 95% CI will be the 0.025 percentile
  e. The upper bound of a 95% CI will be the 0.975 percentile

Make a function called `boot_ci` which calculates the 95% confidence interval in this manner.

```{r warning = FALSE}
boot_ci<-function(data,func,n=1000){
  clean.data <- na.omit(as.numeric(data))
  stats<-rep(NA,n)
  for (i in 1:n){
  samp<-sample(clean.data,length(clean.data),replace=TRUE )
  stats[i]<-func(samp)
  }
  lower_bound<-quantile(stats,probs=0.025)
  upper_bound<-quantile(stats,probs=0.975)
  return(list(lower_bound=lower_bound,upper_bound=upper_bound))

}
```


5. For problems 3b and 3c, you calculated a mean value. Because you saved these final results in a vector, use the bootstrap to estimate the variance of the return in each case for your final answer. Once you have these results, which game has the smaller variance in returns?

```{r}
boot_ci(fair.02,mean)
boot_ci(fair.48,mean)
```