forked from behrman/ros
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathgay_simple_tv.Rmd
More file actions
157 lines (123 loc) · 3.19 KB
/
gay_simple_tv.Rmd
File metadata and controls
157 lines (123 loc) · 3.19 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
title: "Regression and Other Stories: Gay"
author: "Andrew Gelman, Aki Vehtari"
date: "`r Sys.Date()`"
output:
github_document:
toc: true
---
Tidyverse version by Bill Behrman.
Simple models (linear and discretized age) and attitudes as a
function of age. See Chapter 12 in Regression and Other Stories.
-------------
```{r, message=FALSE}
# Packages
library(tidyverse)
library(rstanarm)
# Parameters
# Data on support for same-sex marriage
file_data <- here::here("Gay/data/naes04.csv")
# Common code
file_common <- here::here("_common.R")
#===============================================================================
# Run common code
source(file_common)
```
# 12 Transformations and regression
## 12.5 Other transformation
### Index and indicator variables
Data
```{r, message=FALSE, warning=FALSE}
data <-
file_data %>%
read_csv() %>%
drop_na(age, gayFavorStateMarriage) %>%
select(age, gayFavorStateMarriage)
data
```
```{r}
summary(data$age)
```
`age` ranges from 18 to 97.
```{r}
data %>%
count(age) %>%
arrange(desc(age))
```
Because of the relatively small number of respondents over age 90, we will let age 91 represent respondents ages 91 and over. We will also create a variable for binned ages.
```{r}
age_max <- 91
data <-
data %>%
mutate(age = if_else(age >= age_max , age_max, age)) %>%
group_by(age) %>%
summarize(favor = sum(gayFavorStateMarriage == "Yes") / n()) %>%
mutate(
age_bin = cut(age, breaks = c(min(age) - 1, seq(29, 79, 10), age_max))
)
data
```
Fit linear regression model.
```{r}
set.seed(235)
fit_linear <- stan_glm(favor ~ age, data = data, refresh = 0)
print(fit_linear, digits = 4)
```
Attitudes toward same-sex marriage by age in 2004.
```{r}
intercept <- coef(fit_linear)[["(Intercept)"]]
slope <- coef(fit_linear)[["age"]]
data %>%
ggplot() +
geom_abline(slope = slope, intercept = intercept) +
geom_point(aes(age, favor)) +
coord_cartesian(ylim = c(0, NA)) +
scale_y_continuous(labels = scales::label_percent(accuracy = 1)) +
labs(
title = "Attitudes toward same-sex marriage by age in 2004",
x = "Age",
y = "Favor same-sex marriage"
)
```
Fit model with binned ages.
```{r}
set.seed(235)
fit_binned <- stan_glm(favor ~ age_bin, data = data, refresh = 0)
print(fit_binned, digits = 2)
```
Attitudes toward same-sex marriage by age in 2004: For binned ages.
```{r}
levels <-
coef(fit_binned) %>%
enframe(value = "favor") %>%
mutate(name = if_else(name == "(Intercept)", "17,29", name)) %>%
extract(
col = name,
into = c("age_min", "age_max"),
regex = "(\\d+),(\\d+)",
convert = TRUE
) %>%
mutate(
age_min = age_min + 1L,
favor =
case_when(
age_min == 18 ~ favor,
TRUE ~ favor[age_min == 18] + favor
)
)
data %>%
ggplot() +
geom_segment(
aes(x = age_min, xend = age_max, y = favor, yend = favor),
data = levels
) +
geom_point(aes(age, favor)) +
coord_cartesian(ylim = c(0, NA)) +
scale_y_continuous(labels = scales::label_percent(accuracy = 1)) +
labs(
title = "Attitudes toward same-sex marriage by age in 2004",
subtitle = "For binned ages",
x = "Age",
y = "Favor same-sex marriage"
)
```