Pivot_wider output producing multiple rows
I am widening data based upon two variables in my data.
However, I find a strange result as widening based on the values of a certain variable the number of rows produced is equal to the number of unique ID variables, so is OK. Widening based on the values of a second variable gives the same number of rows as the original long data.
Here is a MWE:
# Create mwe data
mwe <- structure(list(col1 = c(1, 1, 3, 3, 3, 4, 4, 5, 5, 6, 8, 9, 10,
10, 11, 11, 11, 12, 12, 13, 13, 14, 14, 14, 14),
col2 = c(1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 4L, 3L, 4L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L),
col3 = c(37, 43, 33, 35, 42, 22, 34, 31, 35, 41, 22, 32, 20, 23, 27, 30, 33, 32, 38, 28, 35, 15, 35, 38, 41),
col4 = c(1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 1L, 2L, 3L, 1L, 2L, 1L, 2L, 1L, 2L, 3L, 4L),
col5 = c(37, 37, 33, 33, 33, 22, 22, 31, 31, 41, 22, 32, 20, 20, 27, 27, 27, 32, 32, 28, 28, 15, 15, 15, 15),
col6 = c(2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L)),
class = c("grouped_df", "tbl_df", "tbl", "data.frame"),
row.names = c(NA, -25L),
groups = structure(list(col1 = c(1, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14),
.rows = structure(list(1:2, 3:5, 6:7, 8:9, 10L, 11L, 12L, 13:14, 15:17, 18:19, 20:21, 22:25),
ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))),
class = c("tbl_df", "tbl", "data.frame"),
row.names = c(NA, -12L), .drop = TRUE))
# Widen based on values of variable "col3"
wide_ok <- mwe |>
pivot_wider(names_from = col4,
values_from = col3,
names_prefix = "wide_ok")
# Same number of rows in wide_ok as unique identifiers in mwe
nrow(wide_ok)
length(unique(mwe[["col1"]]))
# Widen based on values of variable "col2"
wide_not_ok <- mwe |>
pivot_wider(names_from = col4,
values_from = col2,
names_prefix = "wide_not_ok")
# NOT the same number of rows in wide_ok as unique identifiers in mwe, equal to total # number of rows in mwe
nrow(wide_not_ok)
length(unique(mwe[["col1"]]))
nrow(mwe)