Creates age groups from numeric values using customizable break points and formatting options. The function allows for flexible formatting and customization of age group labels.
If a factor is returned, this factor includes factor levels of unobserved age groups. This allows for reproducible age groups, which can be used for joining data (e.g. adding age grouped population numbers for incidence calculation).
Usage
create_agegroups(
values,
age_breaks = c(5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90),
breaks_as_lower_bound = TRUE,
first_group_format = "0-{x}",
interval_format = "{x}-{y}",
last_group_format = "{x}+",
pad_numbers = FALSE,
pad_with = "0",
collapse_single_year_groups = FALSE,
na_label = NA,
return_factor = FALSE
)
Arguments
- values
Numeric vector of ages to be grouped
- age_breaks
Numeric vector of break points for age groups.
Default:c(5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90)
- breaks_as_lower_bound
Logical; if
TRUE
(default), breaks are treated as lower bounds of the intervals. IfFALSE
, as upper bounds.- first_group_format
Character string template for the first age group. Uses glue syntax.
Default:"0-\{x\}"
, Other common styles:"<={x}", "<{x+1}"
- interval_format
Character string template for intermediate age groups. Uses glue syntax.
Default:"\{x\}-\{y\}"
, Other common styles:"{x} to {y}"
- last_group_format
Character string template for the last age group. Uses glue syntax.
Default:"\{x\}+"
, Other common styles:">={x}",">{x-1}"
- pad_numbers
Logical or numeric; if numeric, pad numbers up to the specified length (Tip: use
2
). Not compatible with calculations within glue formats. Default:FALSE
- pad_with
Character to use for padding numbers. Default:
"0"
- collapse_single_year_groups
Logical; if
TRUE
, groups spanning single years are collapsed. Default:FALSE
- na_label
Label for
NA
values. IfNA
, keeps defaultNA
handling. Default:NA
- return_factor
Logical; if
TRUE
, returns a factor, ifFALSE
returns character vector. Default:FALSE
Examples
# Basic usage
create_agegroups(1:100)
#> [1] "0-4" "0-4" "0-4" "0-4" "5-9" "5-9" "5-9" "5-9" "5-9"
#> [10] "10-14" "10-14" "10-14" "10-14" "10-14" "15-19" "15-19" "15-19" "15-19"
#> [19] "15-19" "20-24" "20-24" "20-24" "20-24" "20-24" "25-29" "25-29" "25-29"
#> [28] "25-29" "25-29" "30-39" "30-39" "30-39" "30-39" "30-39" "30-39" "30-39"
#> [37] "30-39" "30-39" "30-39" "40-49" "40-49" "40-49" "40-49" "40-49" "40-49"
#> [46] "40-49" "40-49" "40-49" "40-49" "50-59" "50-59" "50-59" "50-59" "50-59"
#> [55] "50-59" "50-59" "50-59" "50-59" "50-59" "60-69" "60-69" "60-69" "60-69"
#> [64] "60-69" "60-69" "60-69" "60-69" "60-69" "60-69" "70-79" "70-79" "70-79"
#> [73] "70-79" "70-79" "70-79" "70-79" "70-79" "70-79" "70-79" "80-89" "80-89"
#> [82] "80-89" "80-89" "80-89" "80-89" "80-89" "80-89" "80-89" "80-89" "90+"
#> [91] "90+" "90+" "90+" "90+" "90+" "90+" "90+" "90+" "90+"
#> [100] "90+"
# Custom formatting with upper bounds
create_agegroups(1:100,
breaks_as_lower_bound = FALSE,
interval_format = "{x} to {y}",
first_group_format = "0 to {x}"
)
#> [1] "0 to 5" "0 to 5" "0 to 5" "0 to 5" "0 to 5" "6 to 10"
#> [7] "6 to 10" "6 to 10" "6 to 10" "6 to 10" "11 to 15" "11 to 15"
#> [13] "11 to 15" "11 to 15" "11 to 15" "16 to 20" "16 to 20" "16 to 20"
#> [19] "16 to 20" "16 to 20" "21 to 25" "21 to 25" "21 to 25" "21 to 25"
#> [25] "21 to 25" "26 to 30" "26 to 30" "26 to 30" "26 to 30" "26 to 30"
#> [31] "31 to 40" "31 to 40" "31 to 40" "31 to 40" "31 to 40" "31 to 40"
#> [37] "31 to 40" "31 to 40" "31 to 40" "31 to 40" "41 to 50" "41 to 50"
#> [43] "41 to 50" "41 to 50" "41 to 50" "41 to 50" "41 to 50" "41 to 50"
#> [49] "41 to 50" "41 to 50" "51 to 60" "51 to 60" "51 to 60" "51 to 60"
#> [55] "51 to 60" "51 to 60" "51 to 60" "51 to 60" "51 to 60" "51 to 60"
#> [61] "61 to 70" "61 to 70" "61 to 70" "61 to 70" "61 to 70" "61 to 70"
#> [67] "61 to 70" "61 to 70" "61 to 70" "61 to 70" "71 to 80" "71 to 80"
#> [73] "71 to 80" "71 to 80" "71 to 80" "71 to 80" "71 to 80" "71 to 80"
#> [79] "71 to 80" "71 to 80" "81 to 90" "81 to 90" "81 to 90" "81 to 90"
#> [85] "81 to 90" "81 to 90" "81 to 90" "81 to 90" "81 to 90" "81 to 90"
#> [91] "91+" "91+" "91+" "91+" "91+" "91+"
#> [97] "91+" "91+" "91+" "91+"
# Ages 1 to 5 are kept as numbers by collapsing single year groups
create_agegroups(1:10,
age_breaks = c(1, 2, 3, 4, 5, 10),
collapse_single_year_groups = TRUE
)
#> [1] "1" "2" "3" "4" "5-9" "5-9" "5-9" "5-9" "5-9" "10+"