Expected kin counts by type of relative in a two-sex multi-state time-varying framework
Source:vignettes/Reference_TwoSex_MultiState_TimeVariant.Rmd
Reference_TwoSex_MultiState_TimeVariant.RmdSince the inception of Caswell’s [@caswell_formal_2019] proposed one-sex time-invariant age-structured matrix model of kinship, there have been many extensions to the framework (many of which are documented within this package). Caswell [-@caswell_formal_2021] updated the original model to incorporate time-varying vital rates, Caswell [-@caswell_formal_2022] introduced two-sexes to the model, and Caswell [-@caswell_formal_2020] considered a multi-stage population of kin. Here, we provide an R function which combines the three aforementioned models.
In this vignette, we’ll demonstrate how the function
kin_multi_stage_time_variant_2sex computes stage-specific
kinship networks encompassing both sexes for an average member of a
population, the sex of whom is user specified, and who is subject to
time-varying demographic rates. We call this individual Focal. We seek
the number of, age, and stage distribution of Focal’s relatives, for
each age of Focal’s life, and as a function of the year in which Focal
is born.
pkgload::load_all()
library(Matrix)
library(tictoc)
options(dplyr.summarise.inform = FALSE) # hide if we don't want to see summarise output (but also #lose #progress bar)Kin counts by parity
In this example we use parity as an example stage. UK data ranging from 1965 - 2022 is sourced from the Human Mortality Database and Office for National Statistics. Some simplifying assumptions we make due to data availability are as follows:
- Fertility rates vary with time, are distinct among parity class, but the same over sexes (the so-called ``androgynous approximation’’).
- Mortality rates vary with time, are distinct across sex, but are the same over parity classes (no parity-specific mortality)
- The age-specific probabilities of parity-progression vary with time, but are the same over sex (androgynous approximation again)
In order to implement the model, the function
kin_multi_stage_time_variant_2sex expects the following 7
inputs of vital rates, fed in as lists:
U_list_femalesA list of female age-and-parity specific survival probabilities over the timescale (in matrix forms). This input list has length = the timescale, and each entry represents the rates of a specific period in matrix form: stage columns, age rows.U_list_malesA list of male age-and-parity specific survival probabilities over the timescale (in matrix forms). This input list has length = the timescale, and each entry represents the rates of a specific period in matrix form: stage columns, age rows.F_list_femalesA list of female age-and-parity specific fertility rates over the timescale (in matrix forms). This input list has length = the timescale, and each entry represents the rates of a specific period in matrix form: stage columns, age rows.F_list_malesA list of male age-and-parity specific fertility rates over the timescale (in matrix forms). This input list has length = the timescale, and each entry represents the rates of a specific period in matrix form: stage columns, age rows.T_list_femalesA list of lists of female age-specific probabilities of moving up parity over the timescale (in matrix forms). The outer list has length = the timescale. The inner list has length = number of ages. Each outer list entry is comprised of a list of matrices (stage*stage dimensional), each matrix describes age-specific probabilities of moving stage. Thus for each year, we have a list of age-specific probabilities of moving from one stage to the next.Same as 5) but for males
H_listA list of length = timescale, in which each element is a matrix which assigns the offspring of individuals in some stage to the appropriate age class (age in rows and states in columns)
To avoid the need for tedious calculations to put data into such format in this vignette, these lists are constructed in another file and simply imported below. The code below reads in the above function input lists.
F_mat_fem <- Female_parity_fert_list_UK
F_mat_male <- Female_parity_fert_list_UK
T_mat_fem <- Parity_transfers_by_age_list_UK
T_mat_male <- Parity_transfers_by_age_list_UK
U_mat_fem <- Female_parity_mortality_list_UK
U_mat_male <- Male_parity_mortality_list_UK
H_mat <- Redistribution_by_parity_list_UKRecap: above are lists of period-specific demographic rates, in particular comprising:
U_mat_fem: list of age by stage matrices, entries give female probability of survival. List starting 1965 ending 2022.
U_mat_male: list of age by stage matrices, entries give female probability of survival. List starting 1965 ending 2022.
F_mat_fem: list of age by stage matrices, entries give female fert, List starting 1965 ending 2022.
F_mat_male == F_mat_fem.
T_mat_fem: list of lists of matrices: Each outer list entry is a list of matrices where each matrix gives age-specific probabilities a female moves up parity (inner list has length of number of age-classes). Outer list starting 1965 ending 2022
T_mat_male == T_mat_fem.
H_mat: list of matrices which redistributes newborns to age-class 1 and parity 0. No time-variation.
1. Accumulated number of kin Focal expects over the lifecourse under time-varying rates from 1965 to 2005
We feed the above inputs into the matrix model, along with other arguments:
- UK sex ratio –>
birth_female= 0.49 - We are considering parity –>
parity= TRUE - We want some of Focal’s kin network –>
output_kin= c(“d”, “oa”, “ys”, “os”) - Accumulated kin in this example –>
summary_kin= TRUE - Focal is female –>
sex_Focal= “Female” - Focal born into parity 0 –>
initial_stage_Focal= 1 - timescale as ouptut – >
output_years= c(1965, 1975, 1985, 1995, 2005)
Accumulated kin are outputted by the argument
summary_kin = TRUE. In such cases, for each age of Focal,
we sum over all possible ages of kin yielding the marginal stage
distribution of kin.
The first sets of time-varying vital rates in our input lists are e.g., U_mat_fem[[1]] (corresponding to mortality in 1965), the 41-st entry is U_mat_fem[[(1+40)]] (mortality in 2005). We require consistency between the length of the list of vital rates and the timescale: U_mat_fem[[1:(1+40)]] = in length = seq(1965,2005). Therefore we use the input lists of demographic rates
U_list_females = U_mat_fem[1:(1+no_years)] which runs
from U_mat_fem[[1]] = 1965 set of rates, up to U_mat_fem[[41]] = 2005
set of rates, and so on…
this run takes some time (round 10 min) so we don´t include the output in the vignette. Please try it!
# Run kinship model for a female Focal over a timescale of no_years (we use 40 here)
no_years <- 40
# and we start projecting kin in 1965
# We decide here to count accumulated kin by age of Focal, and not distributions of kin
kin_out_1965_2005 <-
kin_multi_stage_time_variant_2sex(U_list_females = U_mat_fem[1:(1+no_years)],
U_list_males = U_mat_male[1:(1+no_years)],
F_list_females = F_mat_fem[1:(1+no_years)],
F_list_males = F_mat_male[1:(1+no_years)],
T_list_females = T_mat_fem[1:(1+no_years)],
T_list_males = T_mat_fem[1:(1+no_years)],
H_list = H_mat[1:(1+no_years)],
birth_female = 1 - 0.51, ## Sex ratio -- UK value
parity = TRUE,
output_kin = c("d", "oa", "ys", "os"),
summary_kin = TRUE,
sex_Focal = "Female", ## define Focal's sex at birth
initial_stage_Focal = 1, ## Define Focal's stage at birth
output_years = c(1965, 1975, 1985, 1995, 2005) ## the sequence of years we run the function over
)1.1. Visualizing the output
head(kin_out_1965_2005$kin_summary)Notice the structure of the output data. We have columns
age_focal and kin_stage because we sum over
all ages of kin, and produce the marginal stage distribution given age
of Focal. We have a column corresponding to sex of kin
sex_kin, a column showing which year we are
considering, and a column headed group which selects the
kin type. Finally, we have columns showing Focal’s cohort of birth
cohort (e.g., year - age of Focal), and an as.factor()
equivalent.
1.1.1. Plotting kin for an average Focal at some fixed period in time
Let’s suppose that we really want to understand the age*parity distributions of the accumulated number of aunts and uncles older than Focal’s mother and father, for each age of Focal, over years 1965, 1975, 1985, 1995, 2005. Some people will do….
We restrict Focal’s kinship network to aunts and uncles older than
Focal’s mother by group == “oa”. We visualise the marginal
parity distributions of kin: stage_kin, for each age of
Focal age_focal, using different colour schemes. Implicit
in the below plot is that we really plot Focal’s born into different
cohort – i.e., in the 2005 panel we show a 50 year old
Focal was born in 1955, while a 40 year old Focal was born in 1965.
kin_out_1965_2005$kin_summary %>%
dplyr::filter(group == "oa") %>%
ggplot2::ggplot(ggplot2::aes(x = age_focal, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ year) +
ggplot2::scale_x_continuous(breaks = c(0,10,20,30,40,50,60,70,80,90,100)) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("Older aunts and uncles")We could also consider any other kin in Focal’s network, for
instance, offspring using group == “d”
kin_out_1965_2005$kin_summary %>%
dplyr::filter(group == "d") %>%
ggplot2::ggplot(ggplot2::aes(x = age_focal, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ year) +
ggplot2::scale_x_continuous(breaks = c(0,10,20,30,40,50,60,70,80,90,100)) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("Offspring")1.1.2. Plotting the kin of Focal as a function of Focal’s cohort of birth
Since we only ran the model for 40 years (between 1965-2005), there
is very little scope to view kinship as cohort-specific. We can however
compare cohorts for 40-year segments of Focal’s life. Below, following
from the above example, we once again consider offspring and only show
Focals born of cohort 1910, 1925, or 1965:
kin_out_1965_2005$kin_summary %>%
dplyr::filter(group == "d", cohort %in% c(1910,1925,1965) ) %>%
ggplot2::ggplot(ggplot2::aes(x = age_focal, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ cohort) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("Offspring")The LHS plot (1910 cohort) should be interpreted as follows: if Focal is born in 1910, between 1965-2005 he/she will be 55-95 years old. Focal will have already accumulated its maximal number of offspring, and their overall number will now be dropping as mortality risk begins. The offspring of Focal will be approximately 20-35, and began if not completed reproduction/parity progression.
The middle plot (1925 cohort) shows Focal between ages 40 and 80. Again, Focal will have completed reproduction and can only lose offspring as he/she ages. However, Offspring at Focal of age 40 will be around 10-20 and still have high probability of being in parity 0. Whereas, Focal at age of 80 will have offspring aged around 50, who in turn will have completed reproduction as demonstrated by a well mixed parity-distribution at this age of Focal.
the RHS plot (1965 cohort) simply reflects the fact that Focal will not start reproduction until around 15 years old.
2. Now lets consider the distributions of kin Focal expects over the lifecourse
To obtain distributions of kin as output, we simply use the
kin_full data.frame.
2.1. Visualizing the output
head(kin_out_1965_2005$kin_full)Notice the additional column age_kin. Rather than
grouping kin by stage and summing over all ages, the output here (in
data frame form) gives an expected number of kin for each age*stage
combination, for each age of Focal.
2.1.1. Plotting kin distributions for an average Focal of fixed age, at some fixed period in time
Lets’s consider Focal is aged 50 age_focal == 50, and
examine kin younger siblings; group == “ys”. Restricting
ourselves to the years 1965, 1975, 1985, 1995, 2005, we can plot the
expected age*stage distribution of these kin over the considered
periods, as shown below:
kin_out_1965_2005$kin_full %>%
dplyr::filter(group == "ys",
age_focal == 50) %>%
ggplot2::ggplot(ggplot2::aes(x = age_kin, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ year) +
ggplot2::scale_x_continuous(breaks = c(0,10,20,30,40,50,60,70,80,90,100)) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("Younger siblings") +
ggplot2::ggtitle("Focal 50")Notice the discontinuity along the x-abscissa at 50. This reflects the fact that Focal’s younger siblings cannot are of age <50. Contrastingly, when we look at the age*stage distribution of older siblings, we observe another discontinuity which bounds kin to be of age >50, as plotted below:
kin_out_1965_2005$kin_full %>%
dplyr::filter(group == "os",
age_focal == 50) %>%
ggplot2::ggplot(ggplot2::aes(x = age_kin, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ year) +
ggplot2::scale_x_continuous(breaks = c(0,10,20,30,40,50,60,70,80,90,100)) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("Older siblings") +
ggplot2::ggtitle("Focal 50")With a simple bit of playing with the output data frame, we can plot the age*stage distribution of the combined siblings of Focal
kin_out_1965_2005$kin_full %>%
dplyr::filter((group == "ys" | group == "os"),
age_focal == 50) %>%
tidyr::pivot_wider(names_from = group, values_from = count) %>%
dplyr::mutate(count = `ys` + `os`) %>%
ggplot2::ggplot(ggplot2::aes(x = age_kin, y = count, color = stage_kin, fill = stage_kin)) +
ggplot2::geom_bar(position = "stack", stat = "identity") +
ggplot2::facet_grid(sex_kin ~ year) +
ggplot2::scale_x_continuous(breaks = c(0,10,20,30,40,50,60,70,80,90,100)) +
ggplot2::theme_bw() +
ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5)) +
ggplot2::ylab("All siblings") +
ggplot2::ggtitle("Focal 50")