Purpose
Indicate that the data are from a case-cohort study and describe design specifications (CASECOHORT) or clear case-cohort design specifications (NOCASECOHORT).
Programs
PEANUTS
Syntax
CASECOHORT ccvar {SFRAC # | COHORTSIZE # | SUBCOHORTSIZE # }
{ROBUST | ASYMPTOTIC { SAMPSTR ccstrvar SAMPFRAC ccsfvar
} @
NOCASECOHORT @
Arguments and Subcommands
ccvar
Name of variable that indicates cases outside the subcohort. ccvar is coded as 0 for records (cases and noncases) in the subcohort and 1 for cases outside the cohort.
SFRAC #
Specifies subcohort sampling fraction as number between 0 and 1. Specification of the sampling fraction overrides specification of cohort size (COHORTSIZE subcommand) or subcohort size (SUBCOHORTSIZE subcommand). When the sampling fraction is specified the full cohort size is defined as the smallest integer greater than or equal to the ratio of the number of records to the sampling fraction
COHORTSIZE #
Specifies the size of the full cohort. If this is specified the sampling fraction is defined as the ratio of the subcohort size (usually determined from the data) to the full cohort size. If sampling fraction has not been specified it is set equal to the ratio of the subcohort size to the cohort size.
SUBCOHORTSIZE #
Overrides the default subcohort size (which is determined by from the dataset).
ROBUST
Use robust covariance matrix as suggested in Jiao and Langholtz, This is the default.
ASYMPTOTIC
Use asymptotic covariance matrix.
SAMPSTR ccstrvar
Specifies the variable that contains the sampling strata values for stratified case-cohort analyses. This variable must be categorical. Both the sampling strata variable and the sampling fraction variable must be specified for stratified case-cohort analyses. At present it is assumed that the strata are numbered from 1 to n where n is the number of levels.
SAMPFRAC ccsfvar
Specifies the variable that contains stratum-specific sampling fractions. It is assumed that the sampling fraction is in the range (0, 1] for at least one record in each stratum. If there are multiple valid sampling fraction values within a stratum the first valid value is used and a warning message is printed. If there are no valid values for a stratum an error message is printed and the case-cohort specification is cleared.
Remarks
Case-cohort designs make use of data on a possibly stratified sample of the full cohort with known sampling frequencies and all cases in the full cohort. This command is used to specific the subcohort sampling frequencies and an indicator variable that identifies unsampled record, which are usually cases, that are outside the cohort. (In a study in which one analyzes multiple outcomes the data set may include non-sampled records for all outcomes of interest. In the analysis of a specific outcome unsampled records with other outcomes will not be used in the risk estimation.) As the above description indicates, one can explicitly specify the sampling fraction and cohort or sample sizes for unstratified analyses while for stratified analyses the sampling strata and stratum-specific sampling frequencies must be specified using variables in the dataset.
In the literature on stratified case-cohort analyses a distinction is made between confounder- stratified and exposure-stratified models. In the former case, the analysis is stratified on all of the sampling strata while in the later case the analysis is stratified on only a subset (which may be empty) of the sampling strata. In the latter, exposure-stratified, case it has been suggested (Langholz and Jiao) that one use a slightly modified and somewhat more complex likelihood, but it has also been shown (Borgan et al) that this modification generally has little impact on the risk estimates or their standard errors. At this time, EPICURE treats all stratified case cohort models as confounder stratified.
Examples
a) Specify a case cohort analysis with the variable nonsamp indciating cases not in the sampled subcohort and a subcohort size of 1741 records
CASECOHORT nonsamp COHORTSIZE 1741 @
b) Spdeicfy a case-cohort analyses using the sampling fraction in place of the cohort size and select the asymptotic method for variance adjustment (the default is to use the robust method)
CASECOHORT nonsamp SAMPFRAC 0.06 ASYMPTOTIC @
c) Clear the existing case cohort analysis setup (if any)
NOCASECOHORT @