GMBO/PECAN

The GMBO/PECAN module is used for fitting generalized models to binomial odds or estimation of odds ratios or matched or unmatched case-control data. For binomial data the key variables are a case-count variable and a number-of-trials variable. The case-count variable must be specified to fit a model in GMBO/PECAN; the number-of-trials variable is required if each record represents more than one trial. GMBO/PECAN models can include stratification variables specified with the STRATA command. Stratification variables (also called risk set indicators) must be specified when working with matched case-control data. (Stratification is discussed later in this chapter.)

For Bernoulli data, that is, where each record represents the outcome of a single trial, the case-count variable must be a 0/1 variable. For binomial data in which each record describes the results of several trials, the values of the case-count variable must be greater than or equal to 0 and less than or equal to the number of trials. The default name for the case-count variable is CASES. The CASES command is used for explicit specification of the case-count variable. If the case-count and number-of-trials variables are inconsistent, fitting stops and an error message is printed.

Default names for the number-of-trials variable are TRIALS or N. The number-of-trials variable can be explicitly specified using either the TRIALS or the N command. A trials variable is not required for Bernoulli data.

PECAN is a special algorithm used fit regression models to the odds ratio in matched case-control studies using conditional logistic regression methods (Gail, Lubin et al. 1981). For such data the key variables are an indicator of whether a record refers to a case or a control, called the case-control indicator and one or more set-identifier variables. Both a case-control indicator and one or more set-identifier variables to fit models to matched case-control data must be specified before fitting a model with PECAN.

The case-control indicator variable must be coded as 1 for cases and 0 for controls. (If the input data are not coded in this way, transformations can be used to create the case-control indicator.) Both cases and cc are recognized as default names for the case-control indicator variable. If the input data contains variables with both of these names, the first one encountered is assumed to be the case-control variable. The CASES command can be used for explicit specification of the case-control indicator variable. (CC is recognized as a synonym for CASES.)

The case-control risk sets can be defined by either a set-identifier variable that has a unique value for each matched set or by a set of stratification variables. In the case when there is a single set-identifier variable the names strata and setno are recognized as default names for this key variable. If the input data contains variables with both of these names, the first one encountered is assumed to be the case-control variable. Risk sets for matched case control studies can also be defined by a set of categorical stratification variable. Either the STRATA or the SETNO command can be used for explicit specification of the set-identifier variables.

One feature of GMBO/PECAN is that it can be used to model odds ratios using a combination of conditional and unconditional likelihood methods with the choice of method used for a stratum determined by the total number of records in the stratum. By default the conditional likelihood is used for risk sets with 50 or fewer records and the unconditional likelihood for a model with an explicit stratum parameter is used for stratum with larger numbers of cases. The user can control the threshold used to switch between the conditional and unconditional likelihoods using the CONDITIONAL command.