Fit progress and result summary information

As a model is fit, the program provides information on progress of the fitting.  Upon completion of the fitting, the program provides a description of the current model, a summary table with information on the model structure and parameter estimates, the deviance and other information related to the fit of the model.  Output 4.3 shows the fit progress and result summary information for a complex model in an analysis of the joint effects of radiation and smoking on lung cancer risks in atomic bomb survivors (Furukawa, Preston et al. 2010) carried out using AMFIT.

The fit progress information includes the iteration number, the deviance for the parameter estimates used for the iteration, and the number of step halvings, if any, needed to find a set of parameter estimates that improves the fit. (Step-halving take place when the parameter vector suggested by the standard Newton-Raphson method does not result in an increase in the likelihood function.  In such cases, the program attempts a step in the same direction but with a reduction in the size of the step.    

Iterations continue until convergence has been achieved or the fitting has failed to converge. Failure to converge happens with the program exceeds the maximum number of iterations without convergence or when repeated step halving in several consecutive iterations suggests that there is a problem with the fitting.  When a fails to converge an error message is written a following the parameter summary table.

Output 4.3    Fit progress and fitted model summary information

          Iter  Step      Deviance

 

             0     0    26010.231

             1     0    13648.206

             2     0    10314.414

             3     0     9518.743

             4     0     9416.879

             5     0     9410.535

             6     0     9410.296

             7     0     9410.283

             8     0     9410.283

 

Piece-wise exponential regression

Multiplicative excess relative risk  T0 * (1 + T1) * (1 + T2) * (1 + T3) * (1 + T4)

 

   lungca is used for cases

    py10k is used for person years

 

                            Parameter Summary Table

 # Name                            Estimate     Std.Err.  Test Stat.  P value

-- ----------------------------   ----------   ---------  ----------  --------

Log-linear term 0

 1 sex_1....................          2.264       0.121      18.71    < 0.001

 2 sex_2....................          1.712     0.07161       23.9    < 0.001

 3 naga.....................         0.2097     0.05957       3.52    < 0.001

 4 e30......................        -0.1358     0.04244     -3.201    0.00137

 5 nic * hiro...............       -0.06519     0.06761    -0.9643      0.335

 6 nic * naga...............       -0.09588      0.1116    -0.8593       0.39

 7 sex_1 * lage70...........          4.941      0.3587      13.78    < 0.001

 8 sex_2 * lage70...........          4.956      0.3241      15.29    < 0.001

 9 sex_1 * lage70sq.........         -4.996       1.033     -4.838    < 0.001

10 sex_2 * lage70sq.........         -1.933      0.8717     -2.217     0.0266

 

Linear term 1

11 d10gy....................         0.5901      0.1718      3.435    < 0.001

 

Log-linear term 1

12 e30......................         0.2586      0.1537      1.683     0.0924

13 lage70...................         -2.784       1.175     -2.369     0.0178

14 ldp1.....................          9.150       2.662      3.437    < 0.001

15 ldp1 * ldp1..............         -16.64       5.678     -2.931    0.00338

 

Linear product term 1

16 %CON.....................          1.000     Aliased

17 msex.....................         0.5156      0.1393      3.701    < 0.001

 

Linear term 2

18 py50.....................          1.000     Aliased

19 unksmk...................          1.000     Aliased

 

Log-linear term 2

20 mever....................          1.281      0.1658       7.73    < 0.001

21 fever....................          1.753      0.1697      10.33    < 0.001

22 e30smkr..................         0.2866     0.07501       3.82    < 0.001

23 ldur50...................        -0.2429      0.4714    -0.5152      > 0.5

24 ldur50sq.................         -2.506       1.178     -2.127     0.0334

25 ltsq.....................        -0.4662      0.1076     -4.332    < 0.001

26 ax3cat_1 * munksmk.......         0.2095       0.319     0.6566      > 0.5

27 ax3cat_2 * munksmk.......         0.2448      0.3223     0.7596      0.447

28 ax3cat_3 * munksmk.......         0.4412      0.2594      1.701      0.089

29 ax3cat_1 * funksmk.......         -10.00       Fixed     -0.2246      > 0.5

30 ax3cat_2 * funksmk.......         -1.885      0.8391     -2.247     0.0247

31 ax3cat_3 * funksmk.......         -1.202      0.5503     -2.185     0.0289

 

            Records used       128390

 

                Deviance     9410.283

                     AIC     9464.283

            Pearson Chi2       343264      Degrees of freedom  128363

Once fitting is complete fit summary information is written to the screen and to the log file.  The first section of this summary describes the type and form of the model used.  In this example, we are analyzing rate data using Poisson regression and the model type is described as “Piecewise exponential regression”.  The following line indicates the form of the risk model, a multiplicative excess relative risk model in this example.    The next lines identify the key variables, which for this example are a lung cancer case count variable (lungca) and a person year variable (py10k).  Although not shown in this example, when fitting a stratified model, the summary will include information on the stratification variables.  Although not illustrated in this example, when the analysis is limited to a subset of the data this section of the fit summary contains information describing the selection.

The next section is a parameter summary table.  The parameters are grouped into sections for each term and subterm in the model. The model in this example has three terms (0, 1, and 2).  Term 0 consists solely of a log-linear subterm. Term 1 includes three subterms: a log-linear subterm and two linear subterms.  Term 3 includes a log-linear subterm and a single linear subterm. 

The output for each parameter includes:

The parameter number (column 1), which is used for some commands including those for obtaining likelihood bounds (BOUNDS or PROFILE) and for initializing parameter or fixing parameter estimates (PARA).  If a parameter number does not appear in the table, the associated parameter was determined to be aliased based on the model specification.  (See Interactions and Aliasing for more information on aliasing.)

The name of the covariates associated with each parameter (column 2).  For categorical variables the names include a suffix indicating the level associated with this parameter and for interactions the variable names are separated by *.

The parameter estimate (column 3).

The estimated asymptotic standard error for the parameter estimate (column 4).  This column is used to indicate parameters that were not estimates, either because they were fixed by the user or because they were determined to be aliased with other parameters. 

The test statistic (column 5) and P-value (column 6) columns contain test statistics for a simple hypothesis about the parameter value.  The type of test and the hypothesis being tested depend on whether the parameter was free, fixed, or aliased.   For a free parameter the statistic is the Wald statistic (parameter estimate divided by standard error) for the null hypothesis that the parameter equals 0.  For a fixed parameter the statistic is the signed square root of the score test of the hypothesis that the parameter is equal to the value at which it is fixed.  Nothing is shown in these columns for aliased parameters.

The final section of the fit summary table provides information on the nature of the fit.  The content of this section depends on the model type, but always includes information on the number of records used as well as the deviance and Akaike Information Criteria (AIC) values for the fitted model.  The AIC (Akaike 1974). which is equal to the deviance plus twice the number of free parameters in the model, is often used to compare models, especially non-nested models fit to a data set (e.g. models in which the parameters of one model  are not simply a subset of those in the other model).  For Poisson regression (AMFIT), as in this example, or unmatched models fit to binomial data (GMBO/PECAN) the output also includes the Pearson  statistic and degrees of freedom associated with the deviance and Pearson statistics.  

Although not shown here, the fit summary will include information on the number of strata used in stratified models.  When records are not used due to missing values, the summary will indicate how many records were not used in the fitting. 

For partial likelihood analyses of survival time data (PEANUTS) or conditional models for the odds ratios (GMBO/PECAN), the summary provides information on the number of risk sets in the analysis.  This information will be discussed later in this guide.