Categorical responses are data that are intrinsically qualitative or nonnumerical. As such, they present some unique characteristics and issues for analysis. There are multiple categorical procedures in SAS for dealing with this. This tutorial will cover common ones use to carry out a variety of analysis types.
Examples of categorical data would be responses such as gender, level of education, YesNo answers, etc. A characteristic of such data is that the categories are nonoverlapping or mutually exclusive. Because of this, when the data are summarized to numbers like percentages, the values must add to 100%, a condition known as “SumtoOne”. A consequence of this is that if we know all but one category, we automatically know the missing category: If there are 60% Yes responses, then we know there are 40% No. If plants are recorded in three categories as 20% short and 30% medium, then we know there are also 50% tall. In analyses, this effects the number of parameters estimated. Generally, if there are C categories, then the analyses will estimate C1 parameters.
Another characteristic of some categorical data is order. For example in the short, medium, tall example above, there is a natural progression or order to the categories. Occasionally, we might also categorize true continuous data such as income or age into ordered categories, e.g. age: 2035, 3650, and greater than 50. These Ordinal data are often treated differently than nonordered data (nominal data) by looking at the cumulative responses across the ordered groups measuring the incremental change between levels rather than the absolute percentage at each level.
Individual categorical responses are referred to as Multinomials, as in a multinomial distribution (Multinomials with two categories are a special class referred to as Binomials). The individual categories or cells of the multinomial are characterized by the probability of that category occurring or being observed and the sum of all probabilities across the multinomial is 1.0. Categorical data can also occur in combinations or cross classifications of 2 or more multinomials and these are known as Contingency Tables. Like their singular counterparts, each combination or table cell is represented by a probability of that cell combination and these also sum to 1.0. Contingency tables can also be summarized by row or column totals and these are referred to as marginal distributions. All of these terms and structures can come into play during an analysis as will be demonstrated below.
These data come from survey of participants who participated in several online pesticide safety workshops and are described in Innovative Virtual Pesticide Recertification Webinar Series Achieves Success during the COVID19 Pandemic, Himyck, R. et al. 2022, Journal of Pesticide Safety Education. While this survey covered many topics, the data used here are a subset that relate to questions on participant age, size of community, and the online device they accessed the workshop with. Age is categorized into 5 levels, Community Size has 5 levels, and two types of Devices (Smart phone and Tablet) are considered. The distributions of Age by Device type are plotted below.
PROC IMPORT OUT= WORK.survey
DATAFILE= ".\Data\survey.xlsx"
DBMS=XLSX REPLACE;
sheet="Survey";
RUN;
proc print;
run;
proc sgplot data=survey;
styleattrs datacolors=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) datacontrastcolors=(black black black);
vbarparm category=Device response=Count / LIMITATTRS=(color=black)
group=Age grouporder=data OUTLINEATTRS=(color=black) groupdisplay=cluster;
xaxis label='Device' TYPE=discrete DISCRETEORDER=formatted LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
yaxis label="Count" LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
title1 ' ';
keylegend /AUTOITEMSIZE valueattrs=(size=14) TITLEATTRS=(weight=bold size=12);
run;
Obs  Age  Size  Device  Count 

1  1835  Large City  Smart phone  18 
2  3645  Large City  Smart phone  16 
3  4655  Large City  Smart phone  14 
4  5665  Large City  Smart phone  6 
5  1835  Large City  Tablet  1 
6  3645  Large City  Tablet  13 
7  4655  Large City  Tablet  2 
8  5665  Large City  Tablet  10 
9  Over 65  Large City  Tablet  1 
10  1835  Medium City  Smart phone  8 
11  3645  Medium City  Smart phone  4 
12  4655  Medium City  Smart phone  10 
13  5665  Medium City  Smart phone  6 
14  Over 65  Medium City  Smart phone  1 
15  1835  Medium City  Tablet  1 
16  5665  Medium City  Tablet  7 
17  Over 65  Medium City  Tablet  5 
18  1835  Small City  Smart phone  1 
19  3645  Small City  Smart phone  7 
20  4655  Small City  Smart phone  3 
21  5665  Small City  Smart phone  10 
22  Over 65  Small City  Smart phone  2 
23  1835  Small City  Tablet  5 
24  3645  Small City  Tablet  5 
25  4655  Small City  Tablet  4 
26  5665  Small City  Tablet  10 
27  Over 65  Small City  Tablet  2 
28  1835  Town  Smart phone  6 
29  3645  Town  Smart phone  26 
30  4655  Town  Smart phone  6 
31  5665  Town  Smart phone  3 
32  Over 65  Town  Smart phone  2 
33  1835  Town  Tablet  8 
34  4655  Town  Tablet  4 
35  5665  Town  Tablet  1 
36  Over 65  Town  Tablet  13 
37  1835  Rural  Smart phone  17 
38  3645  Rural  Smart phone  17 
39  4655  Rural  Smart phone  11 
40  5665  Rural  Smart phone  18 
41  Over 65  Rural  Smart phone  15 
42  1835  Rural  Tablet  12 
43  4655  Rural  Tablet  14 
44  5665  Rural  Tablet  15 
45  Over 65  Rural  Tablet  12 
The most basic way to look at this data is to summarize the oneway marginal totals. In this example, we look at Age and Device, ignoring Community Size for now. This looks at Age and Device separately. The procedure used is Proc Freq, which is a tabulation procedure. A Weight statement is used to indicate the data are already summarized into counts. The Table statement implements the tabulation request. Both Age and Device are specified in the statement, generating two analyses. We could also issue two Table statements and generate equivalent results. The option chisq requests a test to assess whether all the categories within each factor have an equal probability of occurring. The plots option asks for frequency plots of each marginal distribution.
proc freq data=survey;
weight count;
tables age device/chisq plots(only)=freqplot;
title1 'Age and Device Marginals';
run;
Age and Device Marginals 
Age  Frequency  Percent 
Cumulative Frequency 
Cumulative Percent 

1835  77  20.70  77  20.70 
3645  88  23.66  165  44.35 
4655  68  18.28  233  62.63 
5665  86  23.12  319  85.75 
Over 65  53  14.25  372  100.00 
ChiSquare Test for Equal Proportions 


ChiSquare  11.0914 
DF  4 
Pr > ChiSq  0.0256 
Sample Size = 372 
Device  Frequency  Percent 
Cumulative Frequency 
Cumulative Percent 

Smart phone  227  61.02  227  61.02 
Tablet  145  38.98  372  100.00 
ChiSquare Test for Equal Proportions 


ChiSquare  18.0753 
DF  1 
Pr > ChiSq  <.0001 
Sample Size = 372 
The output here gives the oneway tabulation for frequency (counts), percent, and the cumulative frequencies and percentages. The Chisquare test indicates that the probabilities of categories within each factor are not equal. Note, the degrees of freedom for each factor is one less than the respective factor levels.
Many times we want to assess the association between two or more categorical factors. In this example, a twoway contingency table is setup to look at the combination of Age and Device. A chisquare test (option chisq) is requested to examine the potential association or independence of the two.
options nonotes;
proc freq data=survey;
weight count;
tables age*device/chisq;
title1 'Age by Device Contingency Table';
run;
Age by Device Contingency Table 


Statistics for Table of Age by Device 
Statistic  DF  Value  Prob 

ChiSquare  4  30.0534  <.0001 
Likelihood Ratio ChiSquare  4  30.7686  <.0001 
MantelHaenszel ChiSquare  1  19.4680  <.0001 
Phi Coefficient  0.2842  
Contingency Coefficient  0.2734  
Cramer's V  0.2842 
Sample Size = 372 
In the resulting table, four numbers are given in each cell. From top to bottom they are: the number of observations for that cell, the corresponding percentage of that cell in the whole table, the row percentage of the cell, and finally, the column percentage. These last two percentages or marginal distributions are often useful for thinking about possible associations. For example, we could look at row percentages in this table representing the distribution of device types within each age group. If there were no association, the percentages of cell phones and tablets would be similar for every age class. Looking at the table, however, we see that the percentage of smart phone use in younger groups is higher than those in older age groups. The percentages for tablet use follow a reverse trend. This is also evident in the initial plot of the data given above.
The chisquare test option confirms this where the pvalue is < 0.0001, indicating an association was detected. In this table there are several tests carried out. Only the first two, ChiSquare and Likelihood Ratio ChiSquare, are relevant and either of these can be reported. Note that, like correlations, this does not imply causality, but just indicates that the distribution of Device types changes with Age classes and the Devices tend to trend in opposite direction as Age increases.
Sometimes it is of interest to more directly model the relationships between two or more categorical factors. One common means for doing this is logistic regression. In logistic regression, a binary categorical “response” is modeled as a function of other factors, which can be either categorical or continuous. A key here is that the response is a binary factor. While modeling can be done with more than two categories in the response, the interpretation becomes much more complex. In logistic regression, we indirectly model the proportions of the binary responses. This is done by selecting one of the categories, often referred to as a “success”, and representing its proportion of success as p. The proportion of “failures” is then 1p because the proportions of “success” and “failure” must add to 1.0. Note that which category we assign as a “success” or “failure” is immaterial, but SAS will choose the lowest numeric value of the binary response as a “success” (In SAS, this can be reversed with the (descending) option placed after the response in the model statement). Once it is defined, however, the proportion is transformed logrithmically to:
\[ transformed\;\; proportion = ln\left( \frac{p}{1p}\right) = logit \] The fraction of success to failure in the log function is referred to as the odds of success, p, and the entire term as the log odds or logit. When logistic regression is run with a categorical factor on the right hand side of the model, the procedure will form one logit or log odds for every level of that factor. Results are then usually displayed or reported as the ratio of the odds (odds ratio = OR) of all levels relative to one selected level. By default, SAS determines this level to be the last alphabetical level. This can also be changed if needed (e.g. the ref option in a Class statement for SAS). An alternative is to also just report and interpret the proportions themselves along with the odds.
SAS has several procedures that can run logistic regression. In the example below, Proc Glimmix is used as logistic regression is actually a generalized linear model. For more information on generalized linear models, see the tutorial here. In this case the factor Device is the binary response. In order to get SAS to implement logistic regression, however, we need to get Glimmix to view this response as a numeric binary variable. We could recode the “Smart phone” : “Tablet” character values in the data, but here the Proc Format procedure is used to simply coerce the change to 0 and 1, respectively, without manipulating the data. In the model, a binary distribution is specified where the logit is the default link function. Age class is the factor on the right hand side of the model. The model will then assess the odds and odds ratios of Smart phone usage for each Age class. The Lsmeans statement, with the ilink and odds options, are used to display the predicted proportions of “success” (Smart phone) and the respective odds for each age class. These are also output to a separate data set and the plotted with Proc Sgplot.
proc format;
value $dvf 'Smart phone' = '0'
'Tablet' = '1';
run;
ods graphics;
proc glimmix data=survey method=quad;
weight count;
class age;
model device = age/dist=binary oddsratio;
format device $dvf.;
lsmeans age/cl ilink odds;
ods output LSMeans=odds;
run;
proc sgplot data=odds noautolegend;
styleattrs datacolors=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) DATACONTRASTCOLORS=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) ;
highlow y=age high=upperodds low=lowerodds/group=age type=line lineattrs=(pattern=solid) highcap=serif lowcap=serif LINEATTRS=(thickness=2);
scatter y=age x=odds/group=age FILLEDOUTLINEDMARKERS markerattrs=(symbol=circlefilled size=14) MARKEROUTLINEATTRS=(color=black) datalabel=age DATALABELATTRS=(Color=black Family=Arial Style=Italic Weight=Bold size=10);
refline 1 /axis=x lineattrs=(pattern=shortdash color=black) ;
yaxis label='Age' TYPE=discrete DISCRETEORDER=data LABELATTRS=( Family=Arial Size=15 Weight=Bold) display=(NOVALUES NOTICKS);
xaxis label='Odds of Smart Phone Use' LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
run;
Model Information  

Data Set  WORK.SURVEY 
Response Variable  Device 
Response Distribution  Binary 
Link Function  Logit 
Variance Function  Default 
Weight Variable  Count 
Variance Matrix  Diagonal 
Estimation Technique  Maximum Likelihood 
Degrees of Freedom Method  Residual 
Class Level Information  

Class  Levels  Values 
Age  5  1835 3645 4655 5665 Over 65 
Number of Observations Read  45 

Number of Observations Used  45 
Response Profile  

Ordered Value 
Device 
Total Frequency 
1  0  24 
2  1  21 
The GLIMMIX procedure is modeling the probability that Device='0'. 
Dimensions  

Columns in X  6 
Columns in Z  0 
Subjects (Blocks in V)  1 
Max Obs per Subject  45 
Optimization Information  

Optimization Technique  NewtonRaphson 
Parameters in Optimization  5 
Lower Boundaries  0 
Upper Boundaries  0 
Fixed Effects  Not Profiled 
Iteration History  

Iteration  Restarts  Evaluations 
Objective Function 
Change 
Max Gradient 
0  0  4  233.65149305  .  3.492468 
1  0  3  233.35427822  0.29721483  0.078116 
2  0  3  233.354175  0.00010322  0.000033 
3  0  3  233.354175  0.00000000  6.73E12 
Convergence criterion (GCONV=1E8) satisfied. 
Fit Statistics  

2 Log Likelihood  466.71 
AIC (smaller is better)  476.71 
AICC (smaller is better)  478.25 
BIC (smaller is better)  485.74 
CAIC (smaller is better)  490.74 
HQIC (smaller is better)  480.08 
Pearson ChiSquare  372.00 
Pearson ChiSquare / DF  8.27 
Type III Tests of Fixed Effects  

Effect  Num DF  Den DF  F Value  Pr > F 
Age  4  40  7.04  0.0002 
Odds Ratio Estimates  

Age  Age  Estimate  DF  95% Confidence Limits  
1835  Over 65  3.056  40  1.445  6.462 
3645  Over 65  6.417  40  2.932  14.042 
4655  Over 65  3.025  40  1.402  6.525 
5665  Over 65  1.650  40  0.803  3.389 
Age Least Squares Means  

Age  Estimate 
Standard Error 
DF  t Value  Pr > t  Alpha  Lower  Upper  Mean 
Standard Error Mean 
Lower Mean 
Upper Mean 
Odds 
Lower Odds 
Upper Odds 
1835  0.6162  0.2388  40  2.58  0.0137  0.05  0.1335  1.0989  0.6494  0.05438  0.5333  0.7500  1.8519  1.1428  3.0008 
3645  1.3581  0.2643  40  5.14  <.0001  0.05  0.8240  1.8922  0.7955  0.04300  0.6951  0.8690  3.8889  2.2796  6.6342 
4655  0.6061  0.2538  40  2.39  0.0217  0.05  0.09327  1.1190  0.6471  0.05795  0.5233  0.7538  1.8333  1.0978  3.0618 
5665  1.11E16  0.2157  40  0.00  1.0000  0.05  0.4359  0.4359  0.5000  0.05392  0.3927  0.6073  1.0000  0.6467  1.5463 
Over 65  0.5008  0.2834  40  1.77  0.0848  0.05  1.0735  0.07195  0.3774  0.06658  0.2547  0.5180  0.6061  0.3418  1.0746 
Some notes on the output: First, in the Response Profile, Device type is listed as 0 or 1, reflecting the formats from the preceding Proc Format definitions. Also, note that the output states the procedure is modeling the probability that Device=‘0’, which was defined to be “Smart phone”. This will be the “p” in the logit function.
Further down are the Type III test of Fixed Effects where the test of differences in the Age factor are given. In this case, Age has a low pvalue suggesting the probabilities of Smart phone use are different across the Age categories. This is followed by the Odds Ratio table resulting from the oddsratio option in the model statement. Here there are four values listed. The last category, “Over 65” is the reference level (it was last alphabetically), so the other categories are compared to it. The odds ratios indicate the relative size of each category’s odds compared to this reference level. The odds of Smart phone usage in “1835” and “4655” year olds are about 3 times as large as the “over 65” age group, while the “3645” group is about 6.4 times as large. Although odds ratios are the commonly reported statistic for group comparison in logistic regression, they do not indicate the size of the underlying probabilities.
The LSmeans table completes the information. Here the Estimate column values are the actual logit transformed values. These are not of much use for interpretation. The ilink and odds options, however, provide the estimated probabilities (Mean column) of Smart phone usage and the respective odds. The “3645” age group has the highest probability at 0.695 or about 70%, while the two oldest groups have probabilities closer to 0.5 or equal probability. In the Odds column, the ratio of success = “Smart phone” to failure = “Tablet” are shown. These indicate how likely Smart phone use is for each group.
Both Odds Ratios and Odds are compared to 1.0. When the probability of success equals that of failure the odds are 1.0, or no preference for either. Also, if two groups have equal odds, their odds ratio will be 1.0. This is indicated in the respective confidence intervals for each statistic. In this example, we see that the “5665” category had a probability of 0.50 resulting in odds of 1.0. In the odds ratio table above, we see that this category compared to the “Over 65” group had an odds ratio close to 1.0 indicating the odds of Smart phone use in these two groups were similar.
Important: When reporting logistic regression results, It is not sufficient to only report odds ratios. These are relative measurements and do not indicate the magnitude or sizes of the responses. Always present either the underlying probabilities, odds, or both.
From the Odds Ratio and LSmeans tables, we can reconstruct the values and see the relationships among them. If we take the estimated probabilities for each category and compute p/(1p), we get the odds. For example in the “1835” group, the estimated probability of Smart phone use is 0.6494. Computing 0.6494/(1.6494) gives 1.85, the odds for that category. If we further take the odds for “1835” and divide it by the odds of “Over 65” = 1.85/0.6061 we get the odds ratio for “1835” = 3.056. As noted above, we can reconstruct the underlying probabilities of categories from their odds, but we cannot do the same with only odds ratios. This is why it is important to present complete results when reporting logistic regression.
There are times, especially in survey data, where there are no obvious “independent/dependent” variable relationships. For example, in this data, the relationship between Age and Community Size may not be obvious in terms of one influencing the other. Regardless, we would still like to evaluate the effects of both or their combination. In these cases, it can be useful to use a class of models called Loglinear models. The procedure for these in SAS is Proc Catmod. Note that Catmod can do other model types, and this example is not meant to cover all those cases. Also, Proc Catmod cannot address or account for random model effects, so it has limitations in that way.
Below, Catmod is used to assess the combined effect of Age and Size in the data (this ignores Device effects). The syntax for loglinear models in Catmod are unique compared to other SAS procedures. Here we use a special construct called response as a place holder in the model because both Age and Size could appear on either the left or right hand sides of the model. The loglin statement following the model tells SAS what effects to evaluate on the right hand side. Here, the full model of main effects and interactions are requested. Catmod can also carry out contrasts. Because categorical models drop the last level of an effect for estimation, the contrasts may look different than other modeling procedures. In Catmod, if the contrast does not involve the last level of an effect, the contrasts are similar to other procedures and the coefficients add to zero. Note, however, the number of coefficients is one less than the number of levels for a factor. Size, for example, has 5 levels, so there are 4 coefficients in the contrast statements. When a contrast involves the last level (Rural in this case), special care needs to be taken in forming contrast coefficient values following this rule: The last effect level is set to be equal to the negative sum of all other coefficients for that effect. To illustrate, let the 5 levels of Size be represented by the greek letters \(\alpha_1\)  \(\alpha_5\). The last level is then set to \(\alpha_5 = (\alpha_1+\alpha_2+\alpha_3+\alpha_4 )\). A contrast comparing “Large City” to “Rural” would then be:
\[ H_0 : (\alpha_1  \alpha_5) = (\alpha_1  ((\alpha_1+\alpha_2+\alpha_3+\alpha_4 )) =\\ \alpha_1 +\alpha_1+\alpha_2+\alpha_3+\alpha_4 = \\2\alpha_1 + \alpha_2+\alpha_3+\alpha_4\] The coefficients are, therefore, 2 1 1 1. Although these do not add to zero, as expected in other procedures, they are correct for this contrast.
Unfortunately, contrasts for interaction terms can become much more complex. In those cases, it may be better to construct a combined categorical variable for both Age and Size
(In a data step, define it as Age_Size = Age” “Size;)
and run that variable in the model and as the loglin effect. While it will have many coefficients (24), the last level “Over 65 Rural” will be dropped and contrasts can be set up as shown above.
proc catmod data=survey order=data;
weight count;
model Size*Age=_response_/pred=prob;
loglin Size Age Size*Age;
contrast 'Large vs Small City' Size 1 0 1 0;
contrast 'Medium vs Town' Size 0 1 0 1;
contrast 'Large vs Rural' Size 2 1 1 1;
run;
Data Summary  

Response  Size*Age  Response Levels  25 
Weight Variable  Count  Populations  1 
Data Set  SURVEY  Total Frequency  372 
Frequency Missing  0  Observations  45 
Population Profiles  

Sample  Sample Size 
1  372 
Response Profiles  

Response  Size  Age 
1  Large City  1835 
2  Large City  3645 
3  Large City  4655 
4  Large City  5665 
5  Large City  Over 65 
6  Medium City  1835 
7  Medium City  3645 
8  Medium City  4655 
9  Medium City  5665 
10  Medium City  Over 65 
11  Small City  1835 
12  Small City  3645 
13  Small City  4655 
14  Small City  5665 
15  Small City  Over 65 
16  Town  1835 
17  Town  3645 
18  Town  4655 
19  Town  5665 
20  Town  Over 65 
21  Rural  1835 
22  Rural  3645 
23  Rural  4655 
24  Rural  5665 
25  Rural  Over 65 
Maximum Likelihood Analysis 

Maximum likelihood computations converged. 
Maximum Likelihood Analysis of Variance  

Source  DF  ChiSquare  Pr > ChiSq 
Size  4  68.69  <.0001 
Age  4  9.74  0.0451 
Size*Age  16  47.55  <.0001 
Likelihood Ratio  0  .  . 
Analysis of Maximum Likelihood Estimates  

Parameter  Estimate 
Standard Error 
Chi Square 
Pr > ChiSq  
Size  Large City  0.0770  0.1855  0.17  0.6780 
Medium City  0.3998  0.1492  7.18  0.0074  
Small City  0.3275  0.1482  4.88  0.0271  
Town  0.0104  0.1341  0.01  0.9381  
Age  1835  0.1395  0.1257  1.23  0.2671 
3645  0.2176  0.1285  2.87  0.0903  
4655  0.0601  0.1266  0.23  0.6350  
5665  0.1948  0.1289  2.28  0.1307  
Size*Age  Large City 1835  0.4335  0.2527  2.94  0.0862 
Large City 3645  0.7784  0.2408  10.45  0.0012  
Large City 4655  0.3411  0.2600  1.72  0.1896  
Large City 5665  0.2064  0.2611  0.62  0.4293  
Medium City 1835  0.00911  0.2697  0.00  0.9730  
Medium City 3645  0.8798  0.3513  6.27  0.0123  
Medium City 4655  0.1939  0.2626  0.55  0.4602  
Medium City 5665  0.3216  0.2474  1.69  0.1937  
Small City 1835  0.4687  0.3040  2.38  0.1232  
Small City 3645  0.1464  0.2513  0.34  0.5600  
Small City 4655  0.2351  0.2900  0.66  0.4175  
Small City 5665  0.6800  0.2264  9.02  0.0027  
Town 1835  0.0407  0.2327  0.03  0.8612  
Town 3645  0.5817  0.2073  7.87  0.0050  
Town 4655  0.2163  0.2543  0.72  0.3949  
Town 5665  1.2673  0.3453  13.47  0.0002 
Contrasts of Maximum Likelihood Estimates  

Contrast  DF  ChiSquare  Pr > ChiSq 
Large vs Small City  1  0.82  0.3642 
Medium vs Town  1  3.42  0.0645 
Large vs Rural  1  13.43  0.0002 
Maximum Likelihood Predicted Values for Response Functions  

Function Number 
Observed  Predicted  Residual  
Function 
Standard Error 
Function 
Standard Error 

1  0.3514  0.299447  0.3514  0.299447  0 
2  0.071459  0.267432  0.071459  0.267432  0 
3  0.52325  0.315495  0.52325  0.315495  0 
4  0.52325  0.315495  0.52325  0.315495  0 
5  3.29584  1.01835  3.29584  1.018201  4.61E8 
6  1.09861  0.3849  1.09861  0.3849  0 
7  1.90954  0.535758  1.90954  0.535759  0 
8  0.99325  0.370185  0.99325  0.370185  0 
9  0.73089  0.33758  0.73089  0.33758  0 
10  1.50408  0.451335  1.50408  0.451336  0 
11  1.50408  0.451335  1.50408  0.451336  0 
12  0.81093  0.346944  0.81093  0.346944  0 
13  1.34993  0.424139  1.34993  0.42414  0 
14  0.3001  0.29502  0.3001  0.295021  0 
15  1.90954  0.535758  1.90954  0.535759  0 
16  0.65678  0.329341  0.65678  0.329341  0 
17  0.03774  0.27477  0.03774  0.27477  0 
18  0.99325  0.370185  0.99325  0.370185  0 
19  1.90954  0.535758  1.90954  0.535759  0 
20  0.58779  0.322031  0.58779  0.322031  0 
21  0.071459  0.267432  0.071459  0.267432  0 
22  0.46262  0.309614  0.46262  0.309614  0 
23  0.07696  0.277555  0.07696  0.277556  0 
24  0.200671  0.2595  0.200671  0.2595  0 
Maximum Likelihood Predicted Values for Probabilities  

Size  Age  Observed  Predicted  Residual  
Probability 
Standard Error 
Probability 
Standard Error 

Large City  1835  0.0511  0.0114  0.0511  0.0114  0 
Large City  3645  0.078  0.0139  0.078  0.0139  0 
Large City  4655  0.043  0.0105  0.043  0.0105  0 
Large City  5665  0.043  0.0105  0.043  0.0105  0 
Large City  Over 65  0.0027  0.0027  0.0027  0.0027  1E10 
Medium City  1835  0.0242  0.008  0.0242  0.008  0 
Medium City  3645  0.0108  0.0053  0.0108  0.0053  0 
Medium City  4655  0.0269  0.0084  0.0269  0.0084  0 
Medium City  5665  0.0349  0.0095  0.0349  0.0095  0 
Medium City  Over 65  0.0161  0.0065  0.0161  0.0065  0 
Small City  1835  0.0161  0.0065  0.0161  0.0065  0 
Small City  3645  0.0323  0.0092  0.0323 