1 Introduction

Categorical responses are data that are intrinsically qualitative or non-numerical. As such, they present some unique characteristics and issues for analysis. There are multiple categorical procedures in SAS for dealing with this. This tutorial will cover common ones use to carry out a variety of analysis types.

1.1 Characteristics of Categorical Data

Examples of categorical data would be responses such as gender, level of education, Yes-No answers, etc. A characteristic of such data is that the categories are non-overlapping or mutually exclusive. Because of this, when the data are summarized to numbers like percentages, the values must add to 100%, a condition known as “Sum-to-One”. A consequence of this is that if we know all but one category, we automatically know the missing category: If there are 60% Yes responses, then we know there are 40% No. If plants are recorded in three categories as 20% short and 30% medium, then we know there are also 50% tall. In analyses, this effects the number of parameters estimated. Generally, if there are C categories, then the analyses will estimate C-1 parameters.

Another characteristic of some categorical data is order. For example in the short, medium, tall example above, there is a natural progression or order to the categories. Occasionally, we might also categorize true continuous data such as income or age into ordered categories, e.g. age: 20-35, 36-50, and greater than 50. These Ordinal data are often treated differently than non-ordered data (nominal data) by looking at the cumulative responses across the ordered groups measuring the incremental change between levels rather than the absolute percentage at each level.

Individual categorical responses are referred to as Multinomials, as in a multinomial distribution (Multinomials with two categories are a special class referred to as Binomials). The individual categories or cells of the multinomial are characterized by the probability of that category occurring or being observed and the sum of all probabilities across the multinomial is 1.0. Categorical data can also occur in combinations or cross classifications of 2 or more multinomials and these are known as Contingency Tables. Like their singular counterparts, each combination or table cell is represented by a probability of that cell combination and these also sum to 1.0. Contingency tables can also be summarized by row or column totals and these are referred to as marginal distributions. All of these terms and structures can come into play during an analysis as will be demonstrated below.

2 Data used in Examples

These data come from survey of participants who participated in several online pesticide safety workshops and are described in Innovative Virtual Pesticide Recertification Webinar Series Achieves Success during the COVID-19 Pandemic, Himyck, R. et al. 2022, Journal of Pesticide Safety Education. While this survey covered many topics, the data used here are a subset that relate to questions on participant age, size of community, and the online device they accessed the workshop with. Age is categorized into 5 levels, Community Size has 5 levels, and two types of Devices (Smart phone and Tablet) are considered. The distributions of Age by Device type are plotted below.



PROC IMPORT OUT= WORK.survey
    DATAFILE= ".\Data\survey.xlsx"
    DBMS=XLSX REPLACE;
    sheet="Survey";
RUN;
    
proc print;
run;

proc sgplot data=survey;
    styleattrs datacolors=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) datacontrastcolors=(black black black);
    vbarparm category=Device response=Count / LIMITATTRS=(color=black)
    group=Age grouporder=data OUTLINEATTRS=(color=black) groupdisplay=cluster;
    xaxis label='Device' TYPE=discrete DISCRETEORDER=formatted LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
    yaxis label="Count" LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
    title1 ' ';
    keylegend /AUTOITEMSIZE valueattrs=(size=14) TITLEATTRS=(weight=bold size=12); 
run;

Obs	Age	Size	Device	Count
1	18-35	Large City	Smart phone	18
2	36-45	Large City	Smart phone	16
3	46-55	Large City	Smart phone	14
4	56-65	Large City	Smart phone	6
5	18-35	Large City	Tablet	1
6	36-45	Large City	Tablet	13
7	46-55	Large City	Tablet	2
8	56-65	Large City	Tablet	10
9	Over 65	Large City	Tablet	1
10	18-35	Medium City	Smart phone	8
11	36-45	Medium City	Smart phone	4
12	46-55	Medium City	Smart phone	10
13	56-65	Medium City	Smart phone	6
14	Over 65	Medium City	Smart phone	1
15	18-35	Medium City	Tablet	1
16	56-65	Medium City	Tablet	7
17	Over 65	Medium City	Tablet	5
18	18-35	Small City	Smart phone	1
19	36-45	Small City	Smart phone	7
20	46-55	Small City	Smart phone	3
21	56-65	Small City	Smart phone	10
22	Over 65	Small City	Smart phone	2
23	18-35	Small City	Tablet	5
24	36-45	Small City	Tablet	5
25	46-55	Small City	Tablet	4
26	56-65	Small City	Tablet	10
27	Over 65	Small City	Tablet	2
28	18-35	Town	Smart phone	6
29	36-45	Town	Smart phone	26
30	46-55	Town	Smart phone	6
31	56-65	Town	Smart phone	3
32	Over 65	Town	Smart phone	2
33	18-35	Town	Tablet	8
34	46-55	Town	Tablet	4
35	56-65	Town	Tablet	1
36	Over 65	Town	Tablet	13
37	18-35	Rural	Smart phone	17
38	36-45	Rural	Smart phone	17
39	46-55	Rural	Smart phone	11
40	56-65	Rural	Smart phone	18
41	Over 65	Rural	Smart phone	15
42	18-35	Rural	Tablet	12
43	46-55	Rural	Tablet	14
44	56-65	Rural	Tablet	15
45	Over 65	Rural	Tablet	12

The SGPlot Procedure

3 Basic Categorical Summary and Inference

3.1 One-way Marginal Analyses

The most basic way to look at this data is to summarize the one-way marginal totals. In this example, we look at Age and Device, ignoring Community Size for now. This looks at Age and Device separately. The procedure used is Proc Freq, which is a tabulation procedure. A Weight statement is used to indicate the data are already summarized into counts. The Table statement implements the tabulation request. Both Age and Device are specified in the statement, generating two analyses. We could also issue two Table statements and generate equivalent results. The option chisq requests a test to assess whether all the categories within each factor have an equal probability of occurring. The plots option asks for frequency plots of each marginal distribution.



proc freq data=survey;
    weight count;
    tables age device/chisq plots(only)=freqplot;
    title1 'Age and Device Marginals';
run;

Age and Device Marginals


Age	Frequency	Percent	Cumulative Frequency	Cumulative Percent
18-35	77	20.70	77	20.70
36-45	88	23.66	165	44.35
46-55	68	18.28	233	62.63
56-65	86	23.12	319	85.75
Over 65	53	14.25	372	100.00

Bar Chart of Frequencies for Age

Chi-Square Test for Equal Proportions
Chi-Square	11.0914
DF	4
Pr > ChiSq	0.0256

Sample Size = 372


Device	Frequency	Percent	Cumulative Frequency	Cumulative Percent
Smart phone	227	61.02	227	61.02
Tablet	145	38.98	372	100.00

Bar Chart of Frequencies for Device

Chi-Square Test for Equal Proportions
Chi-Square	18.0753
DF	1
Pr > ChiSq	<.0001

Sample Size = 372

The output here gives the one-way tabulation for frequency (counts), percent, and the cumulative frequencies and percentages. The Chi-square test indicates that the probabilities of categories within each factor are not equal. Note, the degrees of freedom for each factor is one less than the respective factor levels.

3.2 Two-way Contingency Table Analysis

Many times we want to assess the association between two or more categorical factors. In this example, a two-way contingency table is setup to look at the combination of Age and Device. A chi-square test (option chisq) is requested to examine the potential association or independence of the two.


options nonotes;
proc freq data=survey;
    weight count;
    tables age*device/chisq;
    title1 'Age by Device Contingency Table';
run;

Age by Device Contingency Table

18-35

50

13.44

64.94

22.03

27

7.26

35.06

18.62

77

20.70

36-45

70

18.82

79.55

30.84

18

4.84

20.45

12.41

88

23.66

46-55

44

11.83

64.71

19.38

24

6.45

35.29

16.55

68

18.28

56-65

43

11.56

50.00

18.94

43

11.56

50.00

29.66

86

23.12

Over 65

20

5.38

37.74

8.81

33

8.87

62.26

22.76

53

14.25

Total

227

61.02

145

38.98

372

100.00

Statistics for Table of Age by Device

Statistic	DF	Value	Prob
Chi-Square	4	30.0534	<.0001
Likelihood Ratio Chi-Square	4	30.7686	<.0001
Mantel-Haenszel Chi-Square	1	19.4680	<.0001
Phi Coefficient		0.2842
Contingency Coefficient		0.2734
Cramer's V		0.2842

Sample Size = 372

In the resulting table, four numbers are given in each cell. From top to bottom they are: the number of observations for that cell, the corresponding percentage of that cell in the whole table, the row percentage of the cell, and finally, the column percentage. These last two percentages or marginal distributions are often useful for thinking about possible associations. For example, we could look at row percentages in this table representing the distribution of device types within each age group. If there were no association, the percentages of cell phones and tablets would be similar for every age class. Looking at the table, however, we see that the percentage of smart phone use in younger groups is higher than those in older age groups. The percentages for tablet use follow a reverse trend. This is also evident in the initial plot of the data given above.

The chi-square test option confirms this where the p-value is < 0.0001, indicating an association was detected. In this table there are several tests carried out. Only the first two, Chi-Square and Likelihood Ratio Chi-Square, are relevant and either of these can be reported. Note that, like correlations, this does not imply causality, but just indicates that the distribution of Device types changes with Age classes and the Devices tend to trend in opposite direction as Age increases.

4 Advanced Categorical Analysis

4.1 Logistic Regression

Sometimes it is of interest to more directly model the relationships between two or more categorical factors. One common means for doing this is logistic regression. In logistic regression, a binary categorical “response” is modeled as a function of other factors, which can be either categorical or continuous. A key here is that the response is a binary factor. While modeling can be done with more than two categories in the response, the interpretation becomes much more complex. In logistic regression, we indirectly model the proportions of the binary responses. This is done by selecting one of the categories, often referred to as a “success”, and representing its proportion of success as p. The proportion of “failures” is then 1-p because the proportions of “success” and “failure” must add to 1.0. Note that which category we assign as a “success” or “failure” is immaterial, but SAS will choose the lowest numeric value of the binary response as a “success” (In SAS, this can be reversed with the (descending) option placed after the response in the model statement). Once it is defined, however, the proportion is transformed logrithmically to:

\[ transformed\;\; proportion = ln\left( \frac{p}{1-p}\right) = logit \] The fraction of success to failure in the log function is referred to as the odds of success, p, and the entire term as the log odds or logit. When logistic regression is run with a categorical factor on the right hand side of the model, the procedure will form one logit or log odds for every level of that factor. Results are then usually displayed or reported as the ratio of the odds (odds ratio = OR) of all levels relative to one selected level. By default, SAS determines this level to be the last alphabetical level. This can also be changed if needed (e.g. the ref option in a Class statement for SAS). An alternative is to also just report and interpret the proportions themselves along with the odds.

SAS has several procedures that can run logistic regression. In the example below, Proc Glimmix is used as logistic regression is actually a generalized linear model. For more information on generalized linear models, see the tutorial here. In this case the factor Device is the binary response. In order to get SAS to implement logistic regression, however, we need to get Glimmix to view this response as a numeric binary variable. We could recode the “Smart phone” : “Tablet” character values in the data, but here the Proc Format procedure is used to simply coerce the change to 0 and 1, respectively, without manipulating the data. In the model, a binary distribution is specified where the logit is the default link function. Age class is the factor on the right hand side of the model. The model will then assess the odds and odds ratios of Smart phone usage for each Age class. The Lsmeans statement, with the ilink and odds options, are used to display the predicted proportions of “success” (Smart phone) and the respective odds for each age class. These are also output to a separate data set and the plotted with Proc Sgplot.


proc format;
    value $dvf 'Smart phone' = '0'
                'Tablet' = '1';
run;
ods graphics;
                
proc glimmix data=survey method=quad;
    weight count;
    class age;
    model device = age/dist=binary oddsratio;
    format device $dvf.;
    lsmeans age/cl ilink odds;
    ods output LSMeans=odds;
run;
                
                
proc sgplot data=odds noautolegend; 
    styleattrs datacolors=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) DATACONTRASTCOLORS=(cx1805A7 cx8805A9 cxF717BB cxF71731 cxF78B17) ;
    highlow y=age high=upperodds low=lowerodds/group=age type=line lineattrs=(pattern=solid) highcap=serif lowcap=serif LINEATTRS=(thickness=2);
    scatter y=age x=odds/group=age FILLEDOUTLINEDMARKERS markerattrs=(symbol=circlefilled size=14) MARKEROUTLINEATTRS=(color=black) datalabel=age DATALABELATTRS=(Color=black Family=Arial Style=Italic Weight=Bold size=10);
    refline 1 /axis=x lineattrs=(pattern=shortdash color=black) ;
    yaxis label='Age' TYPE=discrete DISCRETEORDER=data LABELATTRS=( Family=Arial Size=15 Weight=Bold) display=(NOVALUES NOTICKS);
    xaxis label='Odds of Smart Phone Use'  LABELATTRS=( Family=Arial Size=15 Weight=Bold) VALUEATTRS=(Family=Arial Size=12 Weight=Bold);
run;

Model Information
Data Set	WORK.SURVEY
Response Variable	Device
Response Distribution	Binary
Link Function	Logit
Variance Function	Default
Weight Variable	Count
Variance Matrix	Diagonal
Estimation Technique	Maximum Likelihood
Degrees of Freedom Method	Residual

Class Level Information
Class	Levels	Values
Age	5	18-35 36-45 46-55 56-65 Over 65

Number of Observations Read	45
Number of Observations Used	45

Response Profile
Ordered Value	Device	Total Frequency
1	0	24
2	1	21
The GLIMMIX procedure is modeling the probability that Device='0'.

Dimensions
Columns in X	6
Columns in Z	0
Subjects (Blocks in V)	1
Max Obs per Subject	45

Optimization Information
Optimization Technique	Newton-Raphson
Parameters in Optimization	5
Lower Boundaries	0
Upper Boundaries	0
Fixed Effects	Not Profiled

Iteration History
Iteration	Restarts	Evaluations	Objective Function	Change	Max Gradient
0	0	4	233.65149305	.	3.492468
1	0	3	233.35427822	0.29721483	0.078116
2	0	3	233.354175	0.00010322	0.000033
3	0	3	233.354175	0.00000000	6.73E-12

Convergence criterion (GCONV=1E-8) satisfied.

Fit Statistics
-2 Log Likelihood	466.71
AIC (smaller is better)	476.71
AICC (smaller is better)	478.25
BIC (smaller is better)	485.74
CAIC (smaller is better)	490.74
HQIC (smaller is better)	480.08
Pearson Chi-Square	372.00
Pearson Chi-Square / DF	8.27

Type III Tests of Fixed Effects
Effect	Num DF	Den DF	F Value	Pr > F
Age	4	40	7.04	0.0002

Odds Ratio Estimates
Age	Age	Estimate	DF	95% Confidence Limits
18-35	Over 65	3.056	40	1.445	6.462
36-45	Over 65	6.417	40	2.932	14.042
46-55	Over 65	3.025	40	1.402	6.525
56-65	Over 65	1.650	40	0.803	3.389

Age Least Squares Means
Age	Estimate	Standard Error	DF	t Value	Pr > \|t\|	Alpha	Lower	Upper	Mean	Standard Error Mean	Lower Mean	Upper Mean	Odds	Lower Odds	Upper Odds
18-35	0.6162	0.2388	40	2.58	0.0137	0.05	0.1335	1.0989	0.6494	0.05438	0.5333	0.7500	1.8519	1.1428	3.0008
36-45	1.3581	0.2643	40	5.14	<.0001	0.05	0.8240	1.8922	0.7955	0.04300	0.6951	0.8690	3.8889	2.2796	6.6342
46-55	0.6061	0.2538	40	2.39	0.0217	0.05	0.09327	1.1190	0.6471	0.05795	0.5233	0.7538	1.8333	1.0978	3.0618
56-65	1.11E-16	0.2157	40	0.00	1.0000	0.05	-0.4359	0.4359	0.5000	0.05392	0.3927	0.6073	1.0000	0.6467	1.5463
Over 65	-0.5008	0.2834	40	-1.77	0.0848	0.05	-1.0735	0.07195	0.3774	0.06658	0.2547	0.5180	0.6061	0.3418	1.0746

The SGPlot Procedure

Some notes on the output: First, in the Response Profile, Device type is listed as 0 or 1, reflecting the formats from the preceding Proc Format definitions. Also, note that the output states the procedure is modeling the probability that Device=‘0’, which was defined to be “Smart phone”. This will be the “p” in the logit function.

Further down are the Type III test of Fixed Effects where the test of differences in the Age factor are given. In this case, Age has a low p-value suggesting the probabilities of Smart phone use are different across the Age categories. This is followed by the Odds Ratio table resulting from the oddsratio option in the model statement. Here there are four values listed. The last category, “Over 65” is the reference level (it was last alphabetically), so the other categories are compared to it. The odds ratios indicate the relative size of each category’s odds compared to this reference level. The odds of Smart phone usage in “18-35” and “46-55” year olds are about 3 times as large as the “over 65” age group, while the “36-45” group is about 6.4 times as large. Although odds ratios are the commonly reported statistic for group comparison in logistic regression, they do not indicate the size of the underlying probabilities.

The LSmeans table completes the information. Here the Estimate column values are the actual logit transformed values. These are not of much use for interpretation. The ilink and odds options, however, provide the estimated probabilities (Mean column) of Smart phone usage and the respective odds. The “36-45” age group has the highest probability at 0.695 or about 70%, while the two oldest groups have probabilities closer to 0.5 or equal probability. In the Odds column, the ratio of success = “Smart phone” to failure = “Tablet” are shown. These indicate how likely Smart phone use is for each group.

Both Odds Ratios and Odds are compared to 1.0. When the probability of success equals that of failure the odds are 1.0, or no preference for either. Also, if two groups have equal odds, their odds ratio will be 1.0. This is indicated in the respective confidence intervals for each statistic. In this example, we see that the “56-65” category had a probability of 0.50 resulting in odds of 1.0. In the odds ratio table above, we see that this category compared to the “Over 65” group had an odds ratio close to 1.0 indicating the odds of Smart phone use in these two groups were similar.

Important: When reporting logistic regression results, It is not sufficient to only report odds ratios. These are relative measurements and do not indicate the magnitude or sizes of the responses. Always present either the underlying probabilities, odds, or both.

4.1.0.1 More on Odds Ratio and Odds numeric results

From the Odds Ratio and LSmeans tables, we can reconstruct the values and see the relationships among them. If we take the estimated probabilities for each category and compute p/(1-p), we get the odds. For example in the “18-35” group, the estimated probability of Smart phone use is 0.6494. Computing 0.6494/(1-.6494) gives 1.85, the odds for that category. If we further take the odds for “18-35” and divide it by the odds of “Over 65” = 1.85/0.6061 we get the odds ratio for “18-35” = 3.056. As noted above, we can reconstruct the underlying probabilities of categories from their odds, but we cannot do the same with only odds ratios. This is why it is important to present complete results when reporting logistic regression.

4.2 Loglinear Models

There are times, especially in survey data, where there are no obvious “independent/dependent” variable relationships. For example, in this data, the relationship between Age and Community Size may not be obvious in terms of one influencing the other. Regardless, we would still like to evaluate the effects of both or their combination. In these cases, it can be useful to use a class of models called Loglinear models. The procedure for these in SAS is Proc Catmod. Note that Catmod can do other model types, and this example is not meant to cover all those cases. Also, Proc Catmod cannot address or account for random model effects, so it has limitations in that way.

Below, Catmod is used to assess the combined effect of Age and Size in the data (this ignores Device effects). The syntax for loglinear models in Catmod are unique compared to other SAS procedures. Here we use a special construct called response as a place holder in the model because both Age and Size could appear on either the left or right hand sides of the model. The loglin statement following the model tells SAS what effects to evaluate on the right hand side. Here, the full model of main effects and interactions are requested. Catmod can also carry out contrasts. Because categorical models drop the last level of an effect for estimation, the contrasts may look different than other modeling procedures. In Catmod, if the contrast does not involve the last level of an effect, the contrasts are similar to other procedures and the coefficients add to zero. Note, however, the number of coefficients is one less than the number of levels for a factor. Size, for example, has 5 levels, so there are 4 coefficients in the contrast statements. When a contrast involves the last level (Rural in this case), special care needs to be taken in forming contrast coefficient values following this rule: The last effect level is set to be equal to the negative sum of all other coefficients for that effect. To illustrate, let the 5 levels of Size be represented by the greek letters \(\alpha_1\) - \(\alpha_5\). The last level is then set to \(\alpha_5 = -(\alpha_1+\alpha_2+\alpha_3+\alpha_4 )\). A contrast comparing “Large City” to “Rural” would then be:

\[ H_0 : (\alpha_1 - \alpha_5) = (\alpha_1 - (-(\alpha_1+\alpha_2+\alpha_3+\alpha_4 )) =\\ \alpha_1 +\alpha_1+\alpha_2+\alpha_3+\alpha_4 = \\2\alpha_1 + \alpha_2+\alpha_3+\alpha_4\] The coefficients are, therefore, 2 1 1 1. Although these do not add to zero, as expected in other procedures, they are correct for this contrast.

Unfortunately, contrasts for interaction terms can become much more complex. In those cases, it may be better to construct a combined categorical variable for both Age and Size

(In a data step, define it as Age_Size = Age||” “||Size;)

and run that variable in the model and as the loglin effect. While it will have many coefficients (24), the last level “Over 65 Rural” will be dropped and contrasts can be set up as shown above.


proc catmod data=survey order=data;
    weight count;
    model Size*Age=_response_/pred=prob;
    loglin Size Age Size*Age;
    contrast 'Large vs Small City' Size 1 0 -1 0;
    contrast 'Medium vs Town' Size 0 1 0 -1;
    contrast 'Large vs Rural' Size 2 1 1 1;

run;

Data Summary
Response	Size*Age	Response Levels	25
Weight Variable	Count	Populations	1
Data Set	SURVEY	Total Frequency	372
Frequency Missing	0	Observations	45

Population Profiles
Sample	Sample Size
1	372

Response Profiles
Response	Size	Age
1	Large City	18-35
2	Large City	36-45
3	Large City	46-55
4	Large City	56-65
5	Large City	Over 65
6	Medium City	18-35
7	Medium City	36-45
8	Medium City	46-55
9	Medium City	56-65
10	Medium City	Over 65
11	Small City	18-35
12	Small City	36-45
13	Small City	46-55
14	Small City	56-65
15	Small City	Over 65
16	Town	18-35
17	Town	36-45
18	Town	46-55
19	Town	56-65
20	Town	Over 65
21	Rural	18-35
22	Rural	36-45
23	Rural	46-55
24	Rural	56-65
25	Rural	Over 65

Maximum Likelihood Analysis
Maximum likelihood computations converged.

Maximum Likelihood Analysis of Variance
Source	DF	Chi-Square	Pr > ChiSq
Size	4	68.69	<.0001
Age	4	9.74	0.0451
Size*Age	16	47.55	<.0001
Likelihood Ratio	0	.	.

Analysis of Maximum Likelihood Estimates
Parameter		Estimate	Standard Error	Chi- Square	Pr > ChiSq
Size	Large City	-0.0770	0.1855	0.17	0.6780
	Medium City	-0.3998	0.1492	7.18	0.0074
	Small City	-0.3275	0.1482	4.88	0.0271
	Town	0.0104	0.1341	0.01	0.9381
Age	18-35	0.1395	0.1257	1.23	0.2671
	36-45	0.2176	0.1285	2.87	0.0903
	46-55	0.0601	0.1266	0.23	0.6350
	56-65	0.1948	0.1289	2.28	0.1307
Size*Age	Large City 18-35	0.4335	0.2527	2.94	0.0862
	Large City 36-45	0.7784	0.2408	10.45	0.0012
	Large City 46-55	0.3411	0.2600	1.72	0.1896
	Large City 56-65	0.2064	0.2611	0.62	0.4293
	Medium City 18-35	0.00911	0.2697	0.00	0.9730
	Medium City 36-45	-0.8798	0.3513	6.27	0.0123
	Medium City 46-55	0.1939	0.2626	0.55	0.4602
	Medium City 56-65	0.3216	0.2474	1.69	0.1937
	Small City 18-35	-0.4687	0.3040	2.38	0.1232
	Small City 36-45	0.1464	0.2513	0.34	0.5600
	Small City 46-55	-0.2351	0.2900	0.66	0.4175
	Small City 56-65	0.6800	0.2264	9.02	0.0027
	Town 18-35	0.0407	0.2327	0.03	0.8612
	Town 36-45	0.5817	0.2073	7.87	0.0050
	Town 46-55	-0.2163	0.2543	0.72	0.3949
	Town 56-65	-1.2673	0.3453	13.47	0.0002

Contrasts of Maximum Likelihood Estimates
Contrast	DF	Chi-Square	Pr > ChiSq
Large vs Small City	1	0.82	0.3642
Medium vs Town	1	3.42	0.0645
Large vs Rural	1	13.43	0.0002

Maximum Likelihood Predicted Values for Response Functions
Function Number	Observed		Predicted		Residual
Function Number	Function	Standard Error	Function	Standard Error	Residual
1	-0.3514	0.299447	-0.3514	0.299447	0
2	0.071459	0.267432	0.071459	0.267432	0
3	-0.52325	0.315495	-0.52325	0.315495	0
4	-0.52325	0.315495	-0.52325	0.315495	0
5	-3.29584	1.01835	-3.29584	1.018201	-4.61E-8
6	-1.09861	0.3849	-1.09861	0.3849	0
7	-1.90954	0.535758	-1.90954	0.535759	0
8	-0.99325	0.370185	-0.99325	0.370185	0
9	-0.73089	0.33758	-0.73089	0.33758	0
10	-1.50408	0.451335	-1.50408	0.451336	0
11	-1.50408	0.451335	-1.50408	0.451336	0
12	-0.81093	0.346944	-0.81093	0.346944	0
13	-1.34993	0.424139	-1.34993	0.42414	0
14	-0.3001	0.29502	-0.3001	0.295021	0
15	-1.90954	0.535758	-1.90954	0.535759	0
16	-0.65678	0.329341	-0.65678	0.329341	0
17	-0.03774	0.27477	-0.03774	0.27477	0
18	-0.99325	0.370185	-0.99325	0.370185	0
19	-1.90954	0.535758	-1.90954	0.535759	0
20	-0.58779	0.322031	-0.58779	0.322031	0
21	0.071459	0.267432	0.071459	0.267432	0
22	-0.46262	0.309614	-0.46262	0.309614	0
23	-0.07696	0.277555	-0.07696	0.277556	0
24	0.200671	0.2595	0.200671	0.2595	0

Maximum Likelihood Predicted Values for Probabilities
Size	Age	Observed		Predicted		Residual
Size	Age	Probability	Standard Error	Probability	Standard Error	Residual
Large City	18-35	0.0511	0.0114	0.0511	0.0114	0
Large City	36-45	0.078	0.0139	0.078	0.0139	0
Large City	46-55	0.043	0.0105	0.043	0.0105	0
Large City	56-65	0.043	0.0105	0.043	0.0105	0
Large City	Over 65	0.0027	0.0027	0.0027	0.0027	-1E-10
Medium City	18-35	0.0242	0.008	0.0242	0.008	0
Medium City	36-45	0.0108	0.0053	0.0108	0.0053	0
Medium City	46-55	0.0269	0.0084	0.0269	0.0084	0
Medium City	56-65	0.0349	0.0095	0.0349	0.0095	0
Medium City	Over 65	0.0161	0.0065	0.0161	0.0065	0
Small City	18-35	0.0161	0.0065	0.0161	0.0065	0
Small City	36-45	0.0323	0.0092	0.0323	0.0092	0
Small City	46-55	0.0188	0.007	0.0188	0.007	0
Small City	56-65	0.0538	0.0117	0.0538	0.0117	0
Small City	Over 65	0.0108	0.0053	0.0108	0.0053	0
Town	18-35	0.0376	0.0099	0.0376	0.0099	0
Town	36-45	0.0699	0.0132	0.0699	0.0132	0
Town	46-55	0.0269	0.0084	0.0269	0.0084	0
Town	56-65	0.0108	0.0053	0.0108	0.0053	0
Town	Over 65	0.0403	0.0102	0.0403	0.0102	0
Rural	18-35	0.078	0.0139	0.078	0.0139	0
Rural	36-45	0.0457	0.0108	0.0457	0.0108	0
Rural	46-55	0.0672	0.013	0.0672	0.013	0
Rural	56-65	0.0887	0.0147	0.0887	0.0147	0
Rural	Over 65	0.0726	0.0135	0.0726	0.0135	0

Categorical Data Analysis in SAS

Statistical Programs, University of Idaho

2022-10-26

1 Introduction

1.1 Characteristics of Categorical Data

2 Data used in Examples

3 Basic Categorical Summary and Inference

3.1 One-way Marginal Analyses

3.2 Two-way Contingency Table Analysis

4 Advanced Categorical Analysis

4.1 Logistic Regression

4.1.0.1 More on Odds Ratio and Odds numeric results

4.2 Loglinear Models