Categorical Frequency Data

Procedures Applicable to Categorical Frequency Data

This section contains the following items. Details for each can be found by scrolling down the page.
	°	Binomial Probabilities
	°	Fitting an Observed Frequency Distribution to the Closest Poisson Distribution
	°	For Sequential Sampling: Pascal (Negative Binomial) Probabilities
	°	Chi-Square "Goodness of Fit" Test
	°	Kolmogorov-Smirnov One-Sample Test
	°	For a 2x2 Table of Cross-Categorized Frequency Data: Phi Coefficient; Chi-Square Test of Association; Fisher Exact Probability Test; Rates, Risk Ratio, Odds, Odds Ratio, and Log Odds
	°	Fisher Exact Probability Test for Tables Larger than 2x2
	°	Chi-Square, Cramer's V, and Lambda for a Rows by Columns Contingency Table
	°	Log-Linear Analysis for a 3-Way Contingency Table
	°	Kappa as a Measure of Concordance in Categorical Sorting
	°	McNemar's Test for Correlated Proportions in the Marginals of a 2x2 Contingency Table

Binomial Probabilities, as calculated or estimated according to one or more of the following methods:

	»Exact binomial probabilities
	»Approximation via the normal distribution
	»Approximation via the Poisson distribution

Fitting an Observed Frequency Distribution to the Closest Poisson Distribution. The programming for this unit will find the Poisson distribution that most closely fits an observed frequency distribution.

For Sequential Sampling: Pascal (Negative Binomial) Probabilities. For a situation in which independent binomial events are randomly sampled in sequence, this unit will calculate (a) the probability that you will end up with exactly k instances of the outcome in question, with the final (k^th) instance occurring on trial N; and (b) the probability that you will have to sample at least N events before finding the k^th instance of the outcome.

Chi-Square "Goodness of Fit" Test for up to 8 mutually exclusive categories; with an option for estimating the relevant probability, in the case of small samples, via Monte Carlo simulation of the multinomial sampling distribution. [See also: "The Power of the Chi-Square "Goodness of Fit" Test," under "Miscellanea," which pertains to the questionable common practice of accepting the null hypothesis upon failing to find a significant result in a one- dimensional chi-square test.]

Kolmogorov-Smirnov One-Sample Test. A "goodness of fit" test suitable for small sample sizes.

For a 2x2 Table of Cross-Categorized Frequency Data:

Version 1
·Phi Coefficient of Association
·Chi-Square Test of Association
·Fisher Exact Probability Test

Version 2
Same as Version 1, but with provision
for calculating Rates, Risk Ratio, Odds,
Odds Ratio, and Log Odds.

Fisher Exact Probability Test for Tables Larger than 2x2

2x3

2x4

3x3

Chi-Square, Cramer's V, and Lambda for a rows by columns contingency table containing up to 5 rows and 5 columns.

Log-Linear Analysis for a 3-Way Contingency Table. Log-linear analysis is a version of chi-square analysis in which the relevant values are calculated by way of weighted natural logarithms. The first advantage of this procedure is that it is easier to program in the case of a complex 3-way contingency table, since it allows all chi-square values to be derived through simple addition and subtraction of various combinations of the weighted logarithms. The second advantage is that the chi-square values thus derived are linear, which allows for more complex analyses not readily available through the conventional chi-square computational procedure. When a chi-square value is calculated by the log- linear method, it is typically designated as G² as an indication of its computational origin.

Kappa as a Measure of Concordance in Categorical Sorting. Calculates unweighted kappa and kappa with linear and quadratic weightings, along with some other measures of concordance.

McNemar's Test for Correlated Proportions in the Marginals of a 2x2 Contingency Table. Assesses the significance of the difference between two correlated proportions, such as might be found in the case where the two proportions are based on the same sample of subjects or on matched-pair samples.