Background: Counts are familiar outcomes in education research settings, including those involving tests of interventions. Clustered data commonly occur in education research studies, given that data are often collected from students within classrooms or schools. There is a wide array of distributions and models that can be used for clustered count data and that adequately capture the shape and nature of diverse count distributions (Cameron & Trivedi, 2013; Grimm & Stegmann, 2019; Hilbe, 2011). In this paper we focus on two specific models for multilevel analyses of counts that are forms of generalized linear mixed models (GLMM) for discrete data: multilevel Poisson (P) and multilevel negative binomial (NB). We cover theoretical background of both approaches, and present a series of guiding questions to assist researchers in determining which may be the most appropriate model for their particular analyses and research questions. Purpose/objective: Our paper focuses on two comparative applications (P and NB) of multilevel models for counts. Our objectives are to (1) provide theoretical background; (2) clarify assessment of degree of clustering for each model, as well as parameter interpretation; (3) review guiding questions for researchers to assist in model comparison and choice; (4) highlight software issues likely encountered when analyzing clustered count data; and (5) briefly discuss possible extensions or other analysis options. Application: Our motivating example is based on the ECLS-K (https://nces.ed.gov/ecls/kindergarten.asp) and was simulated/modified for demonstration purposes. These data correspond to a study in which proficiency counts were collected for a sample of n = 12,938 kindergarten children sampled from within J = 720 early-grade schools. Proficiency (profcount) was scored based on the number of correct responses to a set of twelve early reading and early math skills items, with the total possible count ranging from 0 to 12 (M = 2.05 items, S = 1.85, S[superscript 2] = 3.44). Variables used to predict proficiency at the child level included male (dichotomous, 50% male), highest level of mother's education ("c_momed," centered and continuous, Md = 5 (at least some college)), and food insecurity ("foodinsec," dichotomous, 8% experiencing food insecurity). Predictors of pro ciency considered at the school level included "public" (78% public schools) and count of crime/conflict issues in the surrounding community ranging from 0 to 21 ("nbhoodprobs," M = 6.29 items, S = 3.39, S[superscript 2] = 11.46). Methods: The Poisson distribution makes an assumption of equidispersion which implies that the mean of the outcome (conditional on covariates in the model) is equal to its variance: E(y[subscript 1]) = [subscript lambda i] and V(y[subscript 1]) = [subscript lambda i]. Equidispersion is violated when the variance exceeds the mean, commonly known as overdispersion. The NB distribution we discuss here -- also referred to as NB2 (Cameron & Trivedi, 2013; Hilbe, 2011) relaxes this assumption through the use of a Poisson-Gamma mixture which captures overdispersion in the counts. As a Poisson-Gamma mixture, the NB model has the same mean structure as the Poisson, with conditional variance, [lambda subscript i] + [lambda subscript i superscript 2]/[theta]. The extra dispersion term tends to 0 as [theta] tends to infinity, representing an indirect relationship between [theta] and dispersion (Friendly & Myer, 2016; Hilbe, 2011). To capture a direct association between the parameter contributing to the increased dispersion and the variance, the negative binomial model can be parameterized so that [alpha] = 1/[theta] represents the factor by which the second term is increased; thus, this parameterization for the Poisson-Gamma mixture has variance represented by [lambda] + [alpha][lambda superscript 2] . As [alpha] increases so does the additional variance, and as [alpha] tends towards 0, the Gamma distribution contributes no additional dispersion and the variance of the Poisson-Gamma mixture converges to that of the Poisson. Interest focuses on estimating [alpha], which is called the heterogeneity or NB dispersion parameter under this parameterization (Fox, 2017; Hilbe, 2011), and assessing whether its inclusion provides better fit for the model to the data over that of the Poisson. A multilevel approach accounts for variability that arises from clustered or correlated count data. Degree of clustering is an important issue for multilevel models in general, and we present recent advances in variance partitioning for multilevel count models including estimation of the intraclass correlation coefficient (ICC) following Leckie et al. (2020). We propose four guiding questions for comparison of models and decisions on multilevel P versus NB analyses: (1) To what degree is clustering present in the data, and does accounting for clustering improve model fit? (2) After adjusting for clustering, is the assumption of equidispersion reasonable, or should this assumption be relaxed? (3) Does inclusion of covariates improve model fit, and should any level-one predictors be treated as random or fixed? and (4) How are parameter estimates interpreted for the final model (multilevel P or NB)? Results: We focus on specific aspects to the guiding questions here, and full description is included in the paper and our presentation. Selected model results for our demonstration data are presented in Table 1 for two-level random-intercept P and NB analyses. Variance partitioning based on Leckie, et al. (2020) yields ICC(P) = 0.37, and ICC(NB) = 0.28. Once clustering is accommodated, itself a source of overdispersion, the NB model including extra dispersion as a random effect at level-1 indicates somewhat lower estimated correlation between students from the same school. The extra dispersion parameter for the NB model is statistically different from 0, (Wald's Z = 0.15/0.01 = 15.0) and thus the equidispersion assumption is not upheld, even after all predictors are included in the model (0.12/0.01 = 12.0). Comparing the full NB model to the corresponding nested P model via likelihood ratio test, X[superscript 2 subscript 1] = 211.61, p < 0.0001, the NB model is preferred; this model also has the smallest AIC of the models presented. Summary/Discussion: Multilevel NB models are becoming more common in education research and in the literature. The four questions posed here can help guide education researchers in their choice of multilevel count models particularly between P and NB analyses. Our paper and presentation will include further details on the guiding questions and model parameters including interpretation of incidence rate ratios (exp(gamma)).