Skip To Content Go to the Table Of Contents
Click for DHHS Home Page
Click for the SAMHSA Home Page
Click for the OAS Drug Abuse Statistics Home Page
Click for What's New
Click for Recent Reports and HighlightsClick for Information by Topic Click for OAS Data Systems and more Pubs Click for Data on Specific Drugs of Use Click for Short Reports and Facts Click for Frequently Asked Questions Click for Publications Click to send OAS Comments, Questions and Requests Click for OAS Home Page Click for Substance Abuse and Mental Health Services Administration Home Page Click to Search Our Site

bulletNational data      bulletState level data       bulletMetropolitan and other subState area data

 

Substance Abuse in States and Metropolitan Areas:
Model Based Estimates from the 1991-1993 National Household Surveys on Drug Abuse

Chapter 4

4. Method Evaluation:

4.1 Evaluation Strategies

4.2 Results of the Evaluations

4.3 Summary of the Evaluations

 

  

4. Method Evaluation

In any small area estimation (SAE) project, questions arise as to how well the model describes the outcome variables, that is, in this case, how effective are the small area estimation models in describing substance abuse in the States and MSAs. When producing small area estimates, there usually are no A gold-standards,@ (i.e., direct estimates with known reliability and validity) that can be used to evaluate the overall methodology. In fact, if there were such estimates available, policy makers could use those and would not need small area estimates. Thus, in the absence of such gold-standards for use in an evaluation, one uses multiple evaluation procedures which, when considered jointly, give an overall indication of the quality of the small area estimates. In this small area estimation study, a large variety of evaluations were conducted. The following summarizes some of these evaluations. (footnote#17)

This SAE study developed a method that 1) reduced bias associated with synthetic/regression estimates which often fail to reflect the full range of variability across small areas, 2) made maximum use of the information that was collected in the NHSDA, and 3) produced estimates that summed to the NHSDA direct estimates for the nation. The evaluations discussed in this chapter found that:

  • The statistical evaluations based on examining the agreement between direct NHSDA estimates and the SAE estimates within evaluation subgroups (see below for an explanation of how this was done), indicated that the SAE estimates had moderate to high correlations with direct estimates, that there were few significant differences between the direct estimates and SAE estimates, and that the SAE estimates adequately reflected the range of prevalence in substance abuse.

  •  

  • The SAE national estimates closely track the NHSDA direct estimates.

     

  • Statistical evaluations based on comparisons to external data on cigarette and alcohol use indicated that the SAE estimates were at about the same level as the external data, that the correlations with the external data were moderate to high, and that the estimates were good to very good at reflecting the range of prevalence of cigarette and alcohol use across the States.

  •  

  • Statistical evaluations based on comparisons to external data from administrative data on arrest and past year treatment for substance abuse indicated that the SAE estimates for States were lower than those based on administrative data (as are the NHSDA estimates at the national level) and had only poor to fair correlations with the corresponding administrative data.

  •  

  • Substance abuse characteristics that have higher prevalence were more adequately reflected by the modeling than those characteristics with lower levels of prevalence.

  •   

4.1 Evaluation Strategies

Three strategies were used to evaluate the estimates:

1) Goodness-of-fit tests

2) Comparisons to external measures of substance abuse

3) Comparisons of the final SAE model to other models

1) Goodness-of-fit tests for the final SAE model

Statistical tests for how well models predict the outcome variables (substance abuse in this case) are termed goodness-of-fit tests. Different goodness-of-fit tests detect different types of problems in the estimates, i.e., different lack-of-fit attributes. (footnote#18)  Two sets of tests were completed, each of which used three goodness-of-fit measures (the correlation, χ2 probability, and range ratio). The first set evaluated the final SAE model by partitioning the entire 1991-1993 NHSDA sample into evaluation subgroups, (footnote#19)  calculating both direct survey based and Model-based estimates for each evaluation subgroup, and comparing these using the three goodness-of-fit statistics. The evaluation subgroups are constructed so that the sample size in each evaluation subgroup is large enough to produce reliable direct estimates. This can be thought of as using the sample to create a set of artificial subpopulations that are predicted to have similar rates of substance abuse within each subgroup and different rates across subgroups. These predicted prevalence estimates are then compared to those that were actually observed in the NHSDA for the artificial populations.

The second set of goodness-of-fit tests were conducted using what is termed a cross-validation approach. Under the cross-validation approach, the models are first fit to part of the data and the resulting prediction equations are then used to make estimates using the remainder of the full data set; that is, the part not used initially to fit the model. (footnote#20) This sample splitting is done multiple times and the associated remainder subsample estimates are used separately to form the evaluation subgroups and calculate the associated goodness-of-fit statistics. Finally, the multiple goodness-of-fit measures are averaged to form a single measure. (footnote#21)  

2) Comparisons to external measures of substance abuse

The second evaluation strategy that was used was to compare the SAE estimates to indicators of substance abuse from external sources. Three different comparisons were made:

  • Comparisons of the SAE estimates of use of alcohol and cigarettes to similar estimates from the Behavioral Risk Factor Surveillance System (BRFSS), which is a telephone survey conducted in all 50 States under cooperative agreements with the Centers for Disease Control and Prevention.
  •  

  • Comparison of the SAE estimates for drug treatment to data from the NDATUS study.

     

  • Comparison of the SAE estimates of arrest in the past year to UCR data on arrests.

  • Although external data were available for only a subset of the estimates from the SAE study, these comparisons were selected because they provided another vehicle for assessing the quality of the entire estimation methodology.

    3) Comparisons of the final SAE model to other models

    Another approach that was used to evaluate the full SAE model was to compare it to other models using the cross-validation goodness-of-fit tests and the comparisons to external data. Six models were investigated including:

    • The final SAE model: The composite estimator which is approximated by a weighted combination of an indirect logistic regression estimator and a local area effect that is a function of the direct survey estimates. The indirect logistic regression estimator was constructed by fitting models which contained county and block group demographic characteristics which were hypothesized to have a relationship to the substance abuse outcome variable. These are listed in Exhibit 2.1. The block group demographic characteristics comprised two types: basic demographic variables (gender, age, and race/ethnicity) and the other demographic variables. The local area effect is a function of the actual NHSDA estimates for the local area. Because the 1991-1993 NHSDA included very large samples for six large US cities, separate models were fit for these six large cities and the remainder of the nation.

    • The big-city-subsample model is similar to the full SAE in that it is a composite estimator that includes both an indirect and direct component; however, in this case a subsample of the respondents from the six large cities was selected and pooled with the full sample from the remainder of the nation before fitting the models. This model was investigated because it reflects the composition of continuing NHSDA surveys which do not have large oversamples from these big cities.

    • The SAE indirect estimator drops the local area direct effects from the full SAE model. This model was fit to evaluate the usefulness of using the composite estimator. Differences between this model and the full SAE model give one an idea as to whether adding the local area effects to the model improves its ability to predict substance abuse. Also, comparing this model with the next simpler model, the county demographic model (below), measures the impact of the addition of the other demographic variables.

    Three simpler models were used a basic demographic model, a demographic model that used six large city/non-large city splits, and a model that included the basic demographics and county level predictions.

  • The basic demographic model uses only demographic predictors of gender, age and race/ethnicity, and applies the coefficients for the significant demographic predictors (main effects and their interactions) to population distributions in the block groups to estimate the prevalence in the block groups. Separate models were not fit for the six large cities and the remainder of the nation. This model is analogous to the simple synthetic estimators and was constructed to investigate the gains that come from using additional predictors (other than demographic characteristics) in the logistic regression models.

  • The six large city/non-large city demographic model is analogous to the basic demographic model in that it uses only the demographic predictors gender, age, and race/ethnicity. However, in this model, the six-big-city/non-six-big city sample split was used to determine if it was possible to improve these estimates by reflecting the fact that there is a different association between substance abuse and demographic composition in the six-big-cities and the remainder of the nation. Comparing this model to the basic demographic model provides an indication of whether or not simple demographic models could be improved if they were fit within areas that were likely to have divergent associations between demographic characteristics and substance abuse.

  • The county demographic model adds county level predictors to the six-big-city/non-big-city demographic model. This model was investigated to determine if it was possible to improve over simple demographic indirect estimators by including county level indicators of substance abuse. This type of model could be a good candidate for monitoring drug use over time because it is simpler than the full SAE model and because it may be possible to monitor changes over time by observing changes in the county level indicators of drug use.

  • Using these three strategies - -  goodness-of-fit tests, comparisons to external data, and comparisons to other models - -  we evaluated the SAE model by: 1) carrying out goodness-of-fit tests for the final SAE model using the evaluation subgroups, 2) conducting cross-validation goodness-of-fit tests on the final SAE and several other models, and 3) comparing estimates from the SAE model and the other types of models to external data by calculating rank correlations and range ratios for the various sets of estimates. The results are given in the next section.

      

    4.2 Results of the Evaluations

    Goodness-of-fit tests for the final SAE model using evaluation subgroups:

    Exhibit 4.1 summarizes the evaluations of the final SAE model based on the Goodness-of-fit tests that were constructed by dividing the sample into evaluation subgroups. Three statistics are presented:

     

  • The correlation between the sample mean of the predicted values and the associated NHSDA direct estimates for each of the evaluation subgroups. Values closer to 1 are better than smaller values.
  • The Chi-Square probabilities from a comparison of the observed (direct estimates) and predicted estimates for each of the evaluation subgroups. (footnote#22)  Based on statistical theory, with a large number of tests as is shown in Exhibit 4.1, if the distribution of the observed and predicted is the same, about half of the χ2 probabilities would be above 0.5 and half below.

      

    Exhibit 4.1 - -  Summary of Evaluation of the Final SAE Model Composite Based on Forming Evaluation Subgroups and Comparing the Model Based Estimates with the Direct Survey Estimates. Correlations, Chi-square Probabilities*, and Range Ratios

     

     

     

     

    Age Group

     

     

    NHSDA SAE

    Outcome Measure

     

    Statistic

     

    12-17

     

    18-25

     

    26-34

     

    35 Plus

    All

    Ages

    Licit Drugs

               
     

    Past Month Cigarette Use

    Correlation

    0.917

    0.952

    0.952

    0.946

    0.978

     

    χ2 probability

    0.038

    0.054

    0.136

    0.846

    0.788

     

    Range Ratio

    0.870

    0.934

    1.016

    1.027

    0.985

     

    Past Month Alcohol Use

    Correlation

    0.871

    0.960

    0.972

    0.981

    0.990

     

    χ2 probability

    0.056

    0.066

    0.049

    0.984

    0.759

     

    Range Ratio

    0.878

    0.872

    0.894

    0.968

    0.948

    Illicit Drugs

               
     

    Past Month Any Illicit Drug Use

    Correlation

    0.939

    0.973

    0.962

    0.942

    0.990

     

    χ2 probability

    0.340

    0.639

    0.384

    0.474

    0.450

     

    Range Ratio

    0.905

    0.827

    0.880

    0.881

    0.868

     

    Past Month Any Illicit But Marijuana Use

    Correlation

    0.916

    0.926

    0.941

    0.845

    0.973

     

    χ2 probability

    0.611

    0.089

    0.774

    0.566

    0.609

     

    Range Ratio

    0.923

    0.801

    1.028

    0.821

    0.879

     

    Past Month Cocaine Use

    Correlation

    0.824

    0.878

    0.920

    0.903

    0.970

     

    χ2 probability

    0.864

    0.094

    0.803

    0.611

    0.607

     

    Range Ratio

    0.856

    0.740

    0.880

    0.889

    0.849

    Dependence

               
     

    Past Year Dependence On Illicit Drugs

    Correlation

    0.912

    0.868

    0.827

    0.723

    0.966

     

    χ2 probability

    0.871

    0.778

    0.625

    0.670

    0.537

     

    Range Ratio

    1.031

    0.882

    0.894

    1.147

    1.000

     

    Past Year Dependence On Alcohol

    Correlation

    0.787

    0.898

    0.943

    0.927

    0.978

     

    χ2 probability

    0.221

    0.007

    0.688

    0.523

    0.020

     

    Range Ratio

    0.892

    0.716

    0.899

    0.878

    0.807

    Treatment

               
     

    Past Year Treatment For Illicit Drugs

    Correlation

    0.908

    0.772

    0.918

    0.884

    0.962

     

    χ2 probability

    0.489

    0.437

    0.539

    0.354

    0.184

     

    Range Ratio

    0.879

    0.920

    0.923

    0.984

    0.944

     

    Past Year Treatment For Alcohol

    Correlation

    0.776

    0.842

    0.763

    0.878

    0.948

     

    χ2 probability

    0.428

    0.680

    0.612

    0.037

    0.159

     

    Range Ratio

    0.978

    0.748

    0.899

    0.814

    0.826

     

    Needing Treatment In Past Year

    Correlation

    0.899

    0.910

    0.923

    0.934

    0.980

     

    χ2 probability

    0.112

    0.255

    0.355

    0.859

    0.403

     

    Range Ratio

    0.892

    0.851

    0.937

    0.823

    0.872

    Arrest

               
     

    Past Year Arrested

    Correlation

    0.961

    0.883

    0.911

    0.878

    0.977

     

    χ2 probability

    0.623

    0.225

    0.902

    0.589

    0.416

     

    Range Ratio

    0.778

    0.883

    0.952

    1.123

    0.950

    *Probability of observing the calculated difference in the predicted and direct estimates across evaluation subgroups given that there is no difference.

     

  • The ratio of the range of the predicted values to the range of the direct estimates. Range ratios close to 1 indicate better agreement between the predicted and the direct estimates. Ratios larger than 1 indicate that the predicted values are more disperse than the actual values and probably means that the model predictions are too large for groups with highest prevalence and too small for the subgroups with lowest prevalence. Range ratios smaller than 1 indicate that the model fails to reflect the actual variability in prevalence across subgroups.

    Summary test statistics for 55 estimates are presented--for 11 outcome measures by 4 age groups and overall for a total of 55 estimates. Examining Exhibit 4.1, it can be seen that all of the tests indicate that the full SAE model works quite well.

     

  • In all but three of the cases the correlation between the predicted and the direct estimates are above 0.8. In addition, it can be noted that these correlations are slightly higher for the more prevalent substance abuse measures.

  • The Chi-Square probabilities show a similar picture. The actual median of the values in the table is 0.467.

  • The range ratios are quite good. This shows that the final SAE model eliminates one of the major disadvantages of the prior methods in that they failed to reflect the full range of variation in the actual estimates.

  • Thus, the evaluations summarized in Exhibit 4.1 indicate that the final SAE model worked well. However, because the estimates within the evaluation of subgroups are not independent, this may present an overly optimistic picture of the quality of the estimates. The following cross validation approach corrects for this lack of independence.

      

    Exhibit 4.2 - - Summary of Evaluations of Final SAE Model and Comparisons to Alternative Models Based on a Cross-validation Goodness-of-fit Tests. Correlations, χ2- Probabilities*, and Range Ratios.

    SAE Outcome Measure

    Statistic

    Final Composite SAE Model

    Indirect estimators

     

    Final SAE Model

     

    County Demographic

    Model

    Large City/

    NonLarge City

    Demographic Model

     

    Basic

    Demographic

    Model

    Past Month Cigarette Use

    Correlation

    0.765

    0.737

    0.622

    0.514

    0.584

    χ2 probability

    0.015

    0.007

    0.008

    0.000

    0.000

    Range Ratio

    1.095

    1.084

    0.956

    0.440

    0.272

                 

    Past Month Alcohol Use

    Correlation

    0.866

    0.858

    0.832

    0.824

    0.839

    χ2 probability

    0.280

    0.231

    0.108

    0.023

    0.001

    Range Ratio

    0.969

    0.841

    0.966

    0.685

    0.524

               

    Past Month Any Illicit Drug Use

    Correlation

    0.728

    0.704

    0.659

    0.573

    0.637

    χ2 probability

    0.043

    0.036

    0.132

    0.080

    0.109

    Range Ratio

    1.618

    1.500

    1.038

    0.510

    0.310

               

    Past Month Any Illicit Drug Use But Marijuana

    Correlation

    0.636

    0.615

    0.392

    0.212

    0.297

    χ2 probability

    0.412

    0.249

    0.242

    0.131

    0.131

    Range Ratio

    1.450

    1.451

    1.213

    0.256

    0.153

               

    Past Year Treatment For Illicit Drugs

    Correlation

    0.588

    0.481

    0.407

    0.532

    0.561

    χ2 probability

    0.287

    0.275

    0.386

    0.294

    0.297

    Range Ratio

    1.420

    1.509

    1.348

    0.278

    0.219

               

    Past Year Arrested

    Correlation

    0.641

    0.662

    0.543

    0.639

    0.635

    χ2 probability

    0.418

    0.399

    0.324

    0.278

    0.165

    Range Ratio

    1.552

    1.537

    1.430

    0.373

    0.425

               

    MEAN

    Correlation

    0.704

    0.676

    0.576

    0.549

    0.592

    χ2 probability

    0.243

    0.199

    0.200

    0.134

    0.117

    Range Ratio

    1.351

    1.321

    1.159

    0.424

    0.317

     

    Note: These tests were restricted to the 26- to 34-year-old age group due to the cost of computations.

    *Probability of observing the calculated difference in the predicted and direct estimates across evaluation subgroups given that there is no difference.

    Cross-validation goodness-of-fit tests for the final SAE model and comparison to other models:

    Exhibit 4.2 summarizes the cross-validation analyses. (footnote#23) In this table, we also introduce for the first time four of the other models that were examined. For example, looking at the first row of Exhibit 4.2, which shows the correlations between the predicted and the direct estimates in the evaluation subgroups, we note that the final SAE model has the highest correlation, the SAE model without the direct local area effects the next highest, and so on.

    Exhibit 4.2

    • The cross-validation approach presents a somewhat less optimistic view of the final SAE model than did the goodness-of-fit tests using evaluation subgroups. The correlations are lower and the range ratios are higher than those observed in Exhibit 4.1. This probably means that the final SAE model included some predictors that should not have been included. This is sometimes called A over-fitting@ in that some of the factors that were identified as being good predictors of substance abuse in one sample were actually not very good predictors when applied to a different sample. (footnote#24)

    • The models that used only the demographic characteristics are the poorest performers in almost all cases. The range ratios indicate that simply using demographic characteristics to predict drug use in a small area will probably not reflect the true range in prevalence estimates across the small areas. This indicates that improvements can be achieved by including more predictors in the models.

    • The county demographic model worked fairly well.  (footnote#25)  Although the full SAE model was the overall best performer, both the county demographic model and the SAE model without the direct effects worked fairly well. The county demographic model estimates are simpler to calculate than the SAE model without the direct effects; therefore, it may be a good candidate for future SAE modeling activities.

    Comparisons to other sources of substance abuse data:

    For each of the six models examined, Exhibits 4.3 through 4.6 compare the estimated prevalences to those from other sources using both the rank correlation and the range ratios. Methodological and definitional differences between the NHSDA and the other sources are considered in making these comparisons. Adjustments were made to the external data in some cases, but it was not possible to fully account for the differences. The other data sources and the adjustments made for comparison with SAE estimates are described below:

    Behavioral Risk Factor Surveillance System (BRFSS): Comparisons of the SAE estimates of the prevalence of past month alcohol and cigarette use are made to estimates from the BRFSS without making any adjustments. The BRFSS is a telephone survey conducted in all 50 States under cooperative agreements with the Centers for Disease Control and Prevention. Definitions used are comparable to NHSDA definitions, since the BRFSS estimates reflect past month use. Studies have shown that reporting of substance use behaviors may be lower in telephone surveys than in face-to-face surveys, particularly for illicit drugs. The BRFSS State estimates are simple averages over the three years 1991 through 1993.

    National Drug and Alcoholism Treatment Unit Survey (NDATUS): SAE estimates of the number of persons receiving treatment for drug abuse are compared to estimates constructed from NDATUS. NDATUS is an inventory of all specialty substance abuse treatment facilities in the U.S. Based on reporting by State substance abuse agencies, it provides estimates (including adjustments for nonresponse) of the number of clients in treatment at a given point in time. To develop an estimate of persons treated during a year, the NDATUS client counts (including drug only and combined drug and alcohol clients) were multiplied by the reciprocals of average lengths of stay, and adjusted to account for multiple treatment episodes in a year by the same individual. Estimates of length of stay and multiple episodes were obtained from the Drug Services Research Survey, conducted in 1990. These calculations were done within categories of treatment modality and and applied separately to each state. No adjustment was made to account for the inclusion in the SAE estimates of persons reporting treatment through self-help groups, private physicians, or emergency rooms, none of which are counted in NDATUS. The State estimates are averages over only 1992 and 1993 since the 1991 estimates could not be adequately adjusted for nonresponse.

    Uniform Crime Reports (UCR): The SAE estimates of the number of persons arrested in the past year are compared to estimates derived from the UCR. The UCR compiles data from local jurisdictions on the number of arrests. For comparison with the SAE estimates, an adjustment to the UCR data was made to account for persons arrested more than once during a year, so the adjusted UCR estimates reflect number of persons arrested at least once. This adjustment was made within the four Census regions, using data on multiple arrests reported by arrestees in the NHSDA sample. The State estimates are simple averages over the three years 1991 through 1993.

    For each of the six models examined, Exhibits 4.3 through 4.6 compare the estimated prevalences to those from other sources using both the rank correlation (footnote#26)  and the range ratios.  (footnote#27)  Of note in these exhibits is the following:

    Exhibit 4.3 presents the results that compare the final SAE estimates for the BRFSS estimates for alcohol use. The rank correlations between the final SAE estimates and the BRFSS estimates are quite high (over 0.85) and the range ratio is good (nearly 0.6). This indicates that the SAE model produced estimates that are very consistent with what is found in the BRFSS. In addition, both sources estimate a similar prevalence of use at the national level with the NHSDA estimates being somewhat larger. This higher level of reporting is consistent with findings from methodology studies which have shown that telephone surveys yield lower reports of use than self-administered surveys.  (footnote#28)  In addition, we note that the rank correlations and range ratios are much better for the final SAE model than the corresponding statistics for the two demographic models. This indicates that the SAE model is performing much better than typical synthetic estimators which states might construct by applying the NHSDA rates of use to their population distribution.

    The comparison of the final SAE estimates to the BRFSS smoking data (Exhibit 4.4) presents a similar picture. The range ratio is very good (over 0.9) and the rank correlations are moderate (about 0.5). The fact that the BRFSS uses a somewhat different question than the NHSDA may account for some of the lack of comparability. Again, we observe that the NHSDA estimates higher levels of cigarette use than the BRFSS, which is consistent with the difference in interview methodology. The demographic model does not do as well as the final SAE model, indicating that the SAE model is better than using a synthetic estimator.

    Exhibits 4.5 and 4.6 present the results for the past year drug treatment and past year arrest. Although we again observe that the SAE models perform better than the demographic models, the rank correlations are still only 0.38 for treatment and 0.35 for arrest. These low correlations are probably due to the lack of correspondence in methodology between the divergent data sources, the low prevalence rates of these items, and the less restricted range across States.  (footnote#29)

    One of the characteristics that is desirable in a small area estimation procedure is that it produce estimates that adequately reflect the range of differences across areas. Because the estimated NHSDA prevalence rates for treatment and arrest are lower than the corresponding estimates from NDATUS and UCR, the calculated range ratios are going to appear artificially small. That is, the range ratios in Exhibits 4.5 and 4.6 are not a good indication of how well the models reflect the range of differences across areas because of the differences in overall prevalence levels

      

    4.3 Summary of the Evaluations

    As was noted in the beginning of this section, a variety of approaches must be used to evaluate small area estimation methods. Considering all of the evidence presented,  (footnote#30) the following findings are noteworthy:

    • The full SAE model is generally the best model in that it tends to have the highest correlations with the direct estimates, to adequately reflect the range of prevalence, and to have exhibited few significant differences when compared to direct estimates in Goodness-of-fit tests.

    • The full SAE model is a better predictor for the more prevalent behaviors than for the less prevalent behaviors and better at reflecting the differences in prevalence rates across areas when there is a wider dispersion of rates across the areas.

  • The demographic models are poor predictors of substance abuse. The estimates presented in this report are much better than States could achieve by simply applying the NHSDA prevalence rates to the population distribution in their States.

  • Estimates of substance abuse are quite sensitive to method effects as evidenced by the fact that the correspondence between SAE models and external small area estimates was increasingly lower as the methodologies between the two information sources were increasingly divergent.

  • Using a NHSDA sample that does not oversample the largest cities and an associated global SAE model appears to work almost as well as the full SAE model. This result bodes well for any future NHSDA small area estimator projects based on data where the six cities were not oversampled.

  • The county demographic model is a very promising alternative to the full SAE model particularly if modeling resources are limited. The county level predictors are appealing since they can be updated from year to year reflecting temporal trends.

  •   

    Exhibit 4.3 - - Comparison of Alternative Small Area Estimators of Prevalence of Alcohol Use to Direct Survey Estimates from the Behavioral Risk Factor Surveillance System (BRFSS).

    State

     

     

    BRFSS

    Estimate

    Composite

    Estimators

    Indirect Estimators

    Direct

    91-93 NHSDA Estimates

    Final SAE

    Model

    Big City Sub-Sampled Model

    SAE Model

    County Demog Model

    Big City/

    Remaind Demog Model

    Basic

    Demog Model

    Total United States

    50.90

    53.46

    53.59

    53.43

    53.01

    53.41

    53.40

    53.01

                     

    North East Region

                   

    New Jersey

    56.60

    59.94

    60.25

    62.03

    60.32

    52.62

    52.91

    61.10

    New York

    53.10

    57.04

    57.35

    56.60

    56.51

    53.60

    52.52

    56.96

    Pennsylvania

    57.10

    55.82

    56.14

    53.75

    55.90

    53.39

    53.96

    52.70

                     
                     

    South Region

                   

    Florida

    53.70

    48.45

    48.71

    48.52

    49.75

    52.43

    52.78

    49.67

    Georgia

    35.90

    48.57

    47.04

    48.21

    47.48

    52.28

    52.78

    48.78

    Kentucky

    32.40

    41.18

    40.48

    44.26

    40.97

    54.18

    54.79

    32.03

    Louisiana

    46.00

    49.40

    49.77

    44.13

    44.66

    51.53

    52.02

    56.62

    North Carolina

    36.10

    46.73

    45.18

    48.17

    46.44

    52.49

    53.01

    43.04

    Oklahoma

    35.50

    39.81

    39.74

    44.02

    44.22

    52.84

    53.16

    36.50

    South Carolina

    36.90

    46.84

    44.32

    46.67

    41.34

    51.83

    52.36

    47.03

    Tennessee

    26.40

    40.70

    38.56

    45.10

    39.07

    53.06

    53.65

    35.76

    Texas

    52.00

    52.88

    53.09

    48.80

    50.07

    53.10

    53.06

    55.23

    Virginia

    50.20

    51.21

    49.88

    52.28

    47.34

    54.18

    53.42

    48.16

    West Virginia

    28.30

    38.61

    38.61

    39.48

    38.58

    53.99

    54.63

    38.41

                     
                     

    North Central Region

                   

    Illinois

    50.70

    54.43

    55.37

    55.26

    55.48

    55.18

    53.27

    55.73

    Indiana

    47.40

    47.70

    48.73

    52.40

    52.36

    54.07

    54.65

    44.95

    Kansas

    45.40

    56.51

    57.63

    54.33

    56.40

    54.10

    54.60

    60.82

    Michigan

    55.50

    56.08

    57.53

    55.23

    54.44

    53.33

    53.86

    58.26

    Minnesota

    63.50

    63.32

    63.32

    57.06

    56.56

    54.74

    55.29

    64.96

    Missouri

    46.50

    54.04

    53.10

    52.22

    54.71

    53.49

    54.07

    44.10

    Ohio

    41.20

    52.24

    53.06

    53.18

    55.30

    53.56

    54.14

    50.45

    Wisconsin

    67.80

    59.15

    59.49

    55.95

    56.31

    54.38

    54.94

    67.92

                     
                     

    West Region

                   

    California

    58.90

    56.67

    56.79

    58.07

    56.05

    53.08

    52.55

    57.69

    New Mexico

    48.10

    53.98

    53.73

    51.57

    57.40

    52.41

    51.88

    56.21

    Oregon

    55.40

    55.95

    56.83

    54.83

    59.81

    53.97

    54.45

    59.72

    Washington

    58.40

    58.33

    59.46

    57.32

    59.97

    54.05

    54.43

    59.55

                     
                     

    Rank Correlation1

    .

    0.861

    0.854

    0.841

    0.765

    0.278

    0.097

    0.807

    Range Ratio2

    1.000

    0.597

    0.598

    0.545

    0.525

    0.088

    0.082

    0.867

                     

    1 Correlation calculated by first ranking states and then calculating the correlation of the ranks.

    2 Ratio of the range of the predicted values to the range of the BRFSS data.

    Note: Estimates of prevalence rates have been multiplied by 100.

    Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse. Model-based estimates using 1991-1993 NHSDA data.

      

    Exhibit 4.4 - - Comparison of Alternative Small Area Estimators of Past Month Prevalence of Cigarette Use to Direct Survey Estimates from the Behavioral Risk Factor Surveillance System (BRFSS).

     

    State

     

     

    BRFSS

    Estimate

    Composite

    Estimators

    Indirect Estimators

    Direct

    91-93 NHSDA Estimates

    Final SAE Model

    Big City Sub-Sampled Model

    Final SAE Model

    County Demog Model

    Big City/

    Remaind Demog Model

    Demog Model

    Total United States

    23.10

    27.16

    27.27

    27.68

    27.73

    27.49

    27.43

    27.66

                     

    North East Region

                   

    New Jersey

    20.50

    26.08

    24.68

    26.24

    26.37

    27.75

    27.25

    25.43

    New York

    23.50

    25.13

    25.26

    26.57

    26.06

    26.10

    27.24

    24.18

    Pennsylvania

    24.30

    28.56

    29.19

    28.47

    28.24

    27.92

    27.53

    30.06

                     
                     

    South Region

                   

    Florida

    23.10

    25.75

    25.15

    27.29

    26.62

    27.15

    27.03

    26.34

    Georgia

    21.80

    28.55

    28.74

    28.81

    29.18

    28.59

    28.14

    28.41

    Kentucky

    29.80

    33.74

    31.95

    31.17

    29.55

    28.15

    27.78

    34.97

    Louisiana

    24.00

    27.90

    30.10

    30.75

    29.35

    28.55

    28.09

    24.21

    North Carolina

    25.70

    28.26

    30.18

    28.69

    28.59

    28.43

    27.97

    30.49

    Oklahoma

    26.20

    29.00

    28.14

    31.09

    29.36

    27.87

    27.20

    25.85

    South Carolina

    25.20

    31.00

    31.51

    30.61

    29.76

    28.61

    28.17

    30.38

    Tennessee

    27.20

    31.44

    30.37

    30.33

    30.44

    28.22

    27.82

    31.39

    Texas

    22.70

    28.38

    27.88

    27.16

    26.74

    27.54

    27.11

    27.51

    Virginia

    23.00

    26.49

    27.84

    27.16

    27.81

    27.47

    27.98

    28.14

    West Virginia

    25.70

    32.67

    32.06

    31.58

    29.50

    27.75

    27.40

    34.39

                     
                     

    North Central Region

                   

    Illinois

    24.10

    27.87

    26.58

    27.01

    27.15

    26.15

    27.54

    27.40

    Indiana

    26.40

    26.04

    28.03

    28.20

    28.13

    28.14

    27.77

    24.65

    Kansas

    21.60

    25.78

    26.00

    27.69

    27.17

    27.95

    27.54

    24.55

    Michigan

    26.00

    28.78

    27.95

    29.89

    29.41

    28.26

    27.84

    29.45

    Minnesota

    22.90

    24.16

    27.27

    26.30

    28.99

    28.11

    27.70

    25.42

    Missouri

    25.20

    26.76

    27.21

    28.29

    28.25

    28.08

    27.68

    27.53

    Ohio

    24.10

    31.18

    29.96

    29.26

    29.10

    28.16

    27.77

    31.80

    Wisconsin

    24.50

    24.94

    28.86

    27.13

    28.21

    28.03

    27.63

    27.98

                     
                     

    West Region

                   

    California

    19.50

    24.35

    23.97

    24.97

    25.81

    26.35

    26.70

    25.52

    New Mexico

    19.60

    30.67

    29.08

    27.29

    28.02

    26.61

    26.02

    28.56

    Oregon

    21.30

    27.11

    27.38

    28.59

    27.42

    27.65

    27.20

    25.20

    Washington

    22.70

    25.16

    26.99

    27.33

    28.27

    27.92

    27.37

    25.00

                     
                     

    Rank Correlation 1

    .

    0.491

    0.610

    0.631

    0.670

    0.485

    0.472

    0.473

    Range Ratio 2

    1.000

    0.930

    0.786

    0.642

    0.449

    0.244

    0.208

    1.048

                     

    1 Correlation calculated by first ranking states and then calculating the correlation of the ranks.

    2 Ratio of the range of the predicted values to the range of the BRFSS data.

    Note: Estimates of prevalence rates have been multiplied by 100.

    Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse. Model-based estimates using 1991-1993 NHSDA data.

      

    Exhibit 4.5 - - Comparison of Alternative Small Area Estimators of Prevalence of Past Year Drug Treatment to Estimates from the National Drug and Alcoholism Treatment Unit Survey (NDATUS).

     

    State

     

     

    Adjusted

    1992-93

    NDATUS

    Estimate

    Composite

    Estimators

    Indirect Estimators

    Direct

    91-93 NHSDA Estimates

    Final SAE Model

    Big City Sub-Sampled Model

    Final SAE Model

    County, Demog Model

    Big City/

    Remaind Demog Model

    Demog Model

    Total United States

    0.85

    0.70

    0.71

    0.70

    0.68

    0.65

    0.64

    0.62

                     

    North East Region

                   

    New Jersey

    1.08

    0.68

    0.68

    0.68

    0.67

    0.64

    0.63

    0.42

    New York

    1.61

    0.64

    0.78

    0.60

    0.67

    0.62

    0.64

    0.68

    Pennsylvania

    0.94

    0.56

    0.57

    0.60

    0.62

    0.62

    0.62

    0.45

                     
                     

    South Region

                   

    Florida

    0.77

    0.69

    0.75

    0.73

    0.69

    0.60

    0.60

    0.47

    Georgia

    0.48

    0.71

    0.64

    0.71

    0.70

    0.70

    0.70

    0.75

    Kentucky

    0.61

    0.57

    0.58

    0.57

    0.66

    0.63

    0.63

    0.41

    Louisiana

    0.88

    0.64

    0.65

    0.66

    0.61

    0.70

    0.70

    0.35

    North Carolina

    0.54

    0.64

    0.78

    0.66

    0.85

    0.68

    0.68

    0.58

    Oklahoma

    0.68

    0.86

    0.65

    0.88

    0.71

    0.66

    0.63

    0.60

    South Carolina

    0.68

    0.53

    0.73

    0.54

    0.79

    0.69

    0.70

    0.39

    Tennessee

    0.44

    0.60

    0.51

    0.60

    0.62

    0.65

    0.65

    0.27

    Texas

    0.74

    0.61

    0.61

    0.66

    0.68

    0.66

    0.65

    0.65

    Virginia

    0.61

    0.65

    0.66

    0.66

    0.63

    0.67

    0.69

    0.47

    West Virginia

    0.29

    0.49

    0.46

    0.47

    0.42

    0.59

    0.59

    0.49

                     
                     

    North Central Region

                   

    Illinois

    0.68

    0.54

    0.52

    0.59

    0.53

    0.63

    0.65

    0.46

    Indiana

    0.62

    0.52

    0.53

    0.53

    0.63

    0.64

    0.64

    0.53

    Kansas

    0.81

    0.66

    0.57

    0.66

    0.63

    0.63

    0.63

    0.48

    Michigan

    0.92

    0.76

    0.89

    0.77

    1.01

    0.66

    0.65

    0.85

    Minnesota

    0.45

    0.84

    0.61

    0.82

    0.54

    0.64

    0.63

    0.59

    Missouri

    0.59

    0.70

    0.61

    0.64

    0.79

    0.63

    0.63

    0.78

    Ohio

    0.75

    0.70

    0.73

    0.70

    0.66

    0.64

    0.64

    0.34

    Wisconsin

    0.69

    0.61

    0.64

    0.63

    0.50

    0.63

    0.62

    0.19

                     
                     

    West Region

                   

    California

    0.87

    0.97

    0.91

    0.92

    0.77

    0.65

    0.65

    1.04

    New Mexico

    0.98

    0.77

    0.61

    0.80

    0.76

    0.64

    0.61

    0.25

    Oregon

    1.20

    0.90

    0.79

    0.88

    0.80

    0.60

    0.59

    0.69

    Washington

    1.67

    0.85

    0.79

    0.89

    0.73

    0.65

    0.63

    0.61

     

                 
                     

    Rank Correlation1

    .

    0.375

    0.523

    0.406

    0.289

    -0.160

    -0.240

    0.063

    Range Ratio2

    1.000

    0.349

    0.326

    0.324

    0.432

    0.079

    0.080

    0.615

                     

    1 Correlation calculated by first ranking states and then calculating the correlation of the ranks.

    2 Ratio of the range of the predicted values to the range of the NDATUS data.

    Note: Estimates of prevalence rates have been multiplied by 100.

    Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse. Model-based estimates using 1991-1993 NHSDA data.

      

    Exhibit 4.6 - - Comparison of Alternative Small Area Estimators of Prevalence of Past Year Arrest to Estimates Based on the Uniform Crime Reports (UCR).

     

     

     

    Composite

    Estimators

     

    Indirect Estimators

     

     

     

     

     

    State

    Adjusted

    1991-93

    UCR

    Estimate

     

    Final

    SAE

    Model

    Big City

    Sub-

    Sampled

    Model

     

     

    SAE

    Model

     

    County

    Demog

    Model

    Big City/

    Remaind

    Demog

    Model

     

     

    Demog

    Model

    Direct

    91-93

    NHSDA

    Estimates

    Total United States

    4.10

    1.64

    1.69

    1.62

    1.66

    1.61

    1.60

    1.57

    North East Region

                   

    New Jersey

    3.24

    1.18

    1.14

    1.23

    1.26

    1.63

    1.55

    1.18

    New York

    5.04

    1.18

    1.20

    1.16

    1.05

    1.40

    1.59

    0.66

    Pennsylvania

    2.45

    1.21

    1.23

    1.26

    1.19

    1.56

    1.50

    1.07

    South Region

                   

    Florida

    3.91

    1.54

    1.69

    1.48

    1.72

    1.50

    1.51

    1.66

    Georgia

    5.35

    2.57

    2.14

    2.04

    1.99

    1.91

    1.80

    2.50

    Kentucky

    6.17

    1.59

    1.66

    1.69

    1.77

    1.63

    1.57

    1.31

    Louisiana

    5.37

    2.26

    2.10

    2.14

    2.06

    1.89

    1.78

    1.47

    North Carolina

    5.92

    1.70

    1.92

    1.95

    2.21

    1.82

    1.73

    1.46

    Oklahoma

    3.64

    1.27

    1.72

    1.52

    1.79

    1.54

    1.47

    2.07

    South Carolina

    4.22

    1.71

    2.05

    1.90

    1.66

    1.91

    1.80

    1.15

    Tennessee

    5.28

    2.07

    2.25

    1.60

    1.54

    1.69

    1.61

    1.10

    Texas

    4.80

    1.77

    2.22

    1.68

    2.03

    1.85

    1.73

    1.92

    Virginia

    5.07

    1.40

    1.40

    1.60

    1.77

    1.67

    1.73

    1.31

    West Virginia

    2.89

    1.17

    1.43

    1.35

    1.74

    1.47

    1.42

    1.75

    North Central Region

                   

    Illinois

    2.61

    1.58

    1.29

    1.46

    1.56

    1.38

    1.62

    1.35

    Indiana

    2.78

    2.29

    2.24

    1.88

    1.73

    1.64

    1.57

    2.68

    Kansas

    4.33

    2.37

    2.52

    1.84

    1.73

    1.62

    1.55

    2.46

    Michigan

    3.18

    1.89

    1.73

    1.97

    1.76

    1.69

    1.62

    1.41

    Minnesota

    2.27

    1.50

    1.74

    1.70

    2.01

    1.58

    1.52

    1.75

    Missouri

    4.99

    1.83

    1.88

    1.91

    1.69

    1.61

    1.54

    1.70

    Ohio

    3.47

    2.07

    1.96

    1.96

    1.81

    1.63

    1.56

    2.03

    Wisconsin

    4.57

    1.29

    1.37

    1.74

    1.95

    1.60

    1.54

    1.91

    West Region

                   

    California

    4.14

    1.90

    1.87

    1.68

    1.74

    1.58

    1.65

    1.90

    New Mexico

    5.46

    2.43

    2.76

    2.02

    1.87

    1.65

    1.54

    1.79

    Oregon

    3.36

    1.70

    1.79

    1.66

    1.55

    1.46

    1.41

    1.67

    Washington

    4.15

    1.51

    1.66

    1.77

    1.60

    1.56

    1.50

    1.21

                     

    Rank Correlation1

    .

    0.350

    0.351

    0.389

    0.356

    0.510

    0.450

    -0.066

    Range Ratio2

    1.000

    0.361

    0.415

    0.252

    0.298

    0.136

    0.100

    0.518

    1 Correlation calculated by first ranking states and then calculating the correlation of the ranks.

    2 Ratio of the range of the predicted values to the range of the UCR data.

    Note: Estimates of prevalence rates have been multiplied by 100.

    Source: SAMHSA, Office of Applied Studies, National Household Survey on Drug Abuse. Model-based estimates using 1991-1993 NHSDA data.

    This is the page footer.

    This page was last updated on June 16, 2008.

    SAMHSA, an agency in the Department of Health and Human Services, is the Federal Government's lead agency for improving the quality and availability of substance abuse prevention, addiction treatment, and mental health services in the United States.

    This is a line.

       Site Map | Contact Us | AccessibilityPrivacy PolicyFreedom of Information Act
     Disclaimer | Department of Health and Human ServicesSAMHSAWhite HouseUSA.gov

    * PDF formatted files require that Adobe Acrobat Reader® program is installed on your computer. Click here to download this FREE software now from Adobe.