Real test A00-240 Exam Questions questions available for actual test

At, we convey thoroughly legitimate SASInstitute A00-240 Exam Questions Questions and Answers that are recently needed for the Passing A00-240 test. We empower people to prepare the Questions and Answers and Certify. It is a superb choice to accelerate your situation as a specialist inside the Industry. A00-240 braindumps with VCE practice test is best to get high marks in A00-240 exam.

Exam Code: A00-240 Practice exam 2022 by team
A00-240 SAS Statistical Business Analysis SAS9: Regression and Model

This exam is administered by SAS and Pearson VUE.
60 scored multiple-choice and short-answer questions.
(Must achieve score of 68 percent correct to pass)
In addition to the 60 scored items, there may be up to five unscored items.
Two hours to complete exam.
Use exam ID A00-240; required when registering with Pearson VUE.

ANOVA - 10%
Verify the assumptions of ANOVA
Analyze differences between population means using the GLM and TTEST procedures
Perform ANOVA post hoc test to evaluate treatment effect
Detect and analyze interactions between factors

Linear Regression - 20%
Fit a multiple linear regression model using the REG and GLM procedures
Analyze the output of the REG, PLM, and GLM procedures for multiple linear regression models
Use the REG or GLMSELECT procedure to perform model selection
Assess the validity of a given regression model through the use of diagnostic and residual analysis

Logistic Regression - 25%
Perform logistic regression with the LOGISTIC procedure
Optimize model performance through input selection
Interpret the output of the LOGISTIC procedure
Score new data sets using the LOGISTIC and PLM procedures

Prepare Inputs for Predictive Model Performance - 20%
Identify the potential challenges when preparing input data for a model
Use the DATA step to manipulate data with loops, arrays, conditional statements and functions
Improve the predictive power of categorical inputs
Screen variables for irrelevance and non-linear association using the CORR procedure
Screen variables for non-linearity using empirical logit plots

Measure Model Performance - 25%
Apply the principles of honest assessment to model performance measurement
Assess classifier performance using the confusion matrix
Model selection and validation using training and validation data
Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection
Establish effective decision cut-off values for scoring

Verify the assumptions of ANOVA
 Explain the central limit theorem and when it must be applied
 Examine the distribution of continuous variables (histogram, box -whisker, Q-Q plots)
 Describe the effect of skewness on the normal distribution
 Define H0, H1, Type I/II error, statistical power, p-value
 Describe the effect of sample size on p-value and power
 Interpret the results of hypothesis testing
 Interpret histograms and normal probability charts
 Draw conclusions about your data from histogram, box-whisker, and Q-Q plots
 Identify the kinds of problems may be present in the data: (biased sample, outliers, extreme values)
 For a given experiment, verify that the observations are independent
 For a given experiment, verify the errors are normally distributed
 Use the UNIVARIATE procedure to examine residuals
 For a given experiment, verify all groups have equal response variance
 Use the HOVTEST option of MEANS statement in PROC GLM to asses response variance

Analyze differences between population means using the GLM and TTEST procedures
 Use the GLM Procedure to perform ANOVA
o CLASS statement
o MODEL statement
o MEANS statement
o OUTPUT statement
 Evaluate the null hypothesis using the output of the GLM procedure
 Interpret the statistical output of the GLM procedure (variance derived from MSE, Fvalue, p-value R**2, Levene's test)
 Interpret the graphical output of the GLM procedure
 Use the TTEST Procedure to compare means Perform ANOVA post hoc test to evaluate treatment effect

Use the LSMEANS statement in the GLM or PLM procedure to perform pairwise comparisons
 Use PDIFF option of LSMEANS statement
 Use ADJUST option of the LSMEANS statement (TUKEY and DUNNETT)
 Interpret diffograms to evaluate pairwise comparisons
 Interpret control plots to evaluate pairwise comparisons
 Compare/Contrast use of pairwise T-Tests, Tukey and Dunnett comparison methods Detect and analyze interactions between factors
 Use the GLM procedure to produce reports that will help determine the significance of the interaction between factors. MODEL statement
 LSMEANS with SLICE=option (Also using PROC PLM)
 Interpret the output of the GLM procedure to identify interaction between factors:
 p-value
 F Value
 R Squared

Linear Regression - 20%

Fit a multiple linear regression model using the REG and GLM procedures
 Use the REG procedure to fit a multiple linear regression model
 Use the GLM procedure to fit a multiple linear regression model

Analyze the output of the REG, PLM, and GLM procedures for multiple linear regression models
 Interpret REG or GLM procedure output for a multiple linear regression model:
 convert models to algebraic expressions
 Convert models to algebraic expressions
 Identify missing degrees of freedom
 Identify variance due to model/error, and total variance
 Calculate a missing F value
 Identify variable with largest impact to model
 For output from two models, identify which model is better
 Identify how much of the variation in the dependent variable is explained by the model
 Conclusions that can be drawn from REG, GLM, or PLM output: (about H0, model quality, graphics)
Use the REG or GLMSELECT procedure to perform model selection

Use the SELECTION option of the model statement in the GLMSELECT procedure
 Compare the differentmodel selection methods (STEPWISE, FORWARD, BACKWARD)
 Enable ODS graphics to display graphs from the REG or GLMSELECT procedure
 Identify best models by examining the graphical output (fit criterion from the REG or GLMSELECT procedure)
 Assign names to models in the REG procedure (multiple model statements)
Assess the validity of a given regression model through the use of diagnostic and residual analysis
 Explain the assumptions for linear regression
 From a set of residuals plots, asses which assumption about the error terms has been violated
 Use REG procedure MODEL statement options to identify influential observations (Student Residuals, Cook's D, DFFITS, DFBETAS)
 Explain options for handling influential observations
 Identify collinearity problems by examining REG procedure output
 Use MODEL statement options to diagnose collinearity problems (VIF, COLLIN, COLLINOINT)

Logistic Regression - 25%
Perform logistic regression with the LOGISTIC procedure
 Identify experiments that require analysis via logistic regression
 Identify logistic regression assumptions
 logistic regression concepts (log odds, logit transformation, sigmoidal relationship between p and X)
 Use the LOGISTIC procedure to fit a binary logistic regression model (MODEL and CLASS statements)

Optimize model performance through input selection
 Use the LOGISTIC procedure to fit a multiple logistic regression model
 Perform Model Selection (STEPWISE, FORWARD, BACKWARD) within the LOGISTIC procedure

Interpret the output of the LOGISTIC procedure
 Interpret the output from the LOGISTIC procedure for binary logistic regression models: Model Convergence section
 Testing Global Null Hypothesis table
 Type 3 Analysis of Effects table
 Analysis of Maximum Likelihood Estimates table

Association of Predicted Probabilities and Observed Responses
Score new data sets using the LOGISTIC and PLM procedures
 Use the SCORE statement in the PLM procedure to score new cases
 Use the CODE statement in PROC LOGISTIC to score new data
 Describe when you would use the SCORE statement vs the CODE statement in PROC LOGISTIC
 Explain how to score new data when you have developed a model from a biased sample
Prepare Inputs for Predictive Model

Performance - 20%
Identify the potential challenges when preparing input data for a model
 Identify problems that missing values can cause in creating predictive models and scoring new data sets
 Identify limitations of Complete Case Analysis
 Explain problems caused by categorical variables with numerous levels
 Discuss the problem of redundant variables
 Discuss the problem of irrelevant and redundant variables
 Discuss the non-linearities and the problems they create in predictive models
 Discuss outliers and the problems they create in predictive models
 Describe quasi-complete separation
 Discuss the effect of interactions
 Determine when it is necessary to oversample data

Use the DATA step to manipulate data with loops, arrays, conditional statements and functions
 Use ARRAYs to create missing indicators
 Use ARRAYS, LOOP, IF, and explicit OUTPUT statements

Improve the predictive power of categorical inputs
 Reduce the number of levels of a categorical variable
 Explain thresholding
 Explain Greenacre's method
 Cluster the levels of a categorical variable via Greenacre's method using the CLUSTER procedure
o METHOD=WARD option
o FREQ, VAR, ID statement

Use of ODS output to create an output data set
 Convert categorical variables to continuous using smooth weight of evidence

Screen variables for irrelevance and non-linear association using the CORR procedure
 Explain how Hoeffding's D and Spearman statistics can be used to find irrelevant variables and non-linear associations
 Produce Spearman and Hoeffding's D statistic using the CORR procedure (VAR, WITH statement)
 Interpret a scatter plot of Hoeffding's D and Spearman statistic to identify irrelevant variables and non-linear associations Screen variables for non-linearity using empirical logit plots
 Use the RANK procedure to bin continuous input variables (GROUPS=, OUT= option; VAR, RANK statements)
 Interpret RANK procedure output
 Use the MEANS procedure to calculate the sum and means for the target cases and total events (NWAY option; CLASS, VAR, OUTPUT statements)
 Create empirical logit plots with the SGPLOT procedure
 Interpret empirical logit plots

Measure Model Performance - 25%
Apply the principles of honest assessment to model performance measurement
 Explain techniques to honestly assess classifier performance
 Explain overfitting
 Explain differences between validation and test data
 Identify the impact of performing data preparation before data is split Assess classifier performance using the confusion matrix
 Explain the confusion matrix
 Define: Accuracy, Error Rate, Sensitivity, Specificity, PV+, PV-
 Explain the effect of oversampling on the confusion matrix
 Adjust the confusion matrix for oversampling

Model selection and validation using training and validation data
 Divide data into training and validation data sets using the SURVEYSELECT procedure
 Discuss the subset selection methods available in PROC LOGISTIC
 Discuss methods to determine interactions (forward selection, with bar and @ notation)

Create interaction plot with the results from PROC LOGISTIC
 Select the model with fit statistics (BIC, AIC, KS, Brier score)
Create and interpret graphs (ROC, lift, and gains charts) for model comparison and selection
 Explain and interpret charts (ROC, Lift, Gains)
 Create a ROC curve (OUTROC option of the SCORE statement in the LOGISTIC procedure)
 Use the ROC and ROCCONTRAST statements to create an overlay plot of ROC curves for two or more models
 Explain the concept of depth as it relates to the gains chart

Establish effective decision cut-off values for scoring
 Illustrate a decision rule that maximizes the expected profit
 Explain the profit matrix and how to use it to estimate the profit per scored customer
 Calculate decision cutoffs using Bayes rule, given a profit matrix
 Determine optimum cutoff values from profit plots
 Given a profit matrix, and model results, determine the model with the highest average profit

SAS Statistical Business Analysis SAS9: Regression and Model
SASInstitute Statistical test prep
Killexams : SASInstitute Statistical test prep - BingNews Search results Killexams : SASInstitute Statistical test prep - BingNews Killexams : Example 54.2: Multilevel Response Killexams : Example 54.2: Multilevel Response

In this example, two preparations, a standard preparation and a test preparation, are each given at several dose levels to groups of insects. The symptoms are recorded for each insect within each group, and two multilevel probit models are fit. Because the natural sort order of the three levels is not the same as the response order, the ORDER=DATA option is specified in the PROC statement to get the desired order.

The following statements produce Output 54.2.1:

   data multi;
      input Prep $ Dose Symptoms $ N;
      if Prep='test' then PrepDose=LDose;
      else PrepDose=0;
   stand     10      None       33
   stand     10      Mild        7
   stand     10      Severe     10
   stand     20      None       17
   stand     20      Mild       13
   stand     20      Severe     17
   stand     30      None       14
   stand     30      Mild        3
   stand     30      Severe     28
   stand     40      None        9
   stand     40      Mild        8
   stand     40      Severe     32
   test      10      None       44
   test      10      Mild        6
   test      10      Severe      0
   test      20      None       32
   test      20      Mild       10
   test      20      Severe     12
   test      30      None       23
   test      30      Mild        7
   test      30      Severe     21
   test      40      None       16
   test      40      Mild        6
   test      40      Severe     19

   proc probit order=data;
      class Prep Symptoms;
      nonpara: model Symptoms=Prep LDose PrepDose / lackfit;
      weight N;
      parallel: model Symptoms=Prep LDose / lackfit;
      weight N;
      title 'Probit Models for Symptom Severity';

The first model uses the PrepDose variable to allow for nonparallelism between the dose response curves for the two preparations. The results of this first model indicate that the parameter for the PrepDose variable is not significant, having a Wald chi-square of 0.73. Also, since the first model is a generalization of the second, a likelihood ratio test statistic for this same parameter can be obtained by multiplying the difference in log likelihoods between the two models by 2. The value obtained, 2 ×(-345.94 - (-346.31)), is 0.73. This is in close agreement with the Wald chi-square from the first model. The lack-of-fit test statistics for the two models do not indicate a problem with either fit.

Output 54.2.1: Multilevel Response: PROC PROBIT

The negative coefficient associated with LDose indicates that the probability of having no symptoms (Symptoms='None') or no or mild symptoms (Symptoms='None' or Symptoms='Mild') decreases as LDose increases; that is, the probability of a severe symptom increases with LDose. This association is apparent for both treatment groups.

The negative coefficient associated with the standard treatment group (Prep = stand) indicates that the standard treatment is associated with more severe symptoms across all ldose values.

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

Sun, 27 Aug 2017 00:59:00 -0500 text/html
Killexams : Multiple Comparisons Killexams : Multiple Comparisons

Multiple Comparisons

When comparing more than two means, an ANOVA F-test tells you whether the means are significantly different from each other, but it does not tell you which means differ from which other means. Multiple comparison procedures (MCPs), also called mean separation tests, give you more detailed information about the differences among the means. The goal in multiple comparisons is to compare the average effects of three or more "treatments" (for example, drugs, groups of subjects) to decide which treatments are better, which ones are worse, and by how much, while controlling the probability of making an incorrect decision. A variety of multiple comparison methods are available with the MEANS and LSMEANS statement in the GLM procedure.

The following classification is due to Hsu (1996). Multiple comparison procedures can be categorized in two ways: by the comparisons they make and by the strength of inference they provide. With respect to which comparisons are made, the GLM procedure offers two types:

  • comparisons between all pairs of means
  • comparisons between a control and all other means

The strength of inference says what can be inferred about the structure of the means when a test is significant; it is related to what type of error rate the MCP controls. MCPs available in the GLM procedure provide one of the following types of inference, in order from weakest to strongest.

  • Individual: differences between means, unadjusted for multiplicity
  • Inhomogeneity: means are different
  • Inequalities: which means are different
  • Intervals: simultaneous confidence intervals for mean differences

Methods that control only individual error rates are not true MCPs at all. Methods that yield the strongest level of inference, simultaneous confidence intervals, are usually preferred, since they enable you not only to say which means are different but also to put confidence bounds on how much they differ, making it easier to assess the practical significance of a difference. They are also less likely to lead nonstatisticians to the invalid conclusion that nonsignificantly different sample means imply equal population means. Interval MCPs are available for both arithmetic means and LS-means via the MEANS and LSMEANS statements, respectively.


Table 30.3 and Table 30.4 display MCPs available in PROC GLM for all pairwise comparisons and comparisons with a control, respectively, along with associated strength of inference and the syntax (when applicable) for both the MEANS and the LSMEANS statements. 

Table 30.3: Multiple Comparisons Procedures for All Pairwise Comparison Table 30.4: Multiple Comparisons Procedures for Comparisons with a Control

Note: One-sided Dunnett's tests are also available from the MEANS statement with the DUNNETTL and DUNNETTU options and from the LSMEANS statement with PDIFF=CONTROLL and PDIFF=CONTROLU.

Details of these multiple comparison methods are given in the following sections.

Pairwise Comparisons

All the methods discussed in this section depend on the standardized pairwise differences t_{ij}=(\bar{y}_i-\bar{y}_j)/\hat{\sigma}_{ij},where

  • i and j are the indices of two groups
  • \bar{y}_i and \bar{y}_j are the means or LS-means for groups i and j
  • \hat{\sigma}_{ij} is the square-root of the estimated variance of \bar{y}_i - \bar{y}_j. For simple arithmetic means, \hat{\sigma}^2_{ij} = s^2(1/n_i + 1/n_j), where ni and nj are the sizes of groups i and j, respectively, and s2 is the mean square for error, with \nu degrees of freedom. For weighted arithmetic means, \hat{\sigma}^2_{ij} = s^2(1/w_i + 1/w_j), where wi and wj are the sums of the weights in groups i and j, respectively. Finally, for LS-means defined by the linear combinations li'b and lj'b of the parameter estimates, \hat{\sigma}^2_{ij} = s^2 l_i'(X'X)^-l_j.

Furthermore, all of the methods are discussed in terms of significance tests of the form

| t_{ij}| & \geq & c(\alpha)

where c(\alpha) is some constant depending on the significance level. Such tests can be inverted to form confidence intervals of the form

(\bar{y}_i-\bar{y}_j) - \hat{\sigma}_{ij}c(\alpha) & \leq & \mu_i-\mu_j & \leq & (\bar{y}_i-\bar{y}_j) + \hat{\sigma}_{ij}c(\alpha)

The simplest approach to multiple comparisons is to do a t test on 

every pair of means (the T option in the MEANS statement, ADJUST=T in the LSMEANS statement). For the ith and jth means, you can reject the null hypothesis that the population means are equal if

| t_{ij}| & \geq & t(\alpha ; \nu)

where \alpha is the significance level, \nu is the number of error degrees of freedom, and t(\alpha ; \nu) is the two-tailed critical value from a Student's t distribution. If the cell sizes are all equal to, say, n, the preceding formula can be rearranged to give

|\bar{y}_i - \bar{y}_j| & \geq & t(\alpha;\nu) s \sqrt{\frac{2}n}

the value of the right-hand side being Fisher's least significant difference (LSD).

There is a problem with repeated t tests, however. Suppose there are ten means and each t test is performed at the 0.05 level. There are 10(10-1)/2=45 pairs of means to compare, each with a 0.05 probability of a type 1 error (a false rejection of the null hypothesis). The chance of making at least one type 1 error is much higher than 0.05. It is difficult to calculate the exact probability, but you can derive a pessimistic approximation by assuming that the comparisons are independent, giving an upper bound to the probability of making at least one type 1 error (the experimentwise error rate) of

1 - (1 - 0.05)^{45} & = & 0.90

The genuine probability is somewhat less than 0.90, but as the number of means increases, the chance of making at least one type 1 error approaches 1.

If you decide to control the individual type 1 error rates for each comparison, you are controlling the individual or comparisonwise error rate. On the other hand, if you want to control the overall type 1 error rate for all the comparisons, you are controlling the experimentwise error rate. It is up to you to decide whether to control the comparisonwise error rate or the experimentwise error rate, but there are many situations in which the experimentwise error rate should be held to a small value. Statistical methods for comparing three or more means while controlling the probability of making at least one type 1 error are called multiple comparisons procedures.

It has been suggested that the experimentwise error rate can be held to the \alpha level by performing the overall ANOVA F-test at the \alpha level and making further comparisons only if the F-test is significant, as in Fisher's protected LSD. This assertion is false if there are more than three means (Einot and Gabriel 1975). Consider again the situation with ten means. Suppose that one population mean differs from the others by such a sufficiently large amount that the power (probability of correctly rejecting the null hypothesis) of the F-test is near 1 but that all the other population means are equal to each other. There will be 9(9 - 1)/2=36 t tests of true null hypotheses, with an upper limit of 0.84 on the probability of at least one type 1 error. Thus, you must distinguish between the experimentwise error rate under the complete null hypothesis, in which all population means are equal, and the experimentwise error rate under a partial null hypothesis, in which some means are equal but others differ. The following abbreviations are used in the discussion:

comparisonwise error rate
experimentwise error rate under the complete null hypothesis
maximum experimentwise error rate under any complete or partial null hypothesis

These error rates are associated with the different

strengths of inference

: individual tests control the CER; tests for inhomogeneity of means control the EERC; tests that yield confidence inequalities or confidence intervals control the MEER. A preliminary F-test controls the EERC but not the MEER.

You can control the MEER at the \alpha level by setting the CER to a sufficiently small value. The Bonferroni inequality (Miller 1981) has been widely used for this purpose. If

{\rm CER} & = & \frac{\alpha}c

where c is the total number of comparisons, then the MEER is less than \alpha. Bonferroni t tests (the BON option in the MEANS

statement, ADJUST=BON in the LSMEANS statement) with {\rm MEER} \lt \alphadeclare two means to be significantly different if

| t_{ij}| & \geq & t(\epsilon ; \nu)


\epsilon & = & \frac{2 \alpha}{k(k - 1)}

for comparison of k means.

Sidak (1967) has provided a tighter bound, showing that

{\rm CER} & = & 1 - (1 - \alpha)^{1/c}

also ensures that {\rm MEER} \leq \alpha for any set of c comparisons. A

Sidak t test (Games 1977), provided by the SIDAK option, is thus given by

| t_{ij}| & \geq & t(\epsilon ; \nu)


\epsilon & = & 1 - (1 - \alpha)^{\frac{2}{k(k - 1)}}

for comparison of k means.

You can use the Bonferroni additive inequality and the Sidak multiplicative inequality to control the MEER for any set of contrasts or other hypothesis tests, not just pairwise comparisons. The Bonferroni inequality can provide simultaneous inferences in any statistical application requiring tests of more than one hypothesis. Other methods discussed in this section for pairwise comparisons can also be adapted for general contrasts (Miller 1981).

Scheff\acute{e} (1953, 1959) proposes another method to control the MEER for

any set of contrasts or other linear hypotheses in the analysis of linear models, including pairwise comparisons, obtained with the SCHEFFE option. Two means are declared significantly different if

| t_{ij}| & \geq & \sqrt{(k-1) F (\alpha;k-1,\nu)}

where F(\alpha;k-1,\nu) is the \alpha-level critical value of an F distribution with k-1 numerator degrees of freedom and \nudenominator degrees of freedom.

Scheff\acute{e}'s test is compatible with the overall ANOVA F-test in that Scheff\acute{e}'s method never declares a contrast significant if the overall F-test is nonsignificant. Most other multiple comparison methods can find significant contrasts when the overall F-test is nonsignificant and, therefore, suffer a loss of power when used with a preliminary F-test.

Scheff\acute{e}'s method may be more powerful than the Bonferroni or Sidak methods if the number of comparisons is large relative to the number of means. For pairwise comparisons, Sidak t tests are generally more powerful.

Tukey (1952, 1953) proposes a test designed specifically for pairwise

comparisons based on the studentized range, sometimes called the "honestly significant difference test," that controls the MEER when the sample sizes are equal. Tukey (1953) and Kramer (1956) independently propose a modification for unequal cell sizes. The Tukey or Tukey-Kramer method is provided by the TUKEY option in the MEANS statement and the ADJUST=TUKEY option in the LSMEANS statement. This method has fared extremely well in Monte Carlo studies (Dunnett 1980). In addition, Hayter (1984) gives a proof that the Tukey-Kramer procedure controls the MEER for means comparisons, and Hayter (1989) describes the extent to which the Tukey-Kramer procedure has been proven to control the MEER for LS-means comparisons. The Tukey-Kramer method is more powerful than the Bonferroni, Sidak, or Scheff\acute{e} methods for pairwise comparisons. Two means are considered significantly different by the Tukey-Kramer criterion if

| t_{ij}| & \geq & q(\alpha;k,\nu)

where q(\alpha;k,\nu) is the \alpha-level critical value of a studentized range distribution of k independent normal random variables with \nu degrees of freedom.

Hochberg (1974) devised a method (the GT2 or SMM option) similar to

Tukey's, but it uses the studentized maximum modulus instead of the studentized range and employs Sidak's (1967) uncorrelated t inequality. It is proven to hold the MEER at a level not exceeding \alpha with unequal sample sizes. It is generally less powerful than the Tukey-Kramer method and always less powerful than Tukey's test for equal cell sizes. Two means are declared significantly different if

| t_{ij}| & \geq & m(\alpha;c,\nu)

where m(\alpha;c,\nu) is the \alpha-level critical value of the studentized maximum modulus distribution of c independent normal random variables with \nu degrees of freedom and c = k(k-1)/2.

Gabriel (1978) proposes another method (the GABRIEL option) based on

the studentized maximum modulus. This method is applicable only to arithmetic means. It rejects if

\frac{|\bar{y}_i - \bar{y}_j|} {s ( \frac{1}{\sqrt{2n_i}} + \frac{1}{\sqrt{2n_j}} ) } & \geq & m(\alpha;k,\nu)

For equal cell sizes, Gabriel's test is equivalent to Hochberg's GT2 method. For unequal cell sizes, Gabriel's method is more powerful than GT2 but may become liberal with highly disparate cell sizes (refer also to Dunnett 1980). Gabriel's test is the only method for unequal sample sizes that lends itself to a graphical representation as intervals around the means. Assuming \bar{y}_i \gt \bar{y}_j,you can rewrite the preceding inequality as

\bar{y}_i - m(\alpha;k,\nu) \frac{s}{\sqrt{2n_i}} & \geq & \bar{y}_j + m(\alpha;k,\nu) \frac{s}{\sqrt{2n_j}}

The expression on the left does not depend on j, nor does the expression on the right depend on i. Hence, you can form what Gabriel calls an (l,u)-interval around each sample mean and declare two means to be significantly different if their (l,u)-intervals do not overlap. See Hsu (1996, section for a discussion of other methods of graphically representing all pair-wise comparisons.

Comparing All Treatments to a Control

One special case of means comparison is that in which the only comparisons that need to be tested are between a set of new treatments and a single control. In this case, you can achieve better power by using a method that is restricted to test only comparisons to the single control mean. Dunnett (1955) proposes a test for this situation that declares a mean significantly different from the control if

| t_{i0}| & \geq & d(\alpha;k,\nu,\rho_1, ... ,\rho_{k-1})

where \bar{y}_0 is the control mean and d(\alpha ;k,\nu,\rho_1, ... ,\rho_{k-1}) is the critical value of the "many-to-one t statistic" (Miller 1981; Krishnaiah and Armitage 1966) for k means to be compared to a control, with \nu error degrees of freedom and correlations \rho_1, ... ,\rho_{k-1},\rho_i = n_i / (n_0 + n_i). The correlation terms arise because each of the treatment means is being compared to the same control. Dunnett's test holds the MEER to a level not exceeding the stated \alpha.

Approximate and Simulation-based Methods

Both Tukey's and Dunnett's tests are based on the same general quantile calculation:

q^t(\alpha,\nu,R) & = & \{q \ni P(\max(| t_1|, ... ,| t_n|)\gt q)=\alpha\}

where the ti have a joint multivariate t distribution with \nudegrees of freedom and correlation matrix R. In general, evaluating q^t(\alpha,\nu,R) requires repeated numerical calculation of an (n+1)-fold integral. This is usually intractable, but the problem reduces to a feasible 2-fold integral when R has a certain symmetry in the case of Tukey's test, and a factor analytic structure (cf. Hsu 1992) in the case of Dunnett's test. The R matrix has the required symmetry for exact computation of Tukey's test if the tis are studentized differences between

  • k(k-1)/2 pairs of k uncorrelated means with equal variances -that is, equal sample sizes
  • k(k-1)/2 pairs of k LS-means from a variance-balanced design (for example, a balanced incomplete block design)

Refer to Hsu (1992, 1996) for more information. The R matrix has the factor analytic structure for exact computation of Dunnett's test if the tis are studentized differences between

  • k-1 means and a control mean, all uncorrelated. (Dunnett's one-sided methods depend on a similar probability calculation, without the absolute values.) Note that it is not required that the variances of the means (that is, the sample sizes) be equal.
  • k-1 LS-means and a control LS-mean from either a variance-balanced design, or a design in which the other factors are orthogonal to the treatment factor (for example, a randomized block design with proportional cell frequencies).

However, other important situations that do not result in a correlation matrix R that has the structure for exact computation include

  • all pairwise differences with unequal sample sizes
  • differences between LS-means in many unbalanced designs

In these situations, exact calculation of q^t(\alpha,\nu,R) is intractable in general. Most of the preceding methods can be viewed as using various approximations for q^t(\alpha,\nu,R).When the sample sizes are unequal, the Tukey-Kramer test is equivalent to another approximation. For comparisons with a control when the correlation R does not have a factor analytic structure, Hsu (1992) suggests approximating R with a matrix R* that does have such a structure and correspondingly approximating q^t(\alpha,\nu,R) with q^t(\alpha,\nu,R^*).When you request Dunnett's test for LS-means (the PDIFF=CONTROL and ADJUST=DUNNETT options), the GLM procedure automatically uses Hsu's approximation when appropriate.

Finally, Edwards and Berry (1987) suggest calculating q^t(\alpha,\nu,R)by simulation. Multivariate t vectors are sampled from a distribution with the appropriate \nu and R parameters, and Edwards and Berry (1987) suggest estimating q^t(\alpha,\nu,R) by \hat{q}, the \alpha percentile of the observed values of \max(| t_1|, ... ,| t_n|). Sufficient samples are generated for the true P(\max(| t_1|, ... ,| t_n|)\gt\hat{q}) to be within a certain accuracy radius \gamma of \alphawith accuracy confidence 100(1-\epsilon). You can approximate q^t(\alpha,\nu,R)by simulation for comparisons between LS-means by specifying ADJUST=SIM (with either PDIFF=ALL or PDIFF=CONTROL). By default, \gamma=0.005 and \epsilon=0.01, so that the tail area of \hat{q} is within 0.005 of \alpha with 99% confidence. You can use the ACC= and EPS= options with ADJUST=SIM to reset \gamma and \epsilon, or you can use the NSAMP= option to set the sample size directly. You can also control the random number sequence with the SEED= option.

Hsu and Nelson (1998) suggest a more accurate simulation method for estimating q^t(\alpha,\nu,R), using a control variate adjustment technique. The same independent, standardized normal variates that are used to generate multivariate t vectors from a distribution with the appropriate \nu and R parameters are also used to generate multivariate t vectors from a distribution for which the exact value of q^t(\alpha,\nu,R) is known. \max(| t_1|, ... ,| t_n|) for the second sample is used as a control variate for adjusting the quantile estimate based on the first sample; refer to Hsu and Nelson (1998) for more details. The control variate adjustment has the drawback that it takes somewhat longer than the crude technique of Edwards and Berry (1987), but it typically yields an estimate that is many times more accurate. In most cases, if you are using ADJUST=SIM, then you should specify ADJUST=SIM(CVADJUST). You can also specify ADJUST=SIM(CVADJUST REPORT) to display a summary of the simulation that includes, among other things, the genuine accuracy radius \gamma, which should be substantially smaller than the target accuracy radius (0.005 by default).

Multiple-Stage Tests

You can use all of the methods discussed so far to obtain simultaneous confidence intervals (Miller 1981). By sacrificing the facility for simultaneous estimation, you can obtain simultaneous tests with greater power using multiple-stage tests (MSTs). MSTs come in both step-up and step-down varieties (Welsch 1977). The step-down methods, which have been more widely used, are available in SAS/STAT software.

Step-down MSTs first test the homogeneity of all of the means at a level \gamma_k. If the test results in a rejection, then each subset of k-1 means is tested at level \gamma_{k-1}; otherwise, the procedure stops. In general, if the hypothesis of homogeneity of a set of p means is rejected at the \gamma_p level, then each subset of p-1 means is tested at the \gamma_{p-1} level; otherwise, the set of p means is considered not to differ significantly and none of its subsets are tested. The many varieties of MSTs that have been proposed differ in the levels \gamma_p and the statistics on which the subset tests are based. Clearly, the EERC of a step-down MST is not greater than \gamma_k, and the CER is not greater than \gamma_2, but the MEER is a complicated function of \gamma_p, p = 2, ... ,k.

With unequal cell sizes, PROC GLM uses the harmonic mean of the cell sizes as the common sample size. However, since the resulting operating characteristics can be undesirable, MSTs are recommended only for the balanced case. When the sample sizes are equal and if the range statistic is used, you can arrange the means in ascending or descending order and test only contiguous subsets. But if you specify the F statistic, this shortcut cannot be taken. For this reason, only range-based MSTs are implemented. It is common practice to report the results of an MST by writing the means in such an order and drawing lines parallel to the list of means spanning the homogeneous subsets. This form of presentation is also convenient for pairwise comparisons with equal cell sizes.

The best known MSTs are the Duncan (the DUNCAN option) and

Student-Newman-Keuls (the SNK option) methods (Miller 1981). Both use the studentized range statistic and, hence, are called multiple range tests. Duncan's method is often called the "new" multiple range test despite the fact that it is one of the oldest MSTs in current use.

The Duncan and SNK methods differ in the \gamma_pvalues used. For Duncan's method, they are

\gamma_p & = & 1 - (1 - \alpha)^{p - 1}

whereas the SNK method uses

\gamma_p & = & \alpha

Duncan's method controls the CER at the \alpha level. Its operating characteristics appear similar to those of Fisher's unprotected LSD or repeated t tests at level \alpha (Petrinovich and Hardyck 1969). Since repeated t tests are easier to compute, easier to explain, and applicable to unequal sample sizes, Duncan's method is not recommended. Several published studies (for example, Carmer and Swanson 1973) have claimed that Duncan's method is superior to Tukey's because of greater power without considering that the greater power of Duncan's method is due to its higher type 1 error rate (Einot and Gabriel 1975).

The SNK method holds the EERC to the \alpha level but does not control the MEER (Einot and Gabriel 1975). Consider ten population means that occur in five pairs such that means within a pair are equal, but there are large differences between pairs. If you make the usual sampling assumptions and also assume that the sample sizes are very large, all subset homogeneity hypotheses for three or more means are rejected. The SNK method then comes down to five independent tests, one for each pair, each at the \alpha level. Letting \alpha be 0.05, the probability of at least one false rejection is

1 - (1 - 0.05)^5 & = & 0.23

As the number of means increases, the MEER approaches 1. Therefore, the SNK method cannot be recommended.

A variety of MSTs that control the MEER have been proposed, but these methods are not as well known as those of Duncan and SNK. An approach developed by Ryan (1959, 1960), Einot and Gabriel (1975), and Welsch (1977) sets

\gamma_p & = & \{1 - (1 - \alpha)^{p/k} & & {for} p \lt k - 1 \ \alpha & & {for} p \geq k - 1 .

You can use range statistics, leading to what is called the REGWQ

method after the authors' initials. If you assume that the sample means have been arranged in descending order from \bar{y}_1 through \bar{y}_k, the homogeneity of means \bar{y}_i, ... ,\bar{y}_j, i\lt j, is rejected by REGWQ if

\bar{y}_i - \bar{y}_j & \geq & q(\gamma_p;p,\nu) \frac{s}{\sqrt{n}}

where p=j-i+1 and the summations are over u = i, ... ,j (Einot and Gabriel 1975). To ensure that the MEER is controlled, the current implementation checks whether q(\gamma_p;p,\nu) is monotonically increasing in p. If not, then a set of critical values that are increasing in p is substituted instead.

REGWQ appears to be the most powerful step-down MST in the current literature (for example, Ramsey 1978). Use of a preliminary F-test decreases the power of all the other multiple comparison methods discussed previously except for Scheff\acute{e}'s test.

Bayesian Approach

Waller and Duncan (1969) and Duncan (1975) take an approach to multiple comparisons that differs from all the methods previously discussed in minimizing the Bayes risk under additive loss rather than controlling type 1 error rates. For each pair of population means \mu_i and \mu_j, null (H0ij) and alternative (Haij) hypotheses are defined:

H_0^{ij}\colon & & \mu_i - \mu_j \leq 0 \ H_a^{ij}\colon & & \mu_i - \mu_j \gt 0

For any i, j pair, let d0 indicate a decision in favor of H0ij and da indicate a decision in favor of Haij, and let \delta=\mu_i-\mu_j. The loss function for the decision on the i, j pair is

L(d_0|\delta) & = & \{0 & & {if} \delta \leq 0 \ \delta & & {if} \delta \gt 0... & = & \{-k \delta & & {if} \delta \leq 0 \ 0 & & {if} \delta \gt 0 \ .

where k represents a constant that you specify rather than the number of means. The loss for the joint decision involving all pairs of means is the sum of the losses for each individual decision. The population means are assumed to have a normal prior distribution with unknown variance, the logarithm of the variance of the means having a uniform prior distribution. For the i, j pair, the null hypothesis is rejected if

\bar{y}_i - \bar{y}_j & \geq & t_B s \sqrt{\frac{2}n}

where tB is the Bayesian t value (Waller and Kemp 1976) depending on k, the F statistic for the one-way ANOVA, and the degrees of freedom for F. The value of tB is a decreasing function of F, so the Waller-Duncan test (specified by the WALLER option) becomes more liberal as F increases.


In summary, if you are interested in several individual comparisons and are not concerned about the effects of multiple inferences, you can use repeated t tests or Fisher's unprotected LSD. If you are interested in all pairwise comparisons or all comparisons with a control, you should use Tukey's or Dunnett's test, respectively, in order to make the strongest possible inferences. If you have weaker inferential requirements and, in particular, if you don't want confidence intervals for the mean differences, you should use the REGWQ method. Finally, if you agree with the Bayesian approach and Waller and Duncan's assumptions, you should use the Waller-Duncan test.

Interpretation of Multiple Comparisons

When you interpret multiple comparisons, remember that failure to reject the hypothesis that two or more means are equal should not lead you to conclude that the population means are, in fact, equal. Failure to reject the null hypothesis implies only that the difference between population means, if any, is not large enough to be detected with the given sample size. A related point is that nonsignificance is nontransitive: that is, given three sample means, the largest and smallest may be significantly different from each other, while neither is significantly different from the middle one. Nontransitive results of this type occur frequently in multiple comparisons.

Multiple comparisons can also lead to counter-intuitive results when the cell sizes are unequal. Consider four cells labeled A, B, C, and D, with sample means in the order A>B>C>D. If A and D each have two observations, and B and C each have 10,000 observations, then the difference between B and C may be significant, while the difference between A and D is not.

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.

Tue, 25 Apr 2017 10:56:00 -0500 text/html
Killexams : Use of the Postpartum Depression Screening Scale in a Collaborative Obstetric Practice


The study setting was a high-volume collaborative obstetrics and midwifery practice in Albuquerque, New Mexico, with 11 obstetricians and nine nurse-midwives. This group attends about 2000 deliveries per year, and approximately 40% of all clients were enrolled in the Medicaid program. The race/ethnicity profile of the clinical population ( Table 3 ) reflects the diversity of the greater Albuquerque area.[17]

The PPD screening tool was the PDSS developed by Beck and Gable.[16,18] This tool was chosen for ease of use and because phrases in the questions were taken from women's own words. Beck and Gable determined that PDSS cutoffs should be set so as to minimize false negative screens. The authors believed the potential risks of failing to detect a depressed mother far outweigh the risks involved in labeling a normal mother as depressed.[18] This tool is a 35-item Likert-type instrument created for new mothers. It takes 5 to 10 minutes for women to answer, and it is written at a third grade reading level. The PDSS contains short, easy to understand items so that mothers may respond by using a 5-point scale ranging from "strongly disagree" to "strongly agree." The tool yields scores that fall into one of three ranges: normal adjustment (≤ 59), potential symptoms of PPD (60 - 79) and a positive screen for major PPD (≥ 80). A score of ≥ 80 on the PDSS does not diagnose PPD or depression equivalent to a DSM-IV-TR diagnosis of Major Depressive Disorder with Postpartum Onset. However, it does indicate that the woman has a high probability of depression and should be referred to a mental health clinician for assessment and treatment. Beck and Gable tested the validity of the PDSS by having a nurse psychotherapist interview women using the Structural Clinical Interview for DSM-IV Axis 1 Disorders. They found that a PDSS cut-off score of ≥ 80 has a sensitivity of 94% and specificity of 98%. In other words, 94% of women who scored ≥ 80 on the PDSS were diagnosed with major PPD based on the interviews with the nurse psychotherapist, and 98% who scored < 80 on the PDSS did not meet the criteria for major PPD. A PDSS score between 60 and 79 yielded a sensitivity of 91% and specificity of 72% for a diagnosis of minor PPD.[19] The PDSS also provides scores in seven symptom areas: sleeping/eating disturbances, anxiety/insecurity, emotional lability, mental confusion, loss of self, guilt/shame, and suicidal thoughts.[16]

This study was approved by the institutional review board at Presbyterian Healthcare Services, Albuquerque, New Mexico. The PDSS tool was purchased by a grant from the Presbyterian Foundation. A training session that included an overview of PPD, the purpose of a screening tool, an overview of the research protocol, and how to administer and score the PDSS tool was held for all obstetricians, certified nurse-midwives, and medical assistants. We developed a PDSS process flow sheet (Figure 1) to facilitate the administration and scoring of the PDSS, as well as a plan to refer patients with a positive screen for PPD for definitive diagnosis and treatment by mental health professionals. Essential to our study was the creation of a list of available mental health clinicians who had expertise in the treatment of depression, particularly PPD. The project directors were a clinical nurse specialist in psychiatry and a certified nurse-midwife from the collaborative practice, and this combination provided a bridge across the disciplines.

Postpartum Depression Screening Scale. MHP = Mental health provider. (Figure by C. Byrne.)

The medical assistants introduced the research study and PDSS tool to women at the 6-week postpartum office visit. A signed consent form, which included informed consent and the authorization for the disclosure of personal health information, was obtained from each participant. Medical assistants asked the woman to complete the first seven questions (short form) of the PDSS and scored the results. If a woman scored ≤ 13, the clinician discussed these normal results with the patient, and no referral or follow-up was deemed necessary. If a woman scored ≥ 14, she was asked to complete the remaining 28 questions, and the tool was scored by either the medical assistant or the clinician. If women scored ≤ 59 on the full questionnaire, the clinician discussed these normal results and no referral or follow-up was necessary. Women who scored in the "potential symptoms of PPD" category on the full screening tool (a score of 60 - 79) were informed about PPD and encouraged to increase their support, sleep, and nutrition. They were asked to notify their clinician if symptoms worsened. The score was then documented in the chart along with the plan of care. In the case of a positive screen (≥ 80 or more, indicating a risk for major PPD), PPD was discussed with women, available treatment options were reviewed, and a referral was made to a mental health professional for further evaluation and treatment. The questionnaires and consent forms were placed in the PDSS study folder and collected by the researchers for data entry.

Once the study was underway the researchers conducted a series of semi-structured interviews with clinicians and medical assistants in order to understand their experience using the screening tool and their perceptions of the benefits and difficulties of using the tool in a busy clinical practice. These interviews were then transcribed and themes were developed and a simple summation of the most frequently mentioned items were categorized.

The researchers collected the completed PDSS forms, which had the women's names on them. Each form was assigned an identification number, and the data were entered into an Excel spreadsheet (Microsoft, Redmond, WA), using the identification number only, that included the PDSS score and specific demographic and clinical items. These included age, parity, race/ethnicity, education, marital status, infant feeding, type of delivery, and history of depression. After data entry, all records were kept in a locked file.

The analysis was performed by experienced personnel at University of New Mexico using SAS software version 9 (SAS Institute, Inc., Cary, NC) on a mainframe computer. Analysis included descriptive statistics, the χ2 test to assess differences in proportions for category of maternal characteristics by PDSS score (negative, symptoms present, or positive screen for PPD), and logistic regression using a backward elimination strategy. This latter method included significant study variables in a regression model, and systematically eliminated one at a time based on lack of statistical significance in the presence of the other variables. The final result indicated significant predictors for a positive screen for PPD using the PDSS in our clinical setting.

Sat, 21 May 2022 08:03:00 -0500 en text/html
Killexams : Preoperative Medical Testing in Medicare Patients Undergoing Cataract Surgery

Study Oversight and Data Source

This study was approved by the institutional review board at the University of California, San Francisco. We obtained research identifiable files from the Centers for Medicare and Medicaid Services (CMS) Research Data Distribution Center for Medicare beneficiaries who underwent cataract surgery in 2011. For each beneficiary, we obtained the outpatient file, the carrier file, the Medicare Provider Analysis and Review file, and the Master Beneficiary Summary file representing all claims from January 1, 2010, through December 31, 2011.

Study Cohort

Using the outpatient and carrier files, we identified patients undergoing cataract surgery in 2011 on the basis of the Current Procedural Terminology (CPT) codes for cataract surgery (66982, 66983, 66984, 66850, 66920, 66930, and 66940).1 We included patients 66 years of age or older with at least 12 months of Medicare eligibility before surgery who were enrolled in the Medicare fee-for-service program without a concurrent health maintenance organization plan. From a total of 1,004,972 eligible persons, we obtained a random sample of 500,000 beneficiaries. We defined each beneficiary’s index surgery date as the first date of an ophthalmology claim for routine cataract surgery (CPT codes 66982–66984). We did not include beneficiaries who had an International Classification of Diseases, Ninth Revision, code indicating previous cataract surgery (V43.1, V45.61, or 379.31), who had had a cataract surgery claim in 2010, who had had an inpatient hospital stay within 30 days before the index surgery (to avoid misclassification of follow-up testing associated with the inpatient admission as preoperative testing), or who could not be linked to a hospital referral region.

Predictor Variables

The patient characteristics that we examined included age, sex, race, and health status; health status was assessed with the use of the Quan modification of the Charlson comorbidity index (scores on the Charlson comorbidity index range from 0 to 33, with higher scores indicating greater disease burden and increased risk of death within 1 year).18 Health system characteristics were derived with the use of the ophthalmologist’s ZIP Code, hospital referral region,19 and the Rural–Urban Commuting Area Codes.20

Identification of Preoperative Tests

We designated the following tests as possible preoperative tests: complete blood count, chemical analysis, coagulation studies, urinalysis, electrocardiography, echocardiography, cardiac stress tests, chest radiography, and pulmonary-function tests; we also included individual components of standard laboratory panels (listed in Table S1 in the Supplementary Appendix, available with the full text of this article at Like Steinberg et al.,16 we categorized these tests as routine preoperative tests if they occurred during the preoperative month (i.e., within 30 days before — but not including the day of — the index surgery).

Identification of Preoperative Office Visits

We defined a preoperative office visit as any new or established visit (CPT codes 99201–99205 or 99211–99215) to any general practice, anesthesiology, cardiology, family practice, internal medicine, or geriatric medicine physician, nurse practitioner, or physician assistant within 30 days before the index surgery. Office visits to ophthalmologists or optometrists were excluded.

Identification of the Test-Ordering Provider and the Location of Surgery

To assess preoperative testing according to provider, we used the operating ophthalmologist as a proxy for the group of physicians (e.g., the ophthalmologist, primary care physician, and anesthesiologist) who made up the patient’s perioperative care team, because we could not use Medicare claims data to reliably determine the specific physician ordering the test. We refer to the group of physicians who might have ordered a patient’s tests as the “care team,” but we use the term “ophthalmologist” when appropriate for clarity. We ascertained a surgical location for 98.5% of the beneficiaries by identifying an ambulatory surgery center or hospital outpatient department claim in the carrier or outpatient file submitted within 1 day before or after the ophthalmologist-submitted claim.


Costs were calculated on the basis of the national limitation amount from the 2011 Clinical Laboratory Fee Schedule21 for laboratory tests. For nonlaboratory tests and office visits, the nonfacility price from the 2011 Medicare Physician Fee Schedule was used.22

Statistical Analysis

All statistical analyses were performed with the use of SAS software, version 9.4 (SAS Institute). We calculated the proportion of beneficiaries with preoperative testing or office visits and the associated cost to Medicare. Patient and health system characteristics, including surgical setting, were assessed for patients who did and patients who did not undergo preoperative testing or office visits; chi-square tests were used to assess differences in categorical variables, and Student’s t-tests were used for continuous variables. We calculated the mean number of tests and office visits per beneficiary per month to evaluate changes over time.

For the analysis of variation among ophthalmologists, we identified the percentage of each ophthalmologist’s patients who had preoperative testing or office visits and graphed this result for ophthalmologists who performed five or more surgeries in 2011. Within groups of ophthalmologists stratified according to the rate of preoperative testing, we calculated excess testing in the preoperative month by subtracting the mean number of tests per month during the 11-month baseline period.

To estimate independent predictors of preoperative testing or office visits, using only beneficiaries with a known surgical location, we created a two-level hierarchical model with the ophthalmologist as a random effect to account for clustering of patients according to ophthalmologist. We included demographic characteristics of the patients, health status, geographic region, surgical setting, and other health system variables as fixed effects. The influence of the ophthalmologist was summarized with the median odds ratio, which is the median ratio of the odds of preoperative testing or office visits between demographically identical patients of equal health status who are the patients of two randomly selected ophthalmologists, with the clusters compared in descending order so that the odds ratios are always 1 or more.23,24 This summary statistic is a function of the estimated variance of the random effect and is directly comparable to odds ratios of fixed-effects variables. To assess the differential contributions to variation in testing attributable to the ophthalmologist, the characteristics of the patient, and the occurrence of an office visit in the preoperative month, we created multiple models of preoperative testing and calculated C-statistics for each model.

Wed, 15 Jun 2022 12:00:00 -0500 en text/html
Killexams : Serum electrolyte concentrations and hydration status are not associated with exercise associated muscle cramping (EAMC) in distance runners

Exercise associated muscle cramping (EAMC) can be defined as “a painful, involuntary contraction of skeletal muscle that occurs during or immediately after exercise.”1 Numerous studies have documented the serum electrolyte changes that occur with endurance exercise.2–6 Serum electrolyte and fluid disturbances have been associated with the development of muscle cramps in certain clinical conditions.7–13 It is therefore often assumed that EAMC is also caused by fluid imbalances, in particular dehydration, and serum electrolyte abnormalities.14–17 This assumption is common,15,16,18–20 despite the fact that very few studies have examined the relation between changes in serum electrolyte concentrations and the development of EAMC. There are no well conducted studies that have documented a relation between dehydration and muscle cramping in athletes.

A prospective study of marathon runners showed no association between EAMC and plasma volume changes or changes in serum sodium, potassium, calcium, phosphate, bicarbonate, urea, or creatinine concentrations.21 Two limitations of this study were that serum magnesium concentrations were not determined and that serum electrolyte concentrations were not documented in the recovery period after the race. If abnormal serum electrolyte concentrations return to normal during recovery, and this correlates with clinical recovery from EAMC, it would support the hypothesis that abnormal serum electrolyte concentrations are related to EAMC.

Magnesium has been shown to play an important role in muscle and nerve function.22,23 Magnesium is also often promoted, mostly by the industry, as the most important electrolyte supplement for preventing skeletal muscle cramping in athletes. Thus any study examining the relation between EAMC and changes in serum electrolyte concentrations should include measurement of serum magnesium.

The relation between serum electrolytes, dehydration, and EAMC has been reported in two other case series. There was no association between EAMC and serum potassium concentrations in cyclists who cycled for between two and a half and five hours, and no association between dehydration (per cent body weight loss) and uncontrolled muscle contraction in 44 triathletes.24,25 Small subject numbers and the lack of any control groups were limitations of those case series.

There is clearly a lack of research documenting the association between EAMC, dehydration, and serum electrolyte status. The relation between EAMC and fluid and electrolytes during the recovery phase from acute cramping after exercise has also not been documented. This relation is particularly important to document because a disassociation between recovery from EAMC and changes in serum electrolyte concentrations would strongly support the hypothesis that changes in serum electrolytes are not related to the aetiology of EAMC.

Our aim in this study was therefore to document the relation between the development of EAMC in ultra-distance runners and concomitant changes in serum electrolyte concentrations and hydration status.



This prospective cohort study was conducted at the Two Oceans Ultra-marathon, a 56 km road race held annually in Cape Town, South Africa. The ethics and research committee of the University of Cape Town Medical School approved the study.

All runners who registered for the race were considered potential subjects. In a pre-race advertisement campaign in the press and during registration for the race, a cohort of 72 runners was recruited. Forty five of these had a history of regularly suffering from EAMC and 27 had no previous experience of muscle cramping. A regular history of EAMC was defined as having a history of EAMC during at least two of every six consecutive races. Inclusion criteria were that all subjects should be male between the ages of 20 and 60, have a history of at least two years of active running as a registered runner, have no history of medical illness, and have no history of chronic or recent medicinal drug use.

A personal interview was conducted with each runner during pre-race registration by one of the investigators (JN). A questionnaire was first administered to the runners to elicit personal details, medical history, and any history of EAMC. A personal interview followed during which the information in the questionnaire was confirmed and all testing procedures were carefully explained. Written informed consent was obtained from each subject.

On race day, 21 runners from the initial subgroup of 45 runners with a history of cramping, who were all part of the cohort of 72 runners, suffered acute EAMC during or within 60 minutes of completing the race and formed the “cramp” group (n = 21). Data were collected from 22 of the 27 runners who had no past history of cramping but who were all part of the initial cohort of 72 runners. These runners formed the control group (n = 22) and they had no past history of EAMC and did not suffer any form of muscle cramping during or after the race. Twenty nine runners of the original cohort of 72 were excluded for various reasons: 16 failed to comply with the protocol during the race, nine had incomplete blood samples, and four drank significant amounts of fluid after completing the race but before arriving for immediate post-race blood sample collection.

Pre-race assessment

Pre-race weighing was conducted on the morning of the race. All runners were weighed at least 75 minutes before the start of the race (0600 hours). Body mass was measured in full race attire (running shorts, vest, and shoes). An electronic scale (Soehnle, Germany) was used for all body mass measurements. The scale was calibrated before weighing sessions. Subjects were instructed not to drink any fluids or consume any food between the weighing procedure and the start of the race.

Pre-race blood samples were collected from all subjects in the 75 minutes before the start of the race. Blood samples were taken from the antecubital veins, with subjects in the seated position. All the blood samples were clearly coded and stored for later analysis. Blood samples were analysed for haemoglobin concentration, packed cell volume, plasma proteins, serum sodium, serum potassium, serum total calcium, serum total magnesium, serum osmolality, and plasma glucose.

The temperature on race day ranged from 14.3°C to 23.8°C, with a relative humidity of 47%.

Post-race assessment

All subjects reported to the medical tent within five minutes of completing the race and before drinking any fluid or emptying their bladder. On arrival at the medical tent, blood samples were immediately collected from all the subjects. These samples were analysed for haemoglobin concentration, packed cell volume, plasma proteins, serum sodium, serum potassium, serum total calcium, serum total magnesium, serum osmolality, and plasma glucose. Thereafter subjects were weighed in full race attire (running shorts, vest, and shoes) as before the race. Subjects were requested to return 60 minutes later for a repeat blood sample and an interview to obtain race cramp history. Runners were allowed to consume liquids ad libitum in this 60 minute post-race period.

A second post-race blood sample was collected from subjects at 60 minutes after the race. This blood sample was analysed for serum sodium, serum potassium, serum total calcium, serum total magnesium, serum osmolality, and plasma glucose.

Race cramp history

All subjects were requested to monitor muscle cramp status during the race. These data were documented at the 60 minute post-race interview. The following data were collected from the EAMC group: muscle groups that cramped, the race distance when cramps started, the duration of cramping bouts, the recurrence of cramping bouts, cramping severity, and relieving factors for an acute cramp.

Muscle groups were listed as quadriceps, hamstrings, gastrocnemius, and other. Race distance was listed as less than 28 km, 28 to 42 km, or more than 42 km. Landmarks familiar to all Two Oceans marathon runners were used to identify these distances that relate to the first half, the third quarter, and the last quarter of the race, respectively. Duration of EAMC was recorded as lasting for less than 30 seconds, 30 to 120 seconds, and more than 120 seconds. Recurrence of EAMC referred to the number of repeat episodes of the muscle cramp after the first episode. This was quantified as one to two recurrences, three to four recurrences, or more than four recurrences until completion of the race.

Cramp severity was quantified according to the time the runner was forced to stop or run slowly and the ability to continue running afterwards. Any cramping where the runner could continue running or walking within five minutes was classified as mild. Cramping that required the runner to stop for 10 to 15 minutes but with resumption of walking or running was classified as moderate, and cramping that required stopping for more than 15 minutes with inability to resume a comfortable walking or running pace was classified as severe. Relieving factors included stretching the cramping muscle, massaging the cramping muscle, and slowing of running pace.

Criteria for discharge

Discharge criteria were that subjects had to be pain-free, able to walk freely, and have suffered no cramping activity for 30 minutes.

Analysis of blood samples

A centrifuge and two large cooler boxes filled with ice were set up in a section of the medical tent for immediate preparation and storage of blood samples. Serum was obtained for sodium, potassium, calcium, magnesium, and osmolality analysis. Blood for glucose analysis was obtained in tubes containing potassium oxalate/sodium fluoride. Blood for haemoglobin and packed cell volume analysis was obtained in tubes containing K3 EDTA 15%. Aliquots of whole blood were removed from the purple top vacutainers and kept for later analysis of packed cell volume and haemoglobin. The remainder of the vacutainer was kept on ice until centrifugation at 2500 rpm for 10 minutes. This separation of serum was within 45 minutes of obtaining the blood sample.

Storage of serum samples was done at −20°C in well sealed tubes until analysis. This storage was achieved within 90 minutes of blood sampling. The blood samples were stored at the Physiology Laboratories at the University of Cape Town (UCT) Medical School. All analyses were completed within 10 days after collection. All samples were analysed in duplicate with suitable standards and random intervals between samples.

Serum sodium and serum potassium determinations were done using ion selective electrodes on an ionised sodium/potassium analyser (KNA1, Radiometer, Copenhagen, Denmark). Serum total calcium and total magnesium concentrations were determined using atomic absorption spectrophotometry (Varian AA 1275 series atomic absorption spectrophotometer, Varian Techtron, Musgrave, Australia). The serum samples were diluted 1:40 with 0.1% lanthanum chloride solution. The wavelengths of determination were 422.7 nm for calcium and 285.2 nm for magnesium. Serum osmolality determinations were done using an automatic osmometer (Osmette A, Precision Systems, Newton, Massachusetts, USA). Particular care was taken to ensure with storage of samples before measurement of serum osmolality. Storage may result in water evaporation, causing an error of measurement. All samples were stored in paraffin wax sealed Eppendorf tubes until analysis could be undertaken. Plasma glucose determinations were done on fluoride/oxalated plasma using an automated glucose analyser (Beckman Glucose Analyzer 2, Beckman Instruments, Brea, California, USA). This glucose analyser was based upon an enzymatic glucose oxidase technique. Serum protein determinations were done on the serum samples using the standard Biuret method.

Haemoglobin determinations were done on whole blood samples using the standard cyanmethaemoglobin technique (Drabkins solution) for haemoglobin estimation. Packed cell volume determinations were carried out in duplicate using a microhaematocrit centrifuge.

Two methods were used to determine changes in hydration status of runners in this study. The first calculated changes in pre-race and post-race body mass measurements. The following formula was used to calculate the percentage change in body weight after the race: [pre-race weight−post-race weight]/pre-race weight×100. The extent of body weight loss was used as an index of dehydration.23,26

The second method to assess hydration status used the formula that is applied to haematological data to calculate the changes in blood volume, plasma volume, and red cell volume that occurred during the race.27 The changes were calculated by applying the Dill and Costill formula to the haematological data.27

Statistical analysis

Statistical analysis of the data was conducted at the Medical Research Council in Cape Town using SAS system software on a Sunsperc 10 Server computer (SAS Institute Inc, SAS/STAT User’s Guide, version 6, 4th ed; Cary, North Carolina: SAS Institute Inc, 1989). Data were analysed using both parametric and non-parametric statistics. Where data were skewed, non-parametric statistics were used. The Wilcoxon two sample test was used for comparisons between groups and within groups. The Tukey-type multiple comparison post-hoc test was used to determine where the differences between the groups lay. The Wilcoxon sign rank test was used to test for within group differences. Significance was accepted at the 0.05 level.


Descriptive data

The average age (years), weight (kg), and finishing time (minutes) from the cramp and control groups are presented in table 1. There were no significant differences between the two groups for any of these variables.

Table 1

 Age, weight, and finishing time for the cramp and control groups

Race cramping history

The clinical characteristics and race history of muscle cramping in the cramp group are presented in table 2. The most common muscle groups reported by the cramp group were hamstring (48%) and quadriceps (38%). All the runners (100%) reported cramping in the latter half of the race, with most (67%) reporting cramping immediately after the race. Most cramps (62%) reportedly lasted longer than 30 seconds and most runners (71%) reported three or more cramping episodes. Most cramps (57%) were moderate to severe in intensity and were best relieved by slowing the running pace (76%) or passive stretching (52%).

Table 2

 Clinical characteristics and race history of muscle cramping in the cramping group (n = 21)

Serum electrolyte concentrations and haematological data

The pre-race, immediate post-race, and 60 minute post-race results for serum sodium, potassium, total calcium, and total magnesium concentrations (mmol/l), serum osmolality (mmol/kg), plasma glucose concentration (mmol/l), plasma proteins (g/l), packed cell volume (%), and haemoglobin (g/dl) from the cramp and control groups are given in table 3.

Table 3

 Values for pre-race, immediate post-race, and 60 minute post-race serum sodium, potassium, total calcium, and total magnesium concentrations, serum osmolality, plasma glucose concentration, plasma proteins, packed cell volume, and haemoglobin results for the cramp and control groups, expressed as mean (SD) and median (1st and 3rd quartiles)

There were no significant differences between cramp and control groups for pre-race, immediate post-race, and 60 minute post-race serum potassium and calcium concentrations, osmolality, plasma glucose, plasma proteins, packed cell volume, or haemoglobin results. The immediate post-race serum sodium concentration was lower (p = 0.004) in the cramp group (139.8 (3.1) mmol/l) than in the control group (142.3 (2.1) mmol/l). The immediate post-race serum magnesium concentration was higher (p = 0.03) in the cramp group (0.73 (0.1) mmol/l) than in the control group (0.67 (0.1) mmol/l).

Hydration status

The average per cent change in body weight, blood volume, plasma volume, and red cell volume between immediate post-race and pre-race from both the cramp and control groups are shown in table 4. There was no significant difference between the two groups for any of these variables.

Table 4

 Per cent change (pre-race to post-race) in body weight, blood volume, plasma volume, and red cell volume during the race for the cramp and control groups


The two main findings of this study were first, that there is no relation between any clinically significant changes in serum concentrations of sodium, potassium, total calcium, and total magnesium and the development of EAMC in ultra-distance runners before or immediately after a race, or during the period of clinical recovery from EAMC; and second, that there is also no relation between the changes in hydration status (measured by changes in body weight, plasma volume, blood volume, or red cell volume) and the development of EAMC in ultra-distance runners during or immediately after a race. The results of our study also show that there is no relation between the development of EAMC in ultra-distance runners and changes in serum osmolality, blood glucose concentration, and the concentration of plasma proteins before and after a race.

The association between skeletal muscle cramping and disturbances of serum electrolyte concentrations has been well documented in a variety of medical conditions. These include plasma volume contraction and extracellular hypo-osmolality in patients undergoing haemodialysis, severe dehydration after sweating, vomiting or diarrhoea, hyponatraemia associated with whole body salt depletion, hypokalaemia and hyperkalaemia, hypocalcaemia, and more recently hypomagnesaemia.9–12,28 In most instances, the postulated mechanism for the development of skeletal muscle cramping is related to alterations in neuromuscular excitability which can be induced by disturbances in serum electrolyte concentrations. Clinically, patients with these abnormalities present with increased generalised neuromuscular excitability, which can lead to generalised skeletal muscle cramping.

Numerous studies have shown that disturbances in serum electrolyte concentrations and hydration status can occur following ultra-distance running.2–4,6 It is therefore commonly assumed that there is a relation between changes in serum electrolyte concentrations, hydration status, and the development of EAMC in distance runners.15,16,18–20

In the present study, serum electrolyte concentrations and hydration status were not associated with the clinical symptoms of EAMC. Small but statistically significant differences in serum sodium and magnesium concentrations between the cramping and control groups in the immediate post-race period in the present study are too small to be of clinical significance. Furthermore, the decrease in serum sodium concentration following the race in the cramp group is probably related to an increased fluid intake during the race in this group.29 Although drinking patterns were not measured directly, increased drinking in the cramp group is likely because of the well publicised belief that cramping is caused by dehydration. The serum electrolyte concentrations observed in this study are also similar to those reported by others, and most importantly are too small to be of any clinical significance.21,24,25

This study was not designed to measure fluid balance in the runners; therefore precise data on the type and volumes of fluid ingested during the race, as well a specific losses (sweat, urine, faeces) during the race, are not presented. However, accurate and consistent data on the hydration status immediately after the race (body weight changes, changes in plasma volume, changes in blood volume) indicate that runners with EAMC were less dehydrated than non-cramping runners at the time of presentation. The per cent decrease in body weight (pre- to post-race) was less in the cramp group (2.9%) than in the control group (3.6%).

The absence of any relations between clinical recovery from EAMC and changes in serum electrolyte concentrations in the 60 minute period after the race also do not support the hypothesis that EAMC is associated with alterations in serum electrolyte concentrations. Finally, the clinical picture of EAMC is that of localised skeletal muscle cramping, and this differs substantially from the generalised skeletal muscle cramping observed in patients with serum electrolyte changes secondary to systemic disease.9–12,28


The results of our study do not support the common hypotheses that EAMC is associated with either changes in serum electrolyte concentrations or changes in hydration status following ultra-distance running. An alternative hypothesis to explain the aetiology of EAMC must therefore be sought.


  1. Schwellnus MP, Derman EW, Noakes TD. Aetiology of skeletal muscle “cramps” during exercise: a novel hypothesis. J Sports Sci 1997;15:277–85.

  2. Hiller WD, O’Toole M, Massimino AF, et al. Plasma electrolyte and glucose changes during the Hawaiian Ironman triathlon. Med Sci Sports Exerc 1985;17 (suppl) :S219.

  3. Hiller WD, O’Toole ML, Laird RH, et al. Electrolyte and glucose changes in endurance and ultra-endurance exercise: results and medical implications. Med Sci Sports Exerc 1986;18 (suppl) :S62–3.

  4. Hiller WD, O’Toole M, Laird RH. Hyponatraemia and ultra marathons [letter]. JAMA1986;11:213–14.

  5. Hiller WD, O’Toole M, Laird RH, et al. Hyponatraemia in triathletes: a prospective study of 270 athletes in events from 2 hours to 34 hours duration. Med Sci Sports Exerc 1987;10 (suppl) :S71.

  6. Noakes TD, Norman RJ, Buck RH, et al. The incidence of hyponatraemia during prolonged ultra endurance exercise. Med Sci Sports Exerc 1990;22:165–70.

  7. Eaton JM. Is this really a muscle cramp? Postgrad Med1989;86:227–32.

  8. Knochel JP. Environmental heat illness: an eclectic review. Arch Intern Med1974;133:841–64.

  9. Knochel JP. Neuromuscular manifestations of electrolyte disorders. Am J Med1982;72:521–35.

  10. Layzer RB, Rowland LP. Cramps. Physiol Med1971;285:31–40.

  11. McGee SR. Muscle cramps. Arch Intern Med1990;150:511–18.

  12. Neal CR, Resnikoff E. Treatment of dialysis-related muscle cramps with hypertonic dextrose. Arch Intern Med1981;141:171–3.

  13. Shearer S. Dehydration and serum electrolyte changes in South African gold miners with heat disorders. Am J Indust Med1990;17:225–39.

  14. Eichner ER. Heat cramps: salt is simplest, most effective antidote. Sports Med Dig1999;21:88.

  15. Peterson L, Renstrom P. Sports injuries: their prevalence and treatment. 3rd ed. London: Martin Dunitz, 2001:416.

  16. Eichner ER. Should I run tomorrow? In: Tunstall-Pedoe DS, ed. Marathon medicine. London: Royal Society of Medicine Press, 2000:323–5.

  17. Maughan RJ. Exercise-induced muscle cramp: a prospective biochemical study in marathon runners. J Sports Sci1986;4:31–4.

  18. Casoni I, Guglielmini C, Graziano L, et al. Changes of magnesium concentrations in endurance athletes. Int J Sports Med 1990;11:234–7.

  19. Willmore JH, Costill DL. Nutrition and nutritional ergogenics. Physiology of sport and exercise. Lower Mitcham, South Australia: Human Kinetics, 1994:361.

  20. Brouns F, Beckers E, Wagenmakers AJ, et al. Ammonia accumulation during highly intensive long-lasting cycling. Int J Sports Med 1990;11:S78–84.

  21. Kantorowski PG, Hiller WD, Garrett WE, et al. Cramping studies in 2600 endurance athletes. Med Sci Sports Exerc 1990;22 (suppl 4) :S104.

  22. Maughan RJ. Thermoregulation in marathon competition at low ambient temperature. Int J Sports Med1985;6:15–19.

  23. Dill DB, Costill DL. Calculation of percentage changes in volumes of blood, plasma, and red cells in dehydration. J Appl Physiol1974;37:247–8.

  24. Milutinovich J, Graefe U, Follette WC, et al. Effect of hypertonic glucose on the muscular cramps of haemodialysis. Ann Intern Med 1979;90:926–8.

  25. Speedy DB, Noakes TD, Rogers IR, et al. Hyponatremia in ultradistance triathletes. Med Sci Sports Exerc 1999;31:809–15.

Sun, 01 May 2022 23:51:00 -0500 en text/html
Killexams : Trends in Survival after In-Hospital Cardiac Arrest

Data Source

The GWTG–Resuscitation registry is a large, prospective, hospital-based, clinical registry of patients with in-hospital cardiac arrests in the United States. The design of the registry has been previously described in detail.9 Briefly, all hospitalized patients with a confirmed cardiac arrest (defined as the lack of a palpable central pulse, apnea, and unresponsiveness), without do-not-resuscitate (DNR) orders, and who have received CPR are identified and enrolled by specially trained personnel. To ensure that all cases in a hospital are captured, multiple case-finding approaches are used, including centralized collection of cardiac-arrest flow sheets, review of hospital paging-system logs, and routine checks of code carts (carts stocked with emergency medications and equipment), pharmacy tracer drug records, and hospital billing charges for use of resuscitation medications.13 The registry uses standardized Utstein-style definitions for clinical variables and outcomes.14,15 Data completeness and accuracy is ensured by rigorous training and certification of hospital staff, use of standardized software with internal data checks, and a periodic re-abstraction process, in which a random audit has revealed a mean error rate of 2.4%.9

This study was approved by the institutional review board at the University of Iowa. The requirement for informed consent was waived. The first author vouches for the integrity of the data and accuracy of the results. All analyses were prespecified and adhered to the study protocol. Although the American Heart Association oversees the GWTG–Resuscitation registry, it had no role in the study design, data analysis or interpretation, or manuscript preparation.

Study Population

Figure 1. Figure 1. Study Cohort.

ICU denotes intensive care unit.

We identified 113,514 adults at 553 hospitals participating in the GWTG–Resuscitation registry who were 18 years of age or older with an index cardiac-arrest event from January 1, 2000, through November 19, 2009 (Figure 1). For patients with multiple cardiac arrests, only the first episode was included. We restricted our sample to patients with cardiac arrests occurring in an intensive care unit or inpatient ward and excluded 24,377 patients with arrests in operating rooms, procedural suites, or emergency departments, because patients who have cardiac arrests in these settings have distinct clinical circumstances and outcomes. Because we were interested in examining trends in survival over time, we also excluded 4292 patients at 179 hospitals with fewer than 3 years of data submission or low case volumes (fewer than 5 cardiac arrests per year). Finally, we excluded patients with missing data on survival (148 patients) and calendar year (72 patients). Our final sample comprised 84,625 patients from 374 hospitals (for hospital characteristics, see Table S1 in the Supplementary Appendix, available with the full text of this article at

Study Outcomes

The primary outcome was survival to discharge. All analyses are reported for the overall cohort and separately according to the initial rhythm. To better understand which specific phase of resuscitation care may have led to improvement in survival, we separately examined rates of acute resuscitation survival (defined as the return of spontaneous circulation for at least 20 contiguous minutes at any time after the initial pulseless arrest) and postresuscitation survival (defined as survival to hospital discharge among patients who survived the acute resuscitation). We also examined temporal trends in time to defibrillation in patients with ventricular fibrillation or pulseless ventricular tachycardia.16

To confirm that any temporal trend in survival was clinically important, we also examined rates of neurologic disability among survivors. This was assessed with the use of the cerebral-performance category (CPC) scores.17 A CPC score of 1 denotes mild or no neurologic disability, 2 moderate neurologic disability, 3 severe neurologic disability, 4 coma or vegetative state, and 5 brain death. We examined temporal trends for clinically significant neurologic disability (CPC score at discharge, >1) and severe neurologic disability (CPC score at discharge, >2).16,18

Statistical Analysis

To evaluate changes in baseline characteristics by calendar year, we used the Mantel–Haenszel test of trend for categorical variables and linear regression for continuous variables. To assess whether survival to discharge had improved over time, multivariable regression models using generalized-estimation equations were constructed for the overall cohort and according to initial rhythm. These models accounted for clustering of patients within hospitals. Because survival exceeded 10%, we used Zou's method to directly estimate rate ratios instead of odds ratios by specifying a Poisson distribution and including a robust variance estimate in our models.19,20 Our independent variable, calendar year, was included as a categorical variable, with 2000 as the reference year. We multiplied the adjusted rate ratio for each year (2001 through 2009) by the observed survival rate for the reference year to obtain yearly risk-adjusted survival rates for the study period. These rates represent the estimated survival for each year if the patient case mix were identical to that in the reference year. We also evaluated calendar year as a continuous variable to obtain adjusted rate ratios for year-to-year survival trends.

In our models, we adjusted for age, sex, race, coexisting conditions, therapeutic interventions in place at the time of cardiac arrest, characteristics of the cardiac arrest, and select hospital characteristics. A full list of the variables used in the multivariable models is provided in Table S2 in the Supplementary Appendix. To confirm that any survival trends were independent of the duration of hospital participation in the registry, we adjusted for the number of years of hospital participation for each arrest. We also examined whether survival trends differed by age group (≥65 years vs. <65 years), race, and sex by including an interaction term with calendar year in the model. Last, to exclude the possibility that our findings were due to enrollment of better-performing hospitals over time, we performed these analyses only for patients at hospitals with at least 8 years of registry participation.

Data were complete for all covariates and outcomes, except race (6.6% missing), CPC score at admission (14.6% missing), time of cardiac arrest (0.9% missing), hospital variables (4.5% missing), and CPC score at discharge (14.0% missing). Missing patient-level covariates were assumed to be missing at random and were imputed with the use of multiple imputation.21 Results with and without imputation were not meaningfully different, so only the former are presented. Imputation was not performed for the outcome of CPC score at discharge.

All statistical analyses were conducted with the use of SAS software, version 9.1.3 (SAS Institute), IVEware (University of Michigan), or R software, version 2.6.0 (Free Software Foundation). All hypothesis tests were two-sided, with a significance level of 0.05.

Wed, 13 Jul 2022 12:00:00 -0500 en text/html
Killexams : Provision of Specific Dental Procedures by General Dentists in the National Dental Practice-Based Research Network: Questionnaire Findings

Gregg H Gilbert1*, Valeria V Gordan2, James J Korelitz3, Jeffrey L Fellows4, Cyril Meyerowitz5, Thomas W Oates6, D Brad Rindal7 and Randall J Gregory8

1Department of Clinical and Community Sciences, School of Dentistry, University of Alabama at Birmingham, SDB Room 109, 1530 Third Avenue South, Birmingham, AL 35294-0007, USA. 2Department of Restorative Dental Sciences, College of Dentistry, University of Florida, Gainesville, FL, USA. 3Westat, Rockville, MD, USA. 4Kaiser Permanente Center for Health Research, Portland, OR, USA. 5 Eastman Institute for Oral Health, Rochester, NY, USA. 6Department of Periodontics, School of Dentistry, University of Texas Health Science Center at SanAntonio, San Antonio, TX, USA. 7HealthPartners Institute for Education and Research, Minneapolis, MN, USA. 8General Dentist, Winston-Salem Dental Care, Winston-Salem, NC, USA


Competing interests
The authors declare no competing interests.

Authors' contributions
GHG, VVG, and JJK developed the study concept and led the design of the questionnaire and study protocol. JLF, CM, TWO, DBR, RJG, and NDPBRNCG contributed to questionnaire design and acquisition of data. JJK and GHG conducted the analysis and review of the data. GHG and JJK drafted the initial version of the manuscript. GHG, VVG, JJK, JLF, CM, TWO, DBR, RJG, and NDPBRNCG contributed to interpretation of data and were involved in drafting the manuscript. GHG, VVG, JJK, JLF, CM, TWO, DBR, RJG, and NDPBRNCG provided critical revisions of the manuscript for important intellectual content approved the final manuscript.

Tue, 12 Jul 2022 12:00:00 -0500 en text/html
Killexams : Woodchip Bedding as Enrichment for Captive Chimpanzees in an Outdoor Enclosure

L Brent
Southwest Foundation for Biomedical Research,
Department of Laboratory Animal Medicine,
P O Box 28147, San Antonio, Texas USA 78228-0147


The use of woodchips as bedding for 16 juvenile chimpanzees (Pan troglodytes) was evaluated for the effects on behaviour, health and husbandry practices. Woodchip bedding was placed in two outdoor play areas for five consecutive days. Behavioural data were recorded in the morning and afternoon of each day, and compared to pre- and post-test data. A total of 44 hours of observations, made up of I hour scan sample sessions, were completed for the study. Behaviours in the following categories were measured: abnormal, affinitive, aggressive, environmental manipulation, inactivity, locomotion, play, self manipulation and woodchip manipulation. The location of each animal was also recorded Analysis of the data indicated that the chimpanzees engaged in woodchip-related behaviours for an average of 20.52 per cent of the datapoints, and that they spent more time manipulating the substrate in the morning than in the afternoon. In addition abnormal behaviour, environmental manipulation and affinitive behaviours were significantly lower during the woodchip condition than during pre-test and post-test conditions. The subjects spent the most time on the floor of the enclosure, and this measure did not differ between conditions. The woodchip bedding did not cause any known health problems for the chimpanzees. Although the daily addition and removal of woodchips took more time than did routine cleaning, it kept the play areas cleaner and drier. The evaluation of woodchip bedding as enrichment was favourable and indicated that bedding may be used regularly in the maintenance of captive chimpanzees.

Key words: animal welfare, chimpanzees, enrichment, husbandry, woodchip bedding

Animal welfare benefits

Woodchip bedding improved the captive environment of juvenile chimpanzees by providing them with a substrate to manipulate and to use in play activity. Abnormal behaviours, especially faeces ingestion and manipulation, were reduced when woodchip bedding was available as compared to the bare floor condition. The area with bedding remained drier, and little additional personnel effort was required to maintain the woodchips.


The captive environment of non-human primates, and especially how it relates to psychological health, has received much attention in recent years. Numerous techniques have been devised to Improve different aspects of the environment, ranging from the addition of toys to the construction of naturalistic enclosures (eg Chamove 1989, Fajzi et al 1989). In choosing which are the most effective, recent research has emphasized practical implementation of environmental enrichment schemes in which the needs of the animals, the housing conditions, the animal's response, safety, cost and personnel effort are evaluated (Bloomsmith et al 1991). A floor covering is one environmental enrichment option. However, for non-human primates in standard caging systems, the floor is often neglected as an area for improvement (Chamove & Anderson 1989).

Several investigators have demonstrated that bedding has beneficial behavioural effects while providing a clean, cost effective environmental improvement. Chamove and Anderson (1979) reported that aggression was decreased in a group of stumptailed macaques when deep woodchip litter and grain were provided in the outdoor portion of their enclosure. Only a slight increase in personnel time was necessary to provide bedding, and the animal area remained drier and had less odour. These results were extended to include seven monkey and one prosimian species (Chamove et al 1982). Woodchips or woodchips and food item were added to the indoor enclosures of each group. The authors found that the subjects spent an increased amount of time on the ground, and decreased abnormal, aggressive and inactive behaviours during the treatment conditions. In addition, microbiological analysis indicated that the bedding became increasingly inhibitory to bacteria with time. Others have reported positive results with the use of a substrate and manipulatable objects for capuchins (Westergaard & Fragaszy 1985). The authors reported that straw and several portable objects placed in the outdoor enclosure elicited greater expression of manipulative tendencies. Individuals with the lowest object contact scores increased this behaviour the most when straw was provided. McKenzie and colleagues (1986) found that arboreal species, such as marmosets and tamarins, spent more time on the floor and decreased their inactivity when woodchips or shredded paper were used as floor covering. Long-term benefits of woodchip bedding, such as decreased stereotyped behaviour, low levels of agonism, and increased exploration have been found for a group of pigtail macaques (Boccia 1989a,b). However, differences between species and group composition may also influence the effect of bedding. A recent study by Byrne and Suomi (1991) reported that increases in activity, exploration, and foraging were noted when woodchips and food item were provided for two groups of rhesus macaques, but no changes were noted in abnormal, aggressive or play behaviours.

Although woodchip bedding has been reported to be an effective enrichment and husbandry practice with many small primate species, quantitative evaluation of the effects of bedding for apes is lacking. The great apes have an affinity for manipulation, and manipulatable objects are important environmental components. Wilson (1982) found that the presence of manipulatable objects had a pronounced influence on the activity level of apes. Unfortunately, manipulation may also take the form of faeces smearing or coprophagy in captive settings (Hill 1966).

A preliminary study was conducted to determine the efficacy of using woodchip bedding as enrichment for juvenile chimpanzees. Behavioural and positional measures were obtained on the subjects before, during and after bedding was provided. Health and husbandry issues were also examined as they related to this form of enrichment. We expected that the bedding would provide a material for manipulation which might replace behavioural disorders and a cleaner, drier environment for the animals.



Sixteen juvenile chimpanzees (Pan troglodytes) housed in two groups of eight served as subjects. One group consisted of four males and four females, while the other group had one male and seven females. Their ages ranged from 3.25 to 4.50 years. Seven had been raised with their mother for 5-13 months and nine were raised in peer groups in a nursery. Housing conditions were identical for each group. They were caged in pairs at night and released to a small outdoor play area (4.75m x 3.58m) for approximately 7 hours daily as weather permitted. While in the play area, the chimpanzees did not have access to the indoor portion of the enclosure. The play area consisted of chain link walls and roof attached to the primary enclosure. Corrugated tin was attached to the roof structure to provide shade, and the floor was made of treated concrete. A concrete curb and 30cm of wire mesh around the perimeter of the play area helped to keep the woodchips inside. The area was bare except for a swinging tyre, balls and a rubber pool. The chimpanzees were fed standard monkey chow, fruit and vegetables, as well as a variety of daily feeding enrichment items. The subjects were not fed during observation periods and received regular feeding while in the indoor cages.

Research design and procedure

Observational data were collected in a pre-test, treatment, post-test repeated measures design (Myers & Well 1991). One hour scan samples with 2 minute intervals were recorded for each group. Half of the observations were completed in the morning (0900h to 1200h) and half in the afternoon (1300h to 1600h). A total of 44 hours of observation was completed for the study during autumn 1990. Pre-test and post-test conditions each consisted of 6 hours of observation per group in the outdoor play area without bedding. During the treatment condition, each group was observed in both the morning and afternoon for five consecutive days with the bedding for a total of 10 sessions per group. For each day of the treatment condition, the drain was covered and c 0.2 cubic metres of woodchip bedding was spread in the play area before the subjects were released. The area was cleaned after the chimpanzees were returned to their sleeping quarters.

Behaviours were grouped into nine categories (see Table). Inactivity and locomotion were recorded only if they occurred in the absence of other behaviours. In addition, the subject's location was noted for each interval: fence, floor, roof or tyre swing. A portable computer was used to record the data.

Personal observations and interviews with the animal attendant in charge of the two groups of chimpanzees were conducted to determine the effect of using woodchip bedding on personnel time and animal health.

Data analysis

The mean number of occurrences of each behavioural category and location for each chimpanzee were calculated for each experimental condition. A repeated measures analysis of variance was performed on these data with the use of the SAS statistical package (SAS Institute, Cary, NC, USA). Multivariate analysis was used when a sphericity test indicated that the variables were not independent. Social group was used as a categorical factor. Contrasts between treatment and baseline conditions were then evaluated for individual behaviour categories. A value of p<0.05 was used to define significance.


Woodchip use

The subjects spent an average of 20.52 per cent of possible data points in woodchip-related behaviours (range 9.57% to 44.3%). The percentage of substrate behaviour showed a gradual decline over the 10 sessions (Figure 1). In addition, a pattern of fluctuation by time of day was noted. Analyzing the difference between AM and PM data indicated that the subjects spent significantly more time manipulating the bedding in the morning sessions than in the afternoon sessions [F(1,29)=22.95, p<0.001].

Behavioural data

Multivariate analysis was applied due to the lack of independence of the behavioural variables as indicated by a sphericity test Wauchley's = 0.5315, p<0.016). Initial analysis showed that the overall effect of the experimental condition was significant (Pillai's trace = 0.921, p<0.001), but the difference between groups was not (P>0.05). Further analysis of each behavioural category was completed by comparing the pre- and post-test conditions. Play behaviour significantly differed between the baseline measures [F(1,14)=11.15, p<0.0051, but the other seven categories were similar in both conditions (p>0.13). Because the subjects' behaviours had returned to levels similar to pre-test measures after the treatment condition, a comparison of the average of the baselines to the experimental phase was completed for all behaviour categories.

Three behaviour categories were significantly lower during the treatment condition: abnormal behaviours [F(1,14)=72.48, p<0.0011; environmental manipulations [F(1,14)=41.52, p<0.0011, and affinitive behaviours [F(1,14)=11.97, p<0.004] (Figure 2).

Further analysis of the abnormal behaviour category indicated that the peer-reared subjects displayed more abnormal behaviours per session than the mother-reared over the course of data collection (peer n=9, mean=4.07, SD=3.21; mother n=7, mean=2.62, SD=1.17). However, the difference was not statistically significant. In addition, the changes in abnormal behaviour from baseline to woodchip condition were not related to the amount of time the subject was reared with its mother. Faeces ingestion and smearing was one of the most common abnormal behaviours for these subjects. For both groups the pre-test level of 5.5 occurrences per session for this behaviour dropped to 1.4 during the woodchip condition, and increased to 7.8 during post-test. The difference between baseline and woodchip condition was significant [F(1,14) = 37.50, p<0.001].

Location data

The overall effect of the experimental phases for location was not significant. Because trends were evident in the graphical representation, pairwise contrasts between conditions were performed. Changes in mans were evident for the fence [increase from pre-test to woodchip treatment, F(1,14)=5.17, p<0.0391 and the tyre swing [decrease from treatment to post-test, F(1,14)=6.05, p<0.0281. However, the amount of time spent on the floor was the category of interest. The chimpanzees were recorded to be on the floor for approximately 75 per cent of the time, and this amount did not differ when woodchip bedding was provided (pre-test 75.50%, woodchip 72.25%, post- test 73.57%).

Health and husbandry

Unlike adult chimpanzees, the juvenile chimpanzees used as subjects did not avoid walking in faeces. The outdoor play area normally had to be disinfected and scrubbed daily to remove faeces which were spread on the walls, fence and flooring. During the study, the woodchips were added and removed daily, which took approximately 30 minutes. The bedding absorbed urine and water, and kept the animals from walking in and smearing faeces around the enclosure. This decreased the amount of time needed for scrubbing the area by approximately 10 minutes per day.

After termination of the study, woodchip bedding was routinely added to the enclosure. Rather than cleaning the area and replacing all the bedding daily, it was decided that spot cleaning was sufficient when the bedding had little soiling. In this way, time was saved in cleaning and money was saved in the cost of the bedding.

The woodchip bedding did not cause any known health problems for the subjects. Although they chewed on the bedding, the chimpanzees were observed to make a wad which they spat out and little bedding was noted in their stool. The type of woodchip bedding used in this study produced dust when moved around the play area. Although this was not a problem in the outdoor area, fine bedding may not be as appropriate in a confined space. Food items that were sticky were not given to the animals in the play area with woodchips because they would often drop the item and then eat it and the clinging bedding. The bedding was advantageous during weather extremes since it kept the animals from contacting the warm or cool cement floor.


The woodchip bedding provided young chimpanzees with an appropriate and effective enrichment option, as measured over a 5 day test period. The results of the behavioural analysis were consistent with several studies utilizing monkeys as subjects. The main finding was that abnormal behaviours were reduced during the woodchip condition. Others have reported similar results using woodchips (Chamove et al 1982) or woodchips and sunflower seeds (Boccia 1989a). However, no reduction in the abnormal behaviours of rhesus macaques as a result of bedding was found by Byrne and Suomi (1991). These authors noted that this may be related to rearing history. Their peer-reared adult subjects most often exhibited abnormal behaviours. They reasoned that since peer-reared rhesus macaques may be more reactive to stress than mother-reared macaques, the woodchip enrichment may not have been sufficient to reduce abnormal behaviour. Although in the present study the peer-reared chimpanzees did exhibit slightly high levels of abnormal behaviour, the response to the woodchip condition was not related to rearing.

Changes in abnormal behaviour patterns may also be dependent on the type of behaviours considered, particularly in cross-species comparisons. Coprophagy and faeces manipulation were not included in the abnormal behaviours recorded in the studies on monkeys and woodchips, although this is a more common occurrence in the apes (Hill 1966, Walsh et al 1982). Obviously, providing woodchips made faeces less accessible to our subjects, this directly influenced the amount of abnormal faeces-related behaviours.

Foraging and exploratory behaviours have been noted to increase when bedding and food items were provided for several monkey species (Boccia 1989b, Byrne & Suomi 1991, Chamove et al1982). The chimpanzees in the present study played with and manipulated the substrate for a large percentage of time, even though no food items were present in the bedding. Perhaps because the chimpanzee has a great interest in manipulating its environment (Maple 1979), woodchips alone were sufficient to incite use. The decrease in other behaviours, such as affinitive behaviours and other environmental manipulations, was probably a result of the increased attention to the substrate. Other investigators have noted results over a longer time period. Social interactions have been reported to decrease when woodchips were added to an enclosure (Byrne & Suomi 1991), and affiliative behaviours were reduced when the environment was made sufficiently interesting by adding woodchips and food items (Chamove et al 1982).

Play behaviour has been reported to increase (Boccia 1989a, Chamove & Anderson 1979, Chamove et al 1982, McKenzie et al 1986), decrease (Chamove et al 1982, McKenzie et al 1986) or remain unchanged (Byrne & Suomi 1991) when bedding and food items were provided for a variety of species. In the present preliminary study with chimpanzees, play behaviour was reduced from pre-test to treatment but did not return to baseline during post-test. This may be interpreted as the subjects' reaction to the removal of the woodchips, or a reflection of complicated influences of the social and physical environment on play behaviour. Long-term monitoring of the chimpanzees' behaviour with and without a woodchip substrate may clarify the impact on play behaviour.

The attenuation of aggression may be a primary goal of enrichment procedures for socially housed non-human primates. Bedding has been reported to reduce aggression in some cases (Boccia 1989a, Chamove et al 1982). However, Byrne and Suomi (1991) noted that in stable groups aggression may already be low and would be affected more by the social environment than by physical enrichment. The levels of aggression in our chimpanzee groups were lower than the levels of any other behaviour category, and did not fluctuate with the addition or removal of the bedding.

Several investigators have noted that both terrestrial and arboreal monkey species spent more time on the floor when it was covered with bedding (Chamove et al 1982, McKenzie et al 1986). In a study which focused on space utilization by captive chimpanzees, juvenile subjects housed in a multi-room enclosure with resting benches and straw bedding were not found to have spatial preferences (Traylor- Holzer & Fritz 1985). In our study, the subjects' predominant use of the floor recorded during baseline data collection did not change when woodchip bedding was provided. These findings are probably related to the enclosure design. The enclosure did not contain above ground resting areas, and it appeared that the fence and swing locations were used mainly for short play bouts and locomotion. It is possible that with the addition of small food items to the bedding, time spent on the floor would increase.

The effects of any new enrichment procedure should be carefully weighed with respect to husbandry and health-related issues as well as behavioural. effects. Devices which may be appropriate and beneficial to the animals may also be too costly, hazardous or time consuming for the animal handler to use. As Chamove and Anderson (1979) found with macaques, the use of woodchips for juvenile chimpanzees produced a slight increase in husbandry time. The animals' health did not suffer due to the woodchip bedding, and long-term use may indicate other advantages of keeping chimpanzees off the wet, sometimes cool, cement floor. The positive results of the initial behavioural responses of the subject, as well as consideration of husbandry and health, indicated that woodchip bedding was a viable enrichment option for juvenile chimpanzees.


I would like to thank Daniel Rodriguez for his co-operation and care of the chimpanzees during this study. Comments by Mollie Bloomsmith on an earlier version of this manuscript are appreciated. The subjects of this study were housed in a facility approved by the American Association for Accreditation of Laboratory Animal Care and registered as a United States Department of Agriculture research facility.


This article originally appeared in Animal Welfare 1, 161-170 (1992). 
Reprinted with permission of the Editor.

Tue, 12 Apr 2016 00:49:00 -0500 en text/html
Killexams : Longitudinal Dynamics of Circulating Tumor Cells and Circulating Tumor DNA for Treatment Monitoring in Metastatic Breast Cancer

Despite the advances in prevention and antineoplastic treatments, breast cancer (BC) is still the most frequently diagnosed cancer in women. Among all new cases, 6%-7% are diagnosed with de-novo metastatic disease and approximately 30% of patients initially diagnosed in earlier stages eventually relapse in distant sites.1-3


  • Key Objective

  • With a steady decrease in costs and the noninvasive nature of testing, liquid biopsy has a growing potential role in the management of metastatic breast cancer. However, the optimal way of integrating different biomarkers remains unclear. The study explored the different dynamics of circulating tumor DNA (ctDNA) and circulating tumor cell enumeration (nCTCs) to better describe their specific features and potential ways of integration for future clinical algorithms based on the longitudinal evolution of liquid biopsy characteristics.

  • Knowledge Generated

  • ctDNA provides a more quantitative, real-time assessment of tumor burden and treatment benefit. Furthermore, ctDNA, when analyzed at a single gene level, can provide insights on treatment resistance. nCTCs likely describe the underlying metastatic biology.

  • Relevance

  • ctDNA can be used to monitor treatment response and anticipate clinical progression; nCTCs provide an overall biologic readout of the disease's clinical aggressiveness.

The growing scalability and the steady decrease in costs have favored the investigation of new clinical algorithms based not only on baseline (BL) liquid biopsy characteristics but also on their longitudinal evolution.4

Circulating tumor cells (CTCs) were the first modern liquid biopsy marker deployed in clinical practice because of their strong prognostic value. However, their longitudinal implementation is still debated.5-7 In this context, the characterization of circulating tumor DNA (ctDNA) has proven useful for treatment selection and encouraging data support its potential role in clinical trial enrollment. By contrast, it has been observed how different information could be obtained by ctDNA according to the analytic workflow.8-11

Notwithstanding the potentials of both liquid biopsy techniques, little is known about their dynamics across longitudinal evaluations. To better grasp potential specificities and confounding factors and therefore enhance their deployment and integration in clinical practice, we analyzed the variations in CTCs enumeration and ctDNA-detected features in patients receiving treatment for metastatic breast cancer (MBC) to comprehensively explore the composite nature of liquid biopsy biomarkers.

Study Population and Ethical Approval

The study was based on two distinct cohorts. The retrospective NU16B06 cohort was analyzed for hypothesis generation on the overall dynamics comparison of liquid biopsy–derived parameters. Patients were characterized for CTCs (Data Supplement), total plasma DNA levels (DNA yield) and ctDNA sequencing through the Guardant360 (Guardant Health, Redwood City, CA) next-generation sequencing platform (Data Supplement).12-14 Patients with ≥ 5 CTC/7.5 mL of blood were defined as stage IVaggressive, whereas patients with < 5 CTC/7.5 mL of blood were defined as stage IVindolent.6

Subsequently, the prospective CRO-2018-56 cohort was used to validate the DNA yield findings and to further characterize the different components of plasma DNA and their clinical meaning through droplet digital polymerase chain reaction (ddPCR) (Data Supplement).15

The NU16B06 cohort

This retrospective cohort consisted of 107 patients with MBC longitudinally characterized for CTCs and ctDNA at the Thomas Jefferson University (Philadephia, PA) and the Robert H. Lurie Comprehensive Cancer Center at Northwestern University (Chicago, IL). Patients were enrolled between 2013 and 2019 under the Investigator Initiated Trial NU16B06 independently from treatment line. BL staging was performed according to the investigators' choice. CTCs and ctDNA collection were performed before treatment start (BL), at progression (progressive disease), and at the first clinical evaluation, with a median of 3 months after the BL timepoint (evaluation one [EV1]). ctDNA analysis comprised the number of detected alterations (NDA), mutant allele frequency (MAF), and DNA yield. The Investigator Initiated Study was approved by the institutional review board under the protocol number NU16B06.

The CRO-2018-56 study

To validate the findings for DNA yield, 48 women with hormone receptor–positive human epidermal growth factor receptor 2 (HER2)-negative (luminal-like) MBC were prospectively enrolled through a multicenter pragmatic study between 2018 and 2019. Patients were eligible for endocrine therapy (ET) as first-line treatment and could have received both ET and chemotherapy (CT) in the adjuvant and neoadjuvant settings. Samples were collected at BL and after 3 months concomitantly with imaging evaluation (EV1) and were analyzed through ddPCR for the detection of small, medium, and long fragments of the gene coding for Beta-Actin (ACTB). The study was approved by the ethics committee under the CEUR-2018-Sper-056-CRO protocol.

Statistical Analysis

Clinical and pathologic variables were reported using descriptive analyses. Categorical variables were reported as frequency distributions, whereas continuous variables were described through median and interquartile ranges (IQRs). Matched pairs variations of CTCs enumeration (nCTCs), NDA, MAF, and DNA yield were tested across three timepoints: BL, EV1, and progression. Wilcoxon signed rank test was used for continuous variables, whereas categorical variables were investigated through the McNemar test. Progression-free survival (PFS) was defined as the time from BL to progression (defined through imaging) or death for any cause, whichever came first. Patients without an end point event at the last follow-up visit were censored. Differences in survival were tested by logrank test and Cox regression with 95% CI and represented by Kaplan-Meier estimator plot. Statistical analysis was conducted using StataCorp 2016 Stata Statistical Software: Release 15.1 (College Station, TX), R (version 3.3.1; The R foundation for Statistical Computing, Vienna, Austria) and JMP (version 14; SAS Institute, Cary, NC).

The 107-patient NU16B06 cohort was characterized for nCTCs and ctDNA at BL, EV1, and progression. Median age at BL was 55 years (IQR 46-63). Luminal-like was the most represented subtype (56 patients, 52%), followed by triple-negative BC (28 patients, 26%) and HER2-positive (23 patients, 22%) (Table 1). Fifty-three patients (50%) were diagnosed with inflammatory BC (Table 1). The most common metastatic site was bone (51 patients, 48%), followed by lymph nodes (44 patients, 42%) and liver (26 patients, 25%) (Table 1). The main treatment option was CT (N = 65, 60.8), as single agent (N = 31, 28.9%) or in association (anti-HER2 agents: N = 18, 16.8%; mammalian target of rapamycin [mTOR] inhibitors: N = 9, 8.4%) (Table 1). ET was prescribed in 33 patients (30.8%). Fulvestrant and aromatase inhibitors were the main ET backbones (N = 19, 17.8%, and N = 14, 13.1%, respectively). ET was combined with cyclin-dependent kinase (CDK) inhibitors in 21 patients (19.6%), with mTOR inhibitors in 7 (6.5%), and with anti-HER2 agents in five patients (4.7%) (multiple combinations were possible) (Table 1). CT was the most common previous treatment type (N = 88, 82.2%), 71 patients had received previous ET (66.4%), and 32 anti-HER2 agents (29.9%). Eleven patients (10.3%) were not exposed to previous treatments.


TABLE 1. Clinicopathologic Characteristics of the NU16B06 Cohort

nCTCs at BL were performed in 74 patients; 37% was classified as stage IVaggressive (27 patients), whereas the proportion was 75% at progression (47 patients) (Table 2). Median time to the first evaluation was 3 months (IQR 2-4).


TABLE 2. Distribution of Liquid Biopsy Biomarkers Across Timepoints in the NU16B06 Cohort

nCTCs, NDA, and MAF Showed Differential Dynamics Across Timepoints

Median MAF at BL was 3, NDA was 4, and nCTCs was 2 (Table 2). At EV1, the median MAF was 0.6, NDA was 5, and nCTCs was 1 (Table 2). At progression, the median MAF was 3.8, NDA was 6, and nCTCs was 5.5 (Table 2; Figs 1A, 1C, and 1E).

With serial assessments, MAF significantly decreased and NDA significantly increased between BL and EV1 (decreased and increased, respectively, P < .0001), whereas both were significantly increased between EV1 and progression (P < .0001) and BL and progression (P = .0241 and P < .0001, respectively) (Figs 1B and 1D).

No significant variations were observed for nCTCs between BL and EV1 and BL and progression (Fig 1F). A significant increase was observed between EV1 and progression (P = .0010) (Fig 1F).

The cohort was then stratified into stage IVindolent and stage IVaggressive.6 Although the general trend was confirmed in the stage IVaggressive subgroup (Figs 1G and 1H), a significant increase was observed in the stage IVindolent subgroup both between EV1 and progression (P = .0109) (Fig 1H) and between BL and progression (P = .0027) (Fig 1G). No significant differences were observed in either subgroup between BL and EV1 (Fig 1H).

MAF Was a Composite Measure Comprising Genes With Different Dynamics

TP53, PIK3CA, MET, ERBB2, EGFR, MYC, NF1, ESR1, ARID1A, and NOTCH1 were the main altered genes at BL (Fig 2A), whereas across all timepoints, TP53, PIK3CA, ERBB2, MET, EGFR, and ESR1 were the most represented. A considerable proportion of patients at BL had more than one alteration for TP53 and PIK3CA (14 and 12, respectively) (Data Supplement).

A significant increase in MAF between EV1 and progression was observed for TP53 (P = .0053), PIK3CA (P = .0457), ERBB2 (P = .0456), and ESR1 (P = .0016) (Figs 2B-2D and 2G). Similarly, an increase between BL and progression was highlighted for TP53 (P = .0283), PIK3CA (P = .0456), and ESR1 (P = .0003) (Figs 2B, 2C, and 2G). No significant variations in MAF were observed for EGFR and MET across all timepoints (Fig 2F). Notably, no new ESR1 alteration was observed in EV1, whereas a significant occurrence of ERBB2 alterations was observed for HER2-negative patients (McNemar test P = .0253). For descriptive purposes, the median MAF across all detected gene variants is shown in Figure 2H.

Plasma DNA Yield Did Not Vary Across Timepoints and Was Not Correlated With MAF and NDA

To better understand the value of ctDNA characterization over total plasma DNA, the NU16B06 samples were also characterized for DNA yield. Median DNA yield at BL was 30.7 ng, at EV1 was 27.2 ng, and at progression was 38.55 ng (Table 2). No significant differences were observed between BL and EV1 or between EV1 and progression (Fig 3A). No correlation was observed with MAF or NDA at BL (Figs 3B and 3C).

The DNA yield dynamics were, moreover, analyzed in the prospective CRO-2018-56 cohort. Forty-eight patients with luminal-like MBC who were candidates for first-line ET were enrolled and their blood samples were collected at BL and at EV1 (first CT scan after 3 months). The most common regimen was ET plus CDK4/6 inhibitors (92%), whereas only 6% of patients received ET as a single agent. Further patients' characteristics are shown in the Data Supplement. Consistent with what was observed in the unselected NU16B06 cohort, no differences were observed for DNA yield between BL and EV1 (P = .2027) (Fig 3D).

Short Plasma DNA Fragments Showed Different Dynamics and Clinical Outcome

Plasma DNA originates from different sources through shedding or cell lysis. The former is more likely to be related to ctDNA and is composed by short fragments. The latter is predominantly derived by leukocytes and usually consists of longer fragments. For exploratory purposes, a ddPCR analysis was performed on the CRO-2018-56 cohort to evaluate potential differences within the overall DNA yield by measuring the different fractions of the ACTB DNA fragments (136 bp, 420 bp, and 2,000 bp, respectively, of ACTBshort, ACTBmedium, and ACTBlong) and their proportion (ACTBshort/ACTBshort plus ACTBmedium plus ACTBlong).15

A significant increase in DNA proportion was observed between BL and EV1 (P = .0064) (Fig 3E) because of significant decreased levels not only in ACTBshort (71% of cases, P = .0162) (Fig 3F), but also for ACTBmedium (66% of cases, P = .0011) (Fig 3G) and ACTBlong (78% of cases, P = .0001) (Fig 3H).

Since ACTBmedium and ACTBlong were mainly derived by cytolysis of neutrophils, the white blood cell count dynamics was also investigated.15 As expected, because of the mechanism of CDK4/6 inhibitors, a significant drop in neutrophil count was observed between BL and EV1 (decrease in 94% of cases, P < .0001), whereas a decrease in lymphocyte count was trended toward significance (decrease in 65% of cases, P = .0615).

For exploratory purposes, the prognostic impact of BL DNA yield was tested in terms of PFS. Patients with a DNA yield higher than the 75th percentile experienced a similar prognosis with respect to the lower percentiles (P = .9325) (Fig 4A). Patients with a DNA proportion in the top quartile at BL had a significantly worse outcome (PFS at 6 months 56% v 90%, P = .0007) (Fig 4B). The prognostic impact of a ≥ 20% decrease between BL and EV1 was then investigated across the different ACTB fragments lengths. Although a significant impact was observed for ACTBshort (hazard ratio [HR]: 3.82; 95% CI, 1.29 to 11.29; P = .0153) (Fig 4C), no significant difference was observed for ACTBmedium and ACTBlong (HR, 1.95; 95% CI, 0.67 to 5.67; P = .2212 and HR, 1.04; 95% CI, 0.13 to 8.00; P = .9712, respectively). The prognostic impact of ACTBshort was also retained after correction for ACTBmedium and ACTBlong in multivariate analysis (HR, 5.24; 95% CI, 1.06 to 25.97; P = .0423) (data not shown).

This study analyzed the dynamic behavior of liquid biopsy biomarkers with respect to treatment response, with the goal of integrating different information that could potentially guide BL treatment choices and serial assessments after treatment initiation.

The study found significant differences between CTCs and ctDNA through MAF and NDA characterization. Importantly, MAF appeared to follow treatment response versus progression. By contrast, NDA increased steadily across timepoints, whereas nCTCs increased only at the time of clinical progression.

CTCs enumeration was the first clinically deployed liquid biopsy biomarker and, although the prognostic implications were consistently confirmed, monitoring results have been controversial.5-7 The SWOG 0500 phase III trial was the first attempt to use CTCs as a longitudinal clinical decision-making tool.7 The study was negative with respect to early change of CT regimen for patients with persistently high CTCs, jeopardizing the clinical utility of longitudinal CTCs characterization. However, the study lacked a precision medicine approach for treatment selection and biology-defined sampling timeframes.7

Similarly, the CirCe01 trial investigated whether it was possible to discontinue a potentially noneffective treatment based on CTCs dynamics in patients with MBC treated beyond the second line.16 The study confirmed the prognostic impact of BL stage IVaggressive on overall survival, but not for PFS.16 It, moreover, observed that patients with ≥ 5 CTCs/7.5 mL (Stage IV aggressive) at BL and with either < 5 CTCs/7.5 mL at the second cycle or a relative decrease of at least 70% of the BL CTCs enumeration experienced a longer PFS.16

Interestingly, consistent results were observed in this study's NU16B06 cohort (HR, 2.04; 95% CI, 0.96 to 4.35; P = .0653).

The study reported here further highlighted more nuanced trend of nCTCs since an increase was observed only at progression. These results may suggest that, although MAF could be more suitable for real-time disease monitoring, nCTCs could be more likely linked to metastatic biology, in particular in the stage IVindolent population. Previous studies suggested that nCTCs is a composite biomarker comprising different subpopulation at different stages of epithelial to mesenchymal transition and that patients who respond to therapy have a proportional decrease of the mesenchymal subpopulation. Patients who experience progressive disease show an increased number of mesenchymal CTCs.17,18 CTCs were, moreover, associated with distinctive biologic features such as mutations and metastatic organotropism.19,20

This study then analyzed how single genes can differently account for the overall MAF, demonstrating that a limited set of genes (ie, TP53 and PIK3CA) actually contributed to the overall measure. Consistent results on ctDNA dynamics were reported in the BEECH study, which highlighted a decrease in ctDNA after 8 days of treatment, while the longitudinal characterization of 21 patients treated with pyrotinib confirmed that the mean allele fraction at each timepoint was correlated with tumor size by computed tomography, with a lead time of 8 to 16 weeks in progression detection.21,22

By contrast, the increasingly high sensitivity of sequencing technologies can introduce potentially confounding factors such as the detection of somatic mutations deriving from normal tissues, in particular clonal hematopoiesis of indeterminate potential.23,24 This could also explain the higher incidence of co-occurring TP53 mutations, with respect to public databases. In our study, we did not concurrently sequence paired white blood cells to rule out clonal hematopoiesis of indeterminate potential.

Ma et al,22 moreover, suggested that a broad gene characterization is needed to correct potential biases deriving by their biologic role and treatment-derived selective pressure. Although genes encompassing truncal mutations (eg, TP53 and PIK3CA) were generally in line with the overall MAF trend, ESR1 and ERBB2 mutations were mainly a later event with a rising MAF and incidence because of the onset of treatment resistance. Although genetic alterations of ESR1 have an established role as resistance biomarkers, other genes such as ERBB2 in HER2-negative patients still need to be fully explored.8,25 It has been reported that patients with luminal-like MBC who acquired ctDNA-detectable ERBB2 alterations during the course of ET had promising responses with the use of tyrosine kinase inhibitors such as neratinib.25,26

Our study, moreover, suggested that the DNA yield fluorometric measurement was not correlated with MAF and NDA and did not vary across treatment timepoints, excluding its potential as a low-cost biomarker. Based on the proportion between plasma-detectable short fragments of ACTB, the study suggested a prognostic impact of the BL DNA proportion over the total plasma concentration. By contrast, this approach showed potential caveats for its longitudinal utilization as the genomic DNA fraction could be affected by drug-related events such as leukopenia and neutropenia, representing a potential confounding factor in the interpretation of the DNA proportion dynamics. Nonetheless, the study suggested how only the short fragment fraction was actually linked to prognosis, independent from the genomic DNA one, supporting the proof of concept that the ctDNA fraction should be accurately selected for a proper liquid biopsy–based disease characterization.

There are several limitations of this study. Since current clinical next-generation sequencing platforms are mainly based on targeted gene panels, MAF could have been underestimated if (1) the driver gene was not included in the panel, or (2) in the presence of two separate subclonal populations not sharing high-MAF mutations.

The retrospective cohort, moreover, was focused on standard, EpCAM-based nCTCs rather than an in-depth CTCs characterization, which could be a limiting factor as demonstrated by previous studies.17,18

The retrospective cohort was large but heterogenous both in terms of disease subtype and treatment line. Although this may increase the generalizability of the findings, it may also have introduced potential biases derived by specific biologic features. By contrast, the prospective cohort is highly homogeneous, but this may in turn jeopardize the results' applicability in other treatment settings.

In conclusion, the study suggests that both CTCs and ctDNA provide complementary information about prognosis and treatment benefit. nCTCs describe the underlying metastatic biology, whereas ctDNA provides a more quantitative, real-time assessment of tumor burden and treatment benefit. In addition, serial ctDNA measurements can be analyzed for early detection of clinically significant resistance alterations.

© 2021 by American Society of Clinical Oncology

The funding sources had no role in the study design, data collection, data analysis, interpretation, or writing of the manuscript.


Supported by the Lynn Sage Cancer Research Foundation, OncoSET Precision Medicine Program, Ministry of Health Grant Ricerca Finalizzata (Grant No.: RF-2016-02362544) and Italian League for the Fight against Cancer (LILT) Healthcare research 2018—5 × mille program. REDCap support was funded in part by a Clinical and Translational Science Award (CTSA) grant from the National Institutes of Health Grant No. UL1TR001422.

The following represents disclosure information provided by the authors of this manuscript. All relationships are considered compensated unless otherwise noted. Relationships are self-held unless noted. I = Immediate Family Member, Inst = My Institution. Relationships may not relate to the subject matter of this manuscript. For more information about ASCO's conflict of interest policy, please refer to or

Open Payments is a public database containing information reported by companies about payments made to US-licensed physicians (Open Payments).

Lorenzo Gerratana

Consulting or Advisory Role: Lilly, Novartis

Travel, Accommodations, Expenses: Menarini Silicon Biosystems

Alessandra Franzoni

Consulting or Advisory Role: Lilly, Novartis

Lisa E. Flaum

Consulting or Advisory Role: Seattle Genetics, Novartis

Speakers' Bureau: Seattle Genetics, Novartis, AstraZeneca

William John Gradishar

Consulting or Advisory Role: Genentech/Roche, AstraZeneca, Pfizer, Puma Biotechnology

Amir Behdad

Honoraria: Thermo Fisher Scientific, Bayer, Roche China, Lilly

Speakers' Bureau: Bayer, Thermofisher Scientific Biomarkers, Lilly, Roche China

Travel, Accommodations, Expenses: Bayer, Foundation Medicine, Pfizer

Hushan Yang

Stock and Other Ownership Interests: Illumina, Pfizer, Oriomics

Travel, Accommodations, Expenses: Oriomics

Other Relationship: NIH/NCI

Fabio Puglisi

Honoraria: Roche, MSD, AstraZeneca, Novartis, Lilly, Pfizer, Pierre Fabre, Daiichi Sankyo

Consulting or Advisory Role: Roche, Amgen, Lilly, Novartis, Pfizer, Eisai

Research Funding: Eisai, AstraZeneca, Roche

Travel, Accommodations, Expenses: Roche, Celgene

Massimo Cristofanilli

Honoraria: Pfizer, Foundation Medicine

Consulting or Advisory Role: Novartis, CytoDyn, Lilly, Foundation Medicine, Menarini

Research Funding: Lilly, Angle, Merck

No other potential conflicts of interest were reported.

1. Siegel RL, Miller KD, Jemal A: Cancer statistics, 2019. CA Cancer J Clin 69:7-34, 2019 Crossref, MedlineGoogle Scholar
2. Bonotto M, Gerratana L, Poletto E, et al: Measures of outcome in metastatic breast cancer: Insights from a real-world scenario. Oncologist 19:608-615, 2014 Crossref, MedlineGoogle Scholar
3. Kennecke H, Yerushalmi R, Woods R, et al: Metastatic behavior of breast cancer subtypes. J Clin Oncol 28:3271-3277, 2010 LinkGoogle Scholar
4. Gerratana L, Zhang Q, Shah AN, et al: Performance of a novel next generation sequencing circulating tumor DNA (ctDNA) platform for the evaluation of samples from patients with metastatic breast cancer (MBC). Crit Rev Oncol Hematol 145:102856, 2020 Crossref, MedlineGoogle Scholar
5. Cristofanilli M, Budd GT, Ellis MJ, et al: Circulating tumor cells, disease progression, and survival in metastatic breast cancer. N Engl J Med 351:781-791, 2004 Crossref, MedlineGoogle Scholar
6. Cristofanilli M, Pierga J-Y, Reuben J, et al: The clinical use of circulating tumor cells (CTCs) enumeration for staging of metastatic breast cancer (MBC): International expert consensus paper. Crit Rev Oncol Hematol 134:39-45, 2019 Crossref, MedlineGoogle Scholar
7. Smerage JB, Barlow WE, Hortobagyi GN, et al: Circulating tumor cells and response to chemotherapy in metastatic breast cancer: SWOG S0500. J Clin Oncol 32:3483-3489, 2014 LinkGoogle Scholar
8. O'Leary B, Hrebien S, Morden JP, et al: Early circulating tumor DNA dynamics and clonal selection with palbociclib and fulvestrant for breast cancer. Nat Commun 9:896, 2018 Crossref, MedlineGoogle Scholar
9. Buono G, Gerratana L, Bulfoni M, et al: Circulating tumor DNA analysis in breast cancer: Is it ready for prime-time? Cancer Treat Rev 73:73-83, 2019 Crossref, MedlineGoogle Scholar
10. André F, Ciruelos E, Rubovszky G, et al: Alpelisib for PIK3CA-mutated, hormone receptor–positive advanced breast cancer. N Engl J Med 380:1929-1940, 2019 Crossref, MedlineGoogle Scholar
11. Gerratana L, Davis AA, Shah AN, et al: Emerging role of genomics and cell-free DNA in breast cancer. Curr Treat Options Oncol 20:68, 2019 Crossref, MedlineGoogle Scholar
12. Lanman RB, Mortimer SA, Zill OA, et al: Analytical and clinical validation of a digital sequencing panel for quantitative, highly accurate evaluation of cell-free circulating tumor DNA. PLoS One 10:e0140712, 2015 Crossref, MedlineGoogle Scholar
13. Forbes SA, Beare D, Boutselakis H, et al: COSMIC: Somatic cancer genetics at high-resolution. Nucleic Acids Res 45:D777-D783, 2017 Crossref, MedlineGoogle Scholar
14. Zill OA, Banks KC, Fairclough SR, et al: The landscape of actionable genomic alterations in cell-free circulating tumor DNA from 21,807 advanced cancer patients. Clin Cancer Res 24:3528-3538, 2018 Crossref, MedlineGoogle Scholar
15. van Dessel LF, Vitale SR, Helmijr JCA, et al: High-throughput isolation of circulating tumor DNA: A comparison of automated platforms. Mol Oncol 13:392-402, 2019 Crossref, MedlineGoogle Scholar
16. Helissey C, Berger F, Cottu P, et al: Circulating tumor cell thresholds and survival scores in advanced metastatic breast cancer: The observational step of the CirCe01 phase III trial. Cancer Lett 360:213-218, 2015 Crossref, MedlineGoogle Scholar
17. Bulfoni M, Gerratana L, Del Ben F, et al: In patients with metastatic breast cancer the identification of circulating tumor cells in epithelial-to-mesenchymal transition is associated with a poor prognosis. Breast Cancer Res 18:30, 2016 Crossref, MedlineGoogle Scholar
18. Yu M, Bardia A, Wittner BS, et al: Circulating breast tumor cells exhibit dynamic changes in epithelial and mesenchymal composition. Science 339:580-584, 2013 Crossref, MedlineGoogle Scholar
19. Davis AA, Zhang Q, Gerratana L, et al: Association of a novel circulating tumor DNA next-generating sequencing platform with circulating tumor cells (CTCs) and CTC clusters in metastatic breast cancer. Breast Cancer Res 21:137, 2019 Crossref, MedlineGoogle Scholar
20. Gerratana L, Davis AA, Polano M, et al: Understanding the organ tropism of metastatic breast cancer through the combination of liquid biopsy tools. Eur J Cancer 143:147-157, 2021 Crossref, MedlineGoogle Scholar
21. Hrebien S, Citi V, Garcia-Murillas I, et al: Early ctDNA dynamics as a surrogate for progression-free survival in advanced breast cancer in the BEECH trial. Ann Oncol 30:945-952, 2019 Crossref, MedlineGoogle Scholar
22. Ma F, Guan Y, Yi Z, et al: Assessing tumor heterogeneity using ctDNA to predict and monitor therapeutic response in metastatic breast cancer. Int J Cancer 146:1359-1368, 2020 Crossref, MedlineGoogle Scholar
23. Riedlinger GM, Jalloul N, Poplin E, et al: Detection of three distinct clonal populations using circulating cell-free DNA: A cautionary note on the use of liquid biopsy. JCO Precis Oncol 2019. LinkGoogle Scholar
24. Razavi P, Li BT, Brown DN, et al: High-intensity sequencing reveals the sources of plasma circulating cell-free DNA variants. Nat Med 25:1928-1937, 2019 Crossref, MedlineGoogle Scholar
25. Medford AJ, Dubash TD, Juric D, et al: Blood-based monitoring identifies acquired and targetable driver HER2 mutations in endocrine-resistant metastatic breast cancer. NPJ Precis Oncol 3:18, 2019 Crossref, MedlineGoogle Scholar
26. Ma CX, Bose R, Gao F, et al: Neratinib efficacy and circulating tumor DNA detection of HER2 mutations in HER2 nonamplified metastatic breast cancer. Clin Cancer Res 23:5687-5695, 2017 Crossref, MedlineGoogle Scholar
Sat, 12 Mar 2022 15:44:00 -0600 en text/html
A00-240 exam dump and training guide direct download
Training Exams List