Fixing the bridge between biologists and statisticians

How do we combine errors? The linear case

Published at November 22, 2024 · 7 min read

In our research work, we usually fit models to experimental data. Our aim is to estimate some biologically relevant parameters, together with their standard errors. Very often, these parameters are interesting in themselves, as they represent means, differences, rates or other important descriptors. In other cases, we use those estimates to derive further indices, by way of some appropriate calculations. For example, think that we have two parameter estimates, say Q and W, with standard errors respectively equal to \(\sigma_Q\) and \(\sigma_W\): it might be relevant to calculate the amount:

...

Here is why I don't care about the Levene's test

Published at March 15, 2024 · 5 min read

During my stat courses, I never give my students any information about the Levene’s test (Levene and Howard, 1960), or other similar tests for homoscedasticity, unless I am specifically prompted to do so. It is not that I intend to underrate the tremendous importance of checking for the basic assumptions of linear model! On the contrary, I always show my students several methods for the graphical inspection of model residuals, but I do not share the same aching desire for a P-value, that most of my colleagues seem to possess.

...

Factorial designs with check in pesticide research

Published at December 15, 2023 · 6 min read

In pesticide research or, in general, agriculture research, we very commonly encouter experiments with two/three crossed factors and some other treatment that is not included in the factorial structure. For example, let’s consider an experiment with two herbicides (rimsulfuron and dicamba) at two rates (40 and 60 g/ha for rimsulfuron and 0.6 and 1 kg/ha for dicamba) and with four adjuvant treatments (surfactant, frigate, mineral oil and no adjuvant). Apart from this fully crossed structure, we need to introduce, at least, an untreated control and a hand-weeded control. The design for such an experiment has been termed ‘augmented factorial’, because we are, indeed, including some extra treatment levels beyond the crossed factorial structure.

...

Regression analyses with common checks in pesticide research

Published at December 15, 2023 · 4 min read

In pesticide research or, in general, agriculture research, we very commonly encounter experiments with, e.g., several herbicides tested at different doses and in different conditions. For these experiments, the untreated control is always added and, of course, such control is common to all herbicides.

For example, in another post (see here) we have considered an experiment with two herbicides (rimsulfuron and dicamba) at two rates (40 and 60 g/ha for rimsulfuron and 0.6 and 1 kg/ha for dicamba) and with four adjuvant treatments (surfactant, frigate, mineral oil and no adjuvant). The dataset is loaded in the box below: there are three predictors (Herbicide, Adjuvant and Dose) and two quantitative response variables (WeedCoverage and Yield).

...

Back-transformations with emmeans()

Published at November 30, 2023 · 5 min read

I am one of those old guys who still uses the stabilising transformations, when the data do not conform to the basic assumptions for ANOVA. Indeed, apart from counts and proportions, where GLMs can be very useful, I have not yet found a simple way to deal with heteroscedasticity for continuous variables, such as yield, weight, height and so on. Yes, I know, Generalised Least Squares (GLS) can be useful to fit heteroscedastic models, but I would argue that stabilising transformations are, conceptually, very much simpler and they can be easily thought to PhD students and practitioners, with only a basic level of knowledge about statistics.

...

The coefficient of determination: is it the R-squared or r-squared?

Published at November 26, 2022 · 9 min read

We often use the coefficient of determination as a swift ‘measure’ of goodness of fit for our regression models. Unfortunately, there is no unique symbol for such a coefficient and both \(R^2\) and \(r^2\) are used in literature, almost interchangeably. Such an interchangeability is also endorsed by the Wikipedia (see at: https://en.wikipedia.org/wiki/Coefficient_of_determination ), where both symbols are reported as the abbreviations for this statistical index.

As an editor of several International Journals, I should not agree with such an approach; indeed, the two symbols \(R^2\) and \(r^2\) mean two different things, and they are not necessarily interchangeable, because, depending on the setting, either of the two may be wrong or ambiguous. Let’s pay a little attention to such an issue.

...

Stabilising transformations: how do I present my results?

Published at June 15, 2019 · 5 min read

ANOVA is routinely used in applied biology for data analyses, although, in some instances, the basic assumptions of normality and homoscedasticity of residuals do not hold. In those instances, most biologists would be inclined to adopt some sort of stabilising transformations (logarithm, square root, arcsin square root…), prior to ANOVA. Yes, there might be more advanced and elegant solutions, but stabilising transformations are suggested in most traditional biometry books, they are very straightforward to apply and they do not require any specific statistical software. I do not think that this traditional technique should be underrated.

...

#Linear_models

Tags

Recent posts

Correcting for multiplicity in the 'emmeans' package

How do we combine errors, in biology? The delta method

Using `lme()` to fit the Environmental Variance mixed model to genotype experiments

Using `lme()` to fit the Stability Variance mixed model to genotype experiments

Getting the Absolute/Relative Growth Rate from growth curves

Archives