Experimental methods in agriculture
Update: v. 0.99 (2021-12-01), compil. 2021-12-01
This is the website for the book “Experimental methods in agriculture” and deals with the organisation of experiments and data analyses in agriculture and, more generally, in biology. Experiments are the key element to scientific progress and they need to be designed in a way that reliable data is produced. Once this fundamental requirement has been fullfilled, statistics can be used to summarise and explore the results, separating ‘signal’ from ‘noise’ and reaching appropriate conclusions.
In this book, we will try to give some essential information to support the adoption of good research practices, with particular reference to field experiments, which are used to compare, e.g., innovative genotypes, agronomic practices, herbicides and other weed control methods. We firmly believe that the advancement of cropping techniques should always be based on the evidence produced by scientifically sound experiments.
We will follow a ‘learn-by-doing’ approach, making use of several examples and case studies, while keeping theory and maths at a minimum level; indeed, we are talking to agronomists and biologists and not to statisticians!
This website is (and will always be) free to use, and is licensed under the Creative Commons Attribution-NonCommercial-NoDerivs 3.0 License. It is written in RMarkdown with bookdown and it is rebuilt every now and then to incorporate corrections and updates. This is necessary, as R is a rapidly evolving language.
This book is not written aiming at completeness, but it is finely tuned for a 6 ECTS introductory course in biometry, for master or PhD students. It is mainly aimed at building solid foundations for starting a job in the research field and, eventually, to be able to tackle more advanced statistical material.
How this book is organised
The first two Chapters deal with the experimental design and explain how to distinguish good from bad experiments. One key aspect is that we can never be sure that data are totally reliable and, thus, we assume that they are reliable whenever we can be reasonably sure that they were obtained by using reliable methods.
In Chapter 3 we learn how to describe the results, based on some simple stats, such as the mean, median, chi square value and Pearson correlation coefficient. In this chapter, we stick to the observed data, as if we were not interested in anything else. In chapter 4 we learn to see those observed data as the result of deterministic and stochastic processes, which we can describe by using statistical models.
In Chapters 5 and 6 we start recognising that the observed data is only one random sample from a wider universe of data and that we are mainly interested on that universe, as we want to use our experiment to draw general conclusions. Going from a sample to a population introduces a certain amount of uncertainty which we have to incorporate into our conclusions.
From Chapter 7 to Chapter 12 we deal with ANOVA, that is one of the most widely used techniques of data analysis, while the last two chapters deal with regression models.
Within each chapter, we usually start with some motivating examples so that you can see the bigger picture, and then dive into the details. In the final chapter, we provide exercises for each book section, which should help you you practice what you’ve learned.
In this book, we will work through the examples using the R statistical software, together with the RStudio environment. We selected for a number of reasons: first of all we like it very much and we think that it is a pleasure to use it, once the initial difficulties have been overcame! Second, it is freeware, which is fundamental for the students. Third, in recent years the software skills of students in master programmes have notably increased and writing small chunks of code is no longer a problem for most of them. Last, but not least, we have seen that some experience with R is a very often required skill when applying for a job. Perhaps, we should say that we are very much indebted for the availability of those two wonderful pieces of free software.
R is characterised by a modular structure and its basic functionalities can be widely extended by a set of add-in packages. As this is mainly an introductory course, we decided, as long as possible, to stick to the main packages, which come with the basic R installation. However, we could not avoid the use of a few very important packages, which we will indicate later on. We should also mention that this book was built by using the ‘bookdown’ package and it is hosted on the website blog ‘www.statforbiology.com,’ which is built by using the ‘blogdown’ package. We will not use these two packages during the course but we should mention that they were really useful.
We recognise that R has a steep learning curve and we will start from the very beginning, without assuming that the students have any preliminary knowledge, either about statistics, or about R.