Course Plan

This short course introduces a statistical workflow for an experiment here on the NM-AIST campus about the effect of drought, fire, and herbivory on acacia trees. The goal for the week is to learn some shared statistical language so you can work with statisticians (like me) on your research.

Last year (2023) we focused on the first 4 steps, which help to design an experiment:

  1. What are our scientific goals ?
  2. How will we get data that informs our scientific goals ? What is our experimental design ?
  3. How do we model (i.e. tell a mathematical story about) our data ?
  4. Assuming this model, will we get precise enough estimates to achieve our scientific goals ?

Now, after collecting data, we will also take the next step:

  1. Is our model realistic enough to achieve our scientific goal ?

But first, let’s review the first 4 design steps. If you did not attend last year, we will catch you up. And if you did attend last year, this will help refresh our memories.

Team activity (2 minutes): Throughout class, we will have activities in green. You will work in pairs, and then share your work with the class. A statistical workflow is best done in teams, not alone. Please introduce yourself to your teammate !

Reading

Students are welcome to read these course materials ahead of class. Maoni karibu !

Other reading can include the online textbook Regression and Other Stories, by Gelman et al..

Other references The course materials are based mainly on Regression and Other Stories, by Gelman et al. Chapter 16 about design, Gelman (2023), and Towards A Principled Bayesian Workflow, by Betancourt. For more about step 5 and beyond, see also Bayesian Workflow, by Gelman et al.. Learning statistics with R, by Danielle Navarro is another good resource. We include further references below.

Scientific Goals

We want to estimate the effects of drought, fire, and herbivory on African acacia species. Many outcomes are interesting, but in this workflow we focus on height.

Team activity (5 minutes): Why is it important to estimate these effects ? What would you want to know ?

An answer

The height of a tree is a simple way to measure the size and growth rate of a tree. As a tree grows, its height usually increases. Trees that are in poor health or struggling to get key resources will grow very little and thus change little in height. Height may even decrease if upper branches die! So tree growth rate is a useful indicator of the health of a tree.

Tree height is not just a quick way to measure size; the height of a tree itself matters for many reasons. Taller trees can capture more light, which allows them to compete with other plants. Once a tree is tall enough, its leaves are above the height of fire and animals, and so it will escape these stresses. The height of a tree also matters aesthetically; tall trees are beautiful!

We expect that different species will have different heights, on average. We can keep this in mind for our statistical modeling later.

Perhaps you can add another reason why tree height is worth measuring?

Acacia Trees

Acacia trees (Miti Migunga) are a very important group of trees across the world, but particularly in Africa. These include such iconic species as the umbrella thorn and the fever tree.

Disney’s Lion King (umbrella thorn acacia ?)
Disney’s Lion King (umbrella thorn acacia ?)


Currently, the former genus Acacia has been divided into two genera: Vachellia and Senegalia.

The twelve Acacia trees in this study are as follows:

Vachellia nilotica, V. gerardii, V. robusta, V. seyal, V. sieberiana, V. tortilis, V. xanthophloea

Senegalia brevispica, S. mellifera, S. polyacantha, S. senegal, V. drepanolobium

These are referred to using a 6-letter code, created from the first three letters of the genus and species. For instance, Vachellia nilotica is “vacnil” and Senegalia brevispica is “senbre”.

Experimental design

We will get data from a factorial experiment (factors: drought, fire, herbivory) on the NM-AIST campus done by Arjun Potter and Charles Luchagula. They transplanted seedlings of 12 Acacia species from Serengeti National Park to the experimental site on campus.

Arjun Potter collecting a seed pod from Vachellia sieberiana on the NM-AIST campus.
Arjun Potter collecting a seed pod from Vachellia sieberiana on the NM-AIST campus.
Charles Luchagula using a porometer to measure stomatal conductance.
Charles Luchagula using a porometer to measure stomatal conductance.

Your ideas

Before we review the design of Arjun’s and Charles’ experiment, I want to hear your ideas !

Team activity (35 minutes): How would you design a study to estimate the effects of drought, fire, and herbivory on African acacia species heights ?

Layout of the site

The experimental site consists of 10 blocks (12 m × 12 m), 2 m apart. Drought was implemented via rainout shelters over each block.

Each block contains 9 subplots.

Each subplot contains 6 plant positions.

Team activity (13 minutes): Try to draw a possible layout like this on your paper. What questions do you have ? What have I not told you that you would like to know ? Why divide the rainout shelter blocks into subplots ?

An answer

Here is the layout for this experiment’s design:

As we discuss below, there was a small modification to the plant positions layout to optimally fit 6 plant positions within a square subplot. The layout of the blocks as 2 rows of 5 blocks was due to the size of rainout shelters and size of the whole experimental site.

We might want to divide the rainout shelter blocks into subplots if we choose to implement treatment at that intermediate level, rather than at the level of a whole block or the level of one plant at a time.

Photo of 1 block containing 9 subplots
Photo of 1 block containing 9 subplots

Random assignment of treatment

Team activity (8 minutes): Can each plant be individually randomized to receive fire, drought, or herbivory ? What are the implementation challenges of such an assignment ? Imagine if two plants growing right next to each other get different treatments; perhaps one gets fire and the other does not. Or one gets drought and the other does not. What might be difficult about this ?

An answer

Manipulating fire sounds difficult, especially in the small area around only one plant. This also sounds difficult for drought, where soil moisture might spill into neighboring plants. Herbivory might be possible to implement at the plant level, depending on how we do it.


Fire was randomly assigned at block-level, while drought and herbivory were randomly assigned at subplot-level. This meant that either the whole block received fire (for 5 out of 10 blocks) or the whole block got no fire (the remaining 5 blocks).

Team activity (12 minutes): Why do you think the researchers chose to assign fire at the block level and not the subplot level ? What are the advantages and disadvantages ?

An answer

As we thought above, manipulating fire is difficult. It may be difficult to do the controlled burns at the subplot level.

Later in the workflow, we will see there is more precision with random assignment at subplot-level as opposed to block-level. When we assign fire at the block level, we cannot distinguish between the effect of being in a block versus the effect of fire. For example, if block 3 got fire but block 7 did not get fire, is the difference between the plant heights because of the fire treatment ? Or because block 3 had slightly different soil than block 7 ? Or maybe their rainout shelters are built slightly differently ? When treatment (herbivory and drought) is assigned at a lower level like a subplot, we can compare within the same block and isolate the effect of treatment alone.


Of the 9 subplots per block, we put one aside for additional controls (we planted corn !). This leaves 8 subplots for experimentation in each block. Specifically:

  • 2 subplots get no drought and no herbivory,
  • 2 subplots get drought but no herbivory,
  • 2 subplots get no drought but they do get herbivory, and
  • 2 subplots get both drought and herbivory.

For each treatment combination above, there are two subplots. This pair of subplots gets all 12 Acacia species (6 in one and 6 in the other, arranged randomly among all plant positions).

Team activity (14 minutes): Focus on one block, with its 9 subplots. Draw one possible randomization on your paper, which subplots get which treatments ? Where are the 12 Acacia species planted (you can label them 1 through 12) ? What concerns do you have about the randomization ?

An answer

What if the randomization is unlucky and all the herbivory subplots are in the east or all the drought subplots are in the west ? The randomization will protect it from this unluckiness by “blocking”, randomizing drought and herbivory within each block. (In survey sampling, “blocking” is called “stratification”.)

However, the randomization of fire can definitely be unlucky and give fire to all blocks in the east, for example. In this case, we could rerandomize, following the procedure described here.

Implementation of treatment

Fire was implemented by putting dried grasses in a subplot and setting it afire with a lighter on four sides. The subplots of the 5 fire-assigned blocks were burning this way.

Drought was implemented with rainout shelters over each block. All rainfall was diverted into ditches, so no rain reaches the plants. Drought subplots received no at all water for 6 months ! This mimics the rainfall uncertainty that is common in East Africa.

To make the “no drought” (control plots), we added water to the subplots with a garden hose regularly on a set schedule. The amount of water in control plots was comparable to average rainfall conditions in East African savannas.

By measuring soil moisture, we can check exactly how dry the drought (and no drought) subplots are. Even when soil looks or feels dry, there is actually a small amount of water bound to the soil. Plants that grow in dry regions can suck this tiny amount of water from the soil !

Herbivory was implemented by cutting plants with scissors.

Team activity (9 minutes): Do you have any implementation questions or concerns ?

An answer

A control (watered) subplot might be next to a drought (no water) subplot. If the drought subplot could access the water from the control subplot, its treatment becomes more complicated. This interference between experimental units is discussed in “Causal Inference: What If” book by Hernán and Robins Fine Point 1.1 on page 5.

Because the soil is a heavy clay, water does not flow quickly through the soil. For this reason, small berms and trenches can stop water from flowing between the control (watered) and drought (no water) subplots.

Mathematical notation

To clarify our design and data, we define mathematical notation.

  • \(i =\) a tree, a row in our data.

Experimental design:

  • treatment factors:
    • \(\text{Fire}_i = 1\) if tree \(i\) gets fire treatment, \(=0\) otherwise
    • \(\text{Drought}_i = 1\) if tree \(i\) gets drought treatment, \(=0\) otherwise
    • \(\text{Herbivory}_i = 1\) if tree \(i\) gets herbivory treatment, \(=0\) otherwise
  • \(s[i] =\) species of tree \(i\), one of 12 species of Acacia being studied. The names of the trees are abbreviated; for instance, “vacxan” is Vachellia xanthophloea, known as “Mgunga Maji” in Kiswahili. Each species is given a unique number (Species_number).
  • tree location:
    • \(b[i] =\) block of tree \(i\), one of 10 blocks
    • \(sp[i] =\) subplot of tree \(i\), one of 9 subplots nested in a block
    • \(p[i] =\) plant position of tree \(i\), one of 6 positions nested in a subplot
  • Data:
    • \(y_i =\) height in centimeters on Feb 28, 2024 of tree \(i\)

Here is the experimental design. Each row is a tree, columns are variables: location, species, and treatment factors).

tree Fire Drought Herbivory Species Species_number Block Subplot PlantPosition is_corn
1 0 1 1 vacsey 9 1 1 1 0
2 0 1 1 senpol 4 1 1 2 0
3 0 1 1 vacger 6 1 1 3 0
4 0 1 1 albhar 1 1 1 4 0
5 0 1 1 senbre 2 1 1 5 0
6 0 1 1 sensen 5 1 1 6 0
7 0 0 1 albhar 1 1 2 1 0
8 0 0 1 vacrob 8 1 2 2 0
9 0 0 1 senmel 3 1 2 3 0
10 0 0 1 vactor 11 1 2 4 0
11 0 0 1 vacxan 12 1 2 5 0
12 0 0 1 vacnil 7 1 2 6 0
13 0 0 0 albhar 1 1 3 1 0
14 0 0 0 vacsie 10 1 3 2 0
15 0 0 0 vacrob 8 1 3 3 0
16 0 0 0 senmel 3 1 3 4 0
17 0 0 0 senbre 2 1 3 5 0
18 0 0 0 sensen 5 1 3 6 0
19 0 1 0 vacxan 12 1 4 1 0
20 0 1 0 vactor 11 1 4 2 0
21 0 1 0 vacrob 8 1 4 3 0
22 0 1 0 senpol 4 1 4 4 0
23 0 1 0 senbre 2 1 4 5 0
24 0 1 0 senmel 3 1 4 6 0
25 0 0 1 sensen 5 1 5 1 0
26 0 0 1 senbre 2 1 5 2 0
27 0 0 1 vacsie 10 1 5 3 0
28 0 0 1 vacsey 9 1 5 4 0
29 0 0 1 senpol 4 1 5 5 0
30 0 0 1 vacger 6 1 5 6 0
31 0 1 1 senmel 3 1 6 1 0
32 0 1 1 vacrob 8 1 6 2 0
33 0 1 1 vacnil 7 1 6 3 0
34 0 1 1 vactor 11 1 6 4 0
35 0 1 1 vacxan 12 1 6 5 0
36 0 1 1 vacsie 10 1 6 6 0
37 0 0 0 vacger 6 1 7 1 1
38 0 0 0 vacnil 7 1 8 1 0
39 0 0 0 senpol 4 1 8 2 0
40 0 0 0 vactor 11 1 8 3 0
41 0 0 0 vacsey 9 1 8 4 0
42 0 0 0 vacger 6 1 8 5 0
43 0 0 0 vacxan 12 1 8 6 0
44 0 1 0 sensen 5 1 9 1 0
45 0 1 0 vacsie 10 1 9 2 0
46 0 1 0 vacnil 7 1 9 3 0
47 0 1 0 vacger 6 1 9 4 0
48 0 1 0 albhar 1 1 9 5 0
49 0 1 0 vacsey 9 1 9 6 0
50 1 0 0 vacrob 8 2 1 1 1
51 1 0 0 vacxan 12 2 2 1 0
52 1 0 0 senbre 2 2 2 2 0
53 1 0 0 sensen 5 2 2 3 0
54 1 0 0 vacrob 8 2 2 4 0
55 1 0 0 senpol 4 2 2 5 0
56 1 0 0 vactor 11 2 2 6 0
57 1 0 1 vacsey 9 2 3 1 0
58 1 0 1 vacger 6 2 3 2 0
59 1 0 1 senpol 4 2 3 3 0
60 1 0 1 vactor 11 2 3 4 0
61 1 0 1 sensen 5 2 3 5 0
62 1 0 1 vacrob 8 2 3 6 0
63 1 1 1 vacsie 10 2 4 1 0
64 1 1 1 vacger 6 2 4 2 0
65 1 1 1 senmel 3 2 4 3 0
66 1 1 1 vacnil 7 2 4 4 0
67 1 1 1 vacrob 8 2 4 5 0
68 1 1 1 vacsey 9 2 4 6 0
69 1 1 0 albhar 1 2 5 1 0
70 1 1 0 senmel 3 2 5 2 0
71 1 1 0 vactor 11 2 5 3 0
72 1 1 0 vacrob 8 2 5 4 0
73 1 1 0 vacnil 7 2 5 5 0
74 1 1 0 vacger 6 2 5 6 0
75 1 1 0 senpol 4 2 6 1 0
76 1 1 0 vacsie 10 2 6 2 0
77 1 1 0 vacxan 12 2 6 3 0
78 1 1 0 vacsey 9 2 6 4 0
79 1 1 0 senbre 2 2 6 5 0
80 1 1 0 sensen 5 2 6 6 0
81 1 0 0 vacnil 7 2 7 1 0
82 1 0 0 albhar 1 2 7 2 0
83 1 0 0 vacsey 9 2 7 3 0
84 1 0 0 vacger 6 2 7 4 0
85 1 0 0 vacsie 10 2 7 5 0
86 1 0 0 senmel 3 2 7 6 0
87 1 0 1 vacsie 10 2 8 1 0
88 1 0 1 vacxan 12 2 8 2 0
89 1 0 1 vacnil 7 2 8 3 0
90 1 0 1 albhar 1 2 8 4 0
91 1 0 1 senmel 3 2 8 5 0
92 1 0 1 senbre 2 2 8 6 0
93 1 1 1 senbre 2 2 9 1 0
94 1 1 1 sensen 5 2 9 2 0
95 1 1 1 vactor 11 2 9 3 0
96 1 1 1 senpol 4 2 9 4 0
97 1 1 1 vacxan 12 2 9 5 0
98 1 1 1 albhar 1 2 9 6 0
99 1 0 0 vacnil 7 3 1 1 0
100 1 0 0 vacsie 10 3 1 2 0
101 1 0 0 vacger 6 3 1 3 0
102 1 0 0 vactor 11 3 1 4 0
103 1 0 0 sensen 5 3 1 5 0
104 1 0 0 vacrob 8 3 1 6 0
105 1 0 1 albhar 1 3 2 1 0
106 1 0 1 vacnil 7 3 2 2 0
107 1 0 1 vactor 11 3 2 3 0
108 1 0 1 senpol 4 3 2 4 0
109 1 0 1 vacxan 12 3 2 5 0
110 1 0 1 sensen 5 3 2 6 0
111 1 1 1 vacsie 10 3 3 1 0
112 1 1 1 vacger 6 3 3 2 0
113 1 1 1 senmel 3 3 3 3 0
114 1 1 1 senbre 2 3 3 4 0
115 1 1 1 vacsey 9 3 3 5 0
116 1 1 1 albhar 1 3 3 6 0
117 1 0 0 senbre 2 3 4 1 1
118 1 1 0 senpol 4 3 5 1 0
119 1 1 0 albhar 1 3 5 2 0
120 1 1 0 senmel 3 3 5 3 0
121 1 1 0 senbre 2 3 5 4 0
122 1 1 0 vacsey 9 3 5 5 0
123 1 1 0 vacnil 7 3 5 6 0
124 1 1 1 vacxan 12 3 6 1 0
125 1 1 1 vacnil 7 3 6 2 0
126 1 1 1 vacrob 8 3 6 3 0
127 1 1 1 sensen 5 3 6 4 0
128 1 1 1 senpol 4 3 6 5 0
129 1 1 1 vactor 11 3 6 6 0
130 1 1 0 vacrob 8 3 7 1 0
131 1 1 0 vacsie 10 3 7 2 0
132 1 1 0 sensen 5 3 7 3 0
133 1 1 0 vacger 6 3 7 4 0
134 1 1 0 vacxan 12 3 7 5 0
135 1 1 0 vactor 11 3 7 6 0
136 1 0 1 vacger 6 3 8 1 0
137 1 0 1 vacrob 8 3 8 2 0
138 1 0 1 senmel 3 3 8 3 0
139 1 0 1 senbre 2 3 8 4 0
140 1 0 1 vacsie 10 3 8 5 0
141 1 0 1 vacsey 9 3 8 6 0
142 1 0 0 senmel 3 3 9 1 0
143 1 0 0 vacxan 12 3 9 2 0
144 1 0 0 albhar 1 3 9 3 0
145 1 0 0 vacsey 9 3 9 4 0
146 1 0 0 senpol 4 3 9 5 0
147 1 0 0 senbre 2 3 9 6 0
148 0 0 0 vacxan 12 4 1 1 1
149 0 1 0 vacsey 9 4 2 1 0
150 0 1 0 vacsie 10 4 2 2 0
151 0 1 0 vactor 11 4 2 3 0
152 0 1 0 senbre 2 4 2 4 0
153 0 1 0 vacxan 12 4 2 5 0
154 0 1 0 senmel 3 4 2 6 0
155 0 0 0 sensen 5 4 3 1 0
156 0 0 0 vacger 6 4 3 2 0
157 0 0 0 vacxan 12 4 3 3 0
158 0 0 0 vacnil 7 4 3 4 0
159 0 0 0 senpol 4 4 3 5 0
160 0 0 0 vactor 11 4 3 6 0
161 0 1 1 vacger 6 4 4 1 0
162 0 1 1 senpol 4 4 4 2 0
163 0 1 1 vacsey 9 4 4 3 0
164 0 1 1 vacrob 8 4 4 4 0
165 0 1 1 senmel 3 4 4 5 0
166 0 1 1 sensen 5 4 4 6 0
167 0 1 0 albhar 1 4 5 1 0
168 0 1 0 vacrob 8 4 5 2 0
169 0 1 0 vacger 6 4 5 3 0
170 0 1 0 senpol 4 4 5 4 0
171 0 1 0 vacnil 7 4 5 5 0
172 0 1 0 sensen 5 4 5 6 0
173 0 0 1 senmel 3 4 6 1 0
174 0 0 1 vactor 11 4 6 2 0
175 0 0 1 vacrob 8 4 6 3 0
176 0 0 1 vacsey 9 4 6 4 0
177 0 0 1 senbre 2 4 6 5 0
178 0 0 1 senpol 4 4 6 6 0
179 0 0 1 vacsie 10 4 7 1 0
180 0 0 1 vacnil 7 4 7 2 0
181 0 0 1 vacxan 12 4 7 3 0
182 0 0 1 albhar 1 4 7 4 0
183 0 0 1 vacger 6 4 7 5 0
184 0 0 1 sensen 5 4 7 6 0
185 0 0 0 senbre 2 4 8 1 0
186 0 0 0 albhar 1 4 8 2 0
187 0 0 0 vacrob 8 4 8 3 0
188 0 0 0 senmel 3 4 8 4 0
189 0 0 0 vacsey 9 4 8 5 0
190 0 0 0 vacsie 10 4 8 6 0
191 0 1 1 vacsie 10 4 9 1 0
192 0 1 1 vactor 11 4 9 2 0
193 0 1 1 senbre 2 4 9 3 0
194 0 1 1 albhar 1 4 9 4 0
195 0 1 1 vacxan 12 4 9 5 0
196 0 1 1 vacnil 7 4 9 6 0
197 0 0 0 vacsey 9 5 1 1 0
198 0 0 0 vacxan 12 5 1 2 0
199 0 0 0 sensen 5 5 1 3 0
200 0 0 0 senmel 3 5 1 4 0
201 0 0 0 vacnil 7 5 1 5 0
202 0 0 0 vactor 11 5 1 6 0
203 0 0 1 vacnil 7 5 2 1 0
204 0 0 1 senbre 2 5 2 2 0
205 0 0 1 vactor 11 5 2 3 0
206 0 0 1 vacrob 8 5 2 4 0
207 0 0 1 vacxan 12 5 2 5 0
208 0 0 1 vacsey 9 5 2 6 0
209 0 1 0 vacrob 8 5 3 1 0
210 0 1 0 vactor 11 5 3 2 0
211 0 1 0 vacsey 9 5 3 3 0
212 0 1 0 vacsie 10 5 3 4 0
213 0 1 0 senmel 3 5 3 5 0
214 0 1 0 vacger 6 5 3 6 0
215 0 0 0 vacrob 8 5 4 1 0
216 0 0 0 vacger 6 5 4 2 0
217 0 0 0 senpol 4 5 4 3 0
218 0 0 0 vacsie 10 5 4 4 0
219 0 0 0 albhar 1 5 4 5 0
220 0 0 0 senbre 2 5 4 6 0
221 0 1 1 vacnil 7 5 5 1 0
222 0 1 1 albhar 1 5 5 2 0
223 0 1 1 senpol 4 5 5 3 0
224 0 1 1 sensen 5 5 5 4 0
225 0 1 1 vactor 11 5 5 5 0
226 0 1 1 senmel 3 5 5 6 0
227 0 0 1 senpol 4 5 6 1 0
228 0 0 1 vacsie 10 5 6 2 0
229 0 0 1 vacger 6 5 6 3 0
230 0 0 1 albhar 1 5 6 4 0
231 0 0 1 senmel 3 5 6 5 0
232 0 0 1 sensen 5 5 6 6 0
233 0 1 1 vacsie 10 5 7 1 0
234 0 1 1 vacsey 9 5 7 2 0
235 0 1 1 vacxan 12 5 7 3 0
236 0 1 1 vacrob 8 5 7 4 0
237 0 1 1 vacger 6 5 7 5 0
238 0 1 1 senbre 2 5 7 6 0
239 0 0 0 vactor 11 5 8 1 1
240 0 1 0 sensen 5 5 9 1 0
241 0 1 0 albhar 1 5 9 2 0
242 0 1 0 vacxan 12 5 9 3 0
243 0 1 0 senpol 4 5 9 4 0
244 0 1 0 senbre 2 5 9 5 0
245 0 1 0 vacnil 7 5 9 6 0
246 0 0 0 sensen 5 6 1 1 0
247 0 0 0 albhar 1 6 1 2 0
248 0 0 0 senbre 2 6 1 3 0
249 0 0 0 vactor 11 6 1 4 0
250 0 0 0 vacrob 8 6 1 5 0
251 0 0 0 vacxan 12 6 1 6 0
252 0 0 1 vacxan 12 6 2 1 0
253 0 0 1 vacsie 10 6 2 2 0
254 0 0 1 senbre 2 6 2 3 0
255 0 0 1 albhar 1 6 2 4 0
256 0 0 1 vacsey 9 6 2 5 0
257 0 0 1 senpol 4 6 2 6 0
258 0 1 1 vacsey 9 6 3 1 0
259 0 1 1 vacnil 7 6 3 2 0
260 0 1 1 vactor 11 6 3 3 0
261 0 1 1 senpol 4 6 3 4 0
262 0 1 1 senmel 3 6 3 5 0
263 0 1 1 sensen 5 6 3 6 0
264 0 1 0 sensen 5 6 4 1 0
265 0 1 0 vacger 6 6 4 2 0
266 0 1 0 senpol 4 6 4 3 0
267 0 1 0 vacsie 10 6 4 4 0
268 0 1 0 vacxan 12 6 4 5 0
269 0 1 0 vactor 11 6 4 6 0
270 0 0 0 senmel 3 6 5 1 1
271 0 0 1 vacger 6 6 6 1 0
272 0 0 1 sensen 5 6 6 2 0
273 0 0 1 vacrob 8 6 6 3 0
274 0 0 1 vactor 11 6 6 4 0
275 0 0 1 vacnil 7 6 6 5 0
276 0 0 1 senmel 3 6 6 6 0
277 0 1 0 senbre 2 6 7 1 0
278 0 1 0 vacsey 9 6 7 2 0
279 0 1 0 vacrob 8 6 7 3 0
280 0 1 0 albhar 1 6 7 4 0
281 0 1 0 vacnil 7 6 7 5 0
282 0 1 0 senmel 3 6 7 6 0
283 0 1 1 vacrob 8 6 8 1 0
284 0 1 1 vacger 6 6 8 2 0
285 0 1 1 albhar 1 6 8 3 0
286 0 1 1 vacxan 12 6 8 4 0
287 0 1 1 vacsie 10 6 8 5 0
288 0 1 1 senbre 2 6 8 6 0
289 0 0 0 vacsie 10 6 9 1 0
290 0 0 0 senmel 3 6 9 2 0
291 0 0 0 vacger 6 6 9 3 0
292 0 0 0 vacnil 7 6 9 4 0
293 0 0 0 senpol 4 6 9 5 0
294 0 0 0 vacsey 9 6 9 6 0
295 1 1 0 vacsey 9 7 1 1 0
296 1 1 0 senmel 3 7 1 2 0
297 1 1 0 vacrob 8 7 1 3 0
298 1 1 0 vacger 6 7 1 4 0
299 1 1 0 senpol 4 7 1 5 0
300 1 1 0 vacnil 7 7 1 6 0
301 1 0 0 senmel 3 7 2 1 0
302 1 0 0 vacsie 10 7 2 2 0
303 1 0 0 vacnil 7 7 2 3 0
304 1 0 0 vacger 6 7 2 4 0
305 1 0 0 albhar 1 7 2 5 0
306 1 0 0 vacrob 8 7 2 6 0
307 1 0 1 albhar 1 7 3 1 0
308 1 0 1 vactor 11 7 3 2 0
309 1 0 1 vacsey 9 7 3 3 0
310 1 0 1 vacger 6 7 3 4 0
311 1 0 1 vacsie 10 7 3 5 0
312 1 0 1 vacnil 7 7 3 6 0
313 1 0 0 vactor 11 7 4 1 0
314 1 0 0 vacsey 9 7 4 2 0
315 1 0 0 vacxan 12 7 4 3 0
316 1 0 0 senpol 4 7 4 4 0
317 1 0 0 sensen 5 7 4 5 0
318 1 0 0 senbre 2 7 4 6 0
319 1 1 1 senpol 4 7 5 1 0
320 1 1 1 senbre 2 7 5 2 0
321 1 1 1 sensen 5 7 5 3 0
322 1 1 1 vactor 11 7 5 4 0
323 1 1 1 vacxan 12 7 5 5 0
324 1 1 1 senmel 3 7 5 6 0
325 1 1 1 vacger 6 7 6 1 0
326 1 1 1 albhar 1 7 6 2 0
327 1 1 1 vacnil 7 7 6 3 0
328 1 1 1 vacrob 8 7 6 4 0
329 1 1 1 vacsie 10 7 6 5 0
330 1 1 1 vacsey 9 7 6 6 0
331 1 0 1 sensen 5 7 7 1 0
332 1 0 1 senpol 4 7 7 2 0
333 1 0 1 vacrob 8 7 7 3 0
334 1 0 1 senbre 2 7 7 4 0
335 1 0 1 vacxan 12 7 7 5 0
336 1 0 1 senmel 3 7 7 6 0
337 1 0 0 albhar 1 7 8 1 1
338 1 1 0 sensen 5 7 9 1 0
339 1 1 0 vactor 11 7 9 2 0
340 1 1 0 albhar 1 7 9 3 0
341 1 1 0 vacsie 10 7 9 4 0
342 1 1 0 senbre 2 7 9 5 0
343 1 1 0 vacxan 12 7 9 6 0
344 1 1 0 vacger 6 8 1 1 0
345 1 1 0 albhar 1 8 1 2 0
346 1 1 0 senpol 4 8 1 3 0
347 1 1 0 vacsie 10 8 1 4 0
348 1 1 0 senbre 2 8 1 5 0
349 1 1 0 sensen 5 8 1 6 0
350 1 0 0 sensen 5 8 2 1 0
351 1 0 0 senmel 3 8 2 2 0
352 1 0 0 senpol 4 8 2 3 0
353 1 0 0 vacsey 9 8 2 4 0
354 1 0 0 vacger 6 8 2 5 0
355 1 0 0 vactor 11 8 2 6 0
356 1 0 0 vacsey 9 8 3 1 1
357 1 1 0 vacsey 9 8 4 1 0
358 1 1 0 vactor 11 8 4 2 0
359 1 1 0 vacnil 7 8 4 3 0
360 1 1 0 vacrob 8 8 4 4 0
361 1 1 0 senmel 3 8 4 5 0
362 1 1 0 vacxan 12 8 4 6 0
363 1 0 1 senbre 2 8 5 1 0
364 1 0 1 senpol 4 8 5 2 0
365 1 0 1 senmel 3 8 5 3 0
366 1 0 1 vactor 11 8 5 4 0
367 1 0 1 vacger 6 8 5 5 0
368 1 0 1 vacsie 10 8 5 6 0
369 1 1 1 senbre 2 8 6 1 0
370 1 1 1 sensen 5 8 6 2 0
371 1 1 1 vacsie 10 8 6 3 0
372 1 1 1 vacrob 8 8 6 4 0
373 1 1 1 vacxan 12 8 6 5 0
374 1 1 1 senpol 4 8 6 6 0
375 1 0 1 vacxan 12 8 7 1 0
376 1 0 1 albhar 1 8 7 2 0
377 1 0 1 vacrob 8 8 7 3 0
378 1 0 1 sensen 5 8 7 4 0
379 1 0 1 vacnil 7 8 7 5 0
380 1 0 1 vacsey 9 8 7 6 0
381 1 1 1 albhar 1 8 8 1 0
382 1 1 1 vacnil 7 8 8 2 0
383 1 1 1 vacger 6 8 8 3 0
384 1 1 1 vactor 11 8 8 4 0
385 1 1 1 senmel 3 8 8 5 0
386 1 1 1 vacsey 9 8 8 6 0
387 1 0 0 vacrob 8 8 9 1 0
388 1 0 0 vacsie 10 8 9 2 0
389 1 0 0 senbre 2 8 9 3 0
390 1 0 0 albhar 1 8 9 4 0
391 1 0 0 vacnil 7 8 9 5 0
392 1 0 0 vacxan 12 8 9 6 0
393 0 1 0 senmel 3 9 1 1 0
394 0 1 0 vacxan 12 9 1 2 0
395 0 1 0 albhar 1 9 1 3 0
396 0 1 0 senbre 2 9 1 4 0
397 0 1 0 vacrob 8 9 1 5 0
398 0 1 0 vacnil 7 9 1 6 0
399 0 0 1 vacsey 9 9 2 1 0
400 0 0 1 sensen 5 9 2 2 0
401 0 0 1 albhar 1 9 2 3 0
402 0 0 1 senpol 4 9 2 4 0
403 0 0 1 senmel 3 9 2 5 0
404 0 0 1 vacger 6 9 2 6 0
405 0 0 0 sensen 5 9 3 1 1
406 0 0 0 sensen 5 9 4 1 0
407 0 0 0 vacsie 10 9 4 2 0
408 0 0 0 vacrob 8 9 4 3 0
409 0 0 0 senbre 2 9 4 4 0
410 0 0 0 albhar 1 9 4 5 0
411 0 0 0 vactor 11 9 4 6 0
412 0 1 1 vacxan 12 9 5 1 0
413 0 1 1 albhar 1 9 5 2 0
414 0 1 1 sensen 5 9 5 3 0
415 0 1 1 vacnil 7 9 5 4 0
416 0 1 1 vacsey 9 9 5 5 0
417 0 1 1 vactor 11 9 5 6 0
418 0 1 1 senmel 3 9 6 1 0
419 0 1 1 senbre 2 9 6 2 0
420 0 1 1 vacger 6 9 6 3 0
421 0 1 1 senpol 4 9 6 4 0
422 0 1 1 vacrob 8 9 6 5 0
423 0 1 1 vacsie 10 9 6 6 0
424 0 0 0 vacger 6 9 7 1 0
425 0 0 0 senpol 4 9 7 2 0
426 0 0 0 vacsey 9 9 7 3 0
427 0 0 0 vacxan 12 9 7 4 0
428 0 0 0 senmel 3 9 7 5 0
429 0 0 0 vacnil 7 9 7 6 0
430 0 1 0 senpol 4 9 8 1 0
431 0 1 0 vacsey 9 9 8 2 0
432 0 1 0 vactor 11 9 8 3 0
433 0 1 0 sensen 5 9 8 4 0
434 0 1 0 vacsie 10 9 8 5 0
435 0 1 0 vacger 6 9 8 6 0
436 0 0 1 vactor 11 9 9 1 0
437 0 0 1 vacxan 12 9 9 2 0
438 0 0 1 vacsie 10 9 9 3 0
439 0 0 1 senbre 2 9 9 4 0
440 0 0 1 vacrob 8 9 9 5 0
441 0 0 1 vacnil 7 9 9 6 0
442 1 1 0 senmel 3 10 1 1 0
443 1 1 0 vactor 11 10 1 2 0
444 1 1 0 albhar 1 10 1 3 0
445 1 1 0 vacsie 10 10 1 4 0
446 1 1 0 vacxan 12 10 1 5 0
447 1 1 0 vacger 6 10 1 6 0
448 1 0 0 senpol 4 10 2 1 0
449 1 0 0 senmel 3 10 2 2 0
450 1 0 0 sensen 5 10 2 3 0
451 1 0 0 senbre 2 10 2 4 0
452 1 0 0 vacsie 10 10 2 5 0
453 1 0 0 vacrob 8 10 2 6 0
454 1 0 1 vacrob 8 10 3 1 0
455 1 0 1 albhar 1 10 3 2 0
456 1 0 1 senmel 3 10 3 3 0
457 1 0 1 vactor 11 10 3 4 0
458 1 0 1 vacsie 10 10 3 5 0
459 1 0 1 vacnil 7 10 3 6 0
460 1 0 0 vacnil 7 10 4 1 1
461 1 0 1 vacger 6 10 5 1 0
462 1 0 1 vacxan 12 10 5 2 0
463 1 0 1 senpol 4 10 5 3 0
464 1 0 1 sensen 5 10 5 4 0
465 1 0 1 vacsey 9 10 5 5 0
466 1 0 1 senbre 2 10 5 6 0
467 1 1 0 sensen 5 10 6 1 0
468 1 1 0 vacrob 8 10 6 2 0
469 1 1 0 vacsey 9 10 6 3 0
470 1 1 0 senpol 4 10 6 4 0
471 1 1 0 senbre 2 10 6 5 0
472 1 1 0 vacnil 7 10 6 6 0
473 1 1 1 senbre 2 10 7 1 0
474 1 1 1 albhar 1 10 7 2 0
475 1 1 1 vactor 11 10 7 3 0
476 1 1 1 vacsey 9 10 7 4 0
477 1 1 1 vacnil 7 10 7 5 0
478 1 1 1 senmel 3 10 7 6 0
479 1 1 1 vacsie 10 10 8 1 0
480 1 1 1 senpol 4 10 8 2 0
481 1 1 1 vacger 6 10 8 3 0
482 1 1 1 vacrob 8 10 8 4 0
483 1 1 1 vacxan 12 10 8 5 0
484 1 1 1 sensen 5 10 8 6 0
485 1 0 0 vacxan 12 10 9 1 0
486 1 0 0 vactor 11 10 9 2 0
487 1 0 0 vacsey 9 10 9 3 0
488 1 0 0 vacnil 7 10 9 4 0
489 1 0 0 vacger 6 10 9 5 0
490 1 0 0 albhar 1 10 9 6 0

Team activity (6 minutes): what is \(s[3]\) ?

An answer

Tree \(i=3\) in row 3 of the data is the “vacger” species. If you look at the Species_number column, you can see that I’ve labeled this Species_number 6 (out of 12 species we are studying).

So,

\(s[3] = 6\) or vacger.

What this means is the species number for the third row (the third plant) is number 6.


Team activity (6 minutes): What is the plain English meaning of \(\text{Fire}_4\) ? And what does it equal ? And what does this mean ?

An answer

\(\text{Fire}_4\) refers to the presence or absence of fire for the 4th tree (tree \(i=4\) in row 4 of the data).

The experimental design tells us that \(\text{Fire}_4 = 0\). This means that the tree number 4 gets no fire (it was not burned).


Modeling

To estimate the effects of drought, fire, and herbivory on the acacia species, we compare heights across groups.

But we have small numbers of trees in each group.

Team activity (8 minutes): How many total trees do we have ? How many trees are Vachellia gerardii that got fire, drought, and not herbivory ?

An answer

10 blocks * (8 regular subplots * 6 trees) = 480 total trees.

5 out of the 10 blocks got fire. In each of these 5 blocks, there is a pair of subplots that got drought and not herbivory. But the species Vachellia gerardii is only in one of them, since the 12 acacia species we are studying are split across the pair of subplots. So there are only 5 Vachellia gerardii plants that got fire, drought, and not herbivory.


We could use similar groups (Vachellia gerardii plants that got fire, drought, and herbivory) to help get better estimates of effects.

Team activity (9 minutes): We want to study drought, herbivory, and fire. What are some things we don’t want to study that might affect tree height ?

An answer

We may be able to explain some of the variation in heights by a plant’s location: how far south or east is it ? where is it within a block’s rainout shelter ? And maybe some rainout shelters are slightly different than others in a way that affects plant height.


We combine data across groups and our knowledge of the experimental design together into a statistical model, a mathematical story of how the data could be generated. We use probability distributions to express both randomness in nature and our uncertainty. For example, the blocks might be a little bit different. Maybe one block’s rainout shelter might get more sunlight or less sunlight, which might make some plants a little bit taller or a little bit shorter. We express these small differences with block effects, drawn from a probability distribution. As another example, we have uncertainty about the effects of fire, so we express the fire effects as being drawn from a probability distribution.

A useful statistical model will help us get better estimates than simply comparing heights across groups.

The big picture

Tree height varies by treatment, species, and location. We might try to predict the height of a tree by:

\[\begin{align*} \text{predicted height} &= \text{average height} \\ &\qquad + \text{factor (fire, drought, herbivory) effects} \\ &\qquad + \text{species-specific effects} \\ &\qquad + \text{location effects} \end{align*}\]

Team activity (6 minutes): Which parts of the model do you think will be most important to predict acacia tree height ?

An answer

I don’t have an answer here, because I don’t know about acacia trees. But as a statistician, I imagine that the goal of the experiment is for location effects to be minimized so that we can focus on the effects of interest: fire, drought, herbivory, and variation across species.


The above model is not fully in math yet. Next we will express each part mathematically.

Above, we defined mathematical notation for our data using English letters, like measured tree height \(y_i\).

Mathematical notation for the model parameters usually uses Greek letters, like \(\mu\) and \(\beta\).

For example, let \(\mu_i =\) the model’s predicted height of tree \(i\) (measured height is the data \(y_i\)). A small model might predict tree height based on whether the tree got the fire treatment:

\[\begin{align*} \mu_i &= \beta_0 + \beta^{\text{fire}} \text{Fire}_i \end{align*}\]

We can use this model in two directions:

  • Simulating: model parameters \(\Rightarrow\) data

  • Fitting: data \(\Rightarrow\) model parameters

Last year (2023) we guessed model parameters based on what we think we know about acacia trees: how tall do we expect them to get after a few months ? how do we expect them to react to fire ? For example, \(\beta_0 = 20\) and \(\beta^{\text{fire}} = -2\).

We then used these guessed parameters to simulate fake data. For example, a plant that gets fire has a predicted height of \(\beta_0 + \beta^{\text{fire}} = 20 - 2 = 18\). This plant’s measured height will be a bit different due to model error, so we add as a random number from a probability distribution to get (for example) 17.836089.

We then fit the model to get estimated model parameters, which should be close to our original guesses. This is a good way to check that we are using model fitting software correctly. It can help us to see how precise we expect our results to be. This is very useful in experimental design so that we can see if we will have results we can publish. In reality, most researchers skip this fake data simulation step. While not ideal, it can be ok if using an experimental design and model that has been used by many past similar experiments.

This year (2024) we have real data we can use to fit the model to get estimated model parameters that are of scientific interest.

Continuing our small example, suppose we have data on the heights of 6 trees. Three are treated with fire and three are not. The heights of the control trees are \((19,20,21)\) and the fire treatments trees are \((17,18,19)\). The average height of all control trees is 20 cm. The average height of all Fire trees is 18 cm. The fire effect can be estimated as the difference \(\beta^{\text{fire}} = 18 - 20 = -2\).

Average height

Let \(\beta_0\) be the average height of the trees across all species for the control (no fire, no drought, no herbivory). Last year (2023) we guessed it would be 20 centimeters, but that was assuming a shorter experiment (ending in October 2023, not February 2024).

beta_0 = 20

Factor effects

Let’s write a small part our model for the tree heights in math:

\[\begin{align*} \mu_i &= \beta_0 + \beta^{\text{fire}} \text{Fire}_i + \beta^{\text{drought}} \text{Drought}_i + \beta^{\text{herbivory}} \text{Herbivory}_i \end{align*}\]

Our notation:

  • \(i =\) a tree
  • \(\text{Fire}_i = 1\) if tree \(i\) gets fire treatment, \(=0\) otherwise
  • \(\text{Drought}_i = 1\) if tree \(i\) gets drought treatment, \(=0\) otherwise
  • \(\text{Herbivory}_i = 1\) if tree \(i\) gets herbivory treatment, \(=0\) otherwise
  • \(\mu_i =\) the model’s predicted height of tree \(i\) (measured height is \(y_i\))

Team activity (9 minutes): What does our model predict will be the height of a tree in the control group ?

An answer

The control group gets none of the treatments, so \(\text{Fire}_i = 0\), \(\text{Drought}_i = 0\), and \(\text{Herbivory}_i = 0\). So the model predicts the control height will be:

\[\mu_i = \beta_0 + \beta^{\text{fire}}*0 + \beta^{\text{drought}}*0 + \beta^{\text{herbivory}}*0\] which is just \(\beta_0\).

Team activity (9 minutes): What does our model predict will be the height of a tree that gets fire treatment but not drought or herbivory ?

An answer

This group gets \(\text{Fire}_i = 1\), but \(\text{Drought}_i = 0\) and \(\text{Herbivory}_i = 0\). So the model predicts their height will be:

\[\mu_i = \beta_0 + \beta^{\text{fire}}*1 + \beta^{\text{drought}}*0 + \beta^{\text{herbivory}}*0\] which is \(\beta_0 + \beta^{\text{fire}}\).

Team activity (9 minutes): What does our model predict will be the height of a tree that gets fire and drought but not the herbivory treatment ?

An answer

This group gets \(\text{Fire}_i = 1\) and \(\text{Drought}_i = 1\), but \(\text{Herbivory}_i = 0\). So the model predicts their height will be:

\[\mu_i = \beta_0 + \beta^{\text{fire}}*1 + \beta^{\text{drought}}*1 + \beta^{\text{herbivory}}*0\] which is \(\beta_0 + \beta^{\text{fire}} + \beta^{\text{drought}}\).

Last year (2023) we guessed effects of fire, drought, and herbivory. We were perhaps way too optimistic about how resistant acacias are to these effects ! We said fire would reduce height by only 2 centimeters, drought by 4 centimeters, and herbivory by 2 centimeters.

beta_fire = -2
beta_drought = -4
beta_herbivory = -2

Interactions of factors

Suppose fire alone reduces height by 2 centimeters \(\beta^{\text{fire}} = -2\) and drought alone reduces height by 4 centimeters \(\beta^{\text{drought}} = -4\). If a plant is exposed to both fire and drought, do you expect its height to be reduced by 6 centimeters ? Or more ? Or less ? Is both fire and drought different than the sum of the two effects ?

Last year (2023) we guessed that having both fire and drought was even worse than the sum of both negative effects. In other words, that the interaction between them is negative, \(\beta^{\text{fire},\text{drought}} = -2\).

beta_fire_drought = -2
beta_fire_herbivory = -2
beta_drought_herbivory = -2

Again, let’s write a small part our model for the tree heights in math:

\[\begin{align*} \mu_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} \text{Fire}_i + \beta^{\text{drought}} \text{Drought}_i + \beta^{\text{herbivory}} \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought}} \text{Fire}_i \text{Drought}_i + \beta^{\text{fire},\text{herbivory}} \text{Fire}_i \text{Herbivory}_i + \beta^{\text{drought},\text{herbivory}} \text{Drought}_i \text{Herbivory}_i \end{align*}\]

Interactions can be visualized as lines that are not parallel.

Team activity (9 minutes): What does our model predict will be the height of a tree that gets fire and drought but not the herbivory treatment ?

An answer

This group gets \(\text{Fire}_i = 1\) and \(\text{Drought}_i = 1\), but \(\text{Herbivory}_i = 0\). So the model predicts their height will be:

\[\mu_i = \beta_0 + \beta^{\text{fire}}*1 + \beta^{\text{drought}}*1 + \beta^{\text{herbivory}}*0 + \beta^{\text{fire},\text{drought}} *1*1 + \beta^{\text{fire},\text{herbivory}} *1*0 + \beta^{\text{drought},\text{herbivory}} *1*0\] which is \(\beta_0 + \beta^{\text{fire}} + \beta^{\text{drought}} + \beta^{\text{fire},\text{drought}}\).

Team activity (12 minutes): Can you redraw the interaction picture for a positive interaction between fire and drought ? What is the story ?

An answer

The 2-way interaction terms could be positive if the main effects are very large, because a plant can’t die twice. So layering two effects, the overall combined effect isn’t as bad as the two summed together.

We will see below that this ends up happening in the real data !

Species-specific effects

So far we’ve been talking in averages across 12 species of acacia trees. Last year (2023) we guessed the average height in the control group is most likely around 20 centimeters, \(\beta_0 = 20\), but with the longer experiment, the truth is closer to 100 centimeters.

What about the variability across species ? Maybe senegalia brevispica is particularly short, so it is 15 centimeters below average, \(\beta^{\text{species}}_{\text{senegalia brevispica}} = -15\). Or maybe albizia harveyi is particularly tall, so it is 17 centimeters above average, \(\beta^{\text{species}}_{\text{albizia harveyi}} = 17\).

To think about this, let’s learn about (or review) the Normal Distribution. It looks like this, where the standard deviation tells us the width (how much variation there is across species):

The drawing shows that species deviate from the average with a standard deviation of 30 centimeters, \(\beta^{\text{species}}_s \sim N(0,30)\).

Unlike is shown in the drawing, last year (2023) we guessed that the shortest species is 8 centimeters below average and the tallest is 8 centimeters above average. In other words, we expected 95% of species to be between 8 centimeters below average and 8 centimeters above average. We can express this mathematically with a Normal distribution with standard deviation 4 for the species-specific differences from the average: \(\beta^{\text{species}}_s \sim N(0,4)\).

In our mathematical story, each species’s height is the average height plus a choice from this Normal distribution:

Species-specific treatment effects

How much do the 12 species differ from each other in the effects of fire, drought, and herbivory ?

Last year (2023) we guessed that the worst effected species is 2 centimeters below average and the most resistant species is 2 centimeters above average. Then we can express this mathematically with a Normal distribution with standard deviation 2 for the species-specific differences from the average:

\[\beta^{\text{species,fire}}_s \sim N(0,1)\] And similar for drought and herbivory.

beta_species = rnorm(n = 12, mean = 0, sd = 4)
beta_species_fire = rnorm(n = 12, mean = 0, sd = 1)
beta_species_drought = rnorm(n = 12, mean = 0, sd = 1) 
beta_species_herbivory = rnorm(n = 12, mean = 0, sd = 1)

Location effects

We do not directly aim to study the effect of plant location, but it might impact plant heights so we want to model it.

Team activity (8 minutes): How might a plant’s location in the experiment affect its growth ? Look back at the experimental design layout and think about each plant’s position.

An answer

If the soil or gradient changes across the experiment, we might expect the height in centimeters to change as the plant position moves south or east. Within a block, the position of the tree within the rainout shelter could affect its growth as well. Beyond the south/east directional effects, there could be block effects due to unintended differences in the rainout shelters.

Blocks 6-10 are located just downhill of Blocks 1-5. This might influence the soil in small but measurable ways.

Storms on the Nelson Mandela campus come from the east. This means that rain tends to be blown by the wind a bit. A small amount of rain is likely to blown into the east side of the experiment (Blocks 5 and 10). A tiny amount of rain might make it to the east sides of each Block between the roofs of the shelters.

The sun rises in the east, travels approximately north (the campus being south of the equator), and sets in the west. So the eastern side of the whole experiment (and potentially the individual blocks) might get more morning sun, and the west side might get more afternoon sun.

These small differences in sunlight and rain might influence our trees. We can attempt to measure and account for this.

Last year (2023) we created variables \(\text{South}_i\), \(\text{East}_i\), \(\text{SouthInBlock}_i\), and \(\text{EastInBlock}_i\) for each tree \(i\) based on its plant position.

Details about corn controls Additionally, each block has a corn plant to capture environmental differences. Blocks with higher corn heights may have more hospitable conditions for acacia growth. For example, the coefficient of corn height, \(\beta^{\text{corn}}\), is the expected change in height in centimeters per change in corn height.

The final analysis will include all this location data. But this week we focus on the factor modeling and species variability. So we will only model the block effects due to unintended differences in the rainout shelters.

Last year we guessed this was around \(\pm 4\) centimeters (after accounting for other location data):

\[\beta^{\text{block}}_b \sim N(0,2)\]

sd_block = 2
beta_block = rnorm(n = 10, mean = 0, sd = sd_block)

Model error

Our model describes how tree height varies by treatment, species, and location. We can use the model to predict the height of a tree, but each individual tree measurement \(y_i\) will vary around this prediction \(\mu_i\). The difference \(\epsilon_i = y_i - \mu_i\) can be called “model error” or “individual tree effect”. It can be positive or negative.

\[\text{tree height } y_i = \text{model's predicted height } \mu_i + \text{model's error } \epsilon_i\]

Let’s look at model error for 3 trees: \(y_1\), \(y_2\), and \(y_3\):

Team activity (6 minutes): What might be special to a tree that isn’t in our model already ? In other words, what do you think is in the model error term \(\epsilon_i\) ?

An answer

Each seedling is genetically unique. Just like children in a family, they are all a little different ! These genetic differences might cause some trees to grow more quickly than others, resulting in differences in height.

Certain seedlings might be randomly eaten by insects or infected by various plant diseases. This might slow the growth of trees.

Some trees might get nibbled by rabbits or small antelopes (dikdik). This of course can change their height.

During the transplanting, some seedlings had their roots damaged. Other trees just were not happy to be transplanted, for unknown reasons. These transplant effects caused some individual seedlings to drop leaves or even cause branches to die. This can cause the height to decrease a bit.

Finally, measurement variation ! Different scientists may get different results when measuring the same plant at the same time.

sd_model_error = 1

Putting it together

We can now write our model for the tree heights in math (Greek letters !):

\[\begin{align*} y_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} \text{Fire}_i + \beta^{\text{drought}} \text{Drought}_i + \beta^{\text{herbivory}} \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought}} \text{Fire}_i \text{Drought}_i + \beta^{\text{fire},\text{herbivory}} \text{Fire}_i \text{Herbivory}_i + \beta^{\text{drought},\text{herbivory}} \text{Drought}_i \text{Herbivory}_i \\ &\qquad + \beta^{\text{species}}_{s[i]} \\ &\qquad + \beta^{\text{species},\text{fire}}_{s[i]} \text{Fire}_i + \beta^{\text{species},\text{drought}}_{s[i]} \text{Drought}_i + \beta^{\text{species},\text{herbivory}}_{s[i]} \text{Herbivory}_i \\ &\qquad + \beta^{\text{block}}_{b[i]} \\ &\qquad + \epsilon_i \end{align*}\]

Consider tree \(i = 1\) in the first row of data below:

tree Fire Drought Herbivory Species Species_number Block
1 0 1 1 vacsey 9 1
2 0 1 1 senpol 4 1
3 0 1 1 vacger 6 1
4 0 1 1 albhar 1 1
5 0 1 1 senbre 2 1
6 0 1 1 sensen 5 1
7 0 0 1 albhar 1 1
8 0 0 1 vacrob 8 1
9 0 0 1 senmel 3 1
10 0 0 1 vactor 11 1
11 0 0 1 vacxan 12 1
12 0 0 1 vacnil 7 1
13 0 0 0 albhar 1 1
14 0 0 0 vacsie 10 1
15 0 0 0 vacrob 8 1
16 0 0 0 senmel 3 1
17 0 0 0 senbre 2 1
18 0 0 0 sensen 5 1
19 0 1 0 vacxan 12 1
20 0 1 0 vactor 11 1
21 0 1 0 vacrob 8 1
22 0 1 0 senpol 4 1
23 0 1 0 senbre 2 1
24 0 1 0 senmel 3 1
25 0 0 1 sensen 5 1
26 0 0 1 senbre 2 1
27 0 0 1 vacsie 10 1
28 0 0 1 vacsey 9 1
29 0 0 1 senpol 4 1
30 0 0 1 vacger 6 1
31 0 1 1 senmel 3 1
32 0 1 1 vacrob 8 1
33 0 1 1 vacnil 7 1
34 0 1 1 vactor 11 1
35 0 1 1 vacxan 12 1
36 0 1 1 vacsie 10 1
37 0 0 0 vacger 6 1
38 0 0 0 vacnil 7 1
39 0 0 0 senpol 4 1
40 0 0 0 vactor 11 1
41 0 0 0 vacsey 9 1
42 0 0 0 vacger 6 1
43 0 0 0 vacxan 12 1
44 0 1 0 sensen 5 1
45 0 1 0 vacsie 10 1
46 0 1 0 vacnil 7 1
47 0 1 0 vacger 6 1
48 0 1 0 albhar 1 1
49 0 1 0 vacsey 9 1
50 1 0 0 vacrob 8 2
51 1 0 0 vacxan 12 2
52 1 0 0 senbre 2 2
53 1 0 0 sensen 5 2
54 1 0 0 vacrob 8 2
55 1 0 0 senpol 4 2
56 1 0 0 vactor 11 2
57 1 0 1 vacsey 9 2
58 1 0 1 vacger 6 2
59 1 0 1 senpol 4 2
60 1 0 1 vactor 11 2
61 1 0 1 sensen 5 2
62 1 0 1 vacrob 8 2
63 1 1 1 vacsie 10 2
64 1 1 1 vacger 6 2
65 1 1 1 senmel 3 2
66 1 1 1 vacnil 7 2
67 1 1 1 vacrob 8 2
68 1 1 1 vacsey 9 2
69 1 1 0 albhar 1 2
70 1 1 0 senmel 3 2
71 1 1 0 vactor 11 2
72 1 1 0 vacrob 8 2
73 1 1 0 vacnil 7 2
74 1 1 0 vacger 6 2
75 1 1 0 senpol 4 2
76 1 1 0 vacsie 10 2
77 1 1 0 vacxan 12 2
78 1 1 0 vacsey 9 2
79 1 1 0 senbre 2 2
80 1 1 0 sensen 5 2
81 1 0 0 vacnil 7 2
82 1 0 0 albhar 1 2
83 1 0 0 vacsey 9 2
84 1 0 0 vacger 6 2
85 1 0 0 vacsie 10 2
86 1 0 0 senmel 3 2
87 1 0 1 vacsie 10 2
88 1 0 1 vacxan 12 2
89 1 0 1 vacnil 7 2
90 1 0 1 albhar 1 2
91 1 0 1 senmel 3 2
92 1 0 1 senbre 2 2
93 1 1 1 senbre 2 2
94 1 1 1 sensen 5 2
95 1 1 1 vactor 11 2
96 1 1 1 senpol 4 2
97 1 1 1 vacxan 12 2
98 1 1 1 albhar 1 2
99 1 0 0 vacnil 7 3
100 1 0 0 vacsie 10 3
101 1 0 0 vacger 6 3
102 1 0 0 vactor 11 3
103 1 0 0 sensen 5 3
104 1 0 0 vacrob 8 3
105 1 0 1 albhar 1 3
106 1 0 1 vacnil 7 3
107 1 0 1 vactor 11 3
108 1 0 1 senpol 4 3
109 1 0 1 vacxan 12 3
110 1 0 1 sensen 5 3
111 1 1 1 vacsie 10 3
112 1 1 1 vacger 6 3
113 1 1 1 senmel 3 3
114 1 1 1 senbre 2 3
115 1 1 1 vacsey 9 3
116 1 1 1 albhar 1 3
117 1 0 0 senbre 2 3
118 1 1 0 senpol 4 3
119 1 1 0 albhar 1 3
120 1 1 0 senmel 3 3
121 1 1 0 senbre 2 3
122 1 1 0 vacsey 9 3
123 1 1 0 vacnil 7 3
124 1 1 1 vacxan 12 3
125 1 1 1 vacnil 7 3
126 1 1 1 vacrob 8 3
127 1 1 1 sensen 5 3
128 1 1 1 senpol 4 3
129 1 1 1 vactor 11 3
130 1 1 0 vacrob 8 3
131 1 1 0 vacsie 10 3
132 1 1 0 sensen 5 3
133 1 1 0 vacger 6 3
134 1 1 0 vacxan 12 3
135 1 1 0 vactor 11 3
136 1 0 1 vacger 6 3
137 1 0 1 vacrob 8 3
138 1 0 1 senmel 3 3
139 1 0 1 senbre 2 3
140 1 0 1 vacsie 10 3
141 1 0 1 vacsey 9 3
142 1 0 0 senmel 3 3
143 1 0 0 vacxan 12 3
144 1 0 0 albhar 1 3
145 1 0 0 vacsey 9 3
146 1 0 0 senpol 4 3
147 1 0 0 senbre 2 3
148 0 0 0 vacxan 12 4
149 0 1 0 vacsey 9 4
150 0 1 0 vacsie 10 4
151 0 1 0 vactor 11 4
152 0 1 0 senbre 2 4
153 0 1 0 vacxan 12 4
154 0 1 0 senmel 3 4
155 0 0 0 sensen 5 4
156 0 0 0 vacger 6 4
157 0 0 0 vacxan 12 4
158 0 0 0 vacnil 7 4
159 0 0 0 senpol 4 4
160 0 0 0 vactor 11 4
161 0 1 1 vacger 6 4
162 0 1 1 senpol 4 4
163 0 1 1 vacsey 9 4
164 0 1 1 vacrob 8 4
165 0 1 1 senmel 3 4
166 0 1 1 sensen 5 4
167 0 1 0 albhar 1 4
168 0 1 0 vacrob 8 4
169 0 1 0 vacger 6 4
170 0 1 0 senpol 4 4
171 0 1 0 vacnil 7 4
172 0 1 0 sensen 5 4
173 0 0 1 senmel 3 4
174 0 0 1 vactor 11 4
175 0 0 1 vacrob 8 4
176 0 0 1 vacsey 9 4
177 0 0 1 senbre 2 4
178 0 0 1 senpol 4 4
179 0 0 1 vacsie 10 4
180 0 0 1 vacnil 7 4
181 0 0 1 vacxan 12 4
182 0 0 1 albhar 1 4
183 0 0 1 vacger 6 4
184 0 0 1 sensen 5 4
185 0 0 0 senbre 2 4
186 0 0 0 albhar 1 4
187 0 0 0 vacrob 8 4
188 0 0 0 senmel 3 4
189 0 0 0 vacsey 9 4
190 0 0 0 vacsie 10 4
191 0 1 1 vacsie 10 4
192 0 1 1 vactor 11 4
193 0 1 1 senbre 2 4
194 0 1 1 albhar 1 4
195 0 1 1 vacxan 12 4
196 0 1 1 vacnil 7 4
197 0 0 0 vacsey 9 5
198 0 0 0 vacxan 12 5
199 0 0 0 sensen 5 5
200 0 0 0 senmel 3 5
201 0 0 0 vacnil 7 5
202 0 0 0 vactor 11 5
203 0 0 1 vacnil 7 5
204 0 0 1 senbre 2 5
205 0 0 1 vactor 11 5
206 0 0 1 vacrob 8 5
207 0 0 1 vacxan 12 5
208 0 0 1 vacsey 9 5
209 0 1 0 vacrob 8 5
210 0 1 0 vactor 11 5
211 0 1 0 vacsey 9 5
212 0 1 0 vacsie 10 5
213 0 1 0 senmel 3 5
214 0 1 0 vacger 6 5
215 0 0 0 vacrob 8 5
216 0 0 0 vacger 6 5
217 0 0 0 senpol 4 5
218 0 0 0 vacsie 10 5
219 0 0 0 albhar 1 5
220 0 0 0 senbre 2 5
221 0 1 1 vacnil 7 5
222 0 1 1 albhar 1 5
223 0 1 1 senpol 4 5
224 0 1 1 sensen 5 5
225 0 1 1 vactor 11 5
226 0 1 1 senmel 3 5
227 0 0 1 senpol 4 5
228 0 0 1 vacsie 10 5
229 0 0 1 vacger 6 5
230 0 0 1 albhar 1 5
231 0 0 1 senmel 3 5
232 0 0 1 sensen 5 5
233 0 1 1 vacsie 10 5
234 0 1 1 vacsey 9 5
235 0 1 1 vacxan 12 5
236 0 1 1 vacrob 8 5
237 0 1 1 vacger 6 5
238 0 1 1 senbre 2 5
239 0 0 0 vactor 11 5
240 0 1 0 sensen 5 5
241 0 1 0 albhar 1 5
242 0 1 0 vacxan 12 5
243 0 1 0 senpol 4 5
244 0 1 0 senbre 2 5
245 0 1 0 vacnil 7 5
246 0 0 0 sensen 5 6
247 0 0 0 albhar 1 6
248 0 0 0 senbre 2 6
249 0 0 0 vactor 11 6
250 0 0 0 vacrob 8 6
251 0 0 0 vacxan 12 6
252 0 0 1 vacxan 12 6
253 0 0 1 vacsie 10 6
254 0 0 1 senbre 2 6
255 0 0 1 albhar 1 6
256 0 0 1 vacsey 9 6
257 0 0 1 senpol 4 6
258 0 1 1 vacsey 9 6
259 0 1 1 vacnil 7 6
260 0 1 1 vactor 11 6
261 0 1 1 senpol 4 6
262 0 1 1 senmel 3 6
263 0 1 1 sensen 5 6
264 0 1 0 sensen 5 6
265 0 1 0 vacger 6 6
266 0 1 0 senpol 4 6
267 0 1 0 vacsie 10 6
268 0 1 0 vacxan 12 6
269 0 1 0 vactor 11 6
270 0 0 0 senmel 3 6
271 0 0 1 vacger 6 6
272 0 0 1 sensen 5 6
273 0 0 1 vacrob 8 6
274 0 0 1 vactor 11 6
275 0 0 1 vacnil 7 6
276 0 0 1 senmel 3 6
277 0 1 0 senbre 2 6
278 0 1 0 vacsey 9 6
279 0 1 0 vacrob 8 6
280 0 1 0 albhar 1 6
281 0 1 0 vacnil 7 6
282 0 1 0 senmel 3 6
283 0 1 1 vacrob 8 6
284 0 1 1 vacger 6 6
285 0 1 1 albhar 1 6
286 0 1 1 vacxan 12 6
287 0 1 1 vacsie 10 6
288 0 1 1 senbre 2 6
289 0 0 0 vacsie 10 6
290 0 0 0 senmel 3 6
291 0 0 0 vacger 6 6
292 0 0 0 vacnil 7 6
293 0 0 0 senpol 4 6
294 0 0 0 vacsey 9 6
295 1 1 0 vacsey 9 7
296 1 1 0 senmel 3 7
297 1 1 0 vacrob 8 7
298 1 1 0 vacger 6 7
299 1 1 0 senpol 4 7
300 1 1 0 vacnil 7 7
301 1 0 0 senmel 3 7
302 1 0 0 vacsie 10 7
303 1 0 0 vacnil 7 7
304 1 0 0 vacger 6 7
305 1 0 0 albhar 1 7
306 1 0 0 vacrob 8 7
307 1 0 1 albhar 1 7
308 1 0 1 vactor 11 7
309 1 0 1 vacsey 9 7
310 1 0 1 vacger 6 7
311 1 0 1 vacsie 10 7
312 1 0 1 vacnil 7 7
313 1 0 0 vactor 11 7
314 1 0 0 vacsey 9 7
315 1 0 0 vacxan 12 7
316 1 0 0 senpol 4 7
317 1 0 0 sensen 5 7
318 1 0 0 senbre 2 7
319 1 1 1 senpol 4 7
320 1 1 1 senbre 2 7
321 1 1 1 sensen 5 7
322 1 1 1 vactor 11 7
323 1 1 1 vacxan 12 7
324 1 1 1 senmel 3 7
325 1 1 1 vacger 6 7
326 1 1 1 albhar 1 7
327 1 1 1 vacnil 7 7
328 1 1 1 vacrob 8 7
329 1 1 1 vacsie 10 7
330 1 1 1 vacsey 9 7
331 1 0 1 sensen 5 7
332 1 0 1 senpol 4 7
333 1 0 1 vacrob 8 7
334 1 0 1 senbre 2 7
335 1 0 1 vacxan 12 7
336 1 0 1 senmel 3 7
337 1 0 0 albhar 1 7
338 1 1 0 sensen 5 7
339 1 1 0 vactor 11 7
340 1 1 0 albhar 1 7
341 1 1 0 vacsie 10 7
342 1 1 0 senbre 2 7
343 1 1 0 vacxan 12 7
344 1 1 0 vacger 6 8
345 1 1 0 albhar 1 8
346 1 1 0 senpol 4 8
347 1 1 0 vacsie 10 8
348 1 1 0 senbre 2 8
349 1 1 0 sensen 5 8
350 1 0 0 sensen 5 8
351 1 0 0 senmel 3 8
352 1 0 0 senpol 4 8
353 1 0 0 vacsey 9 8
354 1 0 0 vacger 6 8
355 1 0 0 vactor 11 8
356 1 0 0 vacsey 9 8
357 1 1 0 vacsey 9 8
358 1 1 0 vactor 11 8
359 1 1 0 vacnil 7 8
360 1 1 0 vacrob 8 8
361 1 1 0 senmel 3 8
362 1 1 0 vacxan 12 8
363 1 0 1 senbre 2 8
364 1 0 1 senpol 4 8
365 1 0 1 senmel 3 8
366 1 0 1 vactor 11 8
367 1 0 1 vacger 6 8
368 1 0 1 vacsie 10 8
369 1 1 1 senbre 2 8
370 1 1 1 sensen 5 8
371 1 1 1 vacsie 10 8
372 1 1 1 vacrob 8 8
373 1 1 1 vacxan 12 8
374 1 1 1 senpol 4 8
375 1 0 1 vacxan 12 8
376 1 0 1 albhar 1 8
377 1 0 1 vacrob 8 8
378 1 0 1 sensen 5 8
379 1 0 1 vacnil 7 8
380 1 0 1 vacsey 9 8
381 1 1 1 albhar 1 8
382 1 1 1 vacnil 7 8
383 1 1 1 vacger 6 8
384 1 1 1 vactor 11 8
385 1 1 1 senmel 3 8
386 1 1 1 vacsey 9 8
387 1 0 0 vacrob 8 8
388 1 0 0 vacsie 10 8
389 1 0 0 senbre 2 8
390 1 0 0 albhar 1 8
391 1 0 0 vacnil 7 8
392 1 0 0 vacxan 12 8
393 0 1 0 senmel 3 9
394 0 1 0 vacxan 12 9
395 0 1 0 albhar 1 9
396 0 1 0 senbre 2 9
397 0 1 0 vacrob 8 9
398 0 1 0 vacnil 7 9
399 0 0 1 vacsey 9 9
400 0 0 1 sensen 5 9
401 0 0 1 albhar 1 9
402 0 0 1 senpol 4 9
403 0 0 1 senmel 3 9
404 0 0 1 vacger 6 9
405 0 0 0 sensen 5 9
406 0 0 0 sensen 5 9
407 0 0 0 vacsie 10 9
408 0 0 0 vacrob 8 9
409 0 0 0 senbre 2 9
410 0 0 0 albhar 1 9
411 0 0 0 vactor 11 9
412 0 1 1 vacxan 12 9
413 0 1 1 albhar 1 9
414 0 1 1 sensen 5 9
415 0 1 1 vacnil 7 9
416 0 1 1 vacsey 9 9
417 0 1 1 vactor 11 9
418 0 1 1 senmel 3 9
419 0 1 1 senbre 2 9
420 0 1 1 vacger 6 9
421 0 1 1 senpol 4 9
422 0 1 1 vacrob 8 9
423 0 1 1 vacsie 10 9
424 0 0 0 vacger 6 9
425 0 0 0 senpol 4 9
426 0 0 0 vacsey 9 9
427 0 0 0 vacxan 12 9
428 0 0 0 senmel 3 9
429 0 0 0 vacnil 7 9
430 0 1 0 senpol 4 9
431 0 1 0 vacsey 9 9
432 0 1 0 vactor 11 9
433 0 1 0 sensen 5 9
434 0 1 0 vacsie 10 9
435 0 1 0 vacger 6 9
436 0 0 1 vactor 11 9
437 0 0 1 vacxan 12 9
438 0 0 1 vacsie 10 9
439 0 0 1 senbre 2 9
440 0 0 1 vacrob 8 9
441 0 0 1 vacnil 7 9
442 1 1 0 senmel 3 10
443 1 1 0 vactor 11 10
444 1 1 0 albhar 1 10
445 1 1 0 vacsie 10 10
446 1 1 0 vacxan 12 10
447 1 1 0 vacger 6 10
448 1 0 0 senpol 4 10
449 1 0 0 senmel 3 10
450 1 0 0 sensen 5 10
451 1 0 0 senbre 2 10
452 1 0 0 vacsie 10 10
453 1 0 0 vacrob 8 10
454 1 0 1 vacrob 8 10
455 1 0 1 albhar 1 10
456 1 0 1 senmel 3 10
457 1 0 1 vactor 11 10
458 1 0 1 vacsie 10 10
459 1 0 1 vacnil 7 10
460 1 0 0 vacnil 7 10
461 1 0 1 vacger 6 10
462 1 0 1 vacxan 12 10
463 1 0 1 senpol 4 10
464 1 0 1 sensen 5 10
465 1 0 1 vacsey 9 10
466 1 0 1 senbre 2 10
467 1 1 0 sensen 5 10
468 1 1 0 vacrob 8 10
469 1 1 0 vacsey 9 10
470 1 1 0 senpol 4 10
471 1 1 0 senbre 2 10
472 1 1 0 vacnil 7 10
473 1 1 1 senbre 2 10
474 1 1 1 albhar 1 10
475 1 1 1 vactor 11 10
476 1 1 1 vacsey 9 10
477 1 1 1 vacnil 7 10
478 1 1 1 senmel 3 10
479 1 1 1 vacsie 10 10
480 1 1 1 senpol 4 10
481 1 1 1 vacger 6 10
482 1 1 1 vacrob 8 10
483 1 1 1 vacxan 12 10
484 1 1 1 sensen 5 10
485 1 0 0 vacxan 12 10
486 1 0 0 vactor 11 10
487 1 0 0 vacsey 9 10
488 1 0 0 vacnil 7 10
489 1 0 0 vacger 6 10
490 1 0 0 albhar 1 10

We can fill in variables to see the model for tree \(i=1\):

\[\begin{align*} y_1 &= \beta_0 \\ &\qquad + \beta^{\text{fire}} * 0 + \beta^{\text{drought}} * 1 + \beta^{\text{herbivory}} * 1 \\ &\qquad + \beta^{\text{fire},\text{drought}} * 0 * 1 + \beta^{\text{fire},\text{herbivory}} * 0 * 1 + \beta^{\text{drought},\text{herbivory}} * 1 * 1 \\ &\qquad + \beta^{\text{species}}_{9} \\ &\qquad + \beta^{\text{species},\text{fire}}_{9} * 0 + \beta^{\text{species},\text{drought}}_{9} * 1 + \beta^{\text{species},\text{herbivory}}_{9} * 1 \\ &\qquad + \beta^{\text{block}}_{1} \\ &\qquad + \epsilon_1 \end{align*}\]

Team activity (9 minutes): What would the model be for tree \(i=1\) if it got no treatment (no fire, no drought, no herbivory) ? Keep the species and plant position variables the same.

An answer

\[\begin{align*} y_1 &= \beta_0 \\ &\qquad + \beta^{\text{species}}_{9} \\ &\qquad + \beta^{\text{block}}_{1} \\ &\qquad + \epsilon_1 \end{align*}\]

We can also write our model for the tree heights in code and use it to simulate fake tree heights:

simulated_data = experimental_design %>% 
  mutate(linear_predictor = beta_0 + 
           beta_fire*Fire +
           beta_drought*Drought +
           beta_herbivory*Herbivory +
           beta_fire_drought*Fire*Drought +
           beta_fire_herbivory*Fire*Herbivory +
           beta_drought_herbivory*Drought*Herbivory +
           beta_species[Species_number] + 
           beta_species_fire[Species_number]*Fire +
           beta_species_drought[Species_number]*Drought +
           beta_species_herbivory[Species_number]*Herbivory +
           beta_block[Block],
         y = rnorm(n = n(), mean = linear_predictor, sd = sd_model_error))

Simulating from the model, we get a collection of tree heights in centimeters. We can make a histogram of these heights to check that the minimum and maximum are reasonable:

hist(simulated_data$y)

Team activity (8 minutes): What do you notice about these simulated fake tree heights ? What seems reasonable ? What seems unreasonable ?

An answer

Tree heights shouldn’t be negative ! Otherwise things look ok. We could modify the model to prevent negative tree heights. We can do this by transforming the height variable, or by using a model other than the Normal distribution. We will explore this in future courses.

What isn’t in the model ?

Team activity (8 minutes): We could make our mathematical story more complicated. What isn’t in the model ?

An answer

We could add a 3-way interaction between fire, drought, and herbivory. We could let all the interactions vary by species. We could include subplot effects.


More complicated models are more difficult to understand and use. If a simpler model is realistic-enough, we can start there.

Fitting the model to data

Great ! We have a mathematical story about how the data could be generated.

To fit the model, we will use an R package called brms, which stands for “Bayesian Regression Models using Stan” (Stan fits Bayesian models).

What is “Bayesian” ?

“Bayesian” is a way of thinking about what we know and what we don’t know. For example, consider the average height in the control group, \(\beta_0\).

Above, we used the Normal Distribution to describe how species vary around this average:

We can also use the Normal Distribution to describe our certainty about the average height (sometimes summarized as a 95% interval):

This is confusing ! Let’s think about it more.

Team activity (8 minutes): Suppose I collect data on millions of acacia trees. Which of the two Normal distributions above will get skinnier ?

An answer

Collecting more data helps us to become more certain about the average height of acacia trees. It will not reduce the variability across acacia species. So only the second Normal distribution (the one describing our certainty) should get skinnier.


It is useful to keep track of our certainty using probability distributions. Let’s now consider the effect of fire, \(\beta^{\text{fire}}\).

Prior to the experiment, we know something about the effect of fire. We do not expect it to be -50000 centimeters, because plants are unlikely to be that tall in the first place. Maybe we expect it to be negative, but we are not certain.

So we conduct an experiment and collect data.

After the experiment, we hopefully know more about the effect of fire.

Bayesian thinking helps us keep track of our uncertainty before and after the experiment.

Let’s look at this in a picture, where we show what we know in terms of probability distributions (like the Normal distribution we discussed above):


Multilevel modeling

A Bayesian approach uses probability distributions to express what we know, i.e. our certainty. This is especially useful when there is structure to what we know. For example, we have 12 species \(\beta^{\text{species}}_1,...,\beta^{\text{species}}_{12}\). So when we get data, we can learn about how these 12 species vary, and use them to inform each other.

This technique is called multilevel modeling (also known as hierarchical modeling) and will be covered in a future course. If you want to read ahead, see:

Fitting to simulated data

Before we have real data, we can learn a lot from fitting the data we simulated from our model.

How much precision do we have to estimate the effect of Fire \(\beta^{\text{fire}}\) ?

Estimate Est.Error
Intercept 18.4 2.0
Fire -1.2 1.6
Drought -4.2 0.3
Herbivory -1.8 0.4
Fire:Drought -2.1 0.2
Fire:Herbivory -2.2 0.2
Drought:Herbivory -1.9 0.2

Team activity (7 minutes): Which effect are we least certain about, \(\beta^{\text{fire}}\), \(\beta^{\text{drought}}\), or \(\beta^{\text{herbivory}}\) ?

An answer

We estimate \(\beta^{\text{fire}}\) to be -1.1 centimeters with estimated error of 1.5 centimeters. We estimate \(\beta^{\text{drought}}\) to be -4.2 centimeters with estimated error of 0.3 centimeters. We estimate \(\beta^{\text{herbivory}}\) to be -1.8 centimeters with estimated error of 0.4 centimeters. So we have least precision for the fire effect.


Improving precision for the fire effect

Team activity (8 minutes): Recall the experimental design. Why do you think precision is worse for the fire effect than for drought and herbivory effects ?

An answer

When we assign fire at the block level, we cannot distinguish between the effect of being in a block versus the effect of fire. For example, if block 3 got fire but block 7 did not get fire, is the difference between the plant heights because of the fire treatment ? Or because block 3 had slightly different soil than block 7 ? Or maybe their rainout shelters are built slightly differently ? When treatment (herbivory and drought) is assigned at a lower level like a subplot, we can compare within the same block and isolate the effect of treatment alone.

But manipulating fire is more difficult than manipulating drought or herbivory. It would be difficult to do the controlled burns at the subplot level.

Team activity (8 minutes): How would you propose to improve the precision of the fire effect ?

An answer

Making the rainout shelters as similar as possible will reduce the size of the block effects. This will make it possible to estimate the effect of fire with more precision. Let’s see this in the mathematical story.

Previously, we assumed block effects with standard deviation 2 centimeters:

\[\beta^{\text{block}}_b \sim N(0,2)\]

Now let’s assume block effects have standard deviation of only 0.2 centimeters:

\[\beta^{\text{block}}_b \sim N(0,0.2)\]

sd_block = 0.2
beta_block = rnorm(n = 10, mean = 0, sd = sd_block)

We refit the model with these new block effects and see these estimates for the effects:

Estimate Est.Error
Intercept 19.4 1.7
Fire -1.3 0.4
Drought -4.2 0.3
Herbivory -1.7 0.5
Fire:Drought -1.9 0.2
Fire:Herbivory -2.0 0.2
Drought:Herbivory -1.9 0.2
Randomization at block-level vs subplot-level references

Imbens (2014)

10 treatment states scattered about. In this case, assuming iid errors does not seem right: instead, one might cluster the errors by state. Now when above a certain amount increasing within-state sample size won’t do much for power…

Suppose we think of the outcomes being affected by individual level characteristics as well as state effects, with the latter modeled as random effects, independent of the treatment…the variance estimates based on assuming there are no state effects would be underestimating the true variance

Gelman et al. (2022)

As Imbens (2014) points out, with only one treatment village and one control village, the treatment e!ect cannot be separated from the difference in village effects.

Su and Ding (2021)

Cluster-randomized experiments are widely used due to their logistical convenience and policy relevance. To analyse them properly, we must address the fact that the treatment is assigned at the cluster level instead of the individual level…Feller and Gelman (2015) reviewed the standard approach of using hierarchical models for causal inference with cluster-randomized experiments.

Feller and Gelman (2015)

In cluster-randomized experiments, every unit in the cluster receives the same treatment; in other words, randomization occurs at the group level…

Fitting to real data

So let’s read in the real data (using Tanzanian football players as code names for species to protect the data before publication):

tree y Fire Drought Herbivory Species Species_number Block Subplot PlantPosition
1 80.5 0 1 1 Mao 10 1 1 1
2 20.5 0 1 1 Nyoni 4 1 1 2
3 39.9 0 1 1 Yondani 8 1 1 3
4 7.1 0 1 1 Samatta 1 1 1 4
5 0.0 0 1 1 Msuva 2 1 1 5
6 27.3 0 1 1 Kawawa 7 1 1 6
7 42.0 0 0 1 Samatta 1 1 2 1
8 86.8 0 0 1 Kapombe 9 1 2 2
9 182.0 0 0 1 Bocco 3 1 2 3
10 83.0 0 0 1 Mnoga 12 1 2 4
11 52.0 0 0 1 Ngassa 5 1 2 5
12 53.9 0 0 1 Manula 6 1 2 6
13 212.0 0 0 0 Samatta 1 1 3 1
14 94.6 0 0 0 Kaseja 11 1 3 2
15 58.1 0 0 0 Kapombe 9 1 3 3
16 240.0 0 0 0 Bocco 3 1 3 4
17 29.4 0 0 0 Msuva 2 1 3 5
18 111.0 0 0 0 Kawawa 7 1 3 6
19 23.4 0 1 0 Ngassa 5 1 4 1
20 39.2 0 1 0 Mnoga 12 1 4 2
21 31.2 0 1 0 Kapombe 9 1 4 3
22 29.0 0 1 0 Nyoni 4 1 4 4
23 27.0 0 1 0 Msuva 2 1 4 5
24 37.2 0 1 0 Bocco 3 1 4 6
25 76.5 0 0 1 Kawawa 7 1 5 1
26 31.4 0 0 1 Msuva 2 1 5 2
27 87.7 0 0 1 Kaseja 11 1 5 3
28 91.3 0 0 1 Mao 10 1 5 4
29 106.0 0 0 1 Nyoni 4 1 5 5
30 25.0 0 0 1 Yondani 8 1 5 6
31 147.3 0 1 1 Bocco 3 1 6 1
32 78.8 0 1 1 Kapombe 9 1 6 2
33 40.0 0 1 1 Manula 6 1 6 3
34 33.4 0 1 1 Mnoga 12 1 6 4
35 33.3 0 1 1 Ngassa 5 1 6 5
36 57.7 0 1 1 Kaseja 11 1 6 6
37 59.5 0 0 0 Manula 6 1 8 1
38 96.7 0 0 0 Nyoni 4 1 8 2
39 131.2 0 0 0 Mnoga 12 1 8 3
40 125.0 0 0 0 Mao 10 1 8 4
41 0.0 0 0 0 Yondani 8 1 8 5
42 64.5 0 0 0 Ngassa 5 1 8 6
43 87.3 0 1 0 Kawawa 7 1 9 1
44 88.4 0 1 0 Kaseja 11 1 9 2
45 17.5 0 1 0 Manula 6 1 9 3
46 44.0 0 1 0 Yondani 8 1 9 4
47 143.0 0 1 0 Samatta 1 1 9 5
48 95.0 0 1 0 Mao 10 1 9 6
49 23.0 1 0 0 Ngassa 5 2 2 1
50 0.0 1 0 0 Msuva 2 2 2 2
51 49.0 1 0 0 Kawawa 7 2 2 3
52 95.8 1 0 0 Kapombe 9 2 2 4
53 0.0 1 0 0 Nyoni 4 2 2 5
54 21.0 1 0 0 Mnoga 12 2 2 6
55 0.0 1 0 1 Mao 10 2 3 1
56 59.0 1 0 1 Yondani 8 2 3 2
57 0.0 1 0 1 Nyoni 4 2 3 3
58 47.5 1 0 1 Mnoga 12 2 3 4
59 51.3 1 0 1 Kawawa 7 2 3 5
60 30.5 1 0 1 Kapombe 9 2 3 6
61 0.0 1 1 1 Kaseja 11 2 4 1
62 13.3 1 1 1 Yondani 8 2 4 2
63 76.0 1 1 1 Bocco 3 2 4 3
64 0.0 1 1 1 Manula 6 2 4 4
65 18.4 1 1 1 Kapombe 9 2 4 5
66 21.0 1 1 1 Mao 10 2 4 6
67 0.0 1 1 0 Samatta 1 2 5 1
68 53.0 1 1 0 Bocco 3 2 5 2
69 0.0 1 1 0 Mnoga 12 2 5 3
70 19.8 1 1 0 Kapombe 9 2 5 4
71 25.0 1 1 0 Manula 6 2 5 5
72 0.0 1 1 0 Yondani 8 2 5 6
73 46.0 1 1 0 Nyoni 4 2 6 1
74 0.0 1 1 0 Kaseja 11 2 6 2
75 0.0 1 1 0 Ngassa 5 2 6 3
76 27.0 1 1 0 Mao 10 2 6 4
77 0.0 1 1 0 Msuva 2 2 6 5
78 22.0 1 1 0 Kawawa 7 2 6 6
79 60.4 1 0 0 Manula 6 2 7 1
80 0.0 1 0 0 Samatta 1 2 7 2
81 96.0 1 0 0 Mao 10 2 7 3
82 49.0 1 0 0 Yondani 8 2 7 4
83 53.5 1 0 0 Kaseja 11 2 7 5
84 0.0 1 0 0 Bocco 3 2 7 6
85 0.0 1 0 1 Kaseja 11 2 8 1
86 0.0 1 0 1 Ngassa 5 2 8 2
87 23.5 1 0 1 Manula 6 2 8 3
88 0.0 1 0 1 Samatta 1 2 8 4
89 121.6 1 0 1 Bocco 3 2 8 5
90 0.0 1 0 1 Msuva 2 2 8 6
91 0.0 1 1 1 Msuva 2 2 9 1
92 0.0 1 1 1 Kawawa 7 2 9 2
93 0.0 1 1 1 Mnoga 12 2 9 3
94 0.0 1 1 1 Nyoni 4 2 9 4
95 14.0 1 1 1 Ngassa 5 2 9 5
96 0.0 1 1 1 Samatta 1 2 9 6
97 71.0 1 0 0 Manula 6 3 1 1
98 50.5 1 0 0 Kaseja 11 3 1 2
99 117.5 1 0 0 Yondani 8 3 1 3
100 0.0 1 0 0 Mnoga 12 3 1 4
101 45.2 1 0 0 Kawawa 7 3 1 5
102 61.5 1 0 0 Kapombe 9 3 1 6
103 0.0 1 0 1 Samatta 1 3 2 1
104 59.0 1 0 1 Manula 6 3 2 2
105 24.8 1 0 1 Mnoga 12 3 2 3
106 46.6 1 0 1 Nyoni 4 3 2 4
107 45.0 1 0 1 Ngassa 5 3 2 5
108 49.5 1 0 1 Kawawa 7 3 2 6
109 0.0 1 1 1 Kaseja 11 3 3 1
110 11.4 1 1 1 Yondani 8 3 3 2
111 49.0 1 1 1 Bocco 3 3 3 3
112 0.0 1 1 1 Msuva 2 3 3 4
113 14.0 1 1 1 Mao 10 3 3 5
114 0.0 1 1 1 Samatta 1 3 3 6
115 34.4 1 1 0 Nyoni 4 3 5 1
116 0.0 1 1 0 Samatta 1 3 5 2
117 82.0 1 1 0 Bocco 3 3 5 3
118 0.0 1 1 0 Msuva 2 3 5 4
119 23.5 1 1 0 Mao 10 3 5 5
120 0.0 1 1 0 Manula 6 3 5 6
121 0.0 1 1 1 Ngassa 5 3 6 1
122 0.0 1 1 1 Manula 6 3 6 2
123 23.5 1 1 1 Kapombe 9 3 6 3
124 33.0 1 1 1 Kawawa 7 3 6 4
125 20.5 1 1 1 Nyoni 4 3 6 5
126 10.0 1 1 1 Mnoga 12 3 6 6
127 0.0 1 1 0 Kapombe 9 3 7 1
128 13.2 1 1 0 Kaseja 11 3 7 2
129 0.0 1 1 0 Kawawa 7 3 7 3
130 14.0 1 1 0 Yondani 8 3 7 4
131 0.0 1 1 0 Ngassa 5 3 7 5
132 0.0 1 1 0 Mnoga 12 3 7 6
133 29.5 1 0 1 Yondani 8 3 8 1
134 83.4 1 0 1 Kapombe 9 3 8 2
135 86.0 1 0 1 Bocco 3 3 8 3
136 0.0 1 0 1 Msuva 2 3 8 4
137 33.0 1 0 1 Kaseja 11 3 8 5
138 10.0 1 0 1 Mao 10 3 8 6
139 117.5 1 0 0 Bocco 3 3 9 1
140 29.0 1 0 0 Ngassa 5 3 9 2
141 0.0 1 0 0 Samatta 1 3 9 3
142 43.2 1 0 0 Mao 10 3 9 4
143 0.0 1 0 0 Nyoni 4 3 9 5
144 0.0 1 0 0 Msuva 2 3 9 6
145 79.0 0 1 0 Mao 10 4 2 1
146 24.3 0 1 0 Kaseja 11 4 2 2
147 60.3 0 1 0 Mnoga 12 4 2 3
148 42.1 0 1 0 Msuva 2 4 2 4
149 51.2 0 1 0 Ngassa 5 4 2 5
150 68.4 0 1 0 Bocco 3 4 2 6
151 89.5 0 0 0 Kawawa 7 4 3 1
152 111.8 0 0 0 Yondani 8 4 3 2
153 58.5 0 0 0 Ngassa 5 4 3 3
154 0.0 0 0 0 Manula 6 4 3 4
155 22.6 0 0 0 Nyoni 4 4 3 5
156 74.5 0 0 0 Mnoga 12 4 3 6
157 23.5 0 1 1 Yondani 8 4 4 1
158 20.6 0 1 1 Nyoni 4 4 4 2
159 24.1 0 1 1 Mao 10 4 4 3
160 14.3 0 1 1 Kapombe 9 4 4 4
161 48.2 0 1 1 Bocco 3 4 4 5
162 48.0 0 1 1 Kawawa 7 4 4 6
163 10.1 0 1 0 Samatta 1 4 5 1
164 37.0 0 1 0 Kapombe 9 4 5 2
165 52.0 0 1 0 Yondani 8 4 5 3
166 38.9 0 1 0 Nyoni 4 4 5 4
167 20.5 0 1 0 Manula 6 4 5 5
168 35.5 0 1 0 Kawawa 7 4 5 6
169 169.5 0 0 1 Bocco 3 4 6 1
170 66.7 0 0 1 Mnoga 12 4 6 2
171 64.0 0 0 1 Kapombe 9 4 6 3
172 79.4 0 0 1 Mao 10 4 6 4
173 51.8 0 0 1 Msuva 2 4 6 5
174 48.0 0 0 1 Nyoni 4 4 6 6
175 79.0 0 0 1 Kaseja 11 4 7 1
176 93.2 0 0 1 Manula 6 4 7 2
177 54.6 0 0 1 Ngassa 5 4 7 3
178 146.6 0 0 1 Samatta 1 4 7 4
179 68.4 0 0 1 Yondani 8 4 7 5
180 90.5 0 0 1 Kawawa 7 4 7 6
181 50.7 0 0 0 Msuva 2 4 8 1
182 21.6 0 0 0 Samatta 1 4 8 2
183 158.0 0 0 0 Kapombe 9 4 8 3
184 205.0 0 0 0 Bocco 3 4 8 4
185 127.0 0 0 0 Mao 10 4 8 5
186 80.6 0 0 0 Kaseja 11 4 8 6
187 44.8 0 1 1 Kaseja 11 4 9 1
188 24.5 0 1 1 Mnoga 12 4 9 2
189 39.3 0 1 1 Msuva 2 4 9 3
190 25.2 0 1 1 Samatta 1 4 9 4
191 0.0 0 1 1 Ngassa 5 4 9 5
192 94.3 0 1 1 Manula 6 4 9 6
193 35.2 0 0 0 Mao 10 5 1 1
194 27.0 0 0 0 Ngassa 5 5 1 2
195 83.2 0 0 0 Kawawa 7 5 1 3
196 175.0 0 0 0 Bocco 3 5 1 4
197 104.5 0 0 0 Manula 6 5 1 5
198 105.0 0 0 0 Mnoga 12 5 1 6
199 42.0 0 0 1 Manula 6 5 2 1
200 33.0 0 0 1 Msuva 2 5 2 2
201 44.0 0 0 1 Mnoga 12 5 2 3
202 52.0 0 0 1 Kapombe 9 5 2 4
203 19.0 0 0 1 Ngassa 5 5 2 5
204 75.0 0 0 1 Mao 10 5 2 6
205 22.0 0 1 0 Kapombe 9 5 3 1
206 38.0 0 1 0 Mnoga 12 5 3 2
207 63.2 0 1 0 Mao 10 5 3 3
208 33.0 0 1 0 Kaseja 11 5 3 4
209 45.2 0 1 0 Bocco 3 5 3 5
210 29.8 0 1 0 Yondani 8 5 3 6
211 100.8 0 0 0 Kapombe 9 5 4 1
212 48.0 0 0 0 Yondani 8 5 4 2
213 120.5 0 0 0 Nyoni 4 5 4 3
214 134.5 0 0 0 Kaseja 11 5 4 4
215 190.0 0 0 0 Samatta 1 5 4 5
216 57.6 0 0 0 Msuva 2 5 4 6
217 30.0 0 1 1 Manula 6 5 5 1
218 9.5 0 1 1 Samatta 1 5 5 2
219 20.0 0 1 1 Nyoni 4 5 5 3
220 16.5 0 1 1 Kawawa 7 5 5 4
221 40.1 0 1 1 Mnoga 12 5 5 5
222 77.2 0 1 1 Bocco 3 5 5 6
223 45.8 0 0 1 Nyoni 4 5 6 1
224 70.0 0 0 1 Kaseja 11 5 6 2
225 28.5 0 0 1 Yondani 8 5 6 3
226 21.3 0 0 1 Samatta 1 5 6 4
227 118.0 0 0 1 Bocco 3 5 6 5
228 67.0 0 0 1 Kawawa 7 5 6 6
229 46.0 0 1 1 Kaseja 11 5 7 1
230 20.0 0 1 1 Mao 10 5 7 2
231 19.3 0 1 1 Ngassa 5 5 7 3
232 23.0 0 1 1 Kapombe 9 5 7 4
233 11.0 0 1 1 Yondani 8 5 7 5
234 14.5 0 1 1 Msuva 2 5 7 6
235 51.5 0 1 0 Kawawa 7 5 9 1
236 6.3 0 1 0 Samatta 1 5 9 2
237 20.8 0 1 0 Ngassa 5 5 9 3
238 18.8 0 1 0 Nyoni 4 5 9 4
239 7.0 0 1 0 Msuva 2 5 9 5
240 19.6 0 1 0 Manula 6 5 9 6
241 158.3 0 0 0 Kawawa 7 6 1 1
242 31.3 0 0 0 Samatta 1 6 1 2
243 103.0 0 0 0 Msuva 2 6 1 3
244 120.0 0 0 0 Mnoga 12 6 1 4
245 121.0 0 0 0 Kapombe 9 6 1 5
246 115.6 0 0 0 Ngassa 5 6 1 6
247 54.0 0 0 1 Ngassa 5 6 2 1
248 85.6 0 0 1 Kaseja 11 6 2 2
249 51.5 0 0 1 Msuva 2 6 2 3
250 143.0 0 0 1 Samatta 1 6 2 4
251 140.0 0 0 1 Mao 10 6 2 5
252 69.4 0 0 1 Nyoni 4 6 2 6
253 39.0 0 1 1 Mao 10 6 3 1
254 83.0 0 1 1 Manula 6 6 3 2
255 52.0 0 1 1 Mnoga 12 6 3 3
256 34.5 0 1 1 Nyoni 4 6 3 4
257 195.0 0 1 1 Bocco 3 6 3 5
258 118.0 0 1 1 Kawawa 7 6 3 6
259 65.0 0 1 0 Kawawa 7 6 4 1
260 67.0 0 1 0 Yondani 8 6 4 2
261 43.2 0 1 0 Nyoni 4 6 4 3
262 58.0 0 1 0 Kaseja 11 6 4 4
263 27.2 0 1 0 Ngassa 5 6 4 5
264 50.0 0 1 0 Mnoga 12 6 4 6
265 58.0 0 0 1 Yondani 8 6 6 1
266 49.0 0 0 1 Kawawa 7 6 6 2
267 91.1 0 0 1 Kapombe 9 6 6 3
268 40.3 0 0 1 Mnoga 12 6 6 4
269 54.2 0 0 1 Manula 6 6 6 5
270 135.4 0 0 1 Bocco 3 6 6 6
271 0.0 0 1 0 Msuva 2 6 7 1
272 29.0 0 1 0 Mao 10 6 7 2
273 62.7 0 1 0 Kapombe 9 6 7 3
274 0.0 0 1 0 Samatta 1 6 7 4
275 0.0 0 1 0 Manula 6 6 7 5
276 139.5 0 1 0 Bocco 3 6 7 6
277 18.4 0 1 1 Kapombe 9 6 8 1
278 48.8 0 1 1 Yondani 8 6 8 2
279 18.4 0 1 1 Samatta 1 6 8 3
280 20.9 0 1 1 Ngassa 5 6 8 4
281 27.0 0 1 1 Kaseja 11 6 8 5
282 18.5 0 1 1 Msuva 2 6 8 6
283 51.2 0 0 0 Kaseja 11 6 9 1
284 121.8 0 0 0 Bocco 3 6 9 2
285 40.5 0 0 0 Yondani 8 6 9 3
286 103.0 0 0 0 Manula 6 6 9 4
287 105.5 0 0 0 Nyoni 4 6 9 5
288 90.0 0 0 0 Mao 10 6 9 6
289 40.5 1 1 0 Mao 10 7 1 1
290 0.0 1 1 0 Bocco 3 7 1 2
291 0.0 1 1 0 Kapombe 9 7 1 3
292 31.5 1 1 0 Yondani 8 7 1 4
293 0.0 1 1 0 Nyoni 4 7 1 5
294 0.0 1 1 0 Manula 6 7 1 6
295 111.0 1 0 0 Bocco 3 7 2 1
296 58.0 1 0 0 Kaseja 11 7 2 2
297 0.0 1 0 0 Manula 6 7 2 3
298 41.5 1 0 0 Yondani 8 7 2 4
299 0.0 1 0 0 Samatta 1 7 2 5
300 0.0 1 0 0 Kapombe 9 7 2 6
301 0.0 1 0 1 Samatta 1 7 3 1
302 50.9 1 0 1 Mnoga 12 7 3 2
303 63.0 1 0 1 Mao 10 7 3 3
304 61.4 1 0 1 Yondani 8 7 3 4
305 0.0 1 0 1 Kaseja 11 7 3 5
306 20.5 1 0 1 Manula 6 7 3 6
307 18.8 1 0 0 Mnoga 12 7 4 1
308 59.4 1 0 0 Mao 10 7 4 2
309 0.0 1 0 0 Ngassa 5 7 4 3
310 50.0 1 0 0 Nyoni 4 7 4 4
311 43.5 1 0 0 Kawawa 7 7 4 5
312 0.0 1 0 0 Msuva 2 7 4 6
313 0.0 1 1 1 Nyoni 4 7 5 1
314 0.0 1 1 1 Msuva 2 7 5 2
315 0.0 1 1 1 Kawawa 7 7 5 3
316 7.0 1 1 1 Mnoga 12 7 5 4
317 0.0 1 1 1 Ngassa 5 7 5 5
318 42.3 1 1 1 Bocco 3 7 5 6
319 0.0 1 1 1 Yondani 8 7 6 1
320 0.0 1 1 1 Samatta 1 7 6 2
321 0.0 1 1 1 Manula 6 7 6 3
322 0.0 1 1 1 Kapombe 9 7 6 4
323 0.0 1 1 1 Kaseja 11 7 6 5
324 0.0 1 1 1 Mao 10 7 6 6
325 37.0 1 0 1 Kawawa 7 7 7 1
326 0.0 1 0 1 Nyoni 4 7 7 2
327 32.5 1 0 1 Kapombe 9 7 7 3
328 23.4 1 0 1 Msuva 2 7 7 4
329 41.7 1 0 1 Ngassa 5 7 7 5
330 90.0 1 0 1 Bocco 3 7 7 6
331 0.0 1 1 0 Kawawa 7 7 9 1
332 14.2 1 1 0 Mnoga 12 7 9 2
333 0.0 1 1 0 Samatta 1 7 9 3
334 0.0 1 1 0 Kaseja 11 7 9 4
335 0.0 1 1 0 Msuva 2 7 9 5
336 0.0 1 1 0 Ngassa 5 7 9 6
337 0.0 1 1 0 Yondani 8 8 1 1
338 0.0 1 1 0 Samatta 1 8 1 2
339 15.0 1 1 0 Nyoni 4 8 1 3
340 0.0 1 1 0 Kaseja 11 8 1 4
341 23.0 1 1 0 Msuva 2 8 1 5
342 0.0 1 1 0 Kawawa 7 8 1 6
343 56.0 1 0 0 Kawawa 7 8 2 1
344 39.2 1 0 0 Bocco 3 8 2 2
345 0.0 1 0 0 Nyoni 4 8 2 3
346 70.5 1 0 0 Mao 10 8 2 4
347 7.8 1 0 0 Yondani 8 8 2 5
348 68.0 1 0 0 Mnoga 12 8 2 6
349 26.5 1 1 0 Mao 10 8 4 1
350 0.0 1 1 0 Mnoga 12 8 4 2
351 9.4 1 1 0 Manula 6 8 4 3
352 15.2 1 1 0 Kapombe 9 8 4 4
353 30.5 1 1 0 Bocco 3 8 4 5
354 0.0 1 1 0 Ngassa 5 8 4 6
355 0.0 1 0 1 Msuva 2 8 5 1
356 37.2 1 0 1 Nyoni 4 8 5 2
357 85.7 1 0 1 Bocco 3 8 5 3
358 49.5 1 0 1 Mnoga 12 8 5 4
359 35.5 1 0 1 Yondani 8 8 5 5
360 0.0 1 0 1 Kaseja 11 8 5 6
361 0.0 1 1 1 Msuva 2 8 6 1
362 0.0 1 1 1 Kawawa 7 8 6 2
363 0.0 1 1 1 Kaseja 11 8 6 3
364 0.0 1 1 1 Kapombe 9 8 6 4
365 15.4 1 1 1 Ngassa 5 8 6 5
366 0.0 1 1 1 Nyoni 4 8 6 6
367 27.6 1 0 1 Ngassa 5 8 7 1
368 0.0 1 0 1 Samatta 1 8 7 2
369 44.5 1 0 1 Kapombe 9 8 7 3
370 62.0 1 0 1 Kawawa 7 8 7 4
371 36.0 1 0 1 Manula 6 8 7 5
372 28.0 1 0 1 Mao 10 8 7 6
373 0.0 1 1 1 Samatta 1 8 8 1
374 14.1 1 1 1 Manula 6 8 8 2
375 0.0 1 1 1 Yondani 8 8 8 3
376 0.0 1 1 1 Mnoga 12 8 8 4
377 0.0 1 1 1 Bocco 3 8 8 5
378 15.0 1 1 1 Mao 10 8 8 6
379 0.0 1 0 0 Kapombe 9 8 9 1
380 28.2 1 0 0 Kaseja 11 8 9 2
381 0.0 1 0 0 Msuva 2 8 9 3
382 0.0 1 0 0 Samatta 1 8 9 4
383 0.0 1 0 0 Manula 6 8 9 5
384 0.0 1 0 0 Ngassa 5 8 9 6
385 134.0 0 1 0 Bocco 3 9 1 1
386 43.0 0 1 0 Ngassa 5 9 1 2
387 29.0 0 1 0 Samatta 1 9 1 3
388 57.7 0 1 0 Msuva 2 9 1 4
389 59.5 0 1 0 Kapombe 9 9 1 5
390 59.0 0 1 0 Manula 6 9 1 6
391 113.0 0 0 1 Mao 10 9 2 1
392 84.8 0 0 1 Kawawa 7 9 2 2
393 215.0 0 0 1 Samatta 1 9 2 3
394 130.4 0 0 1 Nyoni 4 9 2 4
395 163.0 0 0 1 Bocco 3 9 2 5
396 118.0 0 0 1 Yondani 8 9 2 6
397 130.0 0 0 0 Kawawa 7 9 4 1
398 113.9 0 0 0 Kaseja 11 9 4 2
399 164.5 0 0 0 Kapombe 9 9 4 3
400 95.5 0 0 0 Msuva 2 9 4 4
401 18.4 0 0 0 Samatta 1 9 4 5
402 170.0 0 0 0 Mnoga 12 9 4 6
403 65.3 0 1 1 Ngassa 5 9 5 1
404 9.5 0 1 1 Samatta 1 9 5 2
405 53.7 0 1 1 Kawawa 7 9 5 3
406 68.5 0 1 1 Manula 6 9 5 4
407 47.0 0 1 1 Mao 10 9 5 5
408 23.8 0 1 1 Mnoga 12 9 5 6
409 145.0 0 1 1 Bocco 3 9 6 1
410 19.3 0 1 1 Msuva 2 9 6 2
411 87.0 0 1 1 Yondani 8 9 6 3
412 43.6 0 1 1 Nyoni 4 9 6 4
413 40.7 0 1 1 Kapombe 9 9 6 5
414 33.2 0 1 1 Kaseja 11 9 6 6
415 112.0 0 0 0 Yondani 8 9 7 1
416 95.0 0 0 0 Nyoni 4 9 7 2
417 100.6 0 0 0 Mao 10 9 7 3
418 77.0 0 0 0 Ngassa 5 9 7 4
419 230.0 0 0 0 Bocco 3 9 7 5
420 49.9 0 0 0 Manula 6 9 7 6
421 89.8 0 1 0 Nyoni 4 9 8 1
422 142.5 0 1 0 Mao 10 9 8 2
423 47.5 0 1 0 Mnoga 12 9 8 3
424 71.0 0 1 0 Kawawa 7 9 8 4
425 58.0 0 1 0 Kaseja 11 9 8 5
426 55.0 0 1 0 Yondani 8 9 8 6
427 131.5 0 0 1 Mnoga 12 9 9 1
428 23.4 0 0 1 Ngassa 5 9 9 2
429 51.4 0 0 1 Kaseja 11 9 9 3
430 72.3 0 0 1 Msuva 2 9 9 4
431 117.5 0 0 1 Kapombe 9 9 9 5
432 48.6 0 0 1 Manula 6 9 9 6
433 93.0 1 1 0 Bocco 3 10 1 1
434 0.0 1 1 0 Mnoga 12 10 1 2
435 0.0 1 1 0 Samatta 1 10 1 3
436 0.0 1 1 0 Kaseja 11 10 1 4
437 0.0 1 1 0 Ngassa 5 10 1 5
438 58.0 1 1 0 Yondani 8 10 1 6
439 0.0 1 0 0 Nyoni 4 10 2 1
440 0.0 1 0 0 Bocco 3 10 2 2
441 26.5 1 0 0 Kawawa 7 10 2 3
442 22.0 1 0 0 Msuva 2 10 2 4
443 41.0 1 0 0 Kaseja 11 10 2 5
444 38.5 1 0 0 Kapombe 9 10 2 6
445 0.0 1 0 1 Kapombe 9 10 3 1
446 0.0 1 0 1 Samatta 1 10 3 2
447 98.5 1 0 1 Bocco 3 10 3 3
448 19.0 1 0 1 Mnoga 12 10 3 4
449 0.0 1 0 1 Kaseja 11 10 3 5
450 23.0 1 0 1 Manula 6 10 3 6
451 41.0 1 0 1 Yondani 8 10 5 1
452 0.0 1 0 1 Ngassa 5 10 5 2
453 0.0 1 0 1 Nyoni 4 10 5 3
454 0.0 1 0 1 Kawawa 7 10 5 4
455 25.3 1 0 1 Mao 10 10 5 5
456 0.0 1 0 1 Msuva 2 10 5 6
457 22.5 1 1 0 Kawawa 7 10 6 1
458 16.0 1 1 0 Kapombe 9 10 6 2
459 17.2 1 1 0 Mao 10 10 6 3
460 0.0 1 1 0 Nyoni 4 10 6 4
461 0.0 1 1 0 Msuva 2 10 6 5
462 2.1 1 1 0 Manula 6 10 6 6
463 0.0 1 1 1 Msuva 2 10 7 1
464 0.0 1 1 1 Samatta 1 10 7 2
465 8.0 1 1 1 Mnoga 12 10 7 3
466 30.5 1 1 1 Mao 10 10 7 4
467 7.9 1 1 1 Manula 6 10 7 5
468 9.5 1 1 1 Bocco 3 10 7 6
469 0.0 1 1 1 Kaseja 11 10 8 1
470 0.0 1 1 1 Nyoni 4 10 8 2
471 0.0 1 1 1 Yondani 8 10 8 3
472 0.0 1 1 1 Kapombe 9 10 8 4
473 6.5 1 1 1 Ngassa 5 10 8 5
474 13.4 1 1 1 Kawawa 7 10 8 6
475 51.7 1 0 0 Ngassa 5 10 9 1
476 19.5 1 0 0 Mnoga 12 10 9 2
477 24.0 1 0 0 Mao 10 10 9 3
478 0.0 1 0 0 Manula 6 10 9 4
479 12.3 1 0 0 Yondani 8 10 9 5
480 0.0 1 0 0 Samatta 1 10 9 6


The height of the tree is \(y\). For example, tree number 1 belongs to the “Mao” species and was actually measured as \(y_1 = 80.5\) cm tall. These are real measurements, conducted by Emily, Jacob, and others in the field.

Notice that trees that died have a height of 0. For example, tree number 5 died, so \(y_5 = 0\).

Tanzania’s national football team
Tanzania’s national football team


Our model from last year

Let’s fit our model from last year, but this time to real data.

parameter metric real_data
Intercept Estimate 96.1
Fire Estimate -62.5
Drought Estimate -45.3
Herbivory Estimate -14.5
Fire:Drought Estimate 22.5
Fire:Herbivory Estimate 9.2
Drought:Herbivory Estimate 4.8
Species_sd(Intercept) Estimate 26.8
Species_sd(Fire) Estimate 12.6
Species_sd(Drought) Estimate 7.4
Species_sd(Herbivory) Estimate 4.4
Species_cor(Intercept,Fire) Estimate -0.7
Species_cor(Intercept,Drought) Estimate -0.6
Species_cor(Fire,Drought) Estimate 0.5
Species_cor(Intercept,Herbivory) Estimate 0.1
Species_cor(Fire,Herbivory) Estimate -0.1
Species_cor(Drought,Herbivory) Estimate -0.1
Block_sd(Intercept) Estimate 10.5
sigma Estimate 28.4


So we see that we estimate \(\beta_0 = 96.1\), \(\beta^{\text{fire}} = -62.5\), etc.

Let’s compare to the simulated data:

parameter metric real_data simulated_data
Intercept Estimate 96.1 18.4
Intercept Est.Error 10.2 2.0
Fire Estimate -62.5 -1.2
Fire Est.Error 9.2 1.6
Drought Estimate -45.3 -4.2
Drought Est.Error 5.0 0.3
Herbivory Estimate -14.5 -1.8
Herbivory Est.Error 4.7 0.4
Fire:Drought Estimate 22.5 -2.1
Fire:Drought Est.Error 5.2 0.2
Fire:Herbivory Estimate 9.2 -2.2
Fire:Herbivory Est.Error 5.2 0.2
Drought:Herbivory Estimate 4.8 -1.9
Drought:Herbivory Est.Error 5.1 0.2
Species_sd(Intercept) Estimate 26.8 5.9
Species_sd(Intercept) Est.Error 6.1 1.5
Species_sd(Fire) Estimate 12.6 1.1
Species_sd(Fire) Est.Error 4.2 0.3
Species_sd(Drought) Estimate 7.4 0.8
Species_sd(Drought) Est.Error 3.7 0.3
Species_sd(Herbivory) Estimate 4.4 1.1
Species_sd(Herbivory) Est.Error 3.5 0.3
Species_cor(Intercept,Fire) Estimate -0.7 0.1
Species_cor(Intercept,Fire) Est.Error 0.2 0.3
Species_cor(Intercept,Drought) Estimate -0.6 -0.1
Species_cor(Intercept,Drought) Est.Error 0.3 0.3
Species_cor(Fire,Drought) Estimate 0.5 0.0
Species_cor(Fire,Drought) Est.Error 0.3 0.3
Species_cor(Intercept,Herbivory) Estimate 0.1 0.1
Species_cor(Intercept,Herbivory) Est.Error 0.4 0.3
Species_cor(Fire,Herbivory) Estimate -0.1 0.2
Species_cor(Fire,Herbivory) Est.Error 0.4 0.3
Species_cor(Drought,Herbivory) Estimate -0.1 -0.1
Species_cor(Drought,Herbivory) Est.Error 0.4 0.3
Block_sd(Intercept) Estimate 10.5 2.3
Block_sd(Intercept) Est.Error 3.8 0.7
sigma Estimate 28.4 1.1
sigma Est.Error 1.0 0.0

Team activity (19 minutes): What do you notice about the differences between the simulated data from last year and the real data from this year ?

An answer

Let’s rescale the simulated estimates and errors by 5 to roughly match the intercept heights:

parameter metric real_data simulated_data_times_5
Intercept Estimate 96.1 91.8
Intercept Est.Error 10.2 10.2
Fire Estimate -62.5 -5.9
Fire Est.Error 9.2 7.9
Drought Estimate -45.3 -21.0
Drought Est.Error 5.0 1.5
Herbivory Estimate -14.5 -8.8
Herbivory Est.Error 4.7 1.8
Fire:Drought Estimate 22.5 -10.5
Fire:Drought Est.Error 5.2 1.0
Fire:Herbivory Estimate 9.2 -11.0
Fire:Herbivory Est.Error 5.2 1.0
Drought:Herbivory Estimate 4.8 -9.4
Drought:Herbivory Est.Error 5.1 1.0
Species_sd(Intercept) Estimate 26.8 29.6
Species_sd(Intercept) Est.Error 6.1 7.4
Species_sd(Fire) Estimate 12.6 5.6
Species_sd(Fire) Est.Error 4.2 1.7
Species_sd(Drought) Estimate 7.4 4.1
Species_sd(Drought) Est.Error 3.7 1.3
Species_sd(Herbivory) Estimate 4.4 5.3
Species_sd(Herbivory) Est.Error 3.5 1.6
Species_cor(Intercept,Fire) Estimate -0.7 0.4
Species_cor(Intercept,Fire) Est.Error 0.2 1.4
Species_cor(Intercept,Drought) Estimate -0.6 -0.5
Species_cor(Intercept,Drought) Est.Error 0.3 1.4
Species_cor(Fire,Drought) Estimate 0.5 0.0
Species_cor(Fire,Drought) Est.Error 0.3 1.5
Species_cor(Intercept,Herbivory) Estimate 0.1 0.5
Species_cor(Intercept,Herbivory) Est.Error 0.4 1.4
Species_cor(Fire,Herbivory) Estimate -0.1 1.2
Species_cor(Fire,Herbivory) Est.Error 0.4 1.4
Species_cor(Drought,Herbivory) Estimate -0.1 -0.3
Species_cor(Drought,Herbivory) Est.Error 0.4 1.5
Block_sd(Intercept) Estimate 10.5 11.5
Block_sd(Intercept) Est.Error 3.8 3.7
sigma Estimate 28.4 5.4
sigma Est.Error 1.0 0.2

The fire effect is more than 10 times what we expected it to be ! The drought effect was about twice what we expected, and the herbivory effect was similar to what we expected. The errors for these effects are larger than we expected, especially for drought and herbivory.

The 2-way interaction terms are positive ! We expected them to be negative. But maybe this makes sense since the main effects are larger than we expected, and a plant can’t die twice. So layering two effects, the overall combined effect isn’t as bad as the two summed together.

Species variability is roughly what we expected in the control group, but is much higher than we expected for the effect of fire.

The block effect is roughly what we expected, but the model error sigma is 6 times higher than we expected ! In other words, there is a lot of variability in tree height that the model does not explain. This could explain why the errors for the drought and herbivory effects are larger than we expected. The fire effect error is driven by the block effect, so it is closer to what we expected.

How can we improve the model ?

Let’s look at what the real data heights \(y\) look like relative to 10 repeated simulations \(y_\text{rep}\) from our model:

In the real data \(y\), we see many plants at zero, since dead plants have height defined as zero. Then we see a few very tall plants. But the simulations from our model \(y_\text{rep}\) fail to capture this structure in our real data. The \(y_\text{rep}\) include some negative height values, which is not realistic. The \(y_\text{rep}\) don’t have many plants at zero, like we see in the real data. This motivates us to try adding this structure to our model.

Expanding to a 2-part model

The big picture for our model predicted height as a function of various factors:

\[\begin{align*} \text{predicted height} &= \text{average height} \\ &\qquad + \text{factor (fire, drought, herbivory) effects} \\ &\qquad + \text{species-specific effects} \\ &\qquad + \text{location effects} \end{align*}\]

But it didn’t take into account whether or not plants survived in the first place. Maybe we model survival using all these factors, and then model height if a plant survives.

\[\begin{align*} \text{predicted height} &= \text{probability of survival} * \text{predicted height if survive} \end{align*}\]

For example, suppose our model says a plant has a probability of survival of 80%. And the model says that if it survives, its predicted height is 100 cm. If the plant dies, its height is known to be 0 cm. Imagine the experiment were run many many times. For 80% of these experiments, the plant’s height is around 100 cm. For 20% of these experiments, the plant’s height is 0 cm. Then averaging, the model predicts the height to be 80 cm.

Furthermore, the predicted height if a plant survives should always be greater than zero. So we can use a log transformation, modeling \(\log y_i\) rather than \(y_i\) for surviving plants.

More reading on expanding models into two parts and log transformations

Bayesian Data Analysis p.148-149 Example “Checking the fit of hierarchical regression models for adolescent smoking”, Mixture Models, by Betancourt, Towards A Principled Bayesian Workflow, by Betancourt section 4 “Close Enough for an Effective Demonstration”.

See Regression and Other Stories, by Gelman et al. p.43, p.189-195 about log transformations.

“Savanna syndrome” ?

A syndrome in an ecological context is a collection of traits or characteristics that reflects some sort of coordinated, multi-prong adaption. A savanna syndrome describes the set of adaptations to life in savannas, which includes traits that help with drought, fire, and herbivory. We are still learning what those traits are ! Thorns, small leaves, deep roots, and thick bark could be in the mix.

Let’s see if there are species that are better-adapted to the full savanna life (fire and drought and herbivory). So we use the models to predict how each species \(s\) is affected by the combination of all three factors:

\[\begin{align*} \mu_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} + \beta^{\text{drought}} + \beta^{\text{herbivory}} \\ &\qquad + \beta^{\text{fire},\text{drought}} + \beta^{\text{fire},\text{herbivory}} + \beta^{\text{drought},\text{herbivory}} \\ &\qquad + \beta^{\text{species}}_{s} \\ &\qquad + \beta^{\text{species},\text{fire}}_{s} + \beta^{\text{species},\text{drought}}_{s} + \beta^{\text{species},\text{herbivory}}_{s} \end{align*}\]

Instead of using the model, we can just compute the averages from the data alone, \(\bar{y}_{\text{Fire}=1,\text{Drought}=1,\text{Herbivory}=1,s}\). In other words, we can take the 5 trees of species \(s\) that were assigned to fire and drought and herbivory and take their average height.

Team activity (14 minutes): Let’s try this for “species” Bocco Use the real data here to calculate Bocco’s savanna height.

An answer

When Bocco is assigned to fire and drought and herbivory, his heights are: 76 centimeters, 49 centimeters, 42 centimeters, 0 centimeters, and 9.5 centimeters. So on average, Bocco’s savanna height is 35 centimeters.

data <- c(76, 49, 42, 0, 9.5)
mean(data)
## [1] 35.3
result <- t.test(data, conf.level = 0.95)
print(result$conf.int)
## [1] -2.974683 73.574683
## attr(,"conf.level")
## [1] 0.95


The model predicts Bocco’s savanna height to be 46 centimeters, with a 95% certainty interval of (27,65) centimeters. The model uses similar groups (e.g. other Bocco plants that got only some treatments, or other species that got all savanna treatments) to help get better estimates of effects.

Let’s look at the results for all the plants:

Does expanding the model to 2 parts change this ?

Yes !

Let’s look at these predictions alongside the real data for trees that got the full savanna life (fire and drought and herbivory). To do this, we will need to zoom out a bit:

Expanding to more interactions

As we noted in above, the model is missing a 3-way interaction between fire, drought, and herbivory. Below we will think more about what this means.

Additionally, we could let all the interactions vary by species. The main effects already vary by species, so it would make sense if the interactions did as well. For example, if a species has a particularly harsh fire effect and harsh drought effect, maybe its interaction between fire and drought is more positive, since it can’t die twice.

3-way interaction

Again, let’s write a small part our model for the tree heights in math, ignoring the block effects and species-specific effects:

\[\begin{align*} y_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} \text{Fire}_i + \beta^{\text{drought}} \text{Drought}_i + \beta^{\text{herbivory}} \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought}} \text{Fire}_i \text{Drought}_i + \beta^{\text{fire},\text{herbivory}} \text{Fire}_i \text{Herbivory}_i + \beta^{\text{drought},\text{herbivory}} \text{Drought}_i \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought},\text{herbivory}} \text{Fire}_i \text{Drought}_i \text{Herbivory}_i \\ &\qquad + \epsilon_i \end{align*}\]

Team activity (9 minutes): What does our model predict will be the height of a tree that gets fire and drought but not the herbivory treatment ?

An answer

This group gets \(\text{Fire}_i = 1\) and \(\text{Drought}_i = 1\), but \(\text{Herbivory}_i = 0\). So the model predicts their height will be:

\[\begin{align*} \mu_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} * 1 + \beta^{\text{drought}} 1 + \beta^{\text{herbivory}} * 0 \\ &\qquad + \beta^{\text{fire},\text{drought}} * 1 * 1 + \beta^{\text{fire},\text{herbivory}} * 1 * 0 + \beta^{\text{drought},\text{herbivory}} * 1 * 0 \\ &\qquad + \beta^{\text{fire},\text{drought},\text{herbivory}} * 1 * 1 * 0 \end{align*}\]

which is \(\beta_0 + \beta^{\text{fire}} + \beta^{\text{drought}} + \beta^{\text{fire},\text{drought}}\).


Team activity (9 minutes): What does our model predict will be the height of a tree that gets fire and drought and herbivory ?

An answer

This group gets \(\text{Fire}_i = 1\) and \(\text{Drought}_i = 1\), but \(\text{Herbivory}_i = 1\). So the model predicts their height will be:

\[\begin{align*} \mu_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} * 1 + \beta^{\text{drought}} * 1 + \beta^{\text{herbivory}} * 1 \\ &\qquad + \beta^{\text{fire},\text{drought}} * 1 * 1 + \beta^{\text{fire},\text{herbivory}} * 1 * 1 + \beta^{\text{drought},\text{herbivory}} * 1 * 1 \\ &\qquad + \beta^{\text{fire},\text{drought},\text{herbivory}} * 1 * 1 * 1 \end{align*}\]

which is

\[\begin{align*} \mu_i = \beta_0 \\ &\qquad + \beta^{\text{fire}} + \beta^{\text{drought}} + \beta^{\text{herbivory}} \\ &\qquad + \beta^{\text{fire},\text{drought}} + \beta^{\text{fire},\text{herbivory}} + \beta^{\text{drought},\text{herbivory}} \\ &\qquad + \beta^{\text{fire},\text{drought},\text{herbivory}} \end{align*}\]


We can do this for all 8 treatment assignments and see what the model predicts the height will be. Let’s see what this gives according to the real estimates of the \(\beta\)s:

Fire Drought Herbivory PredictedHeight PredictedHeightModel
0 0 0 98 beta_0
1 0 0 31 beta_0 + beta_fire
0 1 0 48 beta_0 + beta_drought
1 1 0 12 beta_0 + beta_fire + beta_drought + beta_fire_drought
Fire Drought Herbivory PredictedHeight PredictedHeightModel
0 0 1 80 beta_0 + beta_herbivory
1 0 1 30 beta_0 + beta_fire + beta_herbivory + beta_fire_herbivory
0 1 1 43 beta_0 + beta_drought + beta_herbivory + beta_drought_herbivory
1 1 1 7 beta_0 + beta_fire + beta_drought + beta_herbivory + beta_fire_drought + beta_fire_herbivory + beta_drought_herbivory + beta_fire_drought_herbivory

And we can draw an accompanying picture:

We are reminded here of the problem with this model: there is nothing to stop the predicted heights from being negative !

Team activity (6 minutes): For trees that get drought, we see similar heights whether or not they get herbivory, \(43 \approx 48\) centimeters. Why might this be true in the wild ?

An answer

In the wild, could this be true if the drought makes plants grow less for animals to eat ?


Team activity (4 minutes): For trees that eventually got fire, we see similar heights whether or not they had herbivory before, \(30 \approx 31\) centimeters. Why might this be true ?

An answer

Perhaps the fire was so devastating that it wiped away any herbivory effects ?


Team activity (7 minutes): What is your interpretation of the 3-way interaction between fire, drought, and herbivory ?

An answer

It is difficult to understand 3-way interactions. The picture helps, but it really requires time for the ideas to sink in.

The 3-way interaction allows the interaction between fire and drought to differ by whether or not the tree got the herbivory treatment. We could similarly interpret the 3-way interaction as allowing the interaction between fire and herbivory to differ by whether or not the tree got the drought treatment. Or as allowing the interaction between herbivory and drought to differ by whether or not the tree got the fire treatment.

Species variability

Our model now allows effects and their interactions to vary by species \(s\):

\[\begin{align*} y_i &= \beta_0 \\ &\qquad + \beta^{\text{fire}} \text{Fire}_i + \beta^{\text{drought}} \text{Drought}_i + \beta^{\text{herbivory}} \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought}} \text{Fire}_i \text{Drought}_i + \beta^{\text{fire},\text{herbivory}} \text{Fire}_i \text{Herbivory}_i + \beta^{\text{drought},\text{herbivory}} \text{Drought}_i \text{Herbivory}_i \\ &\qquad + \beta^{\text{fire},\text{drought},\text{herbivory}} \text{Drought}_i \text{Herbivory}_i \text{Fire}_i \\ &\qquad + \beta^{\text{species}}_{s[i]} \\ &\qquad + \beta^{\text{species},\text{fire}}_{s[i]} \text{Fire}_i + \beta^{\text{species},\text{drought}}_{s[i]} \text{Drought}_i + \beta^{\text{species},\text{herbivory}}_{s[i]} \text{Herbivory}_i \\ &\qquad + \beta^{\text{species},\text{fire},\text{drought}}_{s[i]} \text{Fire}_i \text{Drought}_i + \beta^{\text{species},\text{fire},\text{herbivory}}_{s[i]} \text{Fire}_i \text{Herbivory}_i + \beta^{\text{species},\text{drought},\text{herbivory}}_{s[i]} \text{Drought}_i \text{Herbivory}_i \\ &\qquad + \beta^{\text{species},\text{fire},\text{drought},\text{herbivory}}_{s[i]} \text{Drought}_i \text{Herbivory}_i \text{Fire}_i \\ &\qquad + \beta^{\text{block}}_{b[i]} \\ &\qquad + \epsilon_i \end{align*}\]

How much do species vary ? Let’s print out some estimates and errors from the model fit to the real data:

parameter Estimate Est.Error lowerbound of 95% certainty interval upperbound of 95% certainty interval
Species_sd(Intercept) 26 6 17 41
Species_sd(Fire) 14 5 5 24
Species_sd(Drought) 7 4 0 17
Species_sd(Herbivory) 5 4 0 14
Species_sd(Fire:Drought) 7 6 0 21
Species_sd(Fire:Herbivory) 5 4 0 15
Species_sd(Drought:Herbivory) 6 5 0 18
Species_sd(Fire:Drought:Herbivory) 8 6 0 22

Recall from above that the intercept (average height in the control group, \(\beta_0\)) is estimated to be around 100 centimeters. Let’s suppose that this estimate is the true average height and that we know it perfectly.

Above, we used the Normal Distribution to describe how species vary around this average:

Team activity (6 minutes): In the control group, how do species vary in their heights ? What is the species standard deviation ?

An answer

The species standard deviation for height in the control group is the Species_sd(Intercept) in the output, which is estimated to be 27 centimeters.


Team activity (6 minutes): What is our certainty about how much the species vary in control group heights ?

An answer

The estimation error for the species standard deviation is 7. We are 95% certain that the species standard deviation is 17 to 43 centimeters. So the Normal distribution drawn above may be skinnier or wider than shown.


Team activity (11 minutes): How do species vary in the effects of fire, drought, and herbivory ? How certain are we ?

An answer

Species vary the most in their response to fire, with a standard deviation of 14 centimeters. The standard deviations across species for the effects of drought and herbivory are less, and indeed we aren’t even certain if they are much above zero.

Team activity (11 minutes): How do species vary in the interaction effects ? How certain are we ?

An answer

The lowerbounds of the 95% certainty intervals are zero, so we can’t use the data to distinguish the species by their interaction effects.

Savanna syndrome revisited

Does expanding the model to include more interactions make a difference ? Recall our original model results:

The model with more interactions give very similar predictions for savanna heights:

Relevant versus irrelevant model expansions

We cannot model everything. With limited data and limited time for analysis, we have to choose what to focus on. You may have heard the saying “all models are wrong, but some are useful”. Which of our models are the most useful for us ? For the savanna syndrome we wish to study, the model expansion to 2 parts matters more than the model expansion that adds interactions.

Based on a drawing from Towards A Principled Bayesian Workflow, by Betancourt in 1.4.1.1 All Models Are Wrong, 1.4.1.2 But Some Are Useful:

Celebrate

Designing, implementing, and analyzing a study is hard and we deserve to celebrate.

Team activity (23 minutes): What went right in this experimental design ? What do you like about it ? How will you celebrate these design ideas in your own work ?

An answer

My favorite aspects of this experimental design:

  • treatment implementation and outcome measurements are designed to be close to the real-world aspects we want to study
  • control groups allow for estimating effects of treatments
  • randomization allows for estimating these effects without bias
  • careful setup to make groups comparable, e.g. building rainout shelters as similarly as possible, which helps improve precision
  • data collection from baseline until the endline of the study enables many analyses, a full time series ! Helps us learn a lot and be more efficient with the data by adjusting for baseline differences, see e.g. Gelman et al. (2019).