The soiltestcorr
-package also allows users to implement
the quadrants analysis approach, also known as the Cate-Nelson analysis.
This tutorial is intended to show how to deploy the
cate_nelson_1965()
function for estimating critical soil
test values based on Cate and Nelson (1965). This approach is also known
as the “graphical” version of the Cate-Nelson analysis. This method
applies an arbitrary fixed value of ry as a target (y-axis) that divides
the data into two categories (below & equal or above ry target). In
a second stage, it estimates the CSTV (x-axis) as the minimum stv that
divides the data into four quadrants (target ry level combined with STV
lower or greater than the CSTV) maximizing the number of points under
well-classified quadrants (II, stv >= CSTV & ry >= ry target;
and IV, stv < CSTV & ry < RY target).
Load your dataframe with soil test value (stv) and relative yield
(ry) data.
Specify the following arguments into the function
-cate_nelson_1965()-:
(a). data
(optional),
(b). stv
(soil test value) and ry
(relative
yield) columns or vectors,
(c). plot
TRUE (produces a ggplot as main output) or
FALSE (DEFAULT, only produces a list or data.frame),
(d). tidy
TRUE (produces a data.frame with results) or
FALSE (store results as list),
Run and check results.
Adjust plot as desired.
library(soiltestcorr)
Suggested packages
# Install if needed
library(ggplot2) # Plots
library(dplyr) # Data wrangling
library(tidyr) # Data wrangling
library(utils) # Data wrangling
library(purrr) # Mapping
This is a basic example using three different datasets:
# Example 1 dataset
# Fake dataset manually created
<- data.frame("RY" = c(65,80,85,88,90,94,93,96,97,95,98,100,99,99,100),
data_1 "STV" = c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15))
# Example 2. Native fake dataset from soiltestcorr package
<- soiltestcorr::data_test
data_2
# Example 3. Native dataset from soiltestcorr package, Freitas et al. (1966), used by Cate & Nelson (1971)
<- soiltestcorr::freitas1966 data_3
RY target = 90%, replace with your desired value
tidy
= FALSE It returns a LIST (more efficient for multiple fits at once)
# Using dataframe argument, tidy = FALSE -> return a LIST
<-
fit_1_tidy_false ::cate_nelson_1965(data = data_1,
soiltestcorrry = RY,
stv = STV,
target = 90,
tidy = FALSE,
plot = FALSE)
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
::head(fit_1_tidy_false)
utils#> $n
#> [1] 15
#>
#> $CRYV
#> [1] 90
#>
#> $CSTV
#> [1] 4.5
#>
#> $R2
#> [1] 0.6697782
#>
#> $AIC
#> [1] 98.39784
#>
#> $BIC
#> [1] 100.522
tidy
= TRUE It returns a data.frame (more organized results)
# Using dataframe argument, tidy = FALSE -> return a LIST
<-
fit_1_tidy_false ::cate_nelson_1965(data = data_1,
soiltestcorrry = RY,
stv = STV,
target = 90,
tidy = TRUE)
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
::head(fit_1_tidy_false)
utils#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 15 90 4.5 0.670 98.4 101. 5.26 0 11 0 4 15
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
You can call stv
and ry
vectors using the
$
.
The tidy
argument still applies for controlling the
output type
<-
fit_1_vectors_list ::cate_nelson_1965(ry = data_1$RY,
soiltestcorrstv = data_1$STV,
target=90,
tidy = FALSE)
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
<-
fit_1_vectors_tidy ::cate_nelson_1965(ry = data_1$RY,
soiltestcorrstv = data_1$STV,
target=90,
tidy = TRUE)
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
<-
fit_2 ::cate_nelson_1965(data = data_2,
soiltestcorrry = RY,
stv = STV,
target = 90,
tidy = TRUE)
::head(fit_2)
utils#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 137 90 23 0.394 1056. 1065. 11.2 11 67 9 50 117
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
<-
fit_3 ::cate_nelson_1965(data = data_3,
soiltestcorrry = RY,
stv = STK,
target = 90,
tidy = TRUE)
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
::head(fit_3)
utils#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 24 90 48 0.707 183. 186. 9.64 0 12 4 8 20
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
Note: the stv
column needs to have the same name for
all datasets
#
<- dplyr::bind_rows(data_1, data_2,
data.all %>% dplyr::rename(STV = STK),
data_3 .id = "id") %>%
::nest(data = c("STV", "RY")) tidyr
# Run multiple examples at once with map()
= data.all %>%
fit_multiple_map ::mutate(mod_alcc = purrr::map(data, ~ soiltestcorr::cate_nelson_1965(ry = .$RY,
dplyrstv = .$STV,
target=90,
tidy = TRUE)))
#> Warning: There were 2 warnings in `dplyr::mutate()`.
#> The first warning was:
#> ℹ In argument: `mod_alcc = purrr::map(...)`.
#> Caused by warning in `stats::chisq.test()`:
#> ! Chi-squared approximation may be incorrect
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 1 remaining warning.
::head(fit_multiple_map)
utils#> # A tibble: 3 × 3
#> id data mod_alcc
#> <chr> <list> <list>
#> 1 1 <tibble [15 × 2]> <tibble [1 × 8]>
#> 2 2 <tibble [137 × 2]> <tibble [1 × 8]>
#> 3 3 <tibble [24 × 2]> <tibble [1 × 8]>
Alternatively, with group_map, we do not require nested data.
However, it requires to dplyr::bind_rows and add an id
column specifying the name of each dataset.
This option return models as lists objects.
<-
fit_multiple_group_map %>% tidyr::unnest(data) %>%
data.all #dplyr::bind_rows(data_1, data_2, .id = "id") %>%
::group_by(id) %>%
dplyr::group_map(~ soiltestcorr::cate_nelson_1965(data = .,
dplyrry = RY,
stv = STV,
target = 90,
tidy = TRUE))
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
#> Warning in stats::chisq.test(data.frame(row.1, row.2)): Chi-squared
#> approximation may be incorrect
::head(fit_multiple_group_map)
utils#> [[1]]
#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 15 90 4.5 0.670 98.4 101. 5.26 0 11 0 4 15
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
#>
#> [[2]]
#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 137 90 23 0.394 1056. 1065. 11.2 11 67 9 50 117
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
#>
#> [[3]]
#> # A tibble: 1 × 8
#> n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV $posi…²
#> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int> <int>
#> 1 24 90 48 0.707 183. 186. 9.64 0 12 4 8 20
#> # … with 1 more variable: quadrants$negative <int>, and abbreviated variable
#> # names ¹quadrants$q.I, ²$positive
A suitable alternative for obtaining confidence intervals for
parameters or derived quantities is bootstrapping.
Bootstrapping is a resampling technique (with replacement) that draws
samples from the original data with the same size. If you have groups
within your data, you can specify grouping variables as arguments in
order to maintain, within each resample, the same proportion of
observations than in the original dataset.
This function returns a table with as many rows as the resampling size (n) containing the results for each resample.
<- boot_cn_1965(data = data_3,
boot_cn65 ry = RY, stv = STK, target = 90,
n = 99)
#> Warning: There were 85 warnings in `dplyr::mutate()`.
#> The first warning was:
#> ℹ In argument: `model = map(...)`.
#> ℹ In group 1: `boot_id = 1`.
#> Caused by warning in `stats::chisq.test()`:
#> ! Chi-squared approximation may be incorrect
#> ℹ Run `dplyr::last_dplyr_warnings()` to see the 84 remaining warnings.
%>% dplyr::slice_head(., n=5)
boot_cn65 #> # A tibble: 5 × 9
#> boot_id n CRYV CSTV R2 AIC BIC RMSE quadran…¹ $q.II $q.III $q.IV
#> <dbl> <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int> <int> <int>
#> 1 1 24 90 48 0.683 192. 196. 11.7 0 9 2 13
#> 2 2 24 90 75 0.592 197. 200. 12.8 2 10 3 9
#> 3 3 24 90 56 0.458 184. 187. 9.86 0 15 6 3
#> 4 4 24 90 75 0.726 179. 183. 8.98 1 10 2 11
#> 5 5 24 90 93.5 0.632 191. 194. 11.3 1 6 2 15
#> # … with 2 more variables: quadrants$positive <int>, $negative <int>, and
#> # abbreviated variable name ¹quadrants$q.I
# CSTV Confidence Interval
quantile(boot_cn65$CSTV, probs = c(0.025, 0.5, 0.975))
#> 2.5% 50% 97.5%
#> 42.00 49.00 124.85
# Plot
%>%
boot_cn65 ::ggplot(aes(x = CSTV))+
ggplot2geom_histogram(color = "grey25", fill = "#9de0bf", bins = 10)
We can generate a ggplot with the same mod_alcc() function.
We just need to specify the argument plot = TRUE
.
::cate_nelson_1965(data = data_1,
soiltestcorrry = RY,
stv = STV,
target=90,
plot = TRUE)
::cate_nelson_1965(data = data_2,
soiltestcorrry = RY,
stv = STV,
target=90,
plot = TRUE)
::cate_nelson_1965(data = data_3,
soiltestcorrry = RY,
stv = STK,
target=90,
plot = TRUE)
References
Cate, R.B. Jr., and Nelson, L.A., 1965. A rapid method for
correlation of soil test analysis with plant response data. North
Carolina Agric. Exp. Stn., International soil Testing Series Bull.
No. 1.