“lfmix: Mixed frequency analysis of low flows using mixed distribution and mixed copula estimators”
library(mixdist)
#> Loading required package: lfstat
#> Loading required package: xts
#> Loading required package: zoo
#>
#> Attaching package: 'zoo'
#> The following objects are masked from 'package:base':
#>
#> as.Date, as.Date.numeric
#> Loading required package: lmom
#> Loading required package: lattice
#> Loading required package: FAdist
#library(lfstat)
#library(FAdist)Introduction to package lfmix (i.e mixdist)
This package implements mixed distribution estimators for low-flow frequency analysis. These estimators are generalizations of the common probability estimator for annual minima series (AMS) for the case that low flows are generated by different processes, violating the homogeneity assumption of frequency analysis. In such cases, the improved estimators provide a consistent framework for estimating return periods in summer and winter, as well as combined probabilities of annual events.
Theoretical background
In the approach, the annual minima series is regarded as the minima of the annual summer minima and winter minima : In this notation, refers to the original variable (daily flow), and to the annual minima for years.
The first method cdfwei_Mixed implements the mixed
distribution estimator for annual minima series of , assuming
independent and seasonally separable types of low-flow events.
Let and denote the marginal extreme value distributions of the summer and winter minima series. Under independence, the cumulative distribution function (cdf) of the annual low flow is
In our implementation for minima series, a Weibull distribution is used to model the marginal distributions of summer and winter events.
The method pobs_mixed_FUN implements an analogous
estimator for the empirical probabilities
where
and
are the respective probabilities of occurrence in the summer and winter
seasons. These are estimated by a common empirical probability
estimator.
The second method, cdfwei_MixedC, implements the mixed
copula estimator of . It is a generalization of the mixed distribution
estimator that allows for seasonal dependence.
Based on the marginal summer and winter distributions and , the probability of event occurring in any season is obtained by
As with the mixed probability estimator, a Weibull distribution is used to model the marginal distributions of summer and winter events. The method implements a Gumbel–Hougaard copula by default as the copula model . Alternatively, an Independence copula can be used, which reduces the method to the mixed distribution estimator assuming independent events.
In the same way as for the theoretical estimator, the method
pobs_mixed_FUN implements an analogous estimator for the
empirical probabilities
where
and
denote the empirical probability of
occurring in the summer and winter seasons, respectively. The
is the empirical copula, which defines the empirical multivariate
distribution.
Method weiMixT
The method weiMixT is a comprehensive function that
implements all estimators, allowing for an in-depth analysis of the
event probabilities within the sample. It also provides several flow
characteristics related to the mixed distribution approaches, including
seasonality indices, seasonal correlation, and the mixture rate of
summer and winter events in the annual sample. Additionally, it offers
an assessment of how the improved estimates of return periods (and
probabilities) differ from the common annual estimator.
The general call of the method is as follows:
output <- weiMixT(x1=xa, sortx=FALSE, T=c(100, 50, 20))
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)The output is a list with following main elements:
$sample
A data.frame containing event characteristics of annual
low flow events with folowing colums: hyear is the hydrological year, x
is the annual minimum flow. The p. (and T.)
are event probabilities (and return periods) from various probability
estimators:
Empirical probabilities (and return periods)pobs are empirical probabilities using the classical
Weibull estimatorpobs.mix empirical return periods from the mixed
distribution approach
Theoretical probabilities (and return periods)pA (TA)… classical anual minima series
approachpMix (TMix) … mixed distribution estimator
from Laaha (2023a)pMixC (TMixC) … mixed copula estimator from
Laaha (2023b)
head(output$sample)
#> hyear x pobs pobs.mix pA TA pMix
#> 1 1976 0.2208571 0.22222222 0.22222222 0.2294649 4.357965 0.224758013
#> 2 1977 0.2180000 0.19444444 0.19753086 0.2101373 4.758794 0.205059242
#> 3 1978 0.3075714 0.75000000 0.80864198 0.7369623 1.356922 0.788882343
#> 4 1979 0.1588571 0.02777778 0.02777778 0.0000000 Inf 0.002640444
#> 5 1980 0.3672857 0.86111111 0.94135802 0.9053666 1.104525 0.951602656
#> 6 1981 0.4204286 0.97222222 0.98996914 0.9673425 1.033760 0.991134583
#> TMix pMixC TMixC
#> 1 4.449230 0.214684356 4.658001
#> 2 4.876640 0.196319195 5.093745
#> 3 1.267616 0.739387671 1.352470
#> 4 378.724227 0.002640444 378.724227
#> 5 1.050859 0.912000944 1.096490
#> 6 1.008945 0.970473554 1.030425$To_events:
A data.frame containing the nominal and improved
probabilities (p.) and return periods (T.) of
of ´design´ events with specified return periods from various
estimators. The design events are specified by argument T
with default to 100, 50 and 20 year events. To
(po) … nominal values corresponding to classical annual
minimum estimator (AMS approach)Tmix (pmix) … mixed distribution estimator
from Laaha (2023a)TmixC (pmixC) … mixed copula estimator from
Laaha (2023b)Ts (ps) … seasonal estimate from the marginal
summer distribution)Tw (pw) … seasonal estimate from the marginal
winter distribution
output$To_events
#> To po Tmix pmix TmixC pmixC Ts ps Tw
#> 1 100 0.01 37.77460 0.02647281 37.77460 0.02647281 37.77460 0.02647281 Inf
#> 2 50 0.02 30.75477 0.03251528 30.80299 0.03246438 31.38146 0.03186595 1490.983
#> 3 20 0.05 17.99798 0.05556180 18.23204 0.05484850 21.63549 0.04622036 102.102
#> pw
#> 1 0.0000000000
#> 2 0.0006706984
#> 3 0.0097941256Various other flow characteristics
Further elements are flow characteristics, including the circular
seasonality index SI, the seasonality ratio
SR, the Spearman rank correlation of summer and winter
series Rho together with its significance value
Rho_p-value, and the mixture_rate of summer
and winter events in the sample are calculated.
Steps of analysis of weiMixT
The following section shows the subsequent analysis steps to produce the output. We first refer to the data before showing the calculation of flow characteristics. The second part shows the calculation of sample probabilities using the various estimators. The final part shows the calculation of return period (and probability) estimates of design events by various estimators.
1. Data
The calculation is based on a daily time series x1 that
is stored as a object of class lfobj. For further details
use help(lfobj) from package lfstat.
x1 <- xa
head(x1)
#> day month year flow hyear
#> 91 31 3 1976 0.622 1975
#> 92 1 4 1976 0.833 1976
#> 93 2 4 1976 0.958 1976
#> 94 3 4 1976 0.971 1976
#> 95 4 4 1976 0.989 1976
#> 96 5 4 1976 1.050 19762. Flow indices and sample probabilities
Seasonality ratio (SR)
SR <- seasratio(x1, breakdays=c("01/04", "01/11"))
SR <- as.numeric(SR)
print(SR)
#> [1] 1.093863Separate the time series x1 into sum/win season
x1 <- xa
x1s <- x1[ x1$month >= 4 & x1$month <= 10 , ]
x1w <- x1[ x1$month < 4 | x1$month > 10 , ]Prepare annual and seasonal series
AM0 <- MAM(x1,n=7, yearly = TRUE)
AM <- AM0$MAn
lmom <- samlmu(AM)
x.pelwei <- pelwei(lmom); x.pelwei
#> zeta beta delta
#> 0.1759369 0.1088168 1.5195907
x.p <- cdfwei(AM, para = x.pelwei)
AM0s <- MAM(x1s,n=7, yearly = TRUE)
#if (sortx==TRUE) {AM0s <- AM0s[order(AM0s$MAn),]}
AMs <- AM0s$MAn
lmoms <- samlmu(AMs)
x.pelwei.s <- pelwei(lmoms); x.pelwei.s
#> zeta beta delta
#> 0.1470125 0.1790483 2.1855520
x.ps <- cdfwei(AM, para = x.pelwei.s)
AM0w <- MAM(x1w,n=7, yearly = TRUE)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
#if (sortx==TRUE) {AM0w <- AM0w[order(AM0w$MAn),]}
AMw <- AM0w$MAn
lmomw <- samlmu(AMw)
x.pelwei.w <- pelwei(lmomw); x.pelwei.w
#> zeta beta delta
#> 0.1825804 0.1469084 1.6393752
x.pw <- cdfwei(AM, para = x.pelwei.w)Correlation and mixture rate
AM0sw <- merge(AM0s, AM0w, by = "hyear", all = TRUE)
cor_spearman <- cor.test(AM0sw$MAn.x, AM0sw$MAn.y, method = "spearman")
print(cor_spearman)
#>
#> Spearman's rank correlation rho
#>
#> data: AM0sw$MAn.x and AM0sw$MAn.y
#> S = 5564, p-value = 0.2018
#> alternative hypothesis: true rho is not equal to 0
#> sample estimates:
#> rho
#> 0.2207283
mixture_rate <- sum(AMs < AMw)/length(AM)
print(mixture_rate)
#> [1] 0.5714286Empirical probabilites (Weibull plotting positions)
# Common Weibull prob estimators -> Same as pobs(AM) from package copula!
.pobs <- (rank(AM))/(length(AM) + 1) # Weibull prob estimators -> Same as pobs(AM) from package copula!
print(.pobs)
#> [1] 0.22222222 0.19444444 0.75000000 0.02777778 0.86111111 0.97222222
#> [7] 0.41666667 0.05555556 0.08333333 0.11111111 0.52777778 0.50000000
#> [13] 0.72222222 0.27777778 0.63888889 0.36111111 0.61111111 0.83333333
#> [19] 0.33333333 0.16666667 0.38888889 0.25000000 0.88888889 0.44444444
#> [25] 0.69444444 0.47222222 0.94444444 0.13888889 0.80555556 0.66666667
#> [31] 0.58333333 0.77777778 0.55555556 0.30555556 0.91666667
# Empirical mixed prob
.pobs.mix <- pobs_mixed_FUN(AM=AM, AMs=AMs, AMw=AMw, plot=FALSE) # $AM, $p.mix and $p.mix.c are returned
print(.pobs.mix)
#> $AM
#> [1] 0.2208571 0.2180000 0.3075714 0.1588571 0.3672857 0.4204286 0.2460000
#> [8] 0.1910000 0.1981429 0.2061429 0.2671429 0.2602857 0.2914286 0.2334286
#> [15] 0.2764286 0.2440000 0.2728571 0.3610000 0.2437143 0.2175714 0.2441429
#> [22] 0.2312857 0.3850000 0.2585714 0.2810000 0.2588571 0.4171429 0.2160000
#> [29] 0.3305714 0.2805714 0.2717143 0.3211429 0.2701429 0.2368571 0.3855714
#>
#> $p.mix
#> [1] 0.22222222 0.19753086 0.80864198 0.02777778 0.94135802 0.98996914
#> [7] 0.40586420 0.05555556 0.12114198 0.13406636 0.50925926 0.48881173
#> [13] 0.75733025 0.31712963 0.62037037 0.37403549 0.58410494 0.91936728
#> [19] 0.36265432 0.17245370 0.38464506 0.29398148 0.96990741 0.44791667
#> [25] 0.69907407 0.46836420 0.97993827 0.14699074 0.87461420 0.68402778
#> [31] 0.54745370 0.83217593 0.52854938 0.33989198 0.97492284
#>
#> $p.mix.C
#> [1] 0.20753968 0.17976190 0.75357143 0.02777778 0.85674603 0.96388889
#> [7] 0.40119048 0.05555556 0.09642857 0.11031746 0.54007937 0.51230159
#> [13] 0.72817460 0.29007937 0.64960317 0.35952381 0.62261905 0.83055556
#> [19] 0.34563492 0.15198413 0.37341270 0.26230159 0.88134921 0.45674603
#> [25] 0.70277778 0.48452381 0.93690476 0.12420635 0.80595238 0.67500000
#> [31] 0.59563492 0.78055556 0.56785714 0.31785714 0.90912698Predicted mixed prob (Weibull cdf model)
# Mixed distribution approach
x.pred.mix <- cdfwei_Mixed(x=AM, y=AM, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), plot=FALSE)
print(x.pred.mix)
#> [1] 0.224758013 0.205059242 0.788882343 0.002640444 0.951602656 0.991134583
#> [7] 0.409105248 0.054201927 0.085925533 0.129396671 0.560617249 0.513075429
#> [13] 0.709684945 0.315543409 0.621511622 0.394232473 0.598611989 0.942245874
#> [19] 0.392104942 0.202145092 0.395296011 0.299733026 0.971383502 0.500902553
#> [25] 0.649796009 0.502938549 0.990049621 0.191558428 0.873560438 0.647195070
#> [31] 0.591142470 0.842686275 0.580763993 0.340993186 0.971883362
# Predicted mixed prob with copula (model)
x.pred.mixC <- cdfwei_MixedC(x=AM, y=AM, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), x1=x1, plot=FALSE)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
print(x.pred.mixC)
#> [1] 0.214684356 0.196319195 0.739387671 0.002640444 0.912000944 0.970473554
#> [7] 0.384996538 0.053535891 0.083986892 0.125277485 0.524406010 0.480588513
#> [13] 0.663400781 0.298827951 0.580793022 0.371320045 0.559543095 0.900526969
#> [19] 0.369363216 0.193598346 0.372298207 0.284222651 0.938203617 0.469386959
#> [25] 0.607137286 0.471260125 0.968326591 0.183704729 0.824468634 0.604709625
#> [31] 0.552624589 0.792768451 0.543021031 0.322305288 0.938915589Build output object $sample
#obs <- data.frame(cbind(hyear=AM0$hyear, x=AM, pobs=.pobs, pobs.mix=.pobs.mix$p.mix, pobs.mix=.pobs.mixC$p.mix.c))
sample <- data.frame(hyear=AM0$hyear, x=AM, pobs=.pobs, pobs.mix=.pobs.mix$p.mix, pA=x.p, TA=1/x.p, pMix=x.pred.mix, TMix=1/x.pred.mix, "pMixC"=x.pred.mixC, "TMixC"=1/x.pred.mixC)3. Return period (and probability) estimates of design events by various estimators
Here we show how the $To_events are calculated. As
stated above, the aim is to produce a data.frame containing
the nominal and improved probabilities (p.) and return
periods (T.) of of ´design´ events with specified return
periods from various estimators. The design events are specified by
argument T with default to 100, 50 and 20 year events.
The data.frame will allow us to analyse the change in
return period when using our improved estimators
Annual Series (no seasons, common probability distribution)
T=c(100,50, 20)
.f <- 1/T # Hier Probability 1/T
.y <- quawei(.f, para = x.pelwei) # Hier Quantilwerte von T => als Eingang in andere Prob-Berechnungen
.p <- cdfwei(.y, para = x.pelwei)
pred <- data.frame(yo=.y, po=.p, To=1/.p)
print(pred)
#> yo po To
#> 1 0.1812090 0.01 100
#> 2 0.1842840 0.02 50
#> 3 0.1913476 0.05 20Seasonal distributions
# -) summer
.ys <- quawei(.f, para = x.pelwei.s)
.ps <- cdfwei(.y, para = x.pelwei.s)
pred.s <- data.frame(y_of_To=.ys, p=.ps, T=1/.ps)
print(pred.s)
#> y_of_To p T
#> 1 0.1688332 0.02647281 37.77460
#> 2 0.1770463 0.03186595 31.38146
#> 3 0.1930126 0.04622036 21.63549
# -) winter
.yw <- quawei(.f, para = x.pelwei.w)
.pw <- cdfwei(.y, para = x.pelwei.w)
pred.w <- data.frame(y_of_To=.yw, p=.pw, T=1/.pw)
print(pred.w)
#> y_of_To p T
#> 1 0.1914602 0.0000000000 Inf
#> 2 0.1961750 0.0006706984 1490.983
#> 3 0.2065797 0.0097941256 102.102Mixed distribution estimator
.pmix <- cdfwei_Mixed(x=.y, y=.y, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), plot=FALSE, pch="x")
# Ermittle Quantile der Mischverteilung mit p=.f (z.b. p=1/20=0.2)
# Dazu feine Diskretisierung
t.y1 <- par("usr")[3] ; t.y2 <- par("usr")[4] ### wie ohne plot berechnen???
t.y.qua <- seq(round(t.y1, 1), round(t.y2, 1), by=0.01)
t.pmix <- cdfwei_Mixed(x=t.y.qua, y=t.y.qua, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), plot=FALSE, pch="x")
# Nun Suche an der y-Stelle der Verteilung wo x=.pmix=.f ist
# ... d.h, größter Wert mit p <= .f
f.qmix <- vector()
for (i in 1:length(.f)){
f.qmix[i] <- max(t.y.qua[t.pmix <= .f[i]]) ### better vectorize ...
}
pred.mix <- data.frame(y_of_To=f.qmix, p=.pmix, T=1/.pmix)
print(pred.mix)
#> y_of_To p T
#> 1 0.16 0.02647281 37.77460
#> 2 0.17 0.03251528 30.75477
#> 3 0.18 0.05556180 17.99798Mixed copula estimator
.pmixC <- cdfwei_MixedC(x=.y, y=.y, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), x1=x1, plot=FALSE)
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
# Ermittle Quantile der Mischverteilung mit p=.f (z.b. p=1/20=0.2)
# Dazu feine Diskretisierung
t.y1 <- par("usr")[3] ; t.y2 <- par("usr")[4] ### wie ohne plot berechnen???
t.y.qua <- seq(round(t.y1, 1), round(t.y2, 1), by=0.01)
t.pmixC <- cdfwei_MixedC(x=t.y.qua, y=t.y.qua, para.x = pelwei(samlmu(AMs)), para.y = pelwei(samlmu(AMw)), x1=x1, plot=FALSE, pch="x")
#> Warning: Probably not enough observations to calculate annual minima for the hydrological years:
#> 2010 (58 obs)
# Nun Suche an der y-Stelle der Verteilung wo x=.pmix=.f ist
# ... d.h, gr??er Wert mit p <= .f
f.qmixC <- vector()
for (i in 1:length(.f)){
f.qmixC[i] <- max(t.y.qua[t.pmixC <= .f[i]]) ### better vectorize ...
}
pred.mixC <- data.frame(y_of_To=f.qmixC, p=.pmixC, T=1/.pmixC)
print(pred.mixC)
#> y_of_To p T
#> 1 0.16 0.02647281 37.77460
#> 2 0.17 0.03246438 30.80299
#> 3 0.19 0.05484850 18.23204Finally build output object $To_events
To_events <- data.frame("To"=1/.p, "po"=.p, "Tmix"=1/.pmix , "pmix"=.pmix , "TmixC"=1/.pmixC , "pmixC"=.pmixC , "Ts"=1/.ps, "ps"=.ps, "Tw"=1/.pw, "pw"=.pw)
print(To_events)
#> To po Tmix pmix TmixC pmixC Ts ps Tw
#> 1 100 0.01 37.77460 0.02647281 37.77460 0.02647281 37.77460 0.02647281 Inf
#> 2 50 0.02 30.75477 0.03251528 30.80299 0.03246438 31.38146 0.03186595 1490.983
#> 3 20 0.05 17.99798 0.05556180 18.23204 0.05484850 21.63549 0.04622036 102.102
#> pw
#> 1 0.0000000000
#> 2 0.0006706984
#> 3 0.0097941256