Module: AA.triggers
Computes triggers for SPI or Dryspell, for a given country and vulnerability mode.
Usage
Pixi
pixi run python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY>
Docker
docker run --rm \
-e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
-e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
-e AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN} \
aa-runner:latest \
python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY> \
--data-path <DATA_PATH> --output-path <OUTPUT_PATH>
Arguments
<ISO>: 3-letter ISO code.<SPI/DRYSPELL>: Indicator family.<VULNERABILITY>:GT— General TriggersNRT— Non-Regret TriggersTBD— Save full list without filtering
Post-processing
After running both SPI and Dryspell triggers, use:
- merge-spi-dryspell-gt-nrt-triggers.py (Jupytext notebook) to filter and merge results.
For core data models and common utilities used in trigger computation, see HIP Analysis docs:
https://wfp-vam.github.io/hip-analysis/
brute_force(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, out_shape, result)
Evaluate the objective function for a set of thresholds using vectorized operations.
This function computes the objective function across an array of forecast probabilities
(prob_issue0, prob_issue1) and observed values (obs_val, obs_bool).
The objective function is evaluated for a set of candidate trigger pairs (t1 and t2 ranging
between 0 and 1 with a 0.01 step). The trigger pair with the lowest score is extracted as well as the score value.
The function is optimized for parallel execution with Numba's guvectorize decorator to perform
calculations efficiently across a grid of inputs.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
prob_issue0
|
np.ndarray, Forecast probabilities for the 'ready' stage. |
required | |
prob_issue1
|
np.ndarray, Forecast probabilities for the 'set' stage. |
required | |
obs_val
|
np.ndarray, Array of observed continuous anomaly values. |
required | |
obs_bool
|
np.ndarray, Array of observed binary event occurrences (0 or 1). |
required | |
leadtime
|
int, Indicator period lead time (month). |
required | |
issue
|
int, Forecast issue month. |
required | |
tolerance
|
float, Tolerance threshold for acceptable false alarms. |
required | |
filter_constraints
|
int, Flag to apply operational constraints (1 for yes, 0 for no). |
required | |
min_return_period
|
int, Minimum acceptable return period (in years). |
required | |
min_hit_rate
|
float, Minimum acceptable hit rate. |
required | |
min_success_rate
|
float, Minimum acceptable success rate. |
required | |
max_failure_rate
|
float, Maximum acceptable failure rate. |
required | |
out_shape
|
np.ndarray, Array with the same size as result used as a trick to define the dimension of result in the decorator. |
required | |
result
|
np.ndarray, Array to store computed objective function value. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
None |
The |
Notes
- This function uses
Numba'sguvectorizeto enable fast parallel processing. - It computes various performance metrics (hit rate, false alarm rate, success rate, failure rate) based on the thresholds.
- Constraints (if enabled) can penalize thresholds that don't satisfy operational limits.
compute_confusion_matrix(true, pred, out_shape, result)
Computes a confusion matrix using numpy for two np.arrays true and pred.
Results are identical (and similar in computation time) to: "from sklearn.metrics import confusion_matrix"
However, this function avoids the dependency on sklearn and allows to use numba in nopython mode.
Returns an array [true negatives, false positives, false negatives, true positives]
evaluate_grid_metrics(obs, probs_ready, probs_set)
Evaluate all metrics over the entire grid using apply_ufunc. The list of metrics is: [Correct Rejections, False Positives, False Negatives, Hits, Hit Rate, False Alarm Rate, Success Rate, Failure Rate, Return Period]
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
xarray.Dataset, containing numerical and categorical observations, with dimensions (district, time, category, index) |
required | |
probs_ready
|
xarray.Dataset, forecast probabilities for the ready month, with dimensions (district, time, category, index, issue) |
required | |
probs_set
|
xarray.Dataset, forecast probabilities for the set month |
required |
Returns:
| Name | Type | Description |
|---|---|---|
metrics_da |
xarray.DataArray, structured array with grid evaluations for all metrics |
filter_triggers_by_window(df_leadtime, probs_ready, probs_set, obs, params)
Filters and selects the best trigger pairs for each window by evaluating the trigger values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df_leadtime
|
pd.DataFrame, DataFrame containing lead time information and trigger values. |
required | |
probs_ready
|
xarray.Dataset, dataset containing readiness probabilities. |
required | |
probs_set
|
xarray.Dataset, dataset containing set probabilities. |
required | |
obs
|
xarray.DataArray, dataset containing observation values. |
required | |
params
|
object, configuration parameters including requirements for HR, SR, and FR. |
required |
Returns:
| Type | Description |
|---|---|
|
pd.DataFrame, DataFrame containing the best trigger pairs for each window. |
find_optimal_triggers(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, output_shape)
Find the optimal triggers pair by evaluating the objective function on each couple of values of a 100 * 100 grid and selecting the minimizer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
observations_val
|
np.ndarray, Time series of the observed rainfall values (or SPI). |
required | |
observations_bool
|
np.ndarray, Time series of categorical observations for the specified category. |
required | |
prob_ready
|
np.ndarray, Time series of forecast probabilities for the ready month. |
required | |
prob_set
|
np.ndarray, Time series of forecast probabilities for the set month. |
required | |
lead_time
|
int, Lead time month. |
required | |
issue
|
int, Issue month. |
required | |
tolerance
|
float, Tolerance threshold for acceptable false alarms. |
required | |
filter_constraints
|
int, Flag to apply operational constraints (1 for yes, 0 for no). |
required | |
min_return_period
|
int, Minimum acceptable return period (in years). |
required | |
min_hit_rate
|
float, Minimum acceptable hit rate. |
required | |
min_success_rate
|
float, Minimum acceptable success rate. |
required | |
max_failure_rate
|
float, Maximum acceptable failure rate. |
required | |
output_shape
|
np.ndarray, Array with expected output size for numba compilation. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
best_triggers |
np.ndarray, Array of size 2 containing best triggers for Ready / Set. |
|
best_score |
float, Score (mainly hit rate) corresponding to the best triggers. |
get_trigger_metrics_dataframe(obs, probs_ready, probs_set, data_path)
Compute trigger metrics for a single district and save the results as a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
xarray.DataArray, observations dataset containing 'bool', 'val', 'lead_time', and 'category' variables. |
required | |
probs_ready
|
xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables. |
required | |
probs_set
|
xarray.DataArray, dataset containing set probabilities with 'prob' variable. |
required | |
data_path
|
str, output folder path to save the CSV file. |
required |
Returns:
| Type | Description |
|---|---|
|
None |
get_window_district(area, indicator, district, params)
Determines which window (Window 1 or Window 2) an indicator belongs to for a given district.
objective(t, obs_val, obs_bool, prob_issue0, prob_issue1, leadtime, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, penalty, conf_matrix, constraints, result)
Compute the objective function value for a given pair of thresholds.
This function evaluates a set of thresholds (t) applied to two forecast probability time series (prob_issue0 and prob_issue1) to generate binary predictions. It then computes the confusion matrix (misses, false alarms, false negatives, hits) and derives key performance metrics such as hit rate, false alarm rate, success rate, failure rate, and return period.
If filter_constraints is True, it applies a set of operational constraints
(minimum hit rate, minimum success rate, maximum failure rate, minimum return period, and lead time constraint)
to determine if the thresholds are acceptable. If all constraints are satisfied,
it minimizes a combination of hit rate and false alarm rate; otherwise, it assigns a high penalty value.
If filter_constraints is False, it returns the full set of metrics without penalization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
t
|
np.ndarray, Array of size 2 containing triggers for 'ready' and 'set' forecasts. |
required | |
obs_val
|
np.ndarray, Array of observed continuous anomaly values. |
required | |
obs_bool
|
np.ndarray, Array of observed binary event occurrences (0 or 1). |
required | |
prob_issue0
|
np.ndarray, Forecast probabilities for the 'ready' stage. |
required | |
prob_issue1
|
np.ndarray, Forecast probabilities for the 'set' stage. |
required | |
leadtime
|
int, Indicator period lead time (month). |
required | |
issue
|
int, Forecast issue month. |
required | |
tolerance
|
float, Tolerance threshold for acceptable false alarms. |
required | |
filter_constraints
|
int, If 1, constraints are applied to filter acceptable triggers. |
required | |
min_return_period
|
int, Minimum acceptable return period for actions (in years). |
required | |
min_hit_rate
|
float, Minimum acceptable hit rate. |
required | |
min_success_rate
|
float, Minimum acceptable success rate (hits relative to actions taken). |
required | |
max_failure_rate
|
float, Maximum acceptable failure rate (tolerance-exceeding false alarms relative to actions taken). |
required | |
penalty
|
np.ndarray, Array of penalty values assigned when constraints are not satisfied. |
required | |
conf_matrix
|
np.ndarray, Array to store computed confusion matrix elements [misses, false alarms, false negatives, hits]. |
required | |
constraints
|
np.ndarray, Array to temporarily store the boolean results of constraints evaluation. |
required | |
result
|
np.ndarray, Array to store the computed objective score or the list of metrics. |
required |
Notes
- The first output when
filter_constraints=Trueis a scalar combining hit rate and false alarm rate. - When
filter_constraints=False, the full confusion matrix and performance metrics are returned. - Designed for use with Numba
guvectorizeto allow fast parallel evaluation across a large grid of thresholds.
run_pilot_districts_metrics(obs, probs_ready, probs_set, params)
Loop through pilot districts and save trigger metrics in CSV.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
xarray.DataArray, observational dataset containing 'bool', 'val', 'lead_time', and 'category' variables. |
required | |
probs_ready
|
xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables. |
required | |
probs_set
|
xarray.DataArray, dataset containing set probabilities with 'prob' variable. |
required | |
params
|
object, configuration parameters including 'data_path', 'iso', and 'districts_vulnerability'. |
required |
Returns:
| Type | Description |
|---|---|
|
None |
run_ready_set_brute_selection(obs, probs_ready, probs_set, probs, params)
Run the trigger optimization using xarray's apply_ufunc and Dask parallelization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
obs
|
xarray.Dataset, dataset containing observed values ('val'), boolean event occurrence ('bool'), lead time ('lead_time'), tolerance ('tolerance'), and return period ('return_period'). |
required | |
probs_ready
|
xarray.Dataset, dataset containing the forecast probability used for readiness ('prob'). |
required | |
probs_set
|
xarray.Dataset, dataset containing the forecast probability used for activation ('prob'). |
required | |
probs
|
xarray.Dataset, dataset containing the forecast issue time ('issue'). |
required | |
params
|
object, Params class containing 'vulnerability' and 'requirements' (with 'HR', 'SR', 'FR'). |
required |
Returns:
| Name | Type | Description |
|---|---|---|
triggers |
xarray.DataArray, array of optimal trigger pairs. |
|
score |
xarray.DataArray, optimal score (- hit rate + alpha * false alarm rate) for each configuration. |
save_metrics_df(grid_metrics_da, data_path)
Convert grid metrics from a DataArray to a pivoted DataFrame and save it as a CSV file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
grid_metrics_da
|
xarray.DataArray, containing the computed trigger metrics. |
required | |
data_path
|
str, output folder path. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
output_path |
str, output file path. |