Skip to content

Module: AA.triggers

Computes triggers for SPI or Dryspell, for a given country and vulnerability mode.

Usage

Pixi

pixi run python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY>

Docker

docker run --rm \
  -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
  -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
  -e AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN} \
  aa-runner:latest \
  python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY> \
  --data-path <DATA_PATH> --output-path <OUTPUT_PATH>

Arguments

  • <ISO>: 3-letter ISO code.
  • <SPI/DRYSPELL>: Indicator family.
  • <VULNERABILITY>:
  • GT — General Triggers
  • NRT — Non-Regret Triggers
  • TBD — Save full list without filtering

Post-processing

After running both SPI and Dryspell triggers, use: - merge-spi-dryspell-gt-nrt-triggers.py (Jupytext notebook) to filter and merge results.

For core data models and common utilities used in trigger computation, see HIP Analysis docs:
https://wfp-vam.github.io/hip-analysis/

brute_force(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, out_shape, result)

Evaluate the objective function for a set of thresholds using vectorized operations.

This function computes the objective function across an array of forecast probabilities (prob_issue0, prob_issue1) and observed values (obs_val, obs_bool). The objective function is evaluated for a set of candidate trigger pairs (t1 and t2 ranging between 0 and 1 with a 0.01 step). The trigger pair with the lowest score is extracted as well as the score value.

The function is optimized for parallel execution with Numba's guvectorize decorator to perform calculations efficiently across a grid of inputs.

Parameters:

Name Type Description Default
prob_issue0

np.ndarray, Forecast probabilities for the 'ready' stage.

required
prob_issue1

np.ndarray, Forecast probabilities for the 'set' stage.

required
obs_val

np.ndarray, Array of observed continuous anomaly values.

required
obs_bool

np.ndarray, Array of observed binary event occurrences (0 or 1).

required
leadtime

int, Indicator period lead time (month).

required
issue

int, Forecast issue month.

required
tolerance

float, Tolerance threshold for acceptable false alarms.

required
filter_constraints

int, Flag to apply operational constraints (1 for yes, 0 for no).

required
min_return_period

int, Minimum acceptable return period (in years).

required
min_hit_rate

float, Minimum acceptable hit rate.

required
min_success_rate

float, Minimum acceptable success rate.

required
max_failure_rate

float, Maximum acceptable failure rate.

required
out_shape

np.ndarray, Array with the same size as result used as a trick to define the dimension of result in the decorator.

required
result

np.ndarray, Array to store computed objective function value.

required

Returns:

Name Type Description
None

The result array is updated in place with the objective value.

Notes
  • This function uses Numba's guvectorize to enable fast parallel processing.
  • It computes various performance metrics (hit rate, false alarm rate, success rate, failure rate) based on the thresholds.
  • Constraints (if enabled) can penalize thresholds that don't satisfy operational limits.

compute_confusion_matrix(true, pred, out_shape, result)

Computes a confusion matrix using numpy for two np.arrays true and pred.

Results are identical (and similar in computation time) to: "from sklearn.metrics import confusion_matrix"

However, this function avoids the dependency on sklearn and allows to use numba in nopython mode.

Returns an array [true negatives, false positives, false negatives, true positives]

evaluate_grid_metrics(obs, probs_ready, probs_set)

Evaluate all metrics over the entire grid using apply_ufunc. The list of metrics is: [Correct Rejections, False Positives, False Negatives, Hits, Hit Rate, False Alarm Rate, Success Rate, Failure Rate, Return Period]

Parameters:

Name Type Description Default
obs

xarray.Dataset, containing numerical and categorical observations, with dimensions (district, time, category, index)

required
probs_ready

xarray.Dataset, forecast probabilities for the ready month, with dimensions (district, time, category, index, issue)

required
probs_set

xarray.Dataset, forecast probabilities for the set month

required

Returns:

Name Type Description
metrics_da

xarray.DataArray, structured array with grid evaluations for all metrics

filter_triggers_by_window(df_leadtime, probs_ready, probs_set, obs, params)

Filters and selects the best trigger pairs for each window by evaluating the trigger values.

Parameters:

Name Type Description Default
df_leadtime

pd.DataFrame, DataFrame containing lead time information and trigger values.

required
probs_ready

xarray.Dataset, dataset containing readiness probabilities.

required
probs_set

xarray.Dataset, dataset containing set probabilities.

required
obs

xarray.DataArray, dataset containing observation values.

required
params

object, configuration parameters including requirements for HR, SR, and FR.

required

Returns:

Type Description

pd.DataFrame, DataFrame containing the best trigger pairs for each window.

find_optimal_triggers(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, output_shape)

Find the optimal triggers pair by evaluating the objective function on each couple of values of a 100 * 100 grid and selecting the minimizer.

Parameters:

Name Type Description Default
observations_val

np.ndarray, Time series of the observed rainfall values (or SPI).

required
observations_bool

np.ndarray, Time series of categorical observations for the specified category.

required
prob_ready

np.ndarray, Time series of forecast probabilities for the ready month.

required
prob_set

np.ndarray, Time series of forecast probabilities for the set month.

required
lead_time

int, Lead time month.

required
issue

int, Issue month.

required
tolerance

float, Tolerance threshold for acceptable false alarms.

required
filter_constraints

int, Flag to apply operational constraints (1 for yes, 0 for no).

required
min_return_period

int, Minimum acceptable return period (in years).

required
min_hit_rate

float, Minimum acceptable hit rate.

required
min_success_rate

float, Minimum acceptable success rate.

required
max_failure_rate

float, Maximum acceptable failure rate.

required
output_shape

np.ndarray, Array with expected output size for numba compilation.

required

Returns:

Name Type Description
best_triggers

np.ndarray, Array of size 2 containing best triggers for Ready / Set.

best_score

float, Score (mainly hit rate) corresponding to the best triggers.

get_trigger_metrics_dataframe(obs, probs_ready, probs_set, data_path)

Compute trigger metrics for a single district and save the results as a CSV file.

Parameters:

Name Type Description Default
obs

xarray.DataArray, observations dataset containing 'bool', 'val', 'lead_time', and 'category' variables.

required
probs_ready

xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables.

required
probs_set

xarray.DataArray, dataset containing set probabilities with 'prob' variable.

required
data_path

str, output folder path to save the CSV file.

required

Returns:

Type Description

None

get_window_district(area, indicator, district, params)

Determines which window (Window 1 or Window 2) an indicator belongs to for a given district.

objective(t, obs_val, obs_bool, prob_issue0, prob_issue1, leadtime, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, penalty, conf_matrix, constraints, result)

Compute the objective function value for a given pair of thresholds.

This function evaluates a set of thresholds (t) applied to two forecast probability time series (prob_issue0 and prob_issue1) to generate binary predictions. It then computes the confusion matrix (misses, false alarms, false negatives, hits) and derives key performance metrics such as hit rate, false alarm rate, success rate, failure rate, and return period.

If filter_constraints is True, it applies a set of operational constraints (minimum hit rate, minimum success rate, maximum failure rate, minimum return period, and lead time constraint) to determine if the thresholds are acceptable. If all constraints are satisfied, it minimizes a combination of hit rate and false alarm rate; otherwise, it assigns a high penalty value.

If filter_constraints is False, it returns the full set of metrics without penalization.

Parameters:

Name Type Description Default
t

np.ndarray, Array of size 2 containing triggers for 'ready' and 'set' forecasts.

required
obs_val

np.ndarray, Array of observed continuous anomaly values.

required
obs_bool

np.ndarray, Array of observed binary event occurrences (0 or 1).

required
prob_issue0

np.ndarray, Forecast probabilities for the 'ready' stage.

required
prob_issue1

np.ndarray, Forecast probabilities for the 'set' stage.

required
leadtime

int, Indicator period lead time (month).

required
issue

int, Forecast issue month.

required
tolerance

float, Tolerance threshold for acceptable false alarms.

required
filter_constraints

int, If 1, constraints are applied to filter acceptable triggers.

required
min_return_period

int, Minimum acceptable return period for actions (in years).

required
min_hit_rate

float, Minimum acceptable hit rate.

required
min_success_rate

float, Minimum acceptable success rate (hits relative to actions taken).

required
max_failure_rate

float, Maximum acceptable failure rate (tolerance-exceeding false alarms relative to actions taken).

required
penalty

np.ndarray, Array of penalty values assigned when constraints are not satisfied.

required
conf_matrix

np.ndarray, Array to store computed confusion matrix elements [misses, false alarms, false negatives, hits].

required
constraints

np.ndarray, Array to temporarily store the boolean results of constraints evaluation.

required
result

np.ndarray, Array to store the computed objective score or the list of metrics.

required
Notes
  • The first output when filter_constraints=True is a scalar combining hit rate and false alarm rate.
  • When filter_constraints=False, the full confusion matrix and performance metrics are returned.
  • Designed for use with Numba guvectorize to allow fast parallel evaluation across a large grid of thresholds.

run_pilot_districts_metrics(obs, probs_ready, probs_set, params)

Loop through pilot districts and save trigger metrics in CSV.

Parameters:

Name Type Description Default
obs

xarray.DataArray, observational dataset containing 'bool', 'val', 'lead_time', and 'category' variables.

required
probs_ready

xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables.

required
probs_set

xarray.DataArray, dataset containing set probabilities with 'prob' variable.

required
params

object, configuration parameters including 'data_path', 'iso', and 'districts_vulnerability'.

required

Returns:

Type Description

None

run_ready_set_brute_selection(obs, probs_ready, probs_set, probs, params)

Run the trigger optimization using xarray's apply_ufunc and Dask parallelization.

Parameters:

Name Type Description Default
obs

xarray.Dataset, dataset containing observed values ('val'), boolean event occurrence ('bool'), lead time ('lead_time'), tolerance ('tolerance'), and return period ('return_period').

required
probs_ready

xarray.Dataset, dataset containing the forecast probability used for readiness ('prob').

required
probs_set

xarray.Dataset, dataset containing the forecast probability used for activation ('prob').

required
probs

xarray.Dataset, dataset containing the forecast issue time ('issue').

required
params

object, Params class containing 'vulnerability' and 'requirements' (with 'HR', 'SR', 'FR').

required

Returns:

Name Type Description
triggers

xarray.DataArray, array of optimal trigger pairs.

score

xarray.DataArray, optimal score (- hit rate + alpha * false alarm rate) for each configuration.

save_metrics_df(grid_metrics_da, data_path)

Convert grid metrics from a DataArray to a pivoted DataFrame and save it as a CSV file.

Parameters:

Name Type Description Default
grid_metrics_da

xarray.DataArray, containing the computed trigger metrics.

required
data_path

str, output folder path.

required

Returns:

Name Type Description
output_path

str, output file path.