Module: `AA.triggers`

Computes triggers for SPI or Dryspell, for a given country and vulnerability mode.

Usage

Pixi

pixi run python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY>

Docker

docker run --rm \
  -e AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID} \
  -e AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY} \
  -e AWS_SESSION_TOKEN=${AWS_SESSION_TOKEN} \
  aa-runner:latest \
  python -m AA.triggers <ISO> <SPI/DRYSPELL> <VULNERABILITY> \
  --data-path <DATA_PATH> --output-path <OUTPUT_PATH>

Arguments

<ISO>: 3-letter ISO code.
<SPI/DRYSPELL>: Indicator family.
<VULNERABILITY>:
GT — General Triggers
NRT — Non-Regret Triggers
TBD — Save full list without filtering

Post-processing

After running both SPI and Dryspell triggers, use: - merge-spi-dryspell-gt-nrt-triggers.py (Jupytext notebook) to filter and merge results.

For core data models and common utilities used in trigger computation, see HIP Analysis docs:
https://wfp-vam.github.io/hip-analysis/

`brute_force(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, out_shape, result)`

Evaluate the objective function for a set of thresholds using vectorized operations.

This function computes the objective function across an array of forecast probabilities (prob_issue0, prob_issue1) and observed values (obs_val, obs_bool). The objective function is evaluated for a set of candidate trigger pairs (t1 and t2 ranging between 0 and 1 with a 0.01 step). The trigger pair with the lowest score is extracted as well as the score value.

The function is optimized for parallel execution with Numba's guvectorize decorator to perform calculations efficiently across a grid of inputs.

Parameters:

Name	Description	Default
`prob_issue0`	np.ndarray, Forecast probabilities for the 'ready' stage.	required
`prob_issue1`	np.ndarray, Forecast probabilities for the 'set' stage.	required
`obs_val`	np.ndarray, Array of observed continuous anomaly values.	required
`obs_bool`	np.ndarray, Array of observed binary event occurrences (0 or 1).	required
`leadtime`	int, Indicator period lead time (month).	required
`issue`	int, Forecast issue month.	required
`tolerance`	float, Tolerance threshold for acceptable false alarms.	required
`filter_constraints`	int, Flag to apply operational constraints (1 for yes, 0 for no).	required
`min_return_period`	int, Minimum acceptable return period (in years).	required
`min_hit_rate`	float, Minimum acceptable hit rate.	required
`min_success_rate`	float, Minimum acceptable success rate.	required
`max_failure_rate`	float, Maximum acceptable failure rate.	required
`out_shape`	np.ndarray, Array with the same size as result used as a trick to define the dimension of result in the decorator.	required
`result`	np.ndarray, Array to store computed objective function value.	required

Returns:

Name	Type	Description
`None`		The `result` array is updated in place with the objective value.

Notes

This function uses Numba's guvectorize to enable fast parallel processing.
It computes various performance metrics (hit rate, false alarm rate, success rate, failure rate) based on the thresholds.
Constraints (if enabled) can penalize thresholds that don't satisfy operational limits.

`compute_confusion_matrix(true, pred, out_shape, result)`

Computes a confusion matrix using numpy for two np.arrays true and pred.

Results are identical (and similar in computation time) to: "from sklearn.metrics import confusion_matrix"

However, this function avoids the dependency on sklearn and allows to use numba in nopython mode.

Returns an array [true negatives, false positives, false negatives, true positives]

`evaluate_grid_metrics(obs, probs_ready, probs_set)`

Evaluate all metrics over the entire grid using apply_ufunc. The list of metrics is: [Correct Rejections, False Positives, False Negatives, Hits, Hit Rate, False Alarm Rate, Success Rate, Failure Rate, Return Period]

Parameters:

Name	Description	Default
`obs`	xarray.Dataset, containing numerical and categorical observations, with dimensions (district, time, category, index)	required
`probs_ready`	xarray.Dataset, forecast probabilities for the ready month, with dimensions (district, time, category, index, issue)	required
`probs_set`	xarray.Dataset, forecast probabilities for the set month	required

Returns:

Name	Type	Description
`metrics_da`		xarray.DataArray, structured array with grid evaluations for all metrics

`filter_triggers_by_window(df_leadtime, probs_ready, probs_set, obs, params)`

Filters and selects the best trigger pairs for each window by evaluating the trigger values.

Parameters:

Name	Description	Default
`df_leadtime`	pd.DataFrame, DataFrame containing lead time information and trigger values.	required
`probs_ready`	xarray.Dataset, dataset containing readiness probabilities.	required
`probs_set`	xarray.Dataset, dataset containing set probabilities.	required
`obs`	xarray.DataArray, dataset containing observation values.	required
`params`	object, configuration parameters including requirements for HR, SR, and FR.	required

Returns:

Type	Description
	pd.DataFrame, DataFrame containing the best trigger pairs for each window.

`find_optimal_triggers(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, output_shape)`

Find the optimal triggers pair by evaluating the objective function on each couple of values of a 100 * 100 grid and selecting the minimizer.

Parameters:

Name	Description	Default
`observations_val`	np.ndarray, Time series of the observed rainfall values (or SPI).	required
`observations_bool`	np.ndarray, Time series of categorical observations for the specified category.	required
`prob_ready`	np.ndarray, Time series of forecast probabilities for the ready month.	required
`prob_set`	np.ndarray, Time series of forecast probabilities for the set month.	required
`lead_time`	int, Lead time month.	required
`issue`	int, Issue month.	required
`tolerance`	float, Tolerance threshold for acceptable false alarms.	required
`filter_constraints`	int, Flag to apply operational constraints (1 for yes, 0 for no).	required
`min_return_period`	int, Minimum acceptable return period (in years).	required
`min_hit_rate`	float, Minimum acceptable hit rate.	required
`min_success_rate`	float, Minimum acceptable success rate.	required
`max_failure_rate`	float, Maximum acceptable failure rate.	required
`output_shape`	np.ndarray, Array with expected output size for numba compilation.	required

Returns:

Name	Type	Description
`best_triggers`		np.ndarray, Array of size 2 containing best triggers for Ready / Set.
`best_score`		float, Score (mainly hit rate) corresponding to the best triggers.

`get_trigger_metrics_dataframe(obs, probs_ready, probs_set, data_path)`

Compute trigger metrics for a single district and save the results as a CSV file.

Parameters:

Name	Description	Default
`obs`	xarray.DataArray, observations dataset containing 'bool', 'val', 'lead_time', and 'category' variables.	required
`probs_ready`	xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables.	required
`probs_set`	xarray.DataArray, dataset containing set probabilities with 'prob' variable.	required
`data_path`	str, output folder path to save the CSV file.	required

Returns:

Type	Description
	None

`get_window_district(area, indicator, district, params)`

Determines which window (Window 1 or Window 2) an indicator belongs to for a given district.

`objective(t, obs_val, obs_bool, prob_issue0, prob_issue1, leadtime, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, penalty, conf_matrix, constraints, result)`

Compute the objective function value for a given pair of thresholds.

This function evaluates a set of thresholds (t) applied to two forecast probability time series (prob_issue0 and prob_issue1) to generate binary predictions. It then computes the confusion matrix (misses, false alarms, false negatives, hits) and derives key performance metrics such as hit rate, false alarm rate, success rate, failure rate, and return period.

If filter_constraints is True, it applies a set of operational constraints (minimum hit rate, minimum success rate, maximum failure rate, minimum return period, and lead time constraint) to determine if the thresholds are acceptable. If all constraints are satisfied, it minimizes a combination of hit rate and false alarm rate; otherwise, it assigns a high penalty value.

If filter_constraints is False, it returns the full set of metrics without penalization.

Parameters:

Name	Description	Default
`t`	np.ndarray, Array of size 2 containing triggers for 'ready' and 'set' forecasts.	required
`obs_val`	np.ndarray, Array of observed continuous anomaly values.	required
`obs_bool`	np.ndarray, Array of observed binary event occurrences (0 or 1).	required
`prob_issue0`	np.ndarray, Forecast probabilities for the 'ready' stage.	required
`prob_issue1`	np.ndarray, Forecast probabilities for the 'set' stage.	required
`leadtime`	int, Indicator period lead time (month).	required
`issue`	int, Forecast issue month.	required
`tolerance`	float, Tolerance threshold for acceptable false alarms.	required
`filter_constraints`	int, If 1, constraints are applied to filter acceptable triggers.	required
`min_return_period`	int, Minimum acceptable return period for actions (in years).	required
`min_hit_rate`	float, Minimum acceptable hit rate.	required
`min_success_rate`	float, Minimum acceptable success rate (hits relative to actions taken).	required
`max_failure_rate`	float, Maximum acceptable failure rate (tolerance-exceeding false alarms relative to actions taken).	required
`penalty`	np.ndarray, Array of penalty values assigned when constraints are not satisfied.	required
`conf_matrix`	np.ndarray, Array to store computed confusion matrix elements [misses, false alarms, false negatives, hits].	required
`constraints`	np.ndarray, Array to temporarily store the boolean results of constraints evaluation.	required
`result`	np.ndarray, Array to store the computed objective score or the list of metrics.	required

Notes

The first output when filter_constraints=True is a scalar combining hit rate and false alarm rate.
When filter_constraints=False, the full confusion matrix and performance metrics are returned.
Designed for use with Numba guvectorize to allow fast parallel evaluation across a large grid of thresholds.

`run_pilot_districts_metrics(obs, probs_ready, probs_set, params)`

Loop through pilot districts and save trigger metrics in CSV.

Parameters:

Name	Description	Default
`obs`	xarray.DataArray, observational dataset containing 'bool', 'val', 'lead_time', and 'category' variables.	required
`probs_ready`	xarray.DataArray, dataset containing readiness probabilities with 'prob' and 'issue' variables.	required
`probs_set`	xarray.DataArray, dataset containing set probabilities with 'prob' variable.	required
`params`	object, configuration parameters including 'data_path', 'iso', and 'districts_vulnerability'.	required

Returns:

Type	Description
	None

`run_ready_set_brute_selection(obs, probs_ready, probs_set, probs, params)`

Run the trigger optimization using xarray's apply_ufunc and Dask parallelization.

Parameters:

Name	Description	Default
`obs`	xarray.Dataset, dataset containing observed values ('val'), boolean event occurrence ('bool'), lead time ('lead_time'), tolerance ('tolerance'), and return period ('return_period').	required
`probs_ready`	xarray.Dataset, dataset containing the forecast probability used for readiness ('prob').	required
`probs_set`	xarray.Dataset, dataset containing the forecast probability used for activation ('prob').	required
`probs`	xarray.Dataset, dataset containing the forecast issue time ('issue').	required
`params`	object, Params class containing 'vulnerability' and 'requirements' (with 'HR', 'SR', 'FR').	required

Returns:

Name	Type	Description
`triggers`		xarray.DataArray, array of optimal trigger pairs.
`score`		xarray.DataArray, optimal score (- hit rate + alpha * false alarm rate) for each configuration.

`save_metrics_df(grid_metrics_da, data_path)`

Convert grid metrics from a DataArray to a pivoted DataFrame and save it as a CSV file.

Parameters:

Name	Type	Description	Default
`grid_metrics_da`		xarray.DataArray, containing the computed trigger metrics.	required
`data_path`		str, output folder path.	required

Returns:

Name	Type	Description
`output_path`		str, output file path.

Module: AA.triggers

Usage

Pixi

Docker

Arguments

Post-processing

brute_force(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, out_shape, result)

compute_confusion_matrix(true, pred, out_shape, result)

evaluate_grid_metrics(obs, probs_ready, probs_set)

filter_triggers_by_window(df_leadtime, probs_ready, probs_set, obs, params)

find_optimal_triggers(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, output_shape)

get_trigger_metrics_dataframe(obs, probs_ready, probs_set, data_path)

get_window_district(area, indicator, district, params)

objective(t, obs_val, obs_bool, prob_issue0, prob_issue1, leadtime, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, penalty, conf_matrix, constraints, result)

run_pilot_districts_metrics(obs, probs_ready, probs_set, params)

run_ready_set_brute_selection(obs, probs_ready, probs_set, probs, params)

save_metrics_df(grid_metrics_da, data_path)

Module: `AA.triggers`

`brute_force(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, out_shape, result)`

`compute_confusion_matrix(true, pred, out_shape, result)`

`evaluate_grid_metrics(obs, probs_ready, probs_set)`

`filter_triggers_by_window(df_leadtime, probs_ready, probs_set, obs, params)`

`find_optimal_triggers(observations_val, observations_bool, prob_ready, prob_set, lead_time, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, output_shape)`

`get_trigger_metrics_dataframe(obs, probs_ready, probs_set, data_path)`

`get_window_district(area, indicator, district, params)`

`objective(t, obs_val, obs_bool, prob_issue0, prob_issue1, leadtime, issue, tolerance, filter_constraints, min_return_period, min_hit_rate, min_success_rate, max_failure_rate, penalty, conf_matrix, constraints, result)`

`run_pilot_districts_metrics(obs, probs_ready, probs_set, params)`

`run_ready_set_brute_selection(obs, probs_ready, probs_set, probs, params)`

`save_metrics_df(grid_metrics_da, data_path)`