User Guide
Quick start
Python
Run the following example to extract commodity data:
from data_bridges_knots import DataBridgesShapes
CONFIG_PATH = r"data_bridges_api_config.yaml"
client = DataBridgesShapes(CONFIG_PATH)
# COMMODITY DATA
commodity_units_list = client.get_commodity_units_list(country_iso3="TZA", commodity_unit_name="Kg", page=1, format='json')
R
library(reticulate)
# Import the Python module through reticulate
data_bridges_knots <- import("data_bridges_knots")
# Point to our virtual environment's Python
use_python(".venv/bin/python")
# Create client instance
config_path <- "data_bridges_api_config.yaml"
client <- data_bridges_knots$DataBridgesShapes(config_path)
# COMMODITY DATA
# Get commodity unit list for Tanzania
commodity_units <- client$get_commodity_units_list(
country_code = "TZA",
commodity_unit_name = "Kg",
page = 1L,
format = "json"
)
Getting variable and choice labels
DataBridgesKnots come with some helper functions to make the datasets more human-readable.
data_bridges_knots.labels.get_variable_labels(xlsform_df, format='dict')
Build a mapping between variable name and variable labels from a DataBridges XLSForm and return it in the desired format.
Empty labels default to the corresponding name. For duplicate names, the latest occurrence overrides earlier values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xlsform_df
|
DataFrame
|
DataFrame with at least |
required |
format
|
str
|
One of |
'dict'
|
- ``"dict"``
|
returns |
required | |
- ``"json"``
|
returns a JSON-formatted |
required | |
- ``"df"``
|
returns a |
required |
Returns:
| Type | Description |
|---|---|
Union[dict[str, str], str, DataFrame]
|
dict | str | pandas.DataFrame: Labels mapping in the requested format. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If required columns are missing. |
ValueError
|
If |
Examples:
>>> import pandas as pd
>>> df = pd.DataFrame({'name': ['n1', 'n2', 'n2'], 'label': ['L1', '', 'L2']})
>>> get_variable_labels(df, 'dict')
{'n1': 'L1', 'n2': 'L2'}
>>> get_variable_labels(df, 'df')
colName label
0 n1 L1
1 n2 L2
data_bridges_knots.labels.get_choice_labels(xlsform_df, format='dict')
Build a mapping from each XLSForm question name to its choice value labels,
and return it as a dictionary, JSON string, or DataFrame.
The function expects an input DataFrame with
- a column
"name"for the question (field) names, and - a column
"choiceList"whose rows contain a structure with a"choices"list. Each item inchoicesis a dict with"name"(the choice value/code) and"label"(the human-readable label).
Duplicate question names are merged, with later entries updating earlier ones.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
xlsform_df
|
DataFrame
|
Input DataFrame containing at least the columns
|
required |
format
|
str
|
Output format; one of |
'dict'
|
Raises:
| Type | Description |
|---|---|
KeyError
|
If required columns (e.g., |
ValueError
|
If |
Returns:
| Type | Description |
|---|---|
Union[dict[str, str], str, DataFrame]
|
dict[str, dict[str, str]] | str | pandas.DataFrame: Labels mapping in the requested format. |
Examples:
>>> import pandas as pd
>>> df = pd.DataFrame({
... "name": ["q1", "q2"],
... "choiceList": [
... {"choices": [{"name": "yes", "label": "Yes"}, {"name": "no", "label": "No"}]},
... {"choices": [{"name": "a", "label": "Option A"}, {"name": "b", "label": "Option B"}]}
... ]
... })
>>> get_choice_labels(df, format="dict")
{'q1': {'yes': 'Yes', 'no': 'No'}, 'q2': {'a': 'Option A', 'b': 'Option B'}}
>>> print(get_choice_labels(df, format="json"))
>>> get_choice_labels(df, format="df")
data_bridges_knots.labels.map_value_labels(survey_df, xlsform_df)
Map numerical choice values to human-readable labels based on XLSForm choices to a DataFrame.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
survey_df
|
DataFrame
|
The survey data with coded values. |
required |
xlsform_df
|
DataFrame
|
DataFrame containing |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If required columns ( |
Example
import pandas as pd survey = pd.DataFrame({"q1": ["yes", "no"], "q2": ["a", "b"]}) xls = pd.DataFrame({ ... "name": ["q1", "q2"], ... "choiceList": [ ... {"choices": [{"name": "yes", "label": "Yes"}, {"name": "no", "label": "No"}]}, ... {"choices": [{"name": "a", "label": "Option A"}, {"name": "b", "label": "Option B"}]} ... ] ... }) map_value_labels(survey, xls)
Returns:
| Type | Description |
|---|---|
DataFrame
|
pandas.DataFrame: A copy of |
DataFrame
|
XLSForm mapping have codes replaced by labels. |