Stats¶

`stats` ¶

Core poverty and inequality statistics functions.

`get_stats(country='all', year='all', povline=None, popshare=None, fill_gaps=False, nowcast=False, subgroup=None, welfare_type='all', reporting_level='all', version=None, ppp_version=None, release_version=None, api_version=API_VERSION, fmt='arrow', simplify=True, server=None, dataframe_type='pandas')` ¶

Get poverty and inequality statistics from the PIP API.

This is the primary function for querying household survey-based poverty and inequality estimates. It mirrors pipr::get_stats().

Parameters:

Name	Type	Description	Default
`country`	`str \| list[str]`	ISO3 country code(s) or `"all"`.	`'all'`
`year`	`str \| int \| list[int]`	Survey year(s) or `"all"`.	`'all'`
`povline`	`float \| None`	Poverty line in 2017 PPP USD per day.	`None`
`popshare`	`float \| None`	Proportion of the population below the poverty line. When set, povline is ignored.	`None`
`fill_gaps`	`bool`	If `True`, interpolate/extrapolate values for years without survey data.	`False`
`nowcast`	`bool`	If `True`, include nowcast estimates (implies `fill_gaps=True`).	`False`
`subgroup`	`str \| None`	Pre-defined aggregation. Either `"wb_regions"` or `"none"`. When set, routes to the `pip-grp` endpoint.	`None`
`welfare_type`	`str`	Welfare concept — `"all"`, `"income"`, or `"consumption"`.	`'all'`
`reporting_level`	`str`	Geographic level — `"all"`, `"national"`, `"urban"`, or `"rural"`.	`'all'`
`version`	`str \| None`	Data version string (see :func:`~povineq.info.get_versions`).	`None`
`ppp_version`	`int \| None`	PPP base year.	`None`
`release_version`	`str \| None`	Release date in `YYYYMMDD` format.	`None`
`api_version`	`str`	API version (only `"v1"` currently).	`API_VERSION`
`fmt`	`str`	Response format — `"arrow"` (default), `"json"`, or `"csv"`.	`'arrow'`
`simplify`	`bool`	If `True` (default), return a DataFrame. If `False`, return a :class:`~povineq._response.PIPResponse` wrapper.	`True`
`server`	`str \| None`	Server target — `None`/`"prod"`, `"qa"`, or `"dev"`.	`None`
`dataframe_type`	`Literal['pandas', 'polars']`	`"pandas"` (default) or `"polars"`.	`'pandas'`

Returns:

Name	Type	Description
`A`	`DataFrame \| PIPResponse`	class:`~pandas.DataFrame` when simplify is `True`, or a
	`DataFrame \| PIPResponse`	class:`~povineq._response.PIPResponse` when simplify is `False`.

Raises:

Type	Description
`PIPValidationError`	If parameter values are invalid.
`PIPAPIError`	If the API returns a structured error response.
`PIPRateLimitError`	If the rate limit is exceeded after retries.
`PIPConnectionError`	If the network is unreachable.

Example

import povineq df = povineq.get_stats(country="AGO", year=2000) df = povineq.get_stats(country="all", year="all", fill_gaps=True) df = povineq.get_stats(country="all", subgroup="wb_regions")

Source code in src/povineq/stats.py

def get_stats(
    country: str | list[str] = "all",
    year: str | int | list[int] = "all",
    povline: float | None = None,
    popshare: float | None = None,
    fill_gaps: bool = False,
    nowcast: bool = False,
    subgroup: str | None = None,
    welfare_type: str = "all",
    reporting_level: str = "all",
    version: str | None = None,
    ppp_version: int | None = None,
    release_version: str | None = None,
    api_version: str = API_VERSION,
    fmt: str = "arrow",
    simplify: bool = True,
    server: str | None = None,
    dataframe_type: Literal["pandas", "polars"] = "pandas",
) -> pd.DataFrame | PIPResponse:
    """Get poverty and inequality statistics from the PIP API.

    This is the primary function for querying household survey-based poverty
    and inequality estimates. It mirrors ``pipr::get_stats()``.

    Args:
        country: ISO3 country code(s) or ``"all"``.
        year: Survey year(s) or ``"all"``.
        povline: Poverty line in 2017 PPP USD per day.
        popshare: Proportion of the population below the poverty line.
            When set, *povline* is ignored.
        fill_gaps: If ``True``, interpolate/extrapolate values for years
            without survey data.
        nowcast: If ``True``, include nowcast estimates (implies
            ``fill_gaps=True``).
        subgroup: Pre-defined aggregation. Either ``"wb_regions"`` or
            ``"none"``. When set, routes to the ``pip-grp`` endpoint.
        welfare_type: Welfare concept — ``"all"``, ``"income"``, or
            ``"consumption"``.
        reporting_level: Geographic level — ``"all"``, ``"national"``,
            ``"urban"``, or ``"rural"``.
        version: Data version string (see :func:`~povineq.info.get_versions`).
        ppp_version: PPP base year.
        release_version: Release date in ``YYYYMMDD`` format.
        api_version: API version (only ``"v1"`` currently).
        fmt: Response format — ``"arrow"`` (default), ``"json"``, or
            ``"csv"``.
        simplify: If ``True`` (default), return a DataFrame. If ``False``,
            return a :class:`~povineq._response.PIPResponse` wrapper.
        server: Server target — ``None``/``"prod"``, ``"qa"``, or ``"dev"``.
        dataframe_type: ``"pandas"`` (default) or ``"polars"``.

    Returns:
        A :class:`~pandas.DataFrame` when *simplify* is ``True``, or a
        :class:`~povineq._response.PIPResponse` when *simplify* is ``False``.

    Raises:
        PIPValidationError: If parameter values are invalid.
        PIPAPIError: If the API returns a structured error response.
        PIPRateLimitError: If the rate limit is exceeded after retries.
        PIPConnectionError: If the network is unreachable.

    Example:
        >>> import povineq
        >>> df = povineq.get_stats(country="AGO", year=2000)
        >>> df = povineq.get_stats(country="all", year="all", fill_gaps=True)
        >>> df = povineq.get_stats(country="all", subgroup="wb_regions")
    """
    logger.debug(
        "get_stats",
        country=country,
        year=year,
        povline=povline,
        popshare=popshare,
        fill_gaps=fill_gaps,
        nowcast=nowcast,
        subgroup=subgroup,
    )

    # Validate and apply business rules via pydantic
    params = StatsParams(
        country=country,
        year=year,
        povline=povline,
        popshare=popshare,
        fill_gaps=fill_gaps,
        nowcast=nowcast,
        subgroup=subgroup,
        welfare_type=welfare_type,
        reporting_level=reporting_level,
        version=version,
        ppp_version=ppp_version,
        release_version=release_version,
        api_version=api_version,
        format=fmt,
    )

    # Route endpoint
    if params.subgroup is not None:
        endpoint = ENDPOINT_PIP_GRP
        group_by = "wb" if params.subgroup == "wb_regions" else params.subgroup
    else:
        endpoint = ENDPOINT_PIP
        group_by = None

    # Build query params (exclude subgroup; use group_by instead)
    query = params.to_query_params()
    query.pop("subgroup", None)
    query.pop("nowcast", None)  # nowcast is not an API query param
    if group_by is not None:
        query["group_by"] = group_by

    response = build_and_execute(endpoint, query, server=server, api_version=api_version)

    out = parse_response(response, simplify=simplify, dataframe_type=dataframe_type)

    # When fill_gaps=False (and simplify=True) filter out nowcast rows
    # pipr does this because estimate_type is only returned when fill_gaps=True
    if params.nowcast is False and simplify and isinstance(out, pd.DataFrame):
        if "estimate_type" in out.columns:
            out = out[~out["estimate_type"].str.contains("nowcast", na=False)].copy()

    return out

`get_wb(year='all', povline=None, version=None, ppp_version=None, release_version=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas')` ¶

Get World Bank regional and global aggregate statistics.

Shorthand for get_stats(subgroup="wb_regions"). Mirrors pipr::get_wb().

Parameters:

Name	Type	Description	Default
`year`	`str \| int \| list[int]`	Year(s) or `"all"`.	`'all'`
`povline`	`float \| None`	Poverty line in 2017 PPP USD per day.	`None`
`version`	`str \| None`	Data version string.	`None`
`ppp_version`	`int \| None`	PPP base year.	`None`
`release_version`	`str \| None`	Release date in `YYYYMMDD` format.	`None`
`api_version`	`str`	API version.	`API_VERSION`
`fmt`	`str`	Response format — `"json"` (default) or `"csv"`.	`'json'`
`simplify`	`bool`	If `True` (default), return a DataFrame.	`True`
`server`	`str \| None`	Server target.	`None`
`dataframe_type`	`Literal['pandas', 'polars']`	`"pandas"` or `"polars"`.	`'pandas'`

Returns:

Type	Description
`DataFrame \| PIPResponse`	A DataFrame of WB regional/global aggregates.

Example

import povineq df = povineq.get_wb()

Source code in src/povineq/stats.py

def get_wb(
    year: str | int | list[int] = "all",
    povline: float | None = None,
    version: str | None = None,
    ppp_version: int | None = None,
    release_version: str | None = None,
    api_version: str = API_VERSION,
    fmt: str = "json",
    simplify: bool = True,
    server: str | None = None,
    dataframe_type: Literal["pandas", "polars"] = "pandas",
) -> pd.DataFrame | PIPResponse:
    """Get World Bank regional and global aggregate statistics.

    Shorthand for ``get_stats(subgroup="wb_regions")``.
    Mirrors ``pipr::get_wb()``.

    Args:
        year: Year(s) or ``"all"``.
        povline: Poverty line in 2017 PPP USD per day.
        version: Data version string.
        ppp_version: PPP base year.
        release_version: Release date in ``YYYYMMDD`` format.
        api_version: API version.
        fmt: Response format — ``"json"`` (default) or ``"csv"``.
        simplify: If ``True`` (default), return a DataFrame.
        server: Server target.
        dataframe_type: ``"pandas"`` or ``"polars"``.

    Returns:
        A DataFrame of WB regional/global aggregates.

    Example:
        >>> import povineq
        >>> df = povineq.get_wb()
    """
    query: dict[str, str] = {}
    if year != "all":
        query["year"] = ",".join(str(y) for y in year) if isinstance(year, list) else str(year)
    else:
        query["year"] = "all"

    if povline is not None:
        query["povline"] = str(povline)
    if version is not None:
        query["version"] = version
    if ppp_version is not None:
        query["ppp_version"] = str(ppp_version)
    if release_version is not None:
        query["release_version"] = release_version
    query["format"] = fmt
    query["group_by"] = "wb"

    response = build_and_execute(
        ENDPOINT_PIP_GRP, query, server=server, api_version=api_version
    )
    return parse_response(response, simplify=simplify, dataframe_type=dataframe_type)

`get_agg(year='all', povline=None, version=None, ppp_version=None, release_version=None, aggregate=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas')` ¶

Get custom aggregate statistics (FCV, regional, vintage, etc.).

Mirrors pipr::get_agg().

Parameters:

Name	Type	Description	Default
`year`	`str \| int \| list[int]`	Year(s) or `"all"`.	`'all'`
`povline`	`float \| None`	Poverty line in 2017 PPP USD per day.	`None`
`version`	`str \| None`	Data version string.	`None`
`ppp_version`	`int \| None`	PPP base year.	`None`
`release_version`	`str \| None`	Release date in `YYYYMMDD` format.	`None`
`aggregate`	`str \| None`	Aggregate name (e.g. `"fcv"`).	`None`
`api_version`	`str`	API version.	`API_VERSION`
`fmt`	`str`	Response format — `"json"` (default) or `"csv"`.	`'json'`
`simplify`	`bool`	If `True` (default), return a DataFrame.	`True`
`server`	`str \| None`	Server target.	`None`
`dataframe_type`	`Literal['pandas', 'polars']`	`"pandas"` or `"polars"`.	`'pandas'`

Returns:

Type	Description
`DataFrame \| PIPResponse`	A DataFrame of custom aggregate statistics.

Example

import povineq df = povineq.get_agg(aggregate="fcv", server="qa")

Source code in src/povineq/stats.py

def get_agg(
    year: str | int | list[int] = "all",
    povline: float | None = None,
    version: str | None = None,
    ppp_version: int | None = None,
    release_version: str | None = None,
    aggregate: str | None = None,
    api_version: str = API_VERSION,
    fmt: str = "json",
    simplify: bool = True,
    server: str | None = None,
    dataframe_type: Literal["pandas", "polars"] = "pandas",
) -> pd.DataFrame | PIPResponse:
    """Get custom aggregate statistics (FCV, regional, vintage, etc.).

    Mirrors ``pipr::get_agg()``.

    Args:
        year: Year(s) or ``"all"``.
        povline: Poverty line in 2017 PPP USD per day.
        version: Data version string.
        ppp_version: PPP base year.
        release_version: Release date in ``YYYYMMDD`` format.
        aggregate: Aggregate name (e.g. ``"fcv"``).
        api_version: API version.
        fmt: Response format — ``"json"`` (default) or ``"csv"``.
        simplify: If ``True`` (default), return a DataFrame.
        server: Server target.
        dataframe_type: ``"pandas"`` or ``"polars"``.

    Returns:
        A DataFrame of custom aggregate statistics.

    Example:
        >>> import povineq
        >>> df = povineq.get_agg(aggregate="fcv", server="qa")
    """
    params = AggParams(
        year=year,
        povline=povline,
        version=version,
        ppp_version=ppp_version,
        release_version=release_version,
        aggregate=aggregate,
        api_version=api_version,
        format=fmt,
    )

    query = params.to_query_params()
    query.pop("api_version", None)

    response = build_and_execute(
        ENDPOINT_PIP_GRP, query, server=server, api_version=api_version
    )
    return parse_response(response, simplify=simplify, dataframe_type=dataframe_type)

Stats¶

stats ¶

get_wb(year='all', povline=None, version=None, ppp_version=None, release_version=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas') ¶

get_agg(year='all', povline=None, version=None, ppp_version=None, release_version=None, aggregate=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas') ¶

`stats` ¶

`get_wb(year='all', povline=None, version=None, ppp_version=None, release_version=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas')` ¶

`get_agg(year='all', povline=None, version=None, ppp_version=None, release_version=None, aggregate=None, api_version=API_VERSION, fmt='json', simplify=True, server=None, dataframe_type='pandas')` ¶