Skip to content

Interpolation

The interpolation module provides tools for interpolating data in pandas DataFrames. It allows users to map data onto new x-values using methods like linear, quadratic, or cubic interpolation. The interpolate function works seamlessly with pandas objects, making it easy to handle missing data or resample datasets for analysis.

interpolate(df, x_column, onto, method='linear')

Interpolates all columns of a DataFrame onto new x values.

Parameters:

Name Type Description Default
df DataFrame

The input DataFrame.

required
x_column str

The name of the column to use as the x-axis.

required
onto Union[ndarray, Sequence]

The new x values to interpolate onto.

required
method str

The interpolation method to use (default is 'linear'). Is passed to np.interp1d(kind=...).

'linear'

Returns:

Type Description
DataFrame

pd.DataFrame: A new DataFrame with interpolated values.

Source code in src/quantalyze/core/interpolation.py
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def interpolate(df: pd.DataFrame, x_column: str, onto: Union[np.ndarray, Sequence], method: str = 'linear') -> pd.DataFrame:
    """
    Interpolates all columns of a DataFrame onto new x values.

    Args:
        df (pd.DataFrame): The input DataFrame.
        x_column (str): The name of the column to use as the x-axis.
        onto (Union[np.ndarray, Sequence]): The new x values to interpolate onto.
        method (str): The interpolation method to use (default is 'linear'). Is passed to np.interp1d(kind=...).

    Returns:
        pd.DataFrame: A new DataFrame with interpolated values.
    """
    if x_column not in df.columns:
        raise ValueError(f"Column '{x_column}' not found in DataFrame.")

    onto = np.asarray(onto)  # Ensure 'onto' is converted to a NumPy array
    interpolated_data = {x_column: onto}
    for column in df.columns:
        if column != x_column:
            interpolator = interp1d(df[x_column], df[column], kind=method, bounds_error=False, fill_value="extrapolate")
            interpolated_data[column] = interpolator(onto)

    return pd.DataFrame(interpolated_data)