AutoSliceMixin - Automatic Slicing for Your Data Classes¶
AutoSliceMixin
saves your live from repeating __getitem__
for your data classes.
Quick Start¶
from alpenstock.auto_slice import AutoSliceMixin, SliceHint
import attrs
from typing import Annotated
import numpy as np
@attrs.define
class Weather(AutoSliceMixin):
city: str # scalar attribute (preserved during slicing)
postcode: int # scalar attribute (preserved during slicing)
temperatures: list[float] # shape (T,) - will be sliced
humidities: np.ndarray # shape (T,) - will be sliced
sitewise_temperatures: Annotated[
np.ndarray, SliceHint(axis=1)
] # shape (N, T) - will be sliced along axis 1
# Create a sample weather data instance
data = Weather(
city="Gotham",
postcode=12345,
temperatures=[1, 1, 4, 5, 1, 4],
humidities=np.array([10, 10, 40, 50, 10, 40]),
sitewise_temperatures=np.array([[3, 1, 4, 1, 5, 9], [2, 7, 1, 8, 2, 8]]),
)
# Slicing result
subset = data[1:4]
assert subset.city == "Gotham"
assert subset.postcode == 12345
assert subset.temperatures == [1, 4, 5]
assert np.allclose(subset.humidities, np.array([10, 40, 50]))
assert np.allclose(subset.sitewise_temperatures, np.array([[1, 4, 1], [7, 1, 8]]))
subset
Weather(city='Gotham', postcode=12345, temperatures=[1, 4, 5], humidities=array([10, 40, 50]), sitewise_temperatures=array([[1, 4, 1], [7, 1, 8]]))
NumPy-like Slicing¶
AutoSliceMixin
also supports fancy indexing using either a list of indices or boolean masks:
# Using a list of indices
subset = data[[0, 3, 2, -1]]
subset
Weather(city='Gotham', postcode=12345, temperatures=[1, 5, 4, 4], humidities=array([10, 50, 40, 40]), sitewise_temperatures=array([[3, 1, 4, 9], [2, 8, 1, 8]]))
# Using a boolean mask
mask = [True, False, True, False, False, True]
subset = data[mask]
subset
Weather(city='Gotham', postcode=12345, temperatures=[1, 4, 4], humidities=array([10, 40, 40]), sitewise_temperatures=array([[3, 4, 9], [2, 1, 8]]))
Indexing is prohibited intentionally¶
AutoSliceMixin
only supports slicing semantics (no single index access). This means you cannot do data[0]
to get a single item. If you insist so, an error is raised:
@attrs.define
class SimpleData(AutoSliceMixin):
name: str
data: list[float]
data = SimpleData(name="no indexing", data=[0, 1, 2, 4])
try:
data[5]
except Exception as e:
assert isinstance(e, ValueError)
print("ERROR: ", e)
ERROR: `AutoSliceMixin` only supports slicing semantics, but key type of <class 'int'> implies indexing
Indexing semantics is prohibited intentionally. Because indexing reduces the dimension of an array-like object, changing its the type siliently. For example, a np.array([0, 1, 2], dtype=np.int64)
becomes np.int64(1)
after indexing. Such kind of type change may introduce many problems. So, indexing is banned.
Customization¶
AutoSliceMixin allows you to customize how slicing behaves for specific attributes using SliceHint
. Here's an example:
# Define a custom slicing function for strings
def fancy_slice_for_str(value: str, key, hint: SliceHint = None):
if isinstance(key, slice):
return value[key]
value = np.asarray(list(value))
rst = "".join(value[key])
return rst
@attrs.define
class Weather(AutoSliceMixin):
# treated as a scalar
city: str
# shape (T,) array, slicing enabled
temperatures: list[float]
# shape (T,) string, slicing enabled manually
raining: Annotated[str, SliceHint(func=fancy_slice_for_str)]
# shape (H, W) array, copied to the slicing result (treated as scalar)
site_image: Annotated[np.ndarray, SliceHint(func="copy")]
# Create sample data
data = Weather(
city="ga kuen to shi",
temperatures=[15, 20, 57, 15],
raining="RSWW", # raining, sunny, windy, windy
site_image=np.array([[0, 1], [3, 4]])
)
# Try different slicing operations
subset = data[1:-1]
assert subset.city == "ga kuen to shi"
assert subset.temperatures == [20, 57]
assert subset.raining == "SW"
assert np.allclose(subset.site_image, [[0, 1], [3, 4]])
In this example, we show several ways to customize slicing behavior:
Custom Slicing Function: The
raining
attribute uses a custom functionfancy_slice_for_str
that handles slicing for string data by converting it to a character array first.Copy Instead of Slice: The
site_image
attribute is configured to be copied instead of sliced usingSliceHint(func="copy")
. This is useful for attributes that should remain unchanged across all slices.Default Behavior: The
temperatures
list is sliced using the default behavior, which works well for most array-like objects.
The SliceHint
annotation provides a flexible way to customize how each attribute behaves during slicing operations.
Limitations¶
Current limitations of AutoSliceMixin:
Framework Support: Currently, only classes decorated with
attrs.define
are supported. Support forpydantic.BaseModel
is planned for future releases.Array-like Objects: While AutoSliceMixin works with most array-like objects (lists, numpy arrays, etc.), custom objects must implement proper slicing behavior to work correctly with the mixin.