refactor(pipeline): enforce ProcessorStep inheritance for pipeline steps (#1862)

- Updated the DataProcessorPipeline to require that all steps inherit from ProcessorStep, enhancing type safety and clarity.
- Adjusted tests to utilize a MockTokenizerProcessorStep that adheres to the ProcessorStep interface, ensuring consistent behavior across tests.
- Refactored various mock step classes in tests to inherit from ProcessorStep for improved consistency and maintainability.
This commit is contained in:
Adil Zouitine
2025-09-04 16:22:03 +02:00
committed by GitHub
parent fc43246942
commit 332ca4ccc5
4 changed files with 85 additions and 38 deletions

View File

@@ -731,11 +731,8 @@ class DataProcessorPipeline(ModelHubMixin, Generic[TOutput]):
def __post_init__(self):
for i, step in enumerate(self.steps):
if not callable(step):
# TODO(steven): This should instead check isinstance(step, ProcessorStep), test need to be updated
raise TypeError(
f"Step {i} ({type(step).__name__}) must define __call__(transition) -> EnvTransition"
)
if not isinstance(step, ProcessorStep):
raise TypeError(f"Step {i} ({type(step).__name__}) must inherit from ProcessorStep")
def transform_features(self, initial_features: dict[str, PolicyFeature]) -> dict[str, PolicyFeature]:
"""