module
Contains functions responsible for validating data types.
Module:
aisp.utils.validation
Import:from aisp.utils import validation
Functions
detect_vector_data_type
def detect_vector_data_type(vector: npt.NDArray) -> FeatureType:
...
Detect the type of data in a vector.
The function detects if the vector contains data of type:
- Binary features: boolean values or integers restricted to 0 and 1.
- Continuous features: floating-point values in the normalized range [0.0, 1.0].
- Ranged features: floating-point values outside the normalized range.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
vector | npt.NDArray | - | An array containing the data to be classified. |
Returns
| Type | Description |
|---|---|
FeatureType | The data type of the vector: "binary-features", "continuous-features", or " ranged-features". |
Raises
| Exception | Description |
|---|---|
UnsupportedTypeError | If the data type of the vector is not supported by the function. |
check_array_type
def check_array_type(x, name: str = "X") -> npt.NDArray:
...
Ensure X is a numpy array. Convert from list if needed.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
x | Any | - | Array, containing the samples and their characteristics, Shape: (n_samples, n_features). |
name | str | 'X' | Variable name used in error messages. |
Returns
| Type | Description |
|---|---|
npt.NDArray | The converted or validated array |
Raises
| Exception | Description |
|---|---|
TypeError | If X is not ndarray or a list. |
check_shape_match
def check_shape_match(x: npt.NDArray, y: npt.NDArray):
...
Ensure X and y have compatible first dimensions.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
x | npt.NDArray | - | Array, containing the samples and their characteristics, Shape: (n_samples, n_features). |
y | npt.NDArray | - | Array of target classes of x with (n_samples). |
Raises
| Exception | Description |
|---|---|
TypeError | If x or y have incompatible shapes. |
check_feature_dimension
def check_feature_dimension(x: npt.NDArray, expected: int):
...
Ensure X has the expected number of features.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
x | npt.NDArray | - | Input array for prediction, containing the samples and their characteristics, Shape: (n_samples, n_features). |
expected | int | - | Expected number of features per sample (columns in X). |
Raises
| Exception | Description |
|---|---|
FeatureDimensionMismatch | If the number of features in X does not match the expected number. |
check_binary_array
def check_binary_array(x: npt.NDArray):
...
Ensure X contains only 0 and 1.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
x | npt.NDArray | - | Array, containing the samples. |
Raises
| Exception | Description |
|---|---|
ValueError | If the array contains values other than 0 and 1. |
check_value_range
def check_value_range(
x: npt.NDArray,
name: str = 'X',
min_value: float = 0.0,
max_value: float = 1.0
) -> None:
...
Ensure all values in the x array fall within a range.
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
x | npt.NDArray | - | Array, containing the samples. |
name | str | 'X' | Name used in the error message. |
min_value | float | 0.0 | Minimum allowed value. |
max_value | float | 1.0 | Maximum allowed value. |
Raises
| Exception | Description |
|---|---|
ValueError | If the array fall outside the interval (min_value, max_value). |