I am using the scikit-fda Python package, and I'm trying to understand the class FDataGrid. In particular, I want to manipulate a given instance of the FDataGrid class (called X_prim) so that I can remove certain functions from it.
My first intuition is to try something simple like X = np.delete(X_prim, [0,1,2])
, to delete the first three functions. Indeed, when I display the values using X.tolist()
, everyhting seems to be right (I get the same return as with X_prim.tolist()
, except that the desired arrays are not there anymore). However, when I try to use the new FDataGrid instance, I always get some kind of error. For instance
from skfda.preprocessing.missing import MissingValuesInterpolation
nan_interp = MissingValuesInterpolation()
X_transformed = nan_interp.fit_transform(X_prim)
works perfectly, but when i try to run
from skfda.preprocessing.missing import MissingValuesInterpolation
nan_interp = MissingValuesInterpolation()
X_transformed = nan_interp.fit_transform(X),
I get the error "AttributeError: 'numpy.ndarray' object has no attribute 'data_matrix'". So apparently, my FDataGrid is not an FDataGrid, as I thought it would be. This leads me to think maybe I should try to edit the data_matrix of X_prim and then manually build a new instance. For this porpouse I run
matrix=X_prim.data_matrix
matrix = np.delete(matrix, [2,13,103,111])
X=skfda.FDataGrid(data_matrix=matrix)
but then running X.tolist()
reveals this is interpreted as an FDataGrid with just one function, to which all of the remaining values of all remaining functions belong. Of course the grid_points attribute also gets ruined. So, besides how can I remove certain functions from the FDataGrid, which is my original goal, I am also wondering what exactly is going on here. I mean, I am just taking the matrix from the FDataBasis itself, I remove some values, and when I try to create a new FDataBasis instance, the matrix gets interpreted in a whole different way. Why is this hapenning and how may I prevent it?
To delete function data, you can use the FDataGrid.copy()
like the following.
import numpy as np
import skfda
fd, _ = skfda.datasets.fetch_growth(return_X_y=True)
keys = [1,2] # indices of function data to be deleted
new_data_matrix = np.delete(fd.data_matrix, keys, axis=0)
new_sample_names = np.delete(np.array(fd.sample_names), keys)
new_fd = fd.copy(data_matrix=new_data_matrix, sample_names=new_sample_names)
And the cause of your observation is because you didn't specify other important parameters such as the grid_points
.