I keep getting the error "Indexing starts at 1 but found a 0" when fitting a model where the input or output started life as R objects (vectors, matrices, or arrays), but only when using a dataloader instead of the dataset itself. Here's a toy example:
library(torch)
library(luz) # For fit() function
x <- torch_randn(1, 9)
y <- torch_tensor(as.integer(c(0,0,1)))
# y <- torch_tensor(as.integer(1)) # This doesn't give an error if out_features=1 below
xy.ds <- tensor_dataset(x,y)
xy.dl <- dataloader(xy.ds, batch_size = 1)
# Create one-layer linear model
linnet <- nn_module(
initialize = function() {
self$fc <- nn_linear(in_features = 9, out_features = 3)
},
forward = function(x) {
self$fc(x)
}
)
fitted <- linnet %>%
setup(
loss = nn_cross_entropy_loss(),
optimizer = optim_adam
) %>%
fit(xy.dl, epochs = 1) # Error in to_index_tensor(target) : Indexing starts at 1 but found a 0.
# fit(xy.ds, epochs = 1) # This doesn't given an error
Maybe this is some sort of bug in the Python-to-R port. Maybe my computer is having trouble accessing some library (I'm using the R GUI on Windows 11, with the currently newest versions of R, torch, and lux). Maybe the problem relates to nn_cross_entropy_loss() (which I need for my real project, which uses R-created arrays and nn_conv2d()). Or maybe I'm just being stupid. In any case, I can't even search for help on the function that's complaining: to_index_tensor().
torch
in R should always use 1-based indices, CrossEntropyLoss targets in this configuration are index tensors (to represent discrete classes like MNIST digits or as.factor(c("cat", "dog"))
) and are explicitly tested for not having any zeros. to_index_tensor()
:
// [[Rcpp::export]]
XPtrTorchTensor to_index_tensor(XPtrTorchTensor t) {
// check that there's no zeros
bool zeros = lantern_Tensor_has_any_zeros(t.get());
if (zeros) {
Rcpp::stop("Indexing starts at 1 but found a 0.");
}
...
, called from internal torch_cross_entropy_loss()
:
torch:::torch_cross_entropy_loss
#> function (self, target, weight = list(), reduction = torch_reduction_mean(),
#> ignore_index = -100L)
#> {
#> target <- to_index_tensor(target)
#> .torch_cross_entropy_loss(self = self, target = target, weight = weight,
#> reduction = reduction, ignore_index = ignore_index)
#> }
We can test this with 0
& 1
targets, without going through dataloader & luz
:
library(torch)
x <- torch_rand(1,9)
nnf_cross_entropy(x, target = torch_tensor(0L))
#> Error in `to_index_tensor()`:
#> ! Indexing starts at 1 but found a 0.
#> Backtrace:
#> ▆
#> 1. └─torch::nnf_cross_entropy(x, target = torch_tensor(0L))
#> 2. └─torch:::torch_cross_entropy_loss(...)
#> 3. └─torch:::to_index_tensor(target)
nnf_cross_entropy(x, target = torch_tensor(1L))
#> torch_tensor
#> 0.965333
#> [ CPUFloatType{} ]
And now original example, 1 added to labels to avoid zeros:
library(torch)
library(luz)
(x <- torch_randn(1, 9))
#> torch_tensor
#> 0.4351 -0.3599 0.5954 0.3883 -1.3409 -0.7897 0.1219 -1.2122 2.5252
#> [ CPUFloatType{1,9} ]
# add +1 to target classes to avoid "found a 0" error when using CrossEntropyLoss
(y <- torch_tensor(as.integer(c(0, 0, 1) + 1)))
#> torch_tensor
#> 1
#> 1
#> 2
#> [ CPULongType{3} ]
xy.ds <- tensor_dataset(x,y)
xy.dl <- dataloader(xy.ds, batch_size = 1)
linnet <- nn_module(
initialize = function() {
self$fc <- nn_linear(in_features = 9, out_features = 3)
},
forward = function(x) {
self$fc(x)
}
)
fitted <- linnet %>%
setup(
loss = nn_cross_entropy_loss(),
optimizer = optim_adam
) %>%
fit(xy.dl, epochs = 1)
fitted
#> A `luz_module_fitted`
#> ── Time ────────────────────────────────────────────────────────────────────────
#> • Total time: 172ms
#> • Avg time per training epoch: 165ms
#>
#> ── Results ─────────────────────────────────────────────────────────────────────
#> Metrics observed in the last epoch.
#>
#> ℹ Training:
#> loss: 2.366
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#> An `nn_module` containing 30 parameters.
#>
#> ── Modules ─────────────────────────────────────────────────────────────────────
#> • fc: <nn_linear> #30 parameters
You may also want to check if resulting dataset and values from dataloader are as expected, currently only 1st y
value is actually used:
# test dataset & dl iterator (edited without fixed seed):
xy.ds$.getitem(1)
#> [[1]]
#> torch_tensor
#> 0.0689
#> 0.3410
#> 0.1130
#> -0.0152
#> 0.2865
#> 0.4140
#> 1.4283
#> 0.4356
#> 0.3323
#> [ CPUFloatType{9} ]
#>
#> [[2]]
#> torch_tensor
#> 1
#> [ CPULongType{} ]
xy.dl$.iter()$.next()
#> [[1]]
#> torch_tensor
#> 0.0689 0.3410 0.1130 -0.0152 0.2865 0.4140 1.4283 0.4356 0.3323
#> [ CPUFloatType{1,9} ]
#>
#> [[2]]
#> torch_tensor
#> 1
#> [ CPULongType{1} ]
This is because of dimensionality, dataset is built by combining
x
of {1,9} (one observation) andy
of {3} (three observations)1st iteration returns 1st observations. And also exhausts dataloader as there's nothing else to pull from x
. So you probably want to make sure tensor_dataset()
input tensors are properly shaped to match your network architecture and loss function, through input R object shapes:
torch_tensor(matrix(as.integer(c(0, 0, 1) + 1), ncol = 3, byrow = TRUE))
#> torch_tensor
#> 1 1 2
#> [ CPULongType{1,3} ]
And/or by reshaping tensors:
y$reshape(c(1,3))
#> torch_tensor
#> 1 1 2
#> [ CPULongType{1,3} ]
Now we can have a dataloder that returns target batch shaped as {1,3} :
ds_ <- tensor_dataset(x,y$reshape(c(1,3)))
dl_ <- dataloader(ds_, batch_size = 1)
dl_$.iter()$.next()
#> [[1]]
#> torch_tensor
#> 0.9468 -0.7732 0.7157 -0.6058 -1.7488 -0.7722 -0.0210 -1.6659 0.8572
#> [ CPUFloatType{1,9} ]
#>
#> [[2]]
#> torch_tensor
#> 1 1 2
#> [ CPULongType{1,3} ]
Just note that this dataloader will not work with linnet
in this example
as target shape / type does not match with CrossEntropyLoss target expectations anymore.
fit(..., data = <dataset>)
?
fit(xy.ds, epochs = 1) # This doesn't given an error
According to ?fit.luz_module_generator
it should work just fine with different data types (dataloader, dataset, list); internally luz
converts datasets & lists to dataloader, but unless setting options ( fit(xy.ds, dataloader_options = list(...))
), it uses different defaults than torch::dataloader()
:
batch_size = 32, shuffle = TRUE, drop_last = TRUE
.
In this case it results with an empty dataloader:
luz:::apply_dataloader_options(xy.ds, valid_data = NULL, dataloader_options = NULL)
#> [[1]]
#> <dataloader>
#> Public:
#>
#> [[2]]
#> NULL
and training loop exits before first train batch starts, so loss function is not called and there's no "Indexing starts at 1 but found a 0" error.