deeplearning4jdl4jnd4j

How can be series, a bunch of signals, not the single signal input of sequential data be classified by several groups of classification with DL4J?


I have 60 signals sequences samples with length 200 each labeled by 6 label groups, each label is marked with one of 10 values. I'd like to get prediction in each label group on each label when feeding the 200-length or even shorter sample to the network.

I tried to build own network based on https://github.com/eclipse/deeplearning4j-examples/blob/master/dl4j-examples/src/main/java/org/deeplearning4j/examples/recurrent/seqclassification/UCISequenceClassificationExample.java example, which, however, provides the label padding. I use no padding for the label and I'm getting exception like this:

Exception in thread "main" java.lang.IllegalStateException: Sequence lengths do not match for RnnOutputLayer input and labels:Arrays should be rank 3 with shape [minibatch, size, sequenceLength] - mismatch on dimension 2 (sequence length) - input=[1, 200, 12000] vs. label=[1, 1, 10]

Solution

  • In fact, it is a requirement for the labels to have a time dimension what is 200-long for the features the same as features are. So here I have to do some kind of techniques like zeroes padding in all 6 labels channel. On other hand, the input was wrong, I put all 60*200 there, however it should be [1, 200, 60] there while 6 labels are [1, 200, 10] each.

    The thing under the question is in which part of 200-length label I should place the real label value [0], [199] or may be place labels to the typical parts of the signals they are associated with? My trainings that should check this is still in progress. What kind of padding is better? Zeroes padding or the label value padding? Still not clear and can't google out paper explaining what is the best.