normalizationelki

ELKI: Normalization undo for result


I am using the ELKI MiniGUI to run LOF. I have found out how to normalize the data before running by -dbc.filter, but I would like to look at the original data records and not the normalized ones in the output.

It seems that there is some flag called -normUndo, which can be set if using the command-line, but I cannot figure out how to use it in the MiniGUI.


Solution

  • This functionality used to exist in ELKI, but has effectively been removed (for now).

    1. only a few normalizations ever supported this, most would fail.
    2. there is no longer a well defined "end" with the visualization. Some users will want to visualize the normalized data, others not.
    3. it requires carrying over normalization information along, which makes data structures more complex (albeit the hierarchical approach we have now would allow this again)
    4. due to numerical imprecision of floating point math, you would frequently not get out the exact same values as you put in
    5. keeping the original data in memory may be too expensive for some use cases, so we would need to add another parameter "keep non-normalized data"; furthermore you would need to choose which (normalized or non-normalized) to use for analysis, and which for visualization. This would not be hard with a full-blown GUI, but you are looking at a command line interface. (This is easy to do with Java, too...)

    We would of course appreciate patches that contribute such functionality to ELKI.

    The easiest way is this: Add a (non-numerical) label column, and you can identify the original objects, in your original data, by this label.