visualizationheatmapinformation-visualization

Heatmap to Show Filesystem diffs on Server?


I've been tasked with creating a simple Python web-based app to graphically represent the "differences" between various servers, in terms of the contents of various key config files throughout the filesystem.

1. High Level Heatmap

For a high-level overview, I was thinking of creating a heatmap of each system (e.g. http://www.jjguy.com/heatmap/). Each system is compared to a golden-source image (the original), and then we use colours to represent he degree of differentiation from this image.

The filesystem is mapped to the x-y axis, so that the same coordinates on the heatmap for each system represent the same files.

My first question here is, do you have any advice on a good algorithm for mapping the filesystem to the x-y coordinates? Bear in mind, while each server should have more-or-less the same filesystem hierarchy, this may not be necessarily true, and I still need to find a way to represent missing files/directories, or perhaps added files/directories. I'm not sure if that's possible, along with the first requirement of mapping equavilent files/directories to matching (or at least similar) x-y points on each system map. Any novel approaches/algos here?

Then I need a way of quantifying the degree of changes between each file (number of lines?), and then passing this onto the heatmap. Bear in mind I'd need to differentiate between say, a single file with 10 changed lines, and 10 files with 1 changed line each, as both have different ramifications.

2. Drill-Down to show File Diffs

I'm hoping also offer the ability in the webapp to drill-down into individual files/directories, and see the changes between them.

I've been using a combination of Kdiff3 and Meld for visual code diffs, and I'm quite impressed by the way Meld display changes.

http://meld.sourceforge.net/meld_file1.png

I couldn't seem to find any standalone web libraries that provide a visual diff mechanism on thier own. The closest I found is jsdifflib (http://snowtide.com/jsdifflib), but it doesn't seem to match the functionality (or the aesthetics, I suppose) of something like Meld. Any advice here?

(Revisionist looks cool actually - http://benfry.com/revisionist/ - but I can't seem to find any public code for it).

Cheers, Victor


Solution

  • As to mapping the filesystem, take a look at WindirStat http://windirstat.info/ . This is originally a size assessment tool for your filesystem, but you could define your own size calculation. The treemap will help to group files that are the same folder or folder tree. It is somewhat robust to changes in the disk contents.