wpfmvvmdata-virtualization

MVVM vs Data Virtualization


I have a TreeView that is bound to a tree of ViewModel instances. The problem is that the model data is coming from a slow repository so I need data virtualization. The list of sub ViewModel below a node should only be loaded when the parent tree view node is expanded and it should be unloaded when it is collapsed.

How can this be implemented while adhering to MVVM principles? How can the ViewModel get notified that it needs to load or unload subnodes? That is when a node was expanded or collapsed without knowning anything about the treeview's existence?

Something makes me feel that data virtualization doesn't go well with MVVM. Since in data virtualization the ViewModel generally needs to know quite a lot about the current state of the UI and aslo needs to control quite a lot of aspects in the UI. Take another example:

A listview with data virtualization. The ViewModel would need to control the length of the ListView's scrollthumb since it depends on the number of items in the Model. Also when the user scrolls, the ViewModel would need to known to what position he scrolled to and how big the listview is (how many items currently fit in) to be able to load the right portion of Model data form the repository.


Solution

  • The easy way to solve this is with a "virtualizing collection" implementation that maintains weak references to its items along with an algorithm for fetching / creating items. The code for this collection is rather complex, what with all the interfaces required and the data structures to efficiently track ranges of loaded data but here is a partial API for a class that virtualized based on indexes:

    public class VirtualizingCollection<T>
      : IList<T>, ICollection<T>, IEnumerable<T>,
        IList, ICollection, IEnumerable,
        INotifyPropertyChanged, INotifyCollectionChanged
    {
      protected abstract void FetchItems(int requestedIndex, int gapStartIndex, int gapEndIndex);
      protected void RecordFetchedItems(int startIndex, int count, IEnumerable items) ...
      protected void RecordInsertOrDelete(int startIndex, int countPlusOrMinus) ...
      protected virtual void OnCollectionChanged(CollectionChangedEventArgs e) ...
      protected virtual void Cleanup();
    }
    

    The internal data structure here is a balanced tree of data ranges, with each data range containing a start index and an array of weak references.

    This class is designed to be subclassed to provide the logic for actually loading the data. Here is how it works:

    Note that FetchItems is passed a range of unloaded items so it can use a heuristic to load multiple items at once. A simple such heuristic would be loading the next 100 items or up to the end of the current gap, whichever comes first.

    With a VirtualizingCollection, WPF's built-in virtualization will cause data loading at the appropriate times for ListBox, ComboBox, etc, as long as you are using eg. VirtualizingStackPanel instead of StackPanel.

    For a TreeView, one more step is required: In the HierarchicalDataTemplate set a MultiBinding for ItemsSource that binds to your real ItemsSource and also to IsExpanded on the templated parent. The converter for the MultiBinding returns its first value (the ItemsSource) if the second value (the IsExpanded value) is true, otherwise it returns null. What this does is makes it so that when you collapse a node in the TreeView all references to the collection contents are immediately dropped so that VirtualizingCollection can clean them up.

    Note that virtualization need not be done based on indexes. In a tree scenario, it can be all-or-nothing, and in a list scenario an estimated count can be used and ranges filled in as necessary using a "start key" / "end key" mechanism. This is useful when the underlying data may change and the virtualized view should track its current location based on which key is at the top of the screen.