c++textword-processor

What's a better way to store text for a word processor?


The usual way is to store the characters in a string, but because while writing a text, a lot of times the user deletes or adds characters in the middle of the text, perhaps it is better to use std::list<char> to contains the characters, then adding characters in the middle of list is not costly operation.


Solution

  • The following paper summarizes the data structures used in word processors: http://www.cs.unm.edu/~crowley/papers/sds.pdf

    Data Structures for Text Sequences. Charles Crowley, University of New Mexico, 1998

    The data structure used to maintain the sequence of characters is an important part of a text editor. This paper investigates and evaluates the range of possible data structures for text sequences. The ADT interface to the text sequence component of a text editor is examined. Six common sequence data structures (array, gap, list, line pointers, fixed size buers and piece tables) are examined and then a general model of sequence data structures that encompasses all six structures is presented. The piece table method is explained in detail and its advantages are presented. The design space of sequence data structures is examined and several variations on the ones listed above are presented. These sequence data structures are compared experimentally and evaluated based on a number of criteria. The experimental comparison is done by implementing each data structure in an editing simulator and testing it using a synthetic load of many thousands of edits. We also report on experiments on the senstivity of the results to variations in the parameters used to generate the synthetic editing load.