textlistenerkeylistenerfile-writingmemory-efficient

Efficient Implementation of Text Processing Application Text Save Function


Some word-processing software and source code editors implement their save function such that if I enter some text in the editor field and then remove it, thus making zero net changes to the current document, it will still conduct a file write. As an example, we may refer to Open Office and observe the phenonmenon without even delving into the source code. Let us leave aside that fact that this is a minor issue for modern hardware.

I would think that a proper method of the save would be to keep track of editor content since load or last save and compare it with current editor content to see if they are identical. Only a dissimilarity should warrant a file write. I would conclude that a drawback to this method is that the algorithm runtime is O(n) and would be significant if the document size is extremely long. It would be efficient if the user (like me) has a habit of saving often by repeatedly and impulsively pressing Ctrl + S. The comparison would consume less resource than a file write where we'd have to decipher the path, create and/ or open the file stream pointer/ entity, receive content to write, write the content and finally terminate the file access - thus closing the file.

I believe the concerning software are implementing a seemingly more efficient method by adding some form of keyboard input listener interface to the editor component and only performing the file write on detection of a single visible keyboard input, and then removing it upon first activation. This would take the job down to O(1) leaving aside listener registration/ de-registration. But I'm fairly sure a listener registration/ de-registration is not as resource consuming as a file write.

However, this algorithm would write regardless of actual content change, as long as a visible keyboard input has been provided. Therefore, if someone is saving the same content repeatedly, the algorithm becomes O(n) again. Perhaps this is a rare issue for people like me who are saving compulsively.

Which method is ideal in terms of efficiency? Not from an opinion point of view, but strictly an analytical one. If it is the latter method, how does a listener component work in a way that is making such a thing more efficient? What effect would it have on efficiency if I did not de-register the listener?


Solution

  • First, make your “save” operation fast and atomic. Atomic means there is no disaster if your app crashes during “save”, and fast means the chances of a crash (like power cut) during save is reduced, and fast is good anyway. Don’t worry about the time for opening/closing files, that’s basically nothing.

    If you have Huge documents consider a database as the underlying format. So changes in a small area affect only a small area in the file.

    The totally simple approach: Have a flag whether the document is changed. Save if the flag is set. Clear the flag after opening and saving the file.

    A better approach combining with undo/redo: You should support undo and redo. Say I open a file and make five changes. You should know that I have 5 undoes. If I undo five times, the file is again unchanged. If I undo twice, I have three undies and two redoes. If I save now, you can be quite sure that the file is changed, and after save you remember “file is original with three undoes”. If I now redo + undo, or undo + redo, I have again the file with three undoes, so it is unchanged.

    You have to get the logic for this right, but you have the advantage that you know the document is unchanged very quickly. So you can enable/disable “Save” very quickly, and show a dialog “do you want to save your changes” very quickly.

    And a keyboard listener is a much too low level. I have a menu item “add page break”. That doesn’t go through the keyboard. Spelling check may or may not change text.