data-structureshashmapcountingspace-complexity

Why isn't the space complexity of program adding values to a letter keyed hash map linear?


Imagine a program that takes a string of length p as an argument, and counts each letter by adding 1 for every occurrence using a hash map that only uses the 26 letters of the English alphabet as keys. I've seen this type program described as O(1) in terms of space complexity (see the quote below). The reasoning being that since we know the size of the key set is capped, the space complexity is constant.

From Tech Interview Handbook:

Often you will need to count the frequency of characters in a string. The most common way of doing that is by using a hash table/map in your language of choice. If your language has a built-in Counter class like Python, ask if you can use that instead.

If you need to keep a counter of characters, a common mistake is to say that the space complexity required for the counter is O(n). The space required for a counter of a string of latin characters is O(1) not O(n). This is because the upper bound is the range of characters, which is usually a fixed constant of 26. The input set is just lowercase Latin characters.

How can we say this, when it seems to me that the size of the map in memory will obviously grow proportionally to the size of p? If we give the program an input of length p+1, won't the size of the map (specifically the integers containing the count of the characters in the input, thus the map itself) necessarily be larger than if we provide an input of length p?


Solution

  • No, that is not how HashMaps works.

    Why O(1) Space Complexity?

    Remember: Space complexity is extra-space(thus don't include input space) your algorithm needs depending on the input size.

    HashMap<Character, Integer> here has fixed size containing only 26 keys representing each Character(1 byte), and 26 values corresponding to each key (which can be a 4 byte integer thus can store values upto 2 Billion).

    Irrespective of your input string size, you only need this constant space of HashMap to keep count of characters. Suppose if your string becomes significantly large say in trillions, then you can just change the value type of HashMap to Long(16 Byte) to hold even much bigger values for each character, again still your extra space is independent of input size and constant, thus has Space Complexity of O(1).