I am beginning to research on content indexing implementation, and was having a look at Whoosh (https://pypi.python.org/pypi/Whoosh/).
I am curious to know where Whoosh stores its content physically - Is it using files?
Whoosh uses a pluggable storage system; if you use the create_in()
function then a FileStorage()
class is used that stores indexes in files in a directory.
See the Whoosh quickstart:
Once you have the schema, you can create an index using the
create_in
function:import os.path from whoosh.index import create_in if not os.path.exists("index"): os.mkdir("index") ix = create_in("index", schema)
(At a low level, this creates a
Storage
object to contain the index. AStorage
object represents that medium in which the index will be stored. Usually this will beFileStorage
, which stores the index as a set of files in a directory.)