I am searching for sample .txt files for information Retrieval. Would be nice if there are sets of documents(around 20 documents) regarding one topic, e.g., sports, music, etc.
There are many datasets available, for instance:
Datasets used to evaluate IR systems: http://www.daviddlewis.com/resources/testcollections/
More IR datasets: http://boston.lti.cs.cmu.edu/callan/Data/
A comprehensive list of several datasets: http://zitnik.si/mediawiki/index.php?title=Datasets
The classic news groups dataset: http://scikit-learn.org/stable/datasets/twenty_newsgroups.html
Much bigger, news articles: http://research.signalmedia.co/newsir16/signal-dataset.html