i would like to get all (87) subcategories and all pages (200) in the "Pages in category "Masculine given names"" section on this site: https://en.wikipedia.org/wiki/Category:Masculine_given_names
I tried it with the following code:
import pywikibot
site = pywikibot.Site("en", "wikipedia")
page = pywikibot.Page(site, 'Category:Masculine_given_names')
print(list(page.categories()))
But with that i only get the categories at the very bottom of the page. How can i get the subcategoreis and (sub)-pages on this site?
How can i get the subcategories and (sub)-pages of a given category?
First you have to use a Category
class instead of a Page
class. You have to create it quite similar:
>>> import pywikibot
>>> site = pywikibot.Site("en", "wikipedia")
>>> cat = pywikibot.Category(site, 'Masculine_given_names')
A Category class has additional methods, refer the documentation for further informations and the available parameters. The categoryinfo
property for example gives a short overview about the category content:
>>> cat.categoryinfo
{'size': 1425, 'pages': 1336, 'files': 0, 'subcats': 89}
There are 1425 entries in this category, there are 1336 pages and 89 subcategories in this case.
To get all subcategories use subcategories()
method:
>>> gen = cat.subcategories()
Note, this is a generator. As shown below you will get all of them as found in categoryinfo
above:
>>> len(list(gen))
89
To get all pages (articles) you have to use the articles()
method, e.g.
>>> gen = cat.articles()
Guess how many entries the corresponing list will have.
Finally there is a method to get all members of the category which includes pages, files and subcategories called members()
:
>>> gen = cat.members()