angulartypescriptwikipedia-api

Fetch specific Wikipedia list


How can I fetch these records from Wikipedia as easily as possible? I need in a JSON file for each of these areas the displayed names: https://en.wikipedia.org/wiki/Category:Surnames_by_language

Example

[
 {
  name: "Agalliu",
  language: "Albanian"
 },
 {
  name: "Agolli",
  language: "Albanian"
 }
 ...
]

I´m working with Angular5.

Also: Is it legal for me to create a database with the information that the data is from Wikipedia?


Solution

  • I don't work with Angular 5 nor typescript, so I don't know at a technical level how to develop the specific code you need, but I think what you need is to have a look to the HttpClient documentation. This search in GitHub might help you to find some module already developed. Angular seems very well documented, that's very nice. So my answer is more theoretical than technical.

    About the data you want to get in the JSON file, surname and the language of this surname, if you only want to work with the pages in the category I think the best way might me to extract the title of the page of each page and the language from the title of the subcategory analyzed. If you want to do it:

    I think another good way to do it is querying to Wikidata, because there are many pages with structures very different and there isn't an infobox generalized in all of them, what it would make easier to get the data because you would be able to scrape an specific field (language or whatever it may be). However, extract it from Wikidata and no from the category has disadvantages too:

    Take a look at MediaWiki API and Wikidata:Data Access.

    "Is it legal for me to create a database with the information that the data is from Wikipedia?"

    Yes, it is perfectly legal. What you have to do is to respect the license. In the case of the English Wikipedia, it is licensing under Creative Commons Attribution-ShareAlike 3.0 Unported. This license allows you to reuse and change the content in a commercial and non-commercial way, but you must attribute the authorship and to share the derivatives with the same license.

    In the case of Wikidata, all in the namespaces of items and properties (Q:* and P:*) are in public domain and marked as CC0, a Creative Commons tool to show that a work is in the public domain. What can you do with the data? Whatever you want.

    I recommend you to read the Creative Commons' FAQ about the CC0 and the legal code of the Creative Commons Attribution-ShareAlike 3.0 Unported.