I am interested in retieving machine readable meta information about R packages.
For example, when I go to CRAN I can see a short description about the package, before I download it: https://cran.r-project.org/web/packages/MASS/
I could not find any way to retrieve a different output from the CRAN server than HTML. I would like to avoid parsing HTML and instead somehow retrieve meta information about packages in a more convenient format (e.g., JSON).
I saw that each R package (at least to my knowledge) has a yaml-like (?) description text inside its source code package (the file is called DESCRIPTION
). However, so far I could only find this kind of description inside tar archives, which means that I would have to download the package before I can access its description.
Here an example of the DESCRIPTION
from the MASS package:
Package: MASS
Priority: recommended
Version: 7.3-55
Date: 2022-01-12
Revision: $Rev: 3559 $
Depends: R (>= 3.3.0), grDevices, graphics, stats, utils
Imports: methods
Suggests: lattice, nlme, nnet, survival
Authors@R: c(person("Brian", "Ripley", role = c("aut", "cre", "cph"),
email = "ripley@stats.ox.ac.uk"),
person("Bill", "Venables", role = "ctb"),
person(c("Douglas", "M."), "Bates", role = "ctb"),
person("Kurt", "Hornik", role = "trl",
comment = "partial port ca 1998"),
person("Albrecht", "Gebhardt", role = "trl",
comment = "partial port ca 1998"),
person("David", "Firth", role = "ctb"))
Description: Functions and datasets to support Venables and Ripley,
"Modern Applied Statistics with S" (4th edition, 2002).
Title: Support Functions and Datasets for Venables and Ripley's MASS
LazyData: yes
ByteCompile: yes
License: GPL-2 | GPL-3
URL: http://www.stats.ox.ac.uk/pub/MASS4/
Contact: <MASS@stats.ox.ac.uk>
NeedsCompilation: yes
Packaged: 2022-01-13 05:06:37 UTC; ripley
Author: Brian Ripley [aut, cre, cph],
Bill Venables [ctb],
Douglas M. Bates [ctb],
Kurt Hornik [trl] (partial port ca 1998),
Albrecht Gebhardt [trl] (partial port ca 1998),
David Firth [ctb]
Maintainer: Brian Ripley <ripley@stats.ox.ac.uk>
Repository: CRAN
Date/Publication: 2022-01-13 08:05:04 UTC
Any suggestions how to get that directly in a machine-readable and convenient form?
I tried to look it up, but search engines did not bring me any useful result so far.
Edit / Clarification: I am looking for a solution that does not rely on R, but rather a web API that is agnostic of the used framework / language for meta data retrieval.
An acceptable solution is the METACRAN API that is available here: https://crandb.r-pkg.org/