rcran

Format for adding references in R package DESCRIPTION?


I just submitted an R package to CRAN. I got this comment back:

If there are references describing the methods in your package, please add these in the description field of your DESCRIPTION file in the form
authors (year) <doi:...>
authors (year) <arXiv:...>
authors (year, ISBN:...)
or if those are not available: <https:...>
with no space after 'doi:', 'arXiv:', 'https:' and angle brackets for auto-linking.
(If you want to add a title as well please put it in quotes: "Title") 

But I thought that the description field is limited to one paragraph, which means you can't include additional text besides the single paragraph in that field. So I was unsure what the exact formatting is for including references in the description field. My guess is below, but this format returns a note stating that the description is malformed.

Description: Text describing the package, blah blah blah.
    More text goes here, etc etc etc.
    Foo, B., and J. Baz. (1999) <doi:23232/xxxxx.00>
    Smith, C. (2021) <https://something.etc/foo>

Note returned when running R CMD check:

checking DESCRIPTION meta-information ... NOTE
Malformed Description field: should contain one or more complete sentences.

This question is related but does not have a satisfactory answer so I am asking again.


Solution

  • I started with Julia Silge's blog post here:

    cran <- tools::CRAN_package_db()
    desc_with_doi <- grep("doi:", cran$Description, value = TRUE)
    

    Here are some examples:

    Given a protein multiple sequence alignment, it is daunting task to assess the effects of substitutions along sequence length. 'aaSEA' package is intended to help researchers to rapidly analyse property changes caused by single, multiple and correlated amino acid substitutions in proteins. Methods for identification of co-evolving positions from multiple sequence alignment are as described in : Pelé et al., (2017) <doi:10.4172/2379-1764.1000250>.

    or

    Estimate parameters of accumulated damage (load duration) models based on failure time data under a Bayesian framework, using Approximate Bayesian Computation (ABC). Assess long-term reliability under stochastic load profiles. Yang, Zidek, and Wong (2019) <doi:10.1080/00401706.2018.1512900>.

    Using a similar filter for "https" shows (unsurprisingly) a lot more generic website links than scholarly references, but e.g.:

    Designed for studies where animals tagged with acoustic tags are expected\n to move through receiver arrays. This package combines the advantages of automatic sorting and checking \n of animal movements with the possibility for user intervention on tags that deviate from expected \n behaviour. The three analysis functions (explore(), migration() and residency()) \n allow the users to analyse their data in a systematic way, making it easy to compare results from \n different studies.\n CJS calculations are based on Perry et al. (2012) <https://www.researchgate.net/publication/256443823_Using_mark-recapture_models_to_estimate_survival_from_telemetry_data>.

    ArXiv (there are only 24 packages with such links out of 17962 total at present):

    Provides functions for model fitting and selection of generalised hypergeometric ensembles of random graphs (gHypEG).\n To learn how to use it, check the vignettes for a quick tutorial.\n Please reference its use as Casiraghi, G., Nanumyan, V. (2019) doi:10.5281/zenodo.2555300\n together with those relevant references from the one listed below.\n The package is based on the research developed at the Chair of Systems Design, ETH Zurich.\n Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2016) <arXiv:1607.02441>.\n Casiraghi, G., Nanumyan, V., Scholtes, I., Schweitzer, F. (2017) <doi:10.1007/978-3-319-67256-4_11>.\n Casiraghi, G., (2017) <arxiv:1702.02048>\n Casiraghi, G., Nanumyan, V. (2018) <arXiv:1810.06495>.\n Brandenberger, L., Casiraghi, G., Nanumyan, V., Schweitzer, F. (2019) <doi:10.1145/3341161.3342926>\n Casiraghi, G. (2019) <doi:10.1007/s41109-019-0241-1>.