pythonsetuptoolspython-packagingpyproject.toml

Adding folder with data with `pyproject.toml`


I would like to package some legacy code to a hello Python package containing only one module (hello.py) file in the top-level directory alongside with some data in a folder called my_data without changing the folder structure:

hello/
|-hello.py
|-pyproject.toml
|-my_data/
  |-my_data.csv

Packaging the Python source code with the following pyproject.toml file is surprisingly simple (without any prior knowledge on packaging), but running pip install . -vvv fails to copy the data:

[project]
name = "hello"
version = "0.1"

[tool.setuptools]
py-modules = ['hello']

[tool.setuptools.package-data]
hello = ['my_data/*']

The content of hello.py could be minimal:

def hello:
    print('Hello, world!')

I tried multiple variants of this pyproject.toml file according to the documentation on http://setuptools.pypa.io/en/stable/userguide/datafiles.html as well as a related question on Specifying package data in pyproject.toml, but none of them would result in copying of the my_data/ folder directly into the site-packages folder (which is intended, but probably bad practice).
I also found documentation suggesting to use a MANIFEST.in>

graft my_data

but also this doesn't result in the data to be installed alongside the code.


Solution

  • The package-data configuration of setuptools can be used when you have a package. But instead you seem to have a a single file-module. In other words package-data is incompatible with the directory structure that you have.

    I suggest rearranging the files like the following:

    hello/
    |-pyproject.toml
    |-hello/
      |-__init__.py  # <----- previously named `hello.py`
      |-my_data.csv
    
    # pyproject.toml diff
    ...
    
      [tool.setuptools]
    - py-modules = ['hello']
    + packages = ['hello']
    
      [tool.setuptools.package-data]
    - hello = ['my_data/*']
    + "" = ['*.csv']  # <--- "" (empty string) means "all packages".