This is just a small bit of research to figure out how to best use the
importlib.resources
system and provide example code that has been
confirmed (though tests) to work, including with resources that are
not present as files in the filesystem (i.e., in a ZIP file library).
We're actually testing the backport importlib-resources from PyPI
(imported as importlib_resources
instead of importlib.resources
)
because using this gives us the same API everywhere, rather than the mess
of various APIs and deprecations we've seen in Pythons 3.7 through
3.12, when the API finally settled down.
We could also add tests showing how to use importlib.resources
in ways
that will work across various versions of Python, but this doesn't seem
worthwhile as it's fairly difficult. (E.g., APIs that are good in Pythons
3.7–3.10 and ≥ 3.12 are deprecated in 3.11, so one would want to suppress
the warning messages about those there, or have some flag that uses
different APIs depending on the version of Python.)
Run ./Test
to run the tests; the -c
option will do a clean build
(rebuilding the virtualenv, etc.).
pylib
contains the sample library code and the resources; client
contains tests that import and call the library.
The documentation for the importlib.resources API is found at:
- Using importlib_resources: General usage overview.
importlib.resources
– Package resource reading, opening and access: API details for current version of Python. (Use the version number drop-down menu at the upper left of the page to see APIs for other versions of Python.)importlib.resources.abc.Traversable
: API of theTraversable
object (a type ofpathlib.Path
) returned byimportlib_resources.files()
.
Most functions take an anchor (type importlib_resources.Anchor
), a
start point for a tree of resources, which may be either a module object or
a module name as a string (Union[str, ModuleType]
).
- If no anchor is supplied, the current module is used. (Since 3.12.)
- If the anchor is an (import) package, that package is used as the root of
the resource tree. (Since 3.9. Pre-3.12 the param name was
package
.) - If the anchor is a non-package module (e.g.,
foo
read fromfoo.py
) the "directory" containing the package is the root of the resource tree (i.e., the resources are adjacent to the module, not below it). (Since 3.12.)
The core API became available in Python 3.9, but is usable in any version
of Python ≥ 3.7 with the importlib_resources
package.
The primary core API function is importlib_resources.files(anchor: Anchor | None = None)
. This is the only function where (as of Python 3.12 or in
the compatibility importlib_resources
package) the anchor is optional,
defaulting to the current module. On the Traversable
return value you may:
- Navigate the resource tree using
Path.joinpath()
. - Iterate directories using
Path.iterdir()
,Path.glob()
,Path.rglob()
, andPath.walk()
. - Open files using
Path.open()
. - Read file contents using
Path.read_text()
orPath.read_bytes()
.
(Forward slashes in path components given to these functions work as directory seperators on all systems, including Windows.)
You may not, however, assume that the return value refers to a path in the
file system. If the package was loaded from a ZIP file, for example, it
will be a reference to the ZIP file and path within it. When you need a
reference to a directory or file in the filesystem, use as_file()
to
temporarily generate one (if necessary) from a files()
return value. Any
temporary file or directory will be cleaned up when you exit the with
context.
t = importlib_resources.files()
with importlib_resources.as_file(t) as p:
... # operations on `Path` object `p` here
The "functional" API is just a bit of syntatic sugar over the core API
above. It suffers from two issues: it always requires an Anchor
argument (you cannot use None
) and it's been on-and-off deprecated over
time. However, these do work in older versions of Python if you're not
using importlib_resources
: they were introduced in 3.7; the Core API
above not until 3.9. Note that these were deprecated in Python 3.11, and
undeprecated in Python 3.12.
To have these functions use resources relative to the current module, you
cannot pass None
(as you can with importlib_resources.files()
), so you
need to pass (generally) __name__
to do that.
The following functions do not need to extract anything to the filesystem:
is_resource(anchor, *path_components)
. Note that directories are not considered to be resources.read_binary(anchor, *path_components)
. Returnsbytes
.open_binary(anchor, *path_components)
. ReturnsBinaryIO
.read_text(anchor, *path_components, encoding='utf-8', errors='strict')
. Returnsstr
.open_text(anchor, *path_components, encoding='utf-8', errors='strict')
ReturnsTextIO
. Always giveencoding
parameter name explicitly, or third argument will beencoding
. (Until Python 3.15.)path(anchor, *path_components)
. Context manager returning apathlib.Path
. May extract some dirs/files to the filesystem, cleaning them up after the context manager exits.contents(anchor, *path_components)
. Still deprecated since 3.11; useiterdir()
.
Command-line programs that take arguments of files to read can be more quickly tested by using input files stored as resources without either extracting the resources to the filesystem or starting a subprocess to run the program.
It's generally advisable to avoid using global variables in modules related to these programs, especially if you call them more than once or also unit test any of these modules, since these will not be reset to their original values at module load time.
The standard pattern is to provide a main()
function for your program
that takes optional parameters to override the arguments and file handles
for any files you need to read:
def main(argv1=None, input_override=None):
args = parseargs(argv1)
...
if input_override is not None:
f = input_override
elif args.input == '-':
f = sys.stdin.buffer
else:
try:
f = open(args.input, 'rb')
except FileNotFoundError as ex:
print(str(ex))
sys.exit(1)
def parseargs(argv1): # `None` to use `sys.argv[1:]`
p = ArgumentParser()
...
return p.parse_args(argv1)
This can then be called from the tests with:
def test_command_line_program(capsys):
main(argv1=['somefile'],
input_override=resfiles().joinpath('somefile').open('rb'))
stdout, stderr = capsys.readouterr()
assert ('expected output', 'expected err') == (stdout, stderr)
The command line program itself, built by the distribution packaging
system, calls main()
with no parameters, thus using sys.argv[1:]
and
opening the filename given on the command line. This is usually configured
in pyproject.toml
:
[project.scripts]
myprogram = 'mypackage.cli.myprogram:main'