-
Notifications
You must be signed in to change notification settings - Fork 345
Description
There is a persistent need for R packages that depend on reticulate to automatically set up a Python environment. While reticulate::configure_environment()
aimed to enable this, it had design decisions that caused issues in practice and is now soft-deprecated. However, the need still exists, and reticulate should aim to make this possible. There are several approaches to achieve this, possibly by modifying reticulate::configure_environment()
or designing a new approach.
Issues with reticulate::configure_environment()
:
-
Conda as the default environment: Using Conda by default often caused binary incompatibilities for R users. For example, this issue is consistently the most visited via Google searches. New solutions should default to virtual environments instead.
-
Installing into a shared environment: Automatically installing all package dependencies into a shared Python environment caused conflicts when two R packages required incompatible Python packages. This led to a frustrating failure loop where each triggered installation could break a previously functioning setup. Additionally, no output was shown to indicate what commands were executed or how to disable the "automagic" behavior. Users had to discover the
RETICULATE_AUTOCONFIGURE=FALSE
environment variable on their own.
Proposed solutions:
-
Isolated environments: R packages should install dependencies into a stand-alone Python environment specific to that R package. To combine the dependencies of multiple packages, users could opt-in with a function like
configure_environment(packages = c("pkg1", "pkg2"))
. -
More visible output: Informative messages should clearly indicate which commands are being run, helping users diagnose issues.
-
Automatic installation messages: Whenever an automatic install is triggered, a message should inform users how to disable this behavior, e.g., "You can disable automatic install by setting
Sys.setenv(RETICULATE_AUTOCONFIGURE=FALSE)
before loading packagefoo
." -
Selective automatic installation: Only trigger automatic installation if the newly installed environment would actually be used in the current R session. For instance, do not trigger an installation if any of conditions 1-7 in Reticulate’s order of Python discovery are true.
-
Dynamic Python package dependencies and post-install steps: The current implementation of
configure_environment()
requires Python package names to be provided as a static list, which proved too rigid in practice. For example, the TensorFlow R package requires different Python packages depending on the platform (tensorflow
,tensorflow-gpu
, ortensorflow-metal
on macOS). Additionally, after installing the Python package, TensorFlow needs to set up symlinks for GPU functionality (as described here), which the currentconfigure_environment()
does not support.
Ideal solution:
An ideal implementation could specify this setup in an R package’s DESCRIPTION file, with flexible syntax options like:
Package: tensorflow
Type: Package
...
Config/reticulate/r-command: install_tensorflow(envname = "r-tensorflow")
or
Config/reticulate/pip-requirements: tensorflow==2.16.*, keras>=3
This would allow for dynamic post-install actions, like configuring symlinks for GPU support, and future syntax options could be added based on community feedback.