Endpoints¶
An endpoint is a persistent service launched by the user on a compute system to serve as a conduit for executing functions on that computer. funcX supports a range of target systems, enabling an endpoint to be deployed on a laptop, the login node of a campus cluster, a cloud instance, or a Kubernetes cluster, for example.
The endpoint requires outbound network connectivity. That is, it must be able to connect to funcX at funcx.org.
The funcX endpoint is available on pypi.org (and thus available via pip
).
However, we strongly recommend installing the funcX endpoint into an isolated virtual environment.
Pipx automatically manages
package-specific virtual environments for command line applications, so install funcX endpoint via:
$ python3 -m pipx install funcx_endpoint
Note
Please note that the funcX endpoint is only supported on Linux.
After installing the funcX endpoint, use the funcx-endpoint
command to manage existing endpoints.
First time setup¶
You will be required to authenticate the first time you run funcx-endpoint
.
If you have authenticated previously, the endpoint will cache access tokens in
the local configuration file.
funcX requires authentication in order to associate endpoints with users and ensure only authorized users can run tasks on that endpoint. As part of this step, we request access to your identity and Globus Groups.
To get started, you will first want to configure a new endpoint.
$ funcx-endpoint configure
Once you’ve run this command, a directory will be created at $HOME/.funcx
and a set of default configuration files will be generated.
You can also set up auto-completion for the funcx-endpoint
commands in your shell, by using the command
$ funcx-endpoint --install-completion [zsh bash fish ...]
Configuring an Endpoint¶
funcX endpoints act as gateways to diverse computational resources, including clusters, clouds, supercomputers, and even your laptop. To make the best use of your resources, the endpoint must be configured to match the capabilities of the resource on which it is deployed.
funcX provides a Python class-based configuration model that allows you to specify the shape of the resources (number of nodes, number of cores per worker, walltime, etc.) as well as allowing you to place limits on how funcX may scale the resources in response to changing workload demands.
To generate the appropriate directories and default configuration template, run the following command:
$ funcx-endpoint configure <ENDPOINT_NAME>
This command will create a profile for your endpoint in $HOME/.funcx/<ENDPOINT_NAME>/
and will instantiate a
config.py
file. This file should be updated with the appropriate configurations for the computational system you are
targeting before you start the endpoint.
funcX is configured using a Config
object.
funcX uses Parsl to manage resources. For more information,
see the Config
class documentation and the
Parsl documentation .
Note
If the ENDPOINT_NAME is not specified, a default endpoint named “default” is configured.
Starting an Endpoint¶
To start a new endpoint run the following command:
$ funcx-endpoint start <ENDPOINT_NAME>
Note
If the ENDPOINT_NAME is not specified, a default endpoint named “default” is started.
Starting an endpoint will perform a registration process with funcX. The registration process provides funcX with information regarding the endpoint. The endpoint also establishes an outbound connection to RabbitMQ to retrieve tasks, send results, and communicate command information. Thus, the funcX endpoint requires outbound access to the funcX services over HTTPS (port 443) and AMQPS (port 5671).
Once started, the endpoint uses a daemon process to run in the background.
Note
If the endpoint was not stopped correctly previously (e.g., after a computer restart when the endpoint was running), the endpoint directory will be cleaned up to allow a fresh start
Warning
Only the owner of an endpoint is authorized to start an endpoint. Thus if you register an endpoint using one identity and try to start an endpoint owned by another identity, it will fail.
To start an endpoint using a client identity, rather than as a user, you can export the FUNCX_SDK_CLIENT_ID and FUNCX_SDK_CLIENT_SECRET environment variables. This is explained in detail in Client Credentials with FuncXClients.
Stopping an Endpoint¶
To stop an endpoint, run the following command:
$ funcx-endpoint stop <ENDPOINT_NAME>
If the endpoint is not running and was stopped correctly previously, this command does nothing.
If the endpoint is not running but was not stopped correctly previously (e.g., after a computer restart when the endpoint was running), this command will clean up the endpoint directory such that the endpoint can be started cleanly again.
Note
If the ENDPOINT_NAME is not specified, the default endpoint is stopped.
Warning
Run the funcx-endpoint stop
command twice to ensure that the endpoint is shutdown.
Listing Endpoints¶
To list available endpoints on the current system, run:
$ funcx-endpoint list
+---------------+-------------+--------------------------------------+
| Endpoint Name | Status | Endpoint ID |
+===============+=============+======================================+
| default | Active | 1e999502-b434-49a2-a2e0-d925383d2dd4 |
+---------------+-------------+--------------------------------------+
| KNL_test | Inactive | 8c01d13c-cfc1-42d9-96d2-52c51784ea16 |
+---------------+-------------+--------------------------------------+
| gpu_cluster | Initialized | None |
+---------------+-------------+--------------------------------------+
Endpoints can be the following states:
Initialized: The endpoint has been created, but not started following configuration and is not registered with the funcx service.
Running: The endpoint is active and available for executing functions.
Stopped: The endpoint was stopped by the user. It is not running and therefore, cannot service any functions. It can be started again without issues.
Disconnected: The endpoint disconnected unexpectedly. It is not running and therefore, cannot service any functions. Starting this endpoint will first invoke necessary endpoint cleanup, since it was not stopped correctly previously.
Container behaviors and routing¶
The funcX endpoint can run functions using independent Python processes or optionally inside containers. funcX supports various container technologies (e.g., docker and singularity) and different routing mechanisms for different use cases.
Raw worker processes (worker_mode=no_container
):
Hard routing: All worker processes are of the same type “RAW”. It this case, the funcx endpoint simply routes tasks to any available worker processes. This is the default mode of a funcx endpoint.
Soft routing: It is the same as hard routing.
Kubernetes (docker):
Hard routing: Both the manager and the worker are deployed within a pod and thus the manager cannot change the type of worker container. In this case, a set of managers are deployed with specific container images and the funcx endpoint simply routes tasks to corresponding managers (matching their types).
Soft routing: NOT SUPPORTED.
Native container support (docker, singularity, shifter):
Hard routing: In this case, each manager (on a compute node) can only launch worker containers of a specific type and thus each manager can serve only one type of function.
Soft routing: When receiving a task for a specific container type, the funcx endpoint attempts to send the task to a manager that has a suitable warm container to minimize the total number of container cold starts. If there are not any warmed containers in any connected managers, the funcX endpoint chooses one manager randomly to dispatch the task.
Example configurations¶
funcX has been used on various systems around the world. Below are example configurations for commonly used systems. If you would like to add your system to this list please contact the funcX team via Slack.
Note
All configuration examples below must be customized for the user’s allocation, Python environment, file system, etc.
Blue Waters (NCSA)¶

The following snippet shows an example configuration for executing remotely on Blue Waters, a supercomputer at the National Center for Supercomputing Applications.
The configuration assumes the user is running on a login node, uses the TorqueProvider
to interface
with the scheduler, and uses the AprunLauncher
to launch workers.
from parsl.addresses import address_by_hostname
from parsl.launchers import AprunLauncher
from parsl.providers import TorqueProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'bluewaters': {
'worker_init': 'module load bwpy;source anaconda3/etc/profile.d/conda.sh;conda activate funcx_testing_py3.7', # noqa: E501
'scheduler_options': '',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=1,
worker_debug=False,
address=address_by_hostname(),
provider=TorqueProvider(
queue='normal',
launcher=AprunLauncher(overrides="-b -- bwpy-environ --"),
# string to prepend to #SBATCH blocks in the submit
scheduler_options=user_opts['bluewaters']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load bwpy; source activate funcx env'.
worker_init=user_opts['bluewaters']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00'
),
)
],
)
# fmt: on
UChicago AI Cluster¶

The following snippet shows an example configuration for the University of Chicago’s AI Cluster.
The configuration assumes the user is running on a login node and uses the SlurmProvider
to interface
with the scheduler and launch onto the GPUs.
Link to docs.
from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE CONFIGURE THESE OPTIONS BEFORE USE
NODES_PER_JOB = 2
GPUS_PER_NODE = 4
GPUS_PER_WORKER = 2
# Do not modify:
TOTAL_WORKERS = int((NODES_PER_JOB * GPUS_PER_NODE) / GPUS_PER_WORKER)
WORKERS_PER_NODE = int(GPUS_PER_NODE / GPUS_PER_WORKER)
GPU_MAP = ','.join([str(x) for x in range(1, TOTAL_WORKERS + 1)])
config = Config(
executors=[
HighThroughputExecutor(
label="fe.cs.uchicago",
worker_debug=False,
address=address_by_interface('ens2f1'),
provider=SlurmProvider(
partition='general',
# Launch 4 managers per node, each bound to 1 GPU
# This is a hack. We use hostname ; to terminate the srun command, and
# start our own
#
# DO NOT MODIFY unless you know what you are doing.
launcher=SrunLauncher(
overrides=(
f'hostname; srun --ntasks={TOTAL_WORKERS} '
f'--ntasks-per-node={WORKERS_PER_NODE} '
f'--gpus-per-task=rtx2080ti:{GPUS_PER_WORKER} '
f'--gpu-bind=map_gpu:{GPU_MAP}'
)
),
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=NODES_PER_JOB,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00',
),
)],
)
# fmt: on
Midway (RCC, UChicago)¶

The Midway cluster is a campus cluster hosted by the Research Computing Center at the University of Chicago.
The snippet below shows an example configuration for executing remotely on Midway.
The configuration assumes the user is running on a login node and uses the SlurmProvider
to interface
with the scheduler, and uses the SrunLauncher
to launch workers.
from parsl.addresses import address_by_hostname
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'midway': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=2,
worker_debug=False,
address=address_by_hostname(),
provider=SlurmProvider(
partition='broadwl',
launcher=SrunLauncher(),
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options=user_opts['midway']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['midway']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00'
),
)
],
)
# fmt: on
The following configuration is an example to use singularity container on Midway.
from parsl.addresses import address_by_hostname
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'midway': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=10,
address=address_by_hostname(),
scheduler_mode='soft',
worker_mode='singularity_reuse',
container_type='singularity',
container_cmd_options="-H /home/$USER",
provider=SlurmProvider(
partition='broadwl',
launcher=SrunLauncher(),
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options=user_opts['midway']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['midway']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00'
),
)
],
)
# fmt: on
Kubernetes Clusters¶

Kubernetes is an open-source system for container management, such as automating deployment and scaling of containers.
The snippet below shows an example configuration for deploying pods as workers on a Kubernetes cluster.
The KubernetesProvider exploits the Python Kubernetes API, which assumes that you have kube config in ~/.kube/config
.
from parsl.addresses import address_by_route
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
from funcx_endpoint.providers.kubernetes.kube import KubernetesProvider
from funcx_endpoint.strategies import KubeSimpleStrategy
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'kube': {
'worker_init': 'pip install --force-reinstall funcx_endpoint>=0.2.0',
'image': 'python:3.8-buster',
'namespace': 'default',
}
}
config = Config(
executors=[
HighThroughputExecutor(
label='Kubernetes_funcX',
max_workers_per_node=1,
address=address_by_route(),
scheduler_mode='hard',
container_type='docker',
strategy=KubeSimpleStrategy(max_idletime=3600),
provider=KubernetesProvider(
init_blocks=0,
min_blocks=0,
max_blocks=2,
init_cpu=1,
max_cpu=4,
init_mem="1024Mi",
max_mem="4096Mi",
image=user_opts['kube']['image'],
worker_init=user_opts['kube']['worker_init'],
namespace=user_opts['kube']['namespace'],
incluster_config=False,
),
)
],
heartbeat_period=15,
heartbeat_threshold=200,
log_dir='.',
)
# fmt: on
Theta (ALCF)¶

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility’s
Theta supercomputer. This example uses the HighThroughputExecutor
and connects to Theta’s Cobalt scheduler
using the CobaltProvider
. This configuration assumes that the script is being executed on the login nodes of Theta.
from parsl.addresses import address_by_interface
from parsl.launchers import AprunLauncher
from parsl.providers import CobaltProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'theta': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '',
# Specify the account/allocation to which jobs should be charged
'account': '<YOUR_THETA_ALLOCATION>'
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=1,
worker_debug=False,
address=address_by_interface('vlan2360'),
provider=CobaltProvider(
queue='debug-flat-quad',
account=user_opts['theta']['account'],
launcher=AprunLauncher(overrides="-d 64"),
# string to prepend to #COBALT blocks in the submit
# script to the scheduler eg: '#COBALT -t 50'
scheduler_options=user_opts['theta']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate funcx_env'.
worker_init=user_opts['theta']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00'
),
)
],
)
# fmt: on
The following configuration is an example to use singularity container on Theta.
from parsl.addresses import address_by_interface
from parsl.launchers import AprunLauncher
from parsl.providers import CobaltProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'theta': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '',
# Specify the account/allocation to which jobs should be charged
'account': '<YOUR_THETA_ALLOCATION>'
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=1,
worker_debug=False,
address=address_by_interface('vlan2360'),
scheduler_mode='soft',
worker_mode='singularity_reuse',
container_type='singularity',
container_cmd_options="-H /home/$USER",
provider=CobaltProvider(
queue='debug-flat-quad',
account=user_opts['theta']['account'],
launcher=AprunLauncher(overrides="-d 64"),
# string to prepend to #COBALT blocks in the submit
# script to the scheduler eg: '#COBALT -t 50'
scheduler_options=user_opts['theta']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate funcx_env'.
worker_init=user_opts['theta']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00'
),
)
],
)
# fmt: on
Cooley (ALCF)¶

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility’s
Cooley cluster. This example uses the HighThroughputExecutor
and connects to Cooley’s Cobalt scheduler
using the CobaltProvider
. This configuration assumes that the script is being executed on the login nodes of Cooley.
from parsl.addresses import address_by_interface
from parsl.launchers import MpiExecLauncher
from parsl.providers import CobaltProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'cooley': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '',
# Specify the account/allocation to which jobs should be charged
'account': '<YOUR_COOLEY_ALLOCATION>'
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=2,
worker_debug=False,
address=address_by_interface('ib0'),
provider=CobaltProvider(
queue='default',
account=user_opts['cooley']['account'],
launcher=MpiExecLauncher(),
# string to prepend to #COBALT blocks in the submit
# script to the scheduler eg: '#COBALT -t 50'
scheduler_options=user_opts['cooley']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate funcx_env'.
worker_init=user_opts['cooley']['worker_init'],
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00',
),
)
],
)
# fmt: onrom funcx_endpoint.endpoint.utils.config import Config
Polaris (ALCF)¶

The following snippet shows an example configuration for executing on Argonne Leadership Computing Facility’s
Polaris cluster. This example uses the HighThroughputExecutor
and connects to Polaris’s PBS scheduler
using the PBSProProvider
. This configuration assumes that the script is being executed on the login node of Polaris.
from parsl.addresses import address_by_interface
from parsl.launchers import SingleNodeLauncher
from parsl.providers import PBSProProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
from funcx_endpoint.strategies import SimpleStrategy
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'polaris': {
# Node setup: activate necessary conda environment and such.
'worker_init': '',
'scheduler_options': '#PBS -l filesystems=home:grand:eagle\n#PBS -k doe',
# ALCF allocation to use
'account': '',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=1,
strategy=SimpleStrategy(max_idletime=300),
address=address_by_interface('bond0'),
provider=PBSProProvider(
launcher=SingleNodeLauncher(),
account=user_opts['polaris']['account'],
queue='preemptable',
cpus_per_node=32,
select_options='ngpus=4',
worker_init=user_opts['polaris']['worker_init'],
scheduler_options=user_opts['polaris']['scheduler_options'],
walltime='01:00:00',
nodes_per_block=1,
init_blocks=0,
min_blocks=0,
max_blocks=2,
),
)
],
)
# fmt: on
Cori (NERSC)¶

The following snippet shows an example configuration for accessing NERSC’s Cori supercomputer. This example uses the HighThroughputExecutor
and connects to Cori’s Slurm scheduler.
It is configured to request 2 nodes configured with 1 TaskBlock per node. Finally, it includes override information to request a particular node type (Haswell) and to configure a specific Python environment on the worker nodes using Anaconda.
from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'cori': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '#SBATCH --constraint=knl,quad,cache',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=2,
worker_debug=False,
address=address_by_interface('bond0.144'),
provider=SlurmProvider(
partition='debug', # Partition / QOS
# We request all hyperthreads on a node.
launcher=SrunLauncher(overrides='-c 272'),
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options=user_opts['cori']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['cori']['worker_init'],
# Increase timeout as Cori's scheduler may be slow
# to respond
cmd_timeout=120,
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 10 minutes
walltime='00:10:00',
),
),
],
)
# fmt: on
Perlmutter (NERSC)¶

The following snippet shows an example configuration for accessing NERSC’s Perlmutter supercomputer. This example uses the HighThroughputExecutor
and connects to Perlmutters’s Slurm scheduler.
It is configured to request 2 nodes configured with 1 TaskBlock per node. Finally, it includes override information to request a particular node type (Haswell) and to configure a specific Python environment on the worker nodes using Anaconda.
Note
Please run module load cgpu
prior to executing funcx-endpoint start <endpoint_name>
on the Cori login nodes to access the Perlmutter queues.
from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'perlmutter': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'scheduler_options': '#SBATCH -C gpu'
}
}
config = Config(
executors=[
HighThroughputExecutor(
worker_debug=False,
address=address_by_interface('nmn0'),
provider=SlurmProvider(
partition='GPU', # Partition / QOS
# We request all hyperthreads on a node.
launcher=SrunLauncher(overrides='-c 272'),
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=gpu'
scheduler_options=user_opts['perlmutter']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['perlmutter']['worker_init'],
# Slurm scheduler on Cori can be slow at times,
# increase the command timeouts
cmd_timeout=120,
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 10 minutes
walltime='00:10:00',
),
),
],
)
# fmt: on
Frontera (TACC)¶

The following snippet shows an example configuration for accessing the Frontera system at TACC. The configuration below assumes that the user is
running on a login node, uses the SlurmProvider
to interface with the scheduler, and uses the SrunLauncher
to launch workers.
from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'frontera': {
'worker_init': 'source ~/setup_funcx_test_env.sh',
'account': 'EAR22001',
'partition': 'development',
'scheduler_options': '',
}
}
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=2,
worker_debug=False,
address=address_by_interface('ib0'),
provider=SlurmProvider(
account=user_opts['frontera']['account'],
partition=user_opts['frontera']['partition'],
launcher=SrunLauncher(),
# Enter scheduler_options if needed
scheduler_options=user_opts['frontera']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['frontera']['worker_init'],
# Add extra time for slow scheduler responses
cmd_timeout=60,
# Scale between 0-1 blocks with 2 nodes per block
nodes_per_block=2,
init_blocks=0,
min_blocks=0,
max_blocks=1,
# Hold blocks for 30 minutes
walltime='00:30:00',
),
)
],
)
# fmt: on
Bebop (LCRC, ANL)¶

The following snippet shows an example configuration for accessing the Bebop system at Argonne’s LCRC. The configuration below assumes that the user is
running on a login node, uses the SlurmProvider
to interface with the scheduler, and uses the SrunLauncher
to launch workers.
from parsl.addresses import address_by_interface
from parsl.launchers import SrunLauncher
from parsl.providers import SlurmProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
# fmt: off
# PLEASE UPDATE user_opts BEFORE USE
user_opts = {
'bebop': {
'worker_init': '',
'scheduler_options': '',
'partition': 'bdws',
}
}
config = Config(
executors=[
HighThroughputExecutor(
address=address_by_interface('ib0'),
provider=SlurmProvider(
partition=user_opts['bebop']['partition'],
launcher=SrunLauncher(),
nodes_per_block=1,
init_blocks=1,
# string to prepend to #SBATCH blocks in the submit
# script to the scheduler eg: '#SBATCH --constraint=knl,quad,cache'
scheduler_options=user_opts['bebop']['scheduler_options'],
# Command to be run before starting a worker, such as:
# 'module load Anaconda; source activate parsl_env'.
worker_init=user_opts['bebop']['worker_init'],
min_blocks=0,
max_blocks=1,
walltime='00:30:00'
),
)
],
)
Pinning Workers to devices¶
Many modern clusters provide multiple accelerators per compute note, yet many applications are best suited to using a
single accelerator per task. funcX supports pinning each worker to different accelerators using the available_accelerators
option of the HighThroughputExecutor
. Provide either the number of accelerators (funcX will assume they are named
in integers starting from zero) or a list of the names of the accelerators available on the node. Each funcX worker
will have the following environment variables set to the worker specific identity assigned:
CUDA_VISIBLE_DEVICES
, ROCR_VISIBLE_DEVICES
, SYCL_DEVICE_FILTER
.
# fmt: off
from parsl.providers import LocalProvider
from funcx_endpoint.endpoint.utils.config import Config
from funcx_endpoint.executors import HighThroughputExecutor
config = Config(
executors=[
HighThroughputExecutor(
max_workers_per_node=4,
# `available_accelerators` may be a natural number or a list of strings.
# If an integer, then each worker launched will have an automatically
# generated environment variable. In this case, one of 0, 1, 2, or 3.
# Alternatively, specific strings may be utilized.
available_accelerators=4,
# available_accelerators=['opencl:gpu:1', 'opencl:gpu:2'] # alternative
provider=LocalProvider(
init_blocks=1,
min_blocks=0,
max_blocks=1,
),
)
],
)
# fmt: on