How to fix intermittent "No nodes available to run query" errors in Trino/Presto on co-hosted nodes?
05:47 25 May 2026

I am running a Trino / PrestoDB cluster where multiple worker nodes are co-hosted on the same physical hardware. Intermittently, my queries fail with the following error:

No nodes available to run query

It appears that the coordinator is dropping communication with the worker nodes during high-resource spikes, leading it to believe there are zero active workers in the cluster.

To mitigate this on our shared hardware, I already tried loosening the default failure detector threshold (which was too aggressive at 0.1) by updating my config.properties to the following:

# threshold is a ratio 0.0-1.0; default 0.1 is too aggressive for co-hosted nodes.
failure-detector.enabled=true
failure-detector.threshold=0.5
failure-detector.heartbeat-interval=1s
failure-detector.warmup-interval=30s
 
# === Discovery HTTP client ===
discovery.http-client.request-timeout=5s
discovery.http-client.idle-timeout=10s
 
# === Exchange HTTP client ===
exchange.http-client.connect-timeout=10s
exchange.http-client.read-timeout=30s       

While raising the failure-detector.threshold to 0.5 helped slightly, the "no nodes available" error still happens under heavy load.

Are these values good enough to completely avoid the "no nodes available" error, or are they still too tight?

configuration cluster-computing presto trino