Title:
Best way to scale LiveKit Egress for recordings (private meetings + livestream platform)?
Post:
Hi everyone,
I’m building a live streaming + private meeting platform and looking for some architecture advice around scaling LiveKit egress.
Current stack
Angular frontend
.NET backend
Self-hosted LiveKit server running on an Ubuntu EC2 instance
Redis for coordination
AWS infrastructure (EC2 / containerized services)
Recording use cases
The platform supports two types of sessions:
Private meetings → using RoomComposite Egress
Livestream classes → using Participant Egress (record only instructor)
Recording is optional and triggered by the instructor, so demand can vary a lot. For example, multiple instructors might start recordings at the same time.
The problem I'm trying to solve
Right now I haven’t implemented autoscaling yet, and I'm trying to design the right architecture before moving forward.
My concern is how to handle situations where many recordings start at once. Since egress workers handle recording jobs, I want to avoid cases where requests fail or timeout due to lack of capacity.
What I'm trying to achieve
Ideally the system should:
Scale egress workers automatically when recording demand increases
Scale down when idle to save infrastructure cost
Handle bursts where many recordings start simultaneously
Support both RoomComposite and Participant egress jobs efficiently
Questions
For anyone running LiveKit in production:
What is the recommended way to scale LiveKit egress workers?
Should scaling be based on:
CPU usage
number of active recordings
pending egress jobs
pipelines per worker
Has anyone implemented autoscaling egress workers successfully on AWS (ECS / EC2 / Kubernetes)?
If LiveKit server load increases (many rooms), how do you typically scale the LiveKit media servers alongside egress workers?
I’m still in the architecture design stage, so any suggestions, reference architectures, or lessons learned would be really helpful.
Thanks!