
A virtual server configured with 8 CPU cores (2 GHz) and 16 GB RAM. Resource allocation by service:
Service | CPU Limits | RAM (GB) Limits | Replicas |
|---|---|---|---|
VL2 | 1 | 1 | 1 |
algos | 4 | 7 | 1 |
algos-slim | 1 | 0.5 | 1 |
celery-beat | 0.2 | 0.2 | 1 |
worker | 0.2 | 0.2 | 1 |
redis | 0.5 | 1 | 1 |
mongoDb | 1 | 2 | 1 |
A single load test iteration simulates one user performing one liveness session and one face-matching request.
Based on the configuration above:
Recommended Load: 10 simultaneous users.
Maximum Capacity: 15 simultaneous users.
Note: These benchmarks were conducted with SIMD instruction sets (such as AVX-512 and SSE) disabled. As the CPU was restricted to scalar operations, these figures represent a worst-case scenario. On modern hardware with these parallel computation instructions enabled, performance metrics and throughput will be significantly higher (up to 10x).
For a use case where there can be 50 simultaneous users, according to the provided benchmark, the next resource allocation is recommended:
Service | CPU Cores | RAM (GB) | Replicas |
|---|---|---|---|
VL2 | 3 | 3 | 3 |
algos | 24 | 42 | 6 |
algos-slim | 3 | 1.5 | 12 |
worker | 1.5 | 1.5 | 3 |
celery-beat | 0.5 | 0.5 | 1 |
redis (cache + vector) | 6 | 32* | 3 |
mongoDB | 8 | 48 | 3 |
Total | 46 Cores | 118 GB | ā |
Redis RAM should be adjusted based on the total number of face embeddings stored (approx. 6 GB per 1 million embeddings).
To maintain optimal performance as your user base grows, use the following ratios to scale your service components:
AppServer (VL2) & Algos: Maintain a ratio of 1 VL2 instance for every 2 algos instances.
Algos & Algos-slim: Deploy 3 algos-slim instances for every 1 algos instance.
Workers: Deploy 1 worker instance per VL2 instance to handle asynchronous tasks effectively.
Celery Beat: This service does not scale; always maintain exactly 1 instance to avoid duplicate task scheduling.
Disclaimer and Customization. These recommendations are provided for guidance purposes only. Every deployment should be tailored to its specific environment; on modern hardware, resource requirements may be significantly lower than these estimates.
To assist with your planning, we provide load-testing utilities designed to help you benchmark and optimize resource allocation for your specific infrastructure. For access to these tools or for further assistance, please contact our support team at [email protected].