Fine Tuning Resource Requests and Limits
This page provides a guideline on how to calculate the resource needs under various circumstances
Last updated
This page provides a guideline on how to calculate the resource needs under various circumstances
Last updated
In order to provide hardware recommendations to users, a series of tests were performed on T4K to measure the memory usage of the control plane, analyzer, web-backend and exporter resources. The tests were performed mainly based on:
Number of Kubernetes resources
Number of backups
In order to analyze the impact of resources on T4K, a T4K was set up with 1000 active namespace level backups, 50 backup plans (not running in schedule or creating backups), and 10k resources which contained 1 deployment, 10k services, 1500 config maps, and 750 secrets initially. In each iteration of tests, the backups were kept at 1000, and 10k resources were added until the cluster had 50k resources and 1000 backups.
Below table and graphs provide memory insights of control-plane, analyzer, web-backend and exporter:
Metrics
Load | Control Plane | Analyzer | Web Backend | Exporter | ||||
---|---|---|---|---|---|---|---|---|
Resources/Backups | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) |
10k resources (1000 backups) | 263 | 263 | 107 | 107 | 272 | 443 | 526 | 723 |
20k resources (1000 backups) | 320 | 320 | 102 | 102 | 344 | 575 | 480 | 718 |
30k resources (1000 backups) | 342 | 480 | 104 | 107 | 398 | 1130 | 486 | 650 |
40k resources (1000 backups) | 391 | 553 | 111 | 159 | 443 | 1250 | 460 | 716 |
1650k resources (1000 backups) | 462 | 650 | 140 | 179 | 515 | 1640 | 530 | 670 |
50k resources (1000 backups) - after restart | 449 | 671 | 142 | 142 | 483 | 1370 | - | - |
Chart
Addition of 10k resources consumes memory usage upto 150 MB, 52 MB and 555 MB in control-plane, analyzer and web-backend respectively while in case of exporter, addition of resources doesn't affect much in the memory consumption.
In order to analyze the impact of backups on T4K, initially, a T4K was setup with 10k resources which contained 1 deployment, 10k services, 1500 config maps and 750 secrets, ~50 active backups and 50 backup plans creating backups at the same time. In each iteration of tests, the resources were kept at 10k and 1000 backups were added until the cluster had 6k backups and 10k resources.
Below table and graphs provide memory insights of control-plane, analyzer, web-backend and exporter:
Metrics
Load | Control Plane | Analyzer | Web Backend | Exporter | ||||
---|---|---|---|---|---|---|---|---|
Resources/Backups | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) | Memory (idle) | Memory (spike) |
10k resources (0-50 backups) | 330 | 387 | 44 | 51 | 380 | 620 | 87 | 93 |
10k resources (1000 backups) | 476 | 626 | 164 | 191 | 457 | 860 | 420 | 650 |
10k resources (2000 backups) | 630 | 780 | 210 | 255 | 757 | 1190 | 1150 | 1410 |
10k resources (3300 backups) | 750 | 960 | 310 | 400 | 912 | 1490 | 1500 | 1800 |
10k resources (6000 backups) | 1060 | 1320 | 796 | 1002 | 1030 | 1700 | - | - |
10k resources (6000 backups) - after restart | 552 | - | 382 | - | 558 | 1300 | - | - |
Chart
Note: No spike was seen after the restart in above graph as all scheduled based backups were paused at that time
Note: No spike was seen after the restart in above graph as all scheduled based backups were paused at that time
Addition of 1k backups consumes memory usage upto 239 MB, 200 MB, 330 MB and 750 MB in control-plane, analyzer, web-backend and exporter respectively.
control-plane, analyzer and exporter components consume more memory with increasing backups than increasing resources. The number of resources has small amount of effect on control-plane, analyzer and exporter components therefore, the number of backups factor should be taken into the account more while providing the memory limits for these containers. The control-plane spikes seem to be related to scheduled backup plans creating 50 backups every hour. The number of backups that are being created at the same time also contributes in memory spike seen in control-plane. The same spike isn't seen much in the resource based tests as the backup plans were not creating backups every hour.
web-backend on the other hand consume memory with both increasing backups and resources as it needs to cache every resources in the cluster.
Considering the metrics from the tests above, below recommendations can be followed to configure control-plane, analyzer, web-backend and exporter's memory limits:
Component | Recommendation |
---|---|
control-plane | 1 GB for 10k resources, 1k backups |
analyzer | 512 MB for 10k resources, 1k backups |
web-backend | 1 GB for 10k resources, 1k backups |
exporter | 1 GB for 1k backups |