Hi everyone,
I have a cluster based on kubeadm having 1 master and 2 workers. I have already implemented built-in horizontalPodAutoscaling (based on cpu_utilization and memory) and now i want to perform autoscaling on the basis of custom metrics (response time in my case).
I am using Prometheus Adapter for custom metrics.And, I could not find any metrics with the name of response_time in prometheus.
Is there any metric available in prometheus which scales the application based on response time and what is its name?
Whether i will need to edit the default horizontal autoscaling algorithm or i will have to make an algorithm for autoscaling from scratch which could scale my application on the basis of response time?
Prometheus has only 4 metric types: Counter, Gauge, Histogram and Summary.
I guess Histogram is that what you need
A histogram samples observations (usually things like request durations or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
A histogram with a base metric name of
<basename>
exposes multiple time series during a scrape:
- cumulative counters for the observation buckets, exposed as
<basename>_bucket{le="<upper inclusive bound>"}
- the total sum of all observed values, exposed as
<basename>_sum
- the count of events that have been observed, exposed as
<basename>_count
(identical to<basename>_bucket{le="+Inf"}
above)
There is a stackoverflow question, where you can get a query for latency (response time), so I think this might be useful for you.
I dont know if I understand you correctly, but if you want to edit HPA, you can edit the yaml file, delete previous HPA and create new one instead.
kubectl delete hpa <name.yaml>
kubectl apply -f <name.yaml>
There is good article about Autoscaling on custom metrics with custom Prometheus Metrics.