I have been tested OutlierDetection for one upstream pod. I expected that istio prevents request goes to target upstream pod.
My test environment:
while [ true ]; do date; curl -v 'http://http-echo-svc.trafficmgmt:80/500'; sleep 1; done
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: dr-status-echo
spec:
host: http-echo-svc.trafficmgmt
trafficPolicy:
outlierDetection:
consecutive5xxErrors: 6
interval: 30s
baseEjectionTime: 2m
maxEjectionPercent: 100
minHealthPercent: 0
And pod & service yaml:
apiVersion: v1
kind: Pod
metadata:
name: status-echo
labels:
app.kubernetes.io/name: echo-pod
spec:
containers:
- name: status-echo
image: status-echo:0.0.3
imagePullPolicy: Never
ports:
- containerPort: 8087
name: http-echo-port
---
apiVersion: v1
kind: Service
metadata:
name: http-echo-svc
spec:
selector:
app.kubernetes.io/name: echo-pod
ports:
- name: http-echo
protocol: TCP
port: 80
targetPort: http-echo-port
Upstream pod logged every requests. That means circuit breaker was not working.
Does OutlierDetection not work for one upstream or is my configuration wrong?
The configuration looks right; I tried with the fake-service
and configured it to always return 500.
Here's my configuration (that's equivalent to what you have, with the exception of different image):
apiVersion: v1
kind: Pod
metadata:
name: status-echo
labels:
app.kubernetes.io/name: echo-pod
spec:
containers:
- name: fake-service
image: nicholasjackson/fake-service:v0.25.2
env:
- name: ERROR_RATE
value: "1"
- name: ERROR_CODE
value: "500"
imagePullPolicy: Always
ports:
- containerPort: 9090
name: http-echo-port
---
apiVersion: v1
kind: Service
metadata:
name: http-echo-svc
spec:
selector:
app.kubernetes.io/name: echo-pod
ports:
- name: http-echo
protocol: TCP
port: 80
targetPort: 9090
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: dr-status-echo
spec:
host: http-echo-svc
trafficPolicy:
outlierDetection:
consecutive5xxErrors: 6
interval: 30s
baseEjectionTime: 2m
maxEjectionPercent: 100
minHealthPercent: 0
I am running curl from within a pod inside the cluster and on the first 6 tries, this is the response I get from the status-echo
pod:
< HTTP/1.1 500 Internal Server Error
< date: Wed, 26 Jul 2023 20:35:18 GMT
< content-length: 164
< content-type: text/plain; charset=utf-8
< x-envoy-upstream-service-time: 4
< server: envoy
<
{
"name": "Service",
"uri": "/",
"type": "HTTP",
"ip_addresses": [
"10.42.0.64"
],
"code": 500,
"error": "Service error automatically injected"
}
On the 7th request, the response changes and looks like this:
< HTTP/1.1 503 Service Unavailable
< content-length: 19
< content-type: text/plain
< date: Wed, 26 Jul 2023 20:35:55 GMT
< server: envoy
<
* Connection #0 to host http-echo-svc left intact
no healthy upstream
Which basically means that the outlier detection kicked in and ejected the one failing host (hence the no healthy upstream
response). Similarly, if you look at the logs from the istio-proxy (on the client side), you'll see a similar error:
"GET / HTTP/1.1" 503 UH no_healthy_upstream