authenticationistiorbackubeflowopenid-dex

istio getting "RBAC: access denied" even the servicerolebinding checked to be allowed


I've been struggleing with istio... So here I am seeking help from the experts!

Background

I'm trying to deploy my kubeflow application for multi-tenency with dex. Refering to the kubeflow offical document with the manifest file from github

Here is a list of component/version information

With the manifest file I successfully deployed the kubeflow on my cluster. Here's what I've done.

Here is a verification of step 3 and 4 Check RBAC enabled and envoyfilter added for kubeflow-userid

[root@gke-client-tf leilichao]# k get clusterrbacconfigs -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
  kind: ClusterRbacConfig
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"rbac.istio.io/v1alpha1","kind":"ClusterRbacConfig","metadata":{"annotations":{},"name":"default"},"spec":{"mode":"ON"}}
    creationTimestamp: "2020-07-04T01:28:52Z"
    generation: 2
    name: default
    resourceVersion: "5986075"
    selfLink: /apis/rbac.istio.io/v1alpha1/clusterrbacconfigs/default
    uid: db70920e-f364-40ec-a93b-a3364f88650f
  spec:
    mode: "ON"
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""
[root@gke-client-tf leilichao]# k get envoyfilter -n istio-system -o yaml
apiVersion: v1
items:
- apiVersion: networking.istio.io/v1alpha3
  kind: EnvoyFilter
  metadata:
    annotations:
      kubectl.kubernetes.io/last-applied-configuration: |
        {"apiVersion":"networking.istio.io/v1alpha3","kind":"EnvoyFilter","metadata":{"annotations":{},"labels":{"app.kubernetes.io/component":"oidc-authservice","app.kubernetes.io/instance":"oidc-authservice-v1.0.0","app.kubernetes.io/managed-by":"kfctl","app.kubernetes.io/name":"oidc-authservice","app.kubernetes.io/part-of":"kubeflow","app.kubernetes.io/version":"v1.0.0"},"name":"authn-filter","namespace":"istio-system"},"spec":{"filters":[{"filterConfig":{"httpService":{"authorizationRequest":{"allowedHeaders":{"patterns":[{"exact":"cookie"},{"exact":"X-Auth-Token"}]}},"authorizationResponse":{"allowedUpstreamHeaders":{"patterns":[{"exact":"kubeflow-userid"}]}},"serverUri":{"cluster":"outbound|8080||authservice.istio-system.svc.cluster.local","failureModeAllow":false,"timeout":"10s","uri":"http://authservice.istio-system.svc.cluster.local"}},"statusOnError":{"code":"GatewayTimeout"}},"filterName":"envoy.ext_authz","filterType":"HTTP","insertPosition":{"index":"FIRST"},"listenerMatch":{"listenerType":"GATEWAY"}}],"workloadLabels":{"istio":"ingressgateway"}}}
    creationTimestamp: "2020-07-04T01:40:43Z"
    generation: 1
    labels:
      app.kubernetes.io/component: oidc-authservice
      app.kubernetes.io/instance: oidc-authservice-v1.0.0
      app.kubernetes.io/managed-by: kfctl
      app.kubernetes.io/name: oidc-authservice
      app.kubernetes.io/part-of: kubeflow
      app.kubernetes.io/version: v1.0.0
    name: authn-filter
    namespace: istio-system
    resourceVersion: "4715289"
    selfLink: /apis/networking.istio.io/v1alpha3/namespaces/istio-system/envoyfilters/authn-filter
    uid: e599ba82-315a-4fc1-9a5d-e8e35d93ca26
  spec:
    filters:
    - filterConfig:
        httpService:
          authorizationRequest:
            allowedHeaders:
              patterns:
              - exact: cookie
              - exact: X-Auth-Token
          authorizationResponse:
            allowedUpstreamHeaders:
              patterns:
              - exact: kubeflow-userid
          serverUri:
            cluster: outbound|8080||authservice.istio-system.svc.cluster.local
            failureModeAllow: false
            timeout: 10s
            uri: http://authservice.istio-system.svc.cluster.local
        statusOnError:
          code: GatewayTimeout
      filterName: envoy.ext_authz
      filterType: HTTP
      insertPosition:
        index: FIRST
      listenerMatch:
        listenerType: GATEWAY
    workloadLabels:
      istio: ingressgateway
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

RBAC Issue problem analysis

After I finished my deployment. I performed below functional testing:

RBAC Issue investigation

I'm getting "RBAC: access denied" error after I successfully created the notebook server on kubeflow and trying to connect the notebook server. I managed to updated the envoy log level and get the log below.

[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:64] checking request: remoteAddress: 10.1.1.2:58012, localAddress: 10.1.2.66:8888, ssl: none, headers: ':authority', 'compliance-kf-system.ml'
':path', '/notebook/roger-l-c-lei/aug06/'
':method', 'GET'
'user-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36'
'accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9'
'accept-encoding', 'gzip, deflate'
'accept-language', 'en,zh-CN;q=0.9,zh;q=0.8'
'cookie', 'authservice_session=MTU5NjY5Njk0MXxOd3dBTkZvMldsVllVMUZPU0VaR01sSk5RVlJJV2xkRFVrRTFTVUl5V0RKV1EwdEhTMU5QVjFCVlUwTkpSVFpYUlVoT1RGVlBUa0U9fN3lPBXDDSZMT9MTJRbG8jv7AtblKTE3r84ayeCYuKOk; _xsrf=2|1e6639f2|10d3ea0a904e0ae505fd6425888453f8|1596697030'
'referer', 'http://compliance-kf-system.ml/jupyter/'
'upgrade-insecure-requests', '1'
'x-forwarded-for', '10.10.10.230'
'x-forwarded-proto', 'http'
'x-request-id', 'babbf884-4cec-93fd-aea6-2fc60d3abb83'
'kubeflow-userid', 'roger.l.c.lei@XXXX.com'
'x-istio-attributes', 'CjAKHWRlc3RpbmF0aW9uLnNlcnZpY2UubmFtZXNwYWNlEg8SDXJvZ2VyLWwtYy1sZWkKIwoYZGVzdGluYXRpb24uc2VydmljZS5uYW1lEgcSBWF1ZzA2Ck4KCnNvdXJjZS51aWQSQBI+a3ViZXJuZXRlczovL2lzdGlvLWluZ3Jlc3NnYXRld2F5LTg5Y2Q0YmQ0Yy1kdnF3dC5pc3Rpby1zeXN0ZW0KQQoXZGVzdGluYXRpb24uc2VydmljZS51aWQSJhIkaXN0aW86Ly9yb2dlci1sLWMtbGVpL3NlcnZpY2VzL2F1ZzA2CkMKGGRlc3RpbmF0aW9uLnNlcnZpY2UuaG9zdBInEiVhdWcwNi5yb2dlci1sLWMtbGVpLnN2Yy5jbHVzdGVyLmxvY2Fs'
'x-envoy-expected-rq-timeout-ms', '300000'
'x-b3-traceid', '3bf35cca1f7b75e7a42a046b1c124b1f'
'x-b3-spanid', 'a42a046b1c124b1f'
'x-b3-sampled', '1'
'x-envoy-original-path', '/notebook/roger-l-c-lei/aug06/'
'content-length', '0'
'x-envoy-internal', 'true'
, dynamicMetadata: filter_metadata {
  key: "istio_authn"
  value {
  }
}

[2020-08-06 13:32:43.290][26][debug][rbac] [external/envoy/source/extensions/filters/http/rbac/rbac_filter.cc:108] enforced denied

From the source code it looks like the allowed function is returnning false so it's giving the "RBAC: access denied" response.

  if (engine.has_value()) {
    if (engine->allowed(*callbacks_->connection(), headers,
                        callbacks_->streamInfo().dynamicMetadata(), nullptr)) {
      ENVOY_LOG(debug, "enforced allowed");
      config_->stats().allowed_.inc();
      return Http::FilterHeadersStatus::Continue;
    } else {
      ENVOY_LOG(debug, "enforced denied");
      callbacks_->sendLocalReply(Http::Code::Forbidden, "RBAC: access denied", nullptr,
                                 absl::nullopt);
      config_->stats().denied_.inc();
      return Http::FilterHeadersStatus::StopIteration;
    }
  }

I took a search on the dumped envoy, it looks like the rule should be allowing any request with a header key as my mail address. Now I can confirm I've got that in my header from above log.

{
 "name": "envoy.filters.http.rbac",
 "config": {
  "rules": {
   "policies": {
    "ns-access-istio": {
     "permissions": [
      {
       "and_rules": {
        "rules": [
         {
          "any": true
         }
        ]
       }
      }
     ],
     "principals": [
      {
       "and_ids": {
        "ids": [
         {
          "header": {
           "exact_match": "roger.l.c.lei@XXXX.com"
          }
         }
        ]
       }
      }
     ]
    }
   }
  }
 }
}

With the understand that the envoy config that's been used to validate RBAC authz is from this config. And it's distributed to the sidecar by mixer, The log and code leads me to the rbac.istio.io config of servicerolebinding.

[root@gke-client-tf leilichao]# k get servicerolebinding -n roger-l-c-lei -o yaml
apiVersion: v1
items:
- apiVersion: rbac.istio.io/v1alpha1
  kind: ServiceRoleBinding
  metadata:
    annotations:
      role: admin
      user: roger.l.c.lei@XXXX.com
    creationTimestamp: "2020-07-04T01:35:30Z"
    generation: 5
    name: owner-binding-istio
    namespace: roger-l-c-lei
    ownerReferences:
    - apiVersion: kubeflow.org/v1
      blockOwnerDeletion: true
      controller: true
      kind: Profile
      name: roger-l-c-lei
      uid: 689c9f04-08a6-4c51-a1dc-944db1a66114
    resourceVersion: "23201026"
    selfLink: /apis/rbac.istio.io/v1alpha1/namespaces/roger-l-c-lei/servicerolebindings/owner-binding-istio
    uid: bbbffc28-689c-4099-837a-87a2feb5948f
  spec:
    roleRef:
      kind: ServiceRole
      name: ns-access-istio
    subjects:
    - properties:
        request.headers[]: roger.l.c.lei@XXXX.com
  status: {}
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

I wanted to have a try updating this ServiceRoleBinding to validate some assumption since I can't debug the envoy source code and there's not enough log to show why exactly is the "allow" method returnning false.

However I find myself cannot update the servicerolebinding. It resumes to its original version everytime right after I finish editing it.

I find that there's this istio-galley validatingAdmissionConfiguration(Code block below) that monitors these istio rbac resources.

[root@gke-client-tf leilichao]# k get validatingwebhookconfigurations istio-galley -oyaml
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
  creationTimestamp: "2020-08-04T15:00:59Z"
  generation: 1
  labels:
    app: galley
    chart: galley
    heritage: Tiller
    istio: galley
    release: istio
  name: istio-galley
  ownerReferences:
  - apiVersion: extensions/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: Deployment
    name: istio-galley
    uid: 11fef012-4145-49ac-a43c-2e1d0a460ea4
  resourceVersion: "22484680"
  selfLink: /apis/admissionregistration.k8s.io/v1beta1/validatingwebhookconfigurations/istio-galley
  uid: 6f485e28-3b5a-4a3b-b31f-a5c477c82619
webhooks:
- admissionReviewVersions:
  - v1beta1
  clientConfig:
    caBundle: 
    .
    .
    .
    service:
      name: istio-galley
      namespace: istio-system
      path: /admitpilot
      port: 443
  failurePolicy: Fail
  matchPolicy: Exact
  name: pilot.validation.istio.io
  namespaceSelector: {}
  objectSelector: {}
  rules:
  - apiGroups:
    - config.istio.io
    apiVersions:
    - v1alpha2
    operations:
    - CREATE
    - UPDATE
    resources:
    - httpapispecs
    - httpapispecbindings
    - quotaspecs
    - quotaspecbindings
    scope: '*'
  - apiGroups:
    - rbac.istio.io
    apiVersions:
    - '*'
    operations:
    - CREATE
    - UPDATE
    resources:
    - '*'
    scope: '*'
  - apiGroups:
    - authentication.istio.io
    apiVersions:
    - '*'
    operations:
    - CREATE
    - UPDATE
    resources:
    - '*'
    scope: '*'
  - apiGroups:
    - networking.istio.io
    apiVersions:
    - '*'
    operations:
    - CREATE
    - UPDATE
    resources:
    - destinationrules
    - envoyfilters
    - gateways
    - serviceentries
    - sidecars
    - virtualservices
    scope: '*'
  sideEffects: Unknown
  timeoutSeconds: 30
- admissionReviewVersions:
  - v1beta1
  clientConfig:
    caBundle: 
    .
    .
    .
    service:
      name: istio-galley
      namespace: istio-system
      path: /admitmixer
      port: 443
  failurePolicy: Fail
  matchPolicy: Exact
  name: mixer.validation.istio.io
  namespaceSelector: {}
  objectSelector: {}
  rules:
  - apiGroups:
    - config.istio.io
    apiVersions:
    - v1alpha2
    operations:
    - CREATE
    - UPDATE
    resources:
    - rules
    - attributemanifests
    - circonuses
    - deniers
    - fluentds
    - kubernetesenvs
    - listcheckers
    - memquotas
    - noops
    - opas
    - prometheuses
    - rbacs
    - solarwindses
    - stackdrivers
    - cloudwatches
    - dogstatsds
    - statsds
    - stdios
    - apikeys
    - authorizations
    - checknothings
    - listentries
    - logentries
    - metrics
    - quotas
    - reportnothings
    - tracespans
    scope: '*'
  sideEffects: Unknown
  timeoutSeconds: 30

Long story short

I've been banging my head over this istio issue for more than 2 weeks. I'm sure there's plenty of people felting the same trying to trouble shoot istio on k8s. Any suggestion is welcomed! Here's how I understand the problem, please correct me if I'm wrong:

I've run into below problems where

I cannot update the ServiceRoleBinding even after I deleted the validating webhook

I tried to delete this validating webhook to update the serviceRoleBinding. The resource resumes right after I save the edit. The validating webhook is actually generated automatically from a configmap so I had to update that to update the webhook.

Is there some kind of cache in galley that mixer uses to distribute the config

I can't find any relevant log that indicates the rbac.istio.io resource is protected/validated by any service in the istio-system namespace.

How can I get the log of the MIXER

I need to understand which component exactly controls the policy. I managed to update the log level but failed to find anything useful

Most importantly How do I debug an envoy container

I need to debug the envoy app to understand why it's returning false for the allow function. If we can not debug it easily. Is there a document that lets me update the code to add more log and build a new image to GCR so I can have another run and based on the log to see what's going on behind the scene.


Solution

  • Answering my own question since I've made some progress on them.

    I cannot update the ServiceRoleBinding even after I deleted the validating webhook

    That's because the ServiceRoleBinding is actually generated/monitored/managed by the profile controller in the kubeflow namespace instead of the validating webhook.

    I'm having this rbac issue because based on the params.yaml in the profiles manifest folder the rule is generated as

    request.headers[]: roger.l.c.lei@XXXX.com
    

    instead of

    request.headers[kubeflow-userid]: roger.l.c.lei@XXXX.com
    

    Due to I mis-configed the value as blank instead of userid-header=kubeflow-userid in the params.yaml