pythongoogle-kubernetes-engineopencensus

Error in OpenCensus in Google Kubernetes Engine with Python


I am deploying containers to GKE that contain Python apps and encountering an error when I try to use OpenCensus to send trace messages:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 59, in func
    return self.func(*aa, **kw)
  File "/usr/local/lib/python3.7/site-packages/opencensus/metrics/transport.py", line 113, in export_all
    export(itertools.chain(*all_gets))
  File "/usr/local/lib/python3.7/site-packages/opencensus/ext/stackdriver/stats_exporter/__init__.py", line 162, in export_metrics
    self.client.project_path(self.options.project_id), ts_batch)
  File "/usr/local/lib/python3.7/site-packages/google/cloud/monitoring_v3/gapic/metric_service_client.py", line 1024, in create_time_series
    request, retry=retry, timeout=timeout, metadata=metadata
  File "/usr/local/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 143, in __call__
    return wrapped_func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 273, in retry_wrapped_func
    on_error=on_error,
  File "/usr/local/lib/python3.7/site-packages/google/api_core/retry.py", line 182, in retry_target
    return target()
  File "/usr/local/lib/python3.7/site-packages/google/api_core/timeout.py", line 214, in func_with_timeout
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 59, in error_remapped_callable
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.InvalidArgument: 400 One or more TimeSeries could not be written: The set of resource labels is incomplete. Missing labels: (container_name namespace_name).: timeSeries[0-199]

The interesting part seems to be this sentence: Missing labels: (container_name namespace_name).

When I run the exact same code locally, I do not receive any errors and I do see my tracing appearing in Stackdriver Metrics Explorer, so the problem appears to be related specifically to running inside a container in GKE.

Is there something specific that is required to get OpenCensus working in a GKE container?


Solution

  • The answer is that you need to manually set two environment variables in your container: CONTAINER_NAME and NAMESPACE. I believe GKE should be setting these and isn't, and so OpenCensus can't find the expected values. A sample fix would involve including those two variables in the podspec:

            spec:
              containers:
                env:
                - name: NAMESPACE
                  valueFrom:
                    fieldRef:
                      fieldPath: metadata.namespace
                - name: CONTAINER_NAME
                  value: {{ APP }}-collectors-{{ NAME }}
    

    More details: https://github.com/census-instrumentation/opencensus-python/issues/796#issuecomment-539109321