pythonkubernetesyamlruamel.yamlmulti-document

How to edit a file with multiple YAML documents in Python


I have the following YAML file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nodejs
  namespace: test
  labels:
    app: hello-world
spec:
  selector:
    matchLabels:
      app: hello-world
  replicas: 100
  template:
    metadata:
      labels:
        app: hello-world
    spec:
      containers:
      - name: hello-world
        image: test/first:latest
        ports:
        - containerPort: 80
        resources:
          limits:
            memory: 2500Mi
            cpu: "2500m"
          requests:
            memory: 12Mi
            cpu: "80m"
---
apiVersion: v1
kind: Service
metadata:
  name: nodejs
spec:
  selector:
    app: hello-world
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
      nodePort: 30082   
  type: NodePort

I need to edit the YAML file using Python, I have tried the code below but it is not working for a file with multiple YAML documents. you can see the below image: enter image description here

import ruamel.yaml

yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.explicit_start =  True

with open(r"D:\deployment.yml") as stream:
   data = yaml.load_all(stream)

test = data[0]['metadata']
test.update(dict(name="Tom1"))
test.labels(dict(name="Tom1"))

test = data['spec']
test.update(dict(name="sfsdf"))

with open(r"D:\deploymentCopy.yml", 'wb') as stream:
    yaml.dump(data, stream)

you can refer the link for more info : Python: Replacing a String in a YAML file


Solution

  • "It is not working" is not very specific description of what is the problem.

    load_all() yields each document, so you would normally use it using:

    for data in yaml.load_all(stream):
        # work on the data of each individual document
    

    if you want all the data in an indexable list, as you do, you have to list() to make a list of the generated data:

         data = list(yaml.load_all(stream))
    

    If you load a number of documents in variable data with .load_all() it is more than likely that you don't want to dump data into a single object (using .dump()), but instead want to use .dump_all(), so you get each element of data dumped in a seperate document:

    with open(r"D:\deploymentCopy.yaml", 'wb') as stream:
        yaml.dump(data, stream)
    

    ruamel.yaml cannot distinguish between dumping a data structure that has a list (i.e. YAML sequence) at its root or dumping a list of data structures that should go in different documents. So you have to make that distinction using .dump() resp. .dump_all()

    Apart from that, the official YAML FAQ on the yaml.org website indicates that the recommended extension for files with YAML documents is .yaml . There are probably some projects that have not been updated since this became the recommendation (16 years ago, i.e. at least since September 2006).