pythonmlrunnuclio

MLRun, Issue with slow response times


I see higher throughput and long average response delay (waiting for worker in range 20-50 seconds), see outputs from grafana:

enter image description here

I know, that part of optimization can be:

I tuned performance based on increase sources and pods/replicas see:

# increase of sources (for faster execution)
fn.with_requests(mem="500Mi", cpu=0.5)  # default sources
fn.with_limits(mem="2Gi", cpu=1)        # maximal sources
    
# increase parallel execution based on increase of pods/replicas
fn.spec.replicas = 2        # default replicas
fn.spec.min_replicas = 2    # min replicas
fn.spec.max_replicas = 5    # max replicas

Do you know, how can I increase amount of workers and expected impacts to CPU/Memory?


Solution

  • I got it. The worker uses separate worker scope. This means that each worker has a copy of all variables, and all changes are kept within the worker (change by worker x, do not affect worker y). It means, it is useful to increase the request/limit resources at least for memory in level of pod/replica.

    You can setup amount of workers for http trigger based on that fn.with_http(workers=<n>), more information see. I updated code based on source tuning:

    # increase of workers (two workers) for each pod/replica
    fn.with_http(workers=2)
    
    # increase of sources (for faster execution)
    fn.with_requests(mem="1Gi", cpu=0.7)    # increased mem 2x and little cpu, because of two workers
    fn.with_limits(mem="2Gi", cpu=1)        # maximal sources (without changes)
        
    # increase parallel execution based on increase of pods/replicas
    fn.spec.replicas = 2        # default replicas 
    fn.spec.min_replicas = 2    # min replicas
    fn.spec.max_replicas = 5    # max replicas