image-processingvips

How to use libvips to shrink giant images with limited memory


I have a Ruby on Rails web application that allow users to upload images which then automatically get resized as small thumbnails using libvips and the ImageProcessing ruby gem. Sometimes users legitimately need to upload 100MP+ images. These large images break our server that only has 1GB of RAM. If it's relevant, these images are almost always JPEGs.

What I'm hoping is to use libvips to first scale down these images to a size that my server can handle--maybe like under 8,000x8,000 pixels--without using lots of RAM. Then I would use that image to do the other things we already do, like change the colorspace to sRGB and resize and strip metadata, etc.

Is this possible? If so can you give an example of a vips or vipsthumbnail linux CLI command?

I found a feature in Imagemagick that should theoretically solve this issue, mentioned in the two links below. But I don't want to have to switch the whole system to Imagemagick just for this.

https://legacy.imagemagick.org/Usage/formats/#jpg_read https://github.com/janko/image_processing/wiki/Improving-ImageMagick-performance

P.S.: I'm using Heroku so if the RAM usage peaks at up to 2GB the action should still work.

(I've always been confused about why image processing seems to always require loading the entire image in RAM at once...)

UPDATE:

I'm providing more context because jcupitt's command is still failing for me.

This is the main software that is installed on the Docker container that is running libvips, as defined in the Dockerfile:

FROM ruby:3.1.2
RUN apt-get update -qq && apt-get install -y postgresql-client 

# uglifier requires nodejs -- `apt-get install nodejs`  only installs older version by default
RUN apt-get install -y curl
RUN curl -sL https://deb.nodesource.com/setup_14.x | bash -
RUN apt-get install -y nodejs

RUN apt-get install -y libvips libvips-dev libvips-tools
# install pdftotext
RUN apt-get install -y xpdf

I am limiting the memory usage of the sidekiq container to 500MB to be more similar to production server. (I also tried this when limiting memory and reserved memory to 1GB and the same thing happens.) This is the config as specified in docker-compose.yml

  sidekiq:
    depends_on:
      - db
      - redis
    build: .
    command: sidekiq -c 1 -v -q mailers -q default -q low -q searchkick
    volumes:
      - '.:/myapp'
    env_file:
      - '.env'
    deploy:
      resources:
        limits:
          memory: 500M
        reservations:
          memory: 500M

This is the exact command I'm trying, based on the command that jcupitt suggested:

first I run docker stats --all to see the sidekiq container's memory usage after booting up, not running libvips:

CONTAINER ID   NAME                    CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
4d7e9ff9c7c7   sidekiq_1               0.48%     210.2MiB / 500MiB     42.03%    282kB / 635kB     133MB / 0B        7

I also check docker-compose exec sidekiq top and get a higher RAM limit, which I think is normal for Docker

top - 18:39:48 up 1 day,  3:21,  0 users,  load average: 0.01, 0.08, 0.21
Tasks:   3 total,   1 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.2 us,  1.5 sy,  0.0 ni, 97.1 id,  0.2 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3929.7 total,    267.4 free,   1844.1 used,   1818.1 buff/cache
MiB Swap:    980.0 total,     61.7 free,    918.3 used.   1756.6 avail Mem 

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                       
      1 root      20   0  607688 190620  12848 S   0.3   4.7   0:10.31 ruby                                                                                                          
     54 root      20   0    6984   3260   2772 R   0.3   0.1   0:00.05 top                                                                                                           
     39 root      20   0    4092   3256   2732 S   0.0   0.1   0:00.03 bash                                                                                                          

then I run the command

docker-compose exec sidekiq bash

root@4d7e9ff9c7c7:/myapp# vipsheader /tmp/shrine20220728-1-8yqju5.jpeg
/tmp/shrine20220728-1-8yqju5.jpeg: 23400x15600 uchar, 3 bands, srgb, jpegload

VIPS_CONCURRENCY=1 vipsthumbnail /tmp/shrine20220728-1-8yqju5.jpeg --size 500x500

Then in another Terminal window I check docker stats --all again

In maybe 0.5s the memory usage quickly shoots to 500MB and the vipsthumbnail process dies and just returns "Killed".


Solution

  • libvips will almost always stream images rather than loading them in memory, so you should not see high memory use.

    For example:

    $ vipsheader st-francis.jpg
    st-francis.jpg: 30000x26319 uchar, 3 bands, srgb, jpegload
    $ ls -l st-francis.jpg
    -rw-rw-r-- 1 john john 227612475 Sep 17  2020 st-francis.jpg
    $ /usr/bin/time -f %M:%e vipsthumbnail st-francis.jpg --size 500x500
    87412:2.57
    

    So 87MB of memory and 2.5s. The image is around 3gb uncompressed. You should get the same performance with ActiveRecord.

    In fact there's not much useful concurrency for this sort of operation, so you can run libvips with a small threadpool.

    $ VIPS_CONCURRENCY=1 /usr/bin/time -f %M:%e vipsthumbnail st-francis.jpg --size 500x500
    52624:2.49
    

    So with one thread in the threadpool it's about the same speed, but memory use is down to 50MB.

    There are a few cases when this will fail. One is with interlaced (also called progressive) images.

    These represent the image as a series of passes of increasingly higher detail. This can help when displaying an image to the user (the image appears in slowly increasing detail, rather than as a line moving down the screen), but unfortunately this also means that you don't get the final value of the first pixel until the entire image has been decompressed. This means you have to decompress the whole image into memory, and makes this type of file extremely unsuitable for large images of the sort you are handling.

    You can detect an interlaced image in ruby-vips with:

    if image.get_typeof("interlaced") != 0
      error "argh! can't handle this"
    end
    

    I would do that test early on in your application and block upload of this type of file.