I have a Ruby on Rails web application that allow users to upload images which then automatically get resized as small thumbnails using libvips and the ImageProcessing ruby gem. Sometimes users legitimately need to upload 100MP+ images. These large images break our server that only has 1GB of RAM. If it's relevant, these images are almost always JPEGs.
What I'm hoping is to use libvips to first scale down these images to a size that my server can handle--maybe like under 8,000x8,000 pixels--without using lots of RAM. Then I would use that image to do the other things we already do, like change the colorspace to sRGB and resize and strip metadata, etc.
Is this possible? If so can you give an example of a vips or vipsthumbnail linux CLI command?
I found a feature in Imagemagick that should theoretically solve this issue, mentioned in the two links below. But I don't want to have to switch the whole system to Imagemagick just for this.
https://legacy.imagemagick.org/Usage/formats/#jpg_read https://github.com/janko/image_processing/wiki/Improving-ImageMagick-performance
P.S.: I'm using Heroku so if the RAM usage peaks at up to 2GB the action should still work.
(I've always been confused about why image processing seems to always require loading the entire image in RAM at once...)
UPDATE:
I'm providing more context because jcupitt's command is still failing for me.
This is the main software that is installed on the Docker container that is running libvips, as defined in the Dockerfile:
FROM ruby:3.1.2
RUN apt-get update -qq && apt-get install -y postgresql-client
# uglifier requires nodejs -- `apt-get install nodejs` only installs older version by default
RUN apt-get install -y curl
RUN curl -sL https://deb.nodesource.com/setup_14.x | bash -
RUN apt-get install -y nodejs
RUN apt-get install -y libvips libvips-dev libvips-tools
# install pdftotext
RUN apt-get install -y xpdf
I am limiting the memory usage of the sidekiq container to 500MB to be more similar to production server. (I also tried this when limiting memory and reserved memory to 1GB and the same thing happens.) This is the config as specified in docker-compose.yml
sidekiq:
depends_on:
- db
- redis
build: .
command: sidekiq -c 1 -v -q mailers -q default -q low -q searchkick
volumes:
- '.:/myapp'
env_file:
- '.env'
deploy:
resources:
limits:
memory: 500M
reservations:
memory: 500M
This is the exact command I'm trying, based on the command that jcupitt suggested:
first I run docker stats --all
to see the sidekiq container's memory usage after booting up, not running libvips:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
4d7e9ff9c7c7 sidekiq_1 0.48% 210.2MiB / 500MiB 42.03% 282kB / 635kB 133MB / 0B 7
I also check docker-compose exec sidekiq top and get a higher RAM limit, which I think is normal for Docker
top - 18:39:48 up 1 day, 3:21, 0 users, load average: 0.01, 0.08, 0.21
Tasks: 3 total, 1 running, 2 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.2 us, 1.5 sy, 0.0 ni, 97.1 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 3929.7 total, 267.4 free, 1844.1 used, 1818.1 buff/cache
MiB Swap: 980.0 total, 61.7 free, 918.3 used. 1756.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 root 20 0 607688 190620 12848 S 0.3 4.7 0:10.31 ruby
54 root 20 0 6984 3260 2772 R 0.3 0.1 0:00.05 top
39 root 20 0 4092 3256 2732 S 0.0 0.1 0:00.03 bash
then I run the command
docker-compose exec sidekiq bash
root@4d7e9ff9c7c7:/myapp# vipsheader /tmp/shrine20220728-1-8yqju5.jpeg
/tmp/shrine20220728-1-8yqju5.jpeg: 23400x15600 uchar, 3 bands, srgb, jpegload
VIPS_CONCURRENCY=1 vipsthumbnail /tmp/shrine20220728-1-8yqju5.jpeg --size 500x500
Then in another Terminal window I check docker stats --all
again
In maybe 0.5s the memory usage quickly shoots to 500MB and the vipsthumbnail process dies and just returns "Killed".
libvips will almost always stream images rather than loading them in memory, so you should not see high memory use.
For example:
$ vipsheader st-francis.jpg
st-francis.jpg: 30000x26319 uchar, 3 bands, srgb, jpegload
$ ls -l st-francis.jpg
-rw-rw-r-- 1 john john 227612475 Sep 17 2020 st-francis.jpg
$ /usr/bin/time -f %M:%e vipsthumbnail st-francis.jpg --size 500x500
87412:2.57
So 87MB of memory and 2.5s. The image is around 3gb uncompressed. You should get the same performance with ActiveRecord.
In fact there's not much useful concurrency for this sort of operation, so you can run libvips with a small threadpool.
$ VIPS_CONCURRENCY=1 /usr/bin/time -f %M:%e vipsthumbnail st-francis.jpg --size 500x500
52624:2.49
So with one thread in the threadpool it's about the same speed, but memory use is down to 50MB.
There are a few cases when this will fail. One is with interlaced (also called progressive) images.
These represent the image as a series of passes of increasingly higher detail. This can help when displaying an image to the user (the image appears in slowly increasing detail, rather than as a line moving down the screen), but unfortunately this also means that you don't get the final value of the first pixel until the entire image has been decompressed. This means you have to decompress the whole image into memory, and makes this type of file extremely unsuitable for large images of the sort you are handling.
You can detect an interlaced image in ruby-vips with:
if image.get_typeof("interlaced") != 0
error "argh! can't handle this"
end
I would do that test early on in your application and block upload of this type of file.