I use chromedp
with Go to transform html pages into images (using screenshot method) but it uses a lot of CPU. Is it ok and can i optimize it? Here how i use it (i tried to open first tab, but everything works the same, after .Run method, endpoint is not responding (app runs normally, parallel healthcheck request are OK)
Here code:
func (cdp *ChromeDP) ConvertToImage(html string) ([]byte, error) {
var res []byte
// You should use same context from ChromeDP struct, to avoid zombie chrome process
ctx1, cancel1 := chromedp.NewContext(cdp.ctx)
defer cancel1()
var wg sync.WaitGroup
wg.Add(1)
if err := chromedp.Run(ctx1,
chromedp.Navigate("about:blank"),
chromedp.ActionFunc(func(ctx context.Context) error {
lctx, cancel := context.WithCancel(ctx)
chromedp.ListenTarget(lctx, func(ev interface{}) {
if _, ok := ev.(*page.EventLoadEventFired); ok {
wg.Done()
cancel()
}
})
return nil
}),
chromedp.ActionFunc(func(ctx context.Context) error {
frameTree, err := page.GetFrameTree().Do(ctx)
if err != nil {
return err
}
return page.SetDocumentContent(frameTree.Frame.ID, html).Do(ctx)
}),
chromedp.ActionFunc(func(ctx context.Context) error {
wg.Wait()
return nil
}),
chromedp.ActionFunc(func(ctx context.Context) error {
result, err := page.CaptureScreenshot().Do(ctx)
if err != nil {
return err
}
res = result
return nil
}),
); err != nil {
return nil, err
}
return res, nil
}
(Its not wg problem, tested :D)
I was having the same problems using an alpine Docker image with a chromium installed. chromedp
handles any chrome instance to execute commands.
I changed the Dockerfile to use chromedp/headless-shell:
# Build
FROM golang:latest AS build
RUN mkdir /go/src/app
COPY . /go/src/app
WORKDIR /go/src/app
RUN go mod tidy
RUN go build -o /go/src/app/main main.go
# Run
FROM chromedp/headless-shell:latest
COPY --from=build /go/src/app/main /go/src/app/
COPY --from=build /go/src/app/templates /go/src/app/templates
WORKDIR /go/src/app
EXPOSE 8080
RUN apt-get update; apt install dumb-init -y
ENTRYPOINT ["dumb-init", "--"]
CMD ["/go/src/app/main"]
I replaced an Alpine image for a Debian image (golang:latest in build stage) because chromedp/headless-shell (2nd stage in image) is based on it.
In the cluster, the service required 400% of CPU. After the change it dropped down to 2%. You can use any AWS node with it.
Another approach is to always have a running browser instance: the first chromedp.NewContext(cdp.ctx)
should open an about:blank
tab to never close the browser when other tabs are closed.