Building a scalable Screenshot Service with Go, ChromeDP, and Docker

January 12, 2025

Building a Screenshot Service with Go, ChromeDP, and Docker

At one of my previous jobs, we faced the challenge of taking screenshots of posts at scale. Our goal was to capture specific content from web pages, such as blog posts or articles, to generate previews, thumbnails, or for further processing. The volume of requests made it necessary to find a lightweight and scalable solution that could handle this task efficiently.

We explored several options for headless browsers, including Puppeteer and Playwright, but found that they were heavier than what we needed. They often involved a lot of overhead, particularly in terms of memory usage and startup time, which became a bottleneck when scaling up.

In the search for a more efficient solution, we came across ChromeDP. Unlike Puppeteer and Playwright, ChromeDP is designed to be lightweight and is specifically optimized for scenarios where low memory usage and fast execution are key priorities. By leveraging ChromeDP in Go, we were able to create a service that could handle thousands of screenshot requests with minimal resource consumption, making it a perfect fit for our needs.

In this blog, we will walk through the creation of a Go-based screenshot service that uses the ChromeDP package for capturing screenshots of a webpage. The service allows users to send a URL and a specific CSS class name to capture a screenshot of that element on the page. Additionally, we will explore how to set up the service inside a Docker container, ensuring that it runs efficiently and securely.

Table of Contents

  1. Service Overview
  2. Code Explanation
  3. Running the Service
  4. Conclusion
  5. Browser Inconsistencies and Unexpected Results

Service Overview

How the Screenshot Service Works

The service uses Go to accept HTTP POST requests for screenshot generation, which is triggered by sending a URL and a CSS class name. ( which you can modify to use any selector you prefer )

The Endpoint

The server listens on a specified port (3000 in this case) and exposes an endpoint /screenshot. This endpoint is designed to handle POST requests that contain two key pieces of data in the request body: a URL and a CSS class name. The request body is expected to be in JSON format, with fields for the url and classname (the CSS class of the element to capture).

Handling Requests

Once the server receives a request, the following steps are taken:

  1. Input Validation:

    • The server checks that the incoming request method is a POST. If it isn’t, it returns a 405 Method Not Allowed error.
    • It then attempts to decode the JSON body into a Go struct. If decoding fails or if the URL or CSS class name is missing, the server returns a 400 Bad Request error with the appropriate message.
  2. Taking the Screenshot:

    • If the request is valid, the server invokes the takeScreenshot function, which uses the ChromeDP Go package to interact with a headless Chrome browser.
    • ChromeDP allows the server to programmatically load the provided URL in Chrome, scroll to the specified element identified by the provided CSS class name, and take a screenshot of that element.
    • The screenshot is taken in the WebP format, which is a lightweight, high-quality image format ideal for web usage.
  3. Retries and Error Handling:

    • The service includes logic to handle retries in case the screenshot attempt fails. For example, if the page doesn’t load properly or if the screenshot is too small (less than 60 pixels in height), the service will attempt to take the screenshot again, up to a maximum of five attempts.
    • If there is a fatal error (such as a 404 error when accessing the page), the service returns the appropriate HTTP status (e.g., 404 Not Found).
  4. Returning the Screenshot:

    • Once a valid screenshot is successfully captured, the service sends the screenshot back to the user in the HTTP response with the correct Content-Type (image/webp), indicating that the image is in WebP format.
  5. Cleanup:

    • After each request, temporary files and resources (such as the user data directory used by ChromeDP) are cleaned up to ensure the service remains efficient and does not consume unnecessary disk space.

This approach allows the service to be lightweight and efficient, providing fast, reliable, and scalable screenshot functionality for capturing specific elements on web pages. The use of retries and error handling ensures the service can handle different failure scenarios and return high-quality images when successful.

Code Explanation

Main Server

The main server is responsible for accepting HTTP requests and forwarding them to the screenshot handler. It listens for POST requests at the /screenshot endpoint.

package main import ( "encoding/json" "fmt" "log" "net/http" ) func main() { port := 3000 http.HandleFunc("/screenshot", screenshotHandler) log.Printf("Server started on port %d\n", port) go cleanupTempFolders() // Run cleanup function concurrently err := http.ListenAndServe(fmt.Sprintf(":%d", port), nil) if err != nil { log.Fatalf("Server error: %v", err) } } var request struct { URL string `json:"url"` CLASSNAME string `json:"classname"` } func screenshotHandler(w http.ResponseWriter, r *http.Request) { if r.Method != http.MethodPost { http.Error(w, "Method not allowed", http.StatusMethodNotAllowed) return } if err := json.NewDecoder(r.Body).Decode(&request); err != nil { http.Error(w, err.Error(), http.StatusBadRequest) log.Println("Error decoding request:", err) return } if request.URL == "" { http.Error(w, "URL is required", http.StatusBadRequest) log.Println("Empty URL provided") return } if request.CLASSNAME == "" { http.Error(w, "CLASSNAME is required", http.StatusBadRequest) log.Println("Empty CLASSNAME provided") return } log.Println("Received screenshot request for URL:", request.URL) var screenshotData []byte var str string var err error var height int for attempts := 0; attempts <= 5; attempts++ { log.Println("Running attempt", attempts) screenshotData, str, err = takeScreenshot(request.URL, request.CLASSNAME) if err != nil { if err.Error() == "page returned 404" { http.Error(w, "Page returned 404", http.StatusNotFound) log.Println("Page returned 404 for URL:", request.URL) return } log.Println("Error taking screenshot for URL:", request.URL, "Error:", err) continue } height, err = getImageHeight(screenshotData) if err != nil { log.Println("Error decoding screenshot data:", err) continue } if height >= 60 { break } log.Printf("Screenshot height below threshold (%d pixels), retrying...\n", height) } cleanup(str) w.Header().Set("Content-Type", "image/webp") w.WriteHeader(http.StatusOK) w.Write(screenshotData) log.Println("Screenshot sent for URL:", request.URL) }

Screenshot Capture Logic

The core logic for capturing the screenshot is handled by the takeScreenshot function. It uses the ChromeDP package to start a headless Chrome browser, navigate to the requested URL, and take a screenshot of the specified element using the provided class name.

package main import ( "bytes" "context" "fmt" "image" "image/png" "log" "os" "os/exec" "path/filepath" "strings" "time" "github.com/chai2010/webp" "github.com/chromedp/cdproto/page" "github.com/chromedp/chromedp" ) func cleanup(folder string) { os.RemoveAll(folder) } func takeScreenshot(url string, classname string) ([]byte, string, error) { // Create a temporary folder for user data userDataDir, err := os.MkdirTemp("", "chromedp-userData") if err != nil { return nil, "", err } defer cleanup(userDataDir) opts := append(chromedp.DefaultExecAllocatorOptions[:], chromedp.UserDataDir(userDataDir), chromedp.Flag("headless", true), chromedp.Flag("disable-extensions", true), ) allocatorCtx, allocatorCancel := chromedp.NewExecAllocator( context.Background(), opts..., ) defer allocatorCancel() ctx, cancel := chromedp.NewContext(allocatorCtx) defer cancel() // Start Chrome and capture the PID var chromePID int if err := chromedp.Run(ctx, chromedp.ActionFunc(func(ctx context.Context) error { cmd := exec.CommandContext(ctx, "pgrep", "-f", "chrome") output, err := cmd.Output() if err != nil { return err } pidStr := strings.TrimSpace(string(output)) if pidStr != "" { fmt.Sscanf(pidStr, "%d", &chromePID) } return nil })); err != nil { return nil, "", err } ctx, cancel = context.WithTimeout(ctx, 20*time.Second) defer func() { cancel() if chromePID > 0 { _ = exec.Command("kill", "-9", fmt.Sprintf("%d", chromePID)).Run() } }() var buf []byte if err := chromedp.Run(ctx, elementScreenshot(url, classname, 100, &buf)); err != nil { // Forcefully kill Chrome if an error occurs if chromePID > 0 { _ = exec.Command("kill", "-9", fmt.Sprintf("%d", chromePID)).Run() } return nil, "", err } // Check the height of the image and retry if necessary height, err := getImageHeight(buf) if err != nil { return nil, "", err } if height < 60 { // Adjust this value as needed if chromePID > 0 { _ = exec.Command("kill", "-9", fmt.Sprintf("%d", chromePID)).Run() } return takeScreenshot(url, classname) // Retry taking the screenshot } var webpImg []byte webpImg, err = convertToWebP(buf) if err != nil { return nil, "", err } return webpImg, userDataDir, nil }

Docker Configuration

To deploy the service in a Docker container, we need to set up a Dockerfile that installs all necessary dependencies, including Google Chrome and the Go runtime, and builds the Go application.

# Use the official Golang image as the base image FROM golang:1.22.3-bookworm # Update package list and install necessary packages RUN apt-get update && \ apt-get install -y wget unzip gnupg fonts-noto fonts-noto-cjk fonts-noto-core fonts-noto-ui-core fonts-noto-unhinted fonts-noto-color-emoji && \ rm -rf /var/lib/apt/lists/* # Install google fonts RUN wget -q https://github.com/google/fonts/archive/main.zip -O /tmp/fonts.zip \ && unzip -q /tmp/fonts.zip -d /tmp \ && mkdir -p /usr/share/fonts/truetype/google-fonts \ && find /tmp/fonts-main -type f -name "*.ttf" -exec cp {} /usr/share/fonts/truetype/google-fonts/ \; \ && rm -rf /tmp/fonts.zip /tmp/fonts-main # Add Google's public key and set up the repository RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor -o /usr/share/keyrings/google-chrome-keyring.gpg && \ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/google-chrome-keyring.gpg] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list # Install Google Chrome RUN apt-get update && \ apt-get install -y google-chrome-stable fontconfig && \ rm -rf /var/lib/apt/lists/* # Set the Current Working Directory inside the container WORKDIR /app # Copy fonts folder contents to /usr/share/fonts COPY fonts/. /usr/share/fonts/ # Refresh font cache RUN fc-cache -fv # Copy go.mod and go.sum files COPY go.mod go.sum ./ # Download all dependencies. Dependencies will be cached if the go.mod and go.sum files are not changed RUN go mod download # Copy the source code into the container COPY . . # Build the Go application RUN go build -o main . # Command to run the executable CMD ["./main"]

Explanation

Use the official Golang image as the base image

FROM golang:1.22.3-bookworm

This line specifies that the base image for our Docker container is the official Go image, specifically version 1.22.3 with the Bookworm variant of Debian. This image includes the Go programming language environment, which is necessary to build and run Go applications.

Install dependencies

RUN apt-get update && \ apt-get install -y wget unzip gnupg fonts-noto fonts-noto-cjk fonts-noto-core fonts-noto-ui-core fonts-noto-unhinted fonts-noto-color-emoji && \ rm -rf /var/lib/apt/lists/\*

Here, we update the package list and install several dependencies required for the container to run smoothly:

wget and unzip are used to download and extract files. gnupg is used for handling GPG keys. Various Noto font packages are installed to support text rendering in Chrome.

After the installations are completed, we clean up the package list (/var/lib/apt/lists/*) to reduce the image size.

Download and install Google Fonts

RUN wget -q https://github.com/google/fonts/archive/main.zip -O /tmp/fonts.zip \ && unzip -q /tmp/fonts.zip -d /tmp \ && mkdir -p /usr/share/fonts/truetype/google-fonts \ && find /tmp/fonts-main -type f -name "\*.ttf" -exec cp {} /usr/share/fonts/truetype/google-fonts/ \; \ && rm -rf /tmp/fonts.zip /tmp/fonts-main

This section downloads and installs Google fonts from their official repository:

We download a ZIP file containing all the Google fonts.
Extract the fonts to a temporary folder.
Copy the TrueType font files into the /usr/share/fonts/truetype/google-fonts directory.
Clean up the temporary files to reduce the size of the container.

Add Google Chrome repository and install Google Chrome

RUN wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor -o /usr/share/keyrings/google-chrome-keyring.gpg && \ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/google-chrome-keyring.gpg] http://dl.google.com/linux/chrome/deb/ stable main" > /etc/apt/sources.list.d/google-chrome.list RUN apt-get update && \ apt-get install -y google-chrome-stable fontconfig && \ rm -rf /var/lib/apt/lists/\*

This step adds the Google public key to the system’s keyring and sets up the repository to install Google Chrome in the container. The public key is used to verify the authenticity of the Chrome package.

Set the working directory for the application

WORKDIR /app

This line sets the working directory inside the container to /app. Any subsequent commands will be executed in this directory.

Copy the fonts into the container

COPY fonts/. /usr/share/fonts/

This command copies the fonts directory from the host machine into the container, making the fonts available for use by Chrome.

Refresh font cache

RUN fc-cache -fv

This command refreshes the font cache inside the container so that Chrome can use the newly copied fonts.

Copy Go modules and download dependencies

COPY go.mod go.sum ./ RUN go mod download

Here, we copy the go.mod and go.sum files into the container and download all the Go dependencies.

Copy the application source code into the container

COPY . .

We copy the entire source code into the container.

Build the Go application

RUN go build -o main .

This line builds the Go application inside the container, producing the executable named main.

Set the default command to run the application

CMD ["./main"]

Finally, this is the command that runs when the Docker container starts. It executes the main binary (the Go application) to start the screenshot service.

Running the Service

To run the service, simply build and run the Docker container:

  1. Build the Docker image:
docker build -t screenshot-service .
  1. Run the container:
docker run -p 3000:3000 screenshot-service

The service will be available at text. You can send a POST request to the /screenshot endpoint with a JSON body containing the url and classname.

Browser Inconsistencies and Unexpected Results

While our screenshot service is powerful and efficient, it's important to note that browser behavior can sometimes be unpredictable. Various factors, such as rendering differences, network issues, or resource-intensive pages, may cause a small percentage of screenshots to be inaccurate or incomplete. These inconsistencies are generally rare but can occur, especially when dealing with dynamic content or animations.

To mitigate these issues, consider implementing additional techniques, such as:

  • Retrying Failed Screenshots: Automatically retry capturing a screenshot if the first attempt fails.
  • Adding Wait Times for Content Rendering: Introducing a delay before capturing to ensure all elements are fully loaded.
  • Customizing Timeout Settings: Adjusting timeout values for network requests or element visibility checks.

Despite these potential challenges, the service remains robust for most use cases. However, being aware of these limitations can help manage expectations and guide future enhancements.

Conclusion

In this tutorial, we built a screenshot service using Go, ChromeDP, and Docker. The service captures screenshots of specific elements on a webpage and returns them in WebP format. We also learned how to set up a Docker container to run the service, ensuring portability and easy deployment. This service can be extended to support additional functionalities, such as dynamic resizing or advanced error handling.