在服务器上运行 Puppeteer：完整教程

hao123数码22/12/2024 18:46:3600

puppeteer 是一个 node.js 库，它提供了一个高级 api，用于通过 devtools 协议控制 chrome 或 chromium 浏览器。它是一个强大的工具，可用于网页抓取、自动化测试、捕获屏幕截图等。虽然在本地使用 puppeteer 很简单，但在服务器上运行它需要额外的考虑。本指南将引导您完成在服务器上启动并运行 puppeteer 的步骤。

为 puppeteer 准备服务器

更新服务器

这一步对于puppeteer的成功执行至关重要。执行以下命令。

sudo apt update -y
sudo apt upgrade -y

登录后复制

安装依赖项

安装以下依赖项以确保 puppeteer 顺利运行。

sudo apt-get install libpangocairo-1.0-0 libx11-xcb1 libxcomposite1 libxcursor1 libxdamage1 libxi6 libxtst6 libnss3 libcups2 libxss1 libxrandr2 libatk1.0-0 libgtk-3-0 libasound2t64

登录后复制

安装 puppeteer

执行以下命令安装最新版本的 puppeteer，始终建议安装最新版本以获得最佳性能。

npm i puppeteer

登录后复制

使用傀儡师

您可以使用以下代码片段通过在您想要的路线调用此函数来验证 puppeteer 是否正常运行。

const puppeteer = require("puppeteer");

/**
 * Launches a Puppeteer browser, navigates to a webpage, and then closes the browser.
 *
 * Launch Options:
 * - headless: Run the browser in headless mode (no GUI).
 * - args:
 *   - "--no-sandbox": Required if running as the root user.
 *   - "--disable-setuid-sandbox": Optional, try if you encounter sandbox errors.
 */

const runPuppeteer = async () => {
  try {
    // Launch a Puppeteer browser instance with custom arguments
    const browser = await puppeteer.launch({
      headless: true,
      args: [
        "--no-sandbox",
        "--disable-setuid-sandbox",
      ],
    });

    // Open a new page in the browser
    const page = await browser.newPage();

    // Navigate to the specified URL
    await page.goto("https://www.google.com");

    console.log("Navigation to Google completed.");

    // Close the browser
    await browser.close();
    console.log("Browser closed successfully.");
  } catch (error) {
    console.error("An error occurred:", error);
  }
};

// Execute the function
runPuppeteer();

登录后复制