class: Page

未匹配的标注
  • 扩展: EventEmitter
    Chromium 中,Page 提供了与单个标签或 扩展程序背景页面 的进行交互的方法。一个 Browser 实例可能有多个 Page 实例。
    本示例创建了一个页面,将其导航到 URL,然后保存屏幕截图:
    const puppeteer = require('puppeteer');
    (async () => {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://example.com');
    await page.screenshot({path: 'screenshot.png'});
    await browser.close();
    })();

如下所述,Page 类会触发各种事件,它可以被用来处理 Node 原生的 EventEmitter 类 的所有方法 ,例如 on, once 或者 removeListener
本示例记录了一个页面 load 事件的消息:

page.once('load', () => console.log('Page loaded!'));

如果要取消订阅事件,可以使用 removeListener 方法:

function logRequest(interceptedRequest) {
  console.log('A request was made:', interceptedRequest.url());
}
page.on('request', logRequest);
// 一段时间后...
page.removeListener('request', logRequest);

'close' 事件

当页面关闭后触发。

'console' 事件

  • <[ConsoleMessage]>

当页面中的 JavaScript 调用 console API 的其中一个方法时触发,例如:console.log 或者 console.dir。如果页面抛出错误或者警告,也会触发。在事件处理时,传递给 console.log 的参数,也会显示为传入的参数。

下面是一个处理 console 事件的示例:

page.on('console', msg => {
  for (let i = 0; i < msg.args().length; ++i)
    console.log(`${i}: ${msg.args()[i]}`);
});
page.evaluate(() => console.log('hello', 5, {foo: 'bar'}));

'dialog' 事件

  • <[Dialog]>

当 JavaScript 对话框出现时触发,例如 alertpromptconfirm 或者 beforeunload。Puppeteer 可以通过 [Dialog] 的 accept 或者 dismiss 方法做出响应.

'domcontentloaded' 事件

当 JavaScript [DOMContentLoaded]((developer.mozilla.org/en-US/docs/W... "DOMContentLoaded") 事件被 dispatch 时触发。

'error' 事件

  • <[Error]>

当页面崩溃时触发

注意 在 Node 中 error 事件有特殊的含义,详情请查阅 error events.

'frameattached' 事件

  • <[Frame]>

当连接一个控件时触发。

'framedetached' 事件

  • <[Frame]>

当控件被拆分时触发。

'framenavigated' 事件

  • <[Frame]>

当控件被导航到一个新的 url 时触发。

'load' 事件

当 JavaScript 事件被调用时触发 load

event: 'metrics'

  • <[Object]>
    • title <[string]> The title passed to console.timeStamp.
    • metrics <[Object]> Object containing metrics as key/value pairs. The values
      of metrics are of <[number]> type.

Emitted when the JavaScript code makes a call to console.timeStamp. For the list
of metrics see page.metrics.

event: 'pageerror'

  • <[Error]> The exception message

Emitted when an uncaught exception happens within the page.

event: 'popup'

  • <[Page]> Page corresponding to "popup" window

Emitted when the page opens a new tab or window.

const [popup] = await Promise.all([
  new Promise(resolve => page.once('popup', resolve)),
  page.click('a[target=_blank]'),
]);
const [popup] = await Promise.all([
  new Promise(resolve => page.once('popup', resolve)),
  page.evaluate(() => window.open('https://example.com')),
]);

event: 'request'

  • <[Request]>

Emitted when a page issues a request. The [request] object is read-only.
In order to intercept and mutate requests, see page.setRequestInterception.

event: 'requestfailed'

  • <[Request]>

Emitted when a request fails, for example by timing out.

NOTE HTTP Error responses, such as 404 or 503, are still successful responses from HTTP standpoint, so request will complete with 'requestfinished' event and not with 'requestfailed'.

event: 'requestfinished'

  • <[Request]>

Emitted when a request finishes successfully.

event: 'response'

  • <[Response]>

Emitted when a [response] is received.

event: 'workercreated'

  • <[Worker]>

Emitted when a dedicated WebWorker is spawned by the page.

event: 'workerdestroyed'

  • <[Worker]>

Emitted when a dedicated WebWorker is terminated.

page.$(selector)

  • selector <[string]> A [selector] to query page for
  • returns: <[Promise]<?[ElementHandle]>>

The method runs document.querySelector within the page. If no element matches the selector, the return value resolves to null.

Shortcut for page.mainFrame().$(selector).

page.$$(selector)

  • selector <[string]> A [selector] to query page for
  • returns: <[Promise]<[Array]<[ElementHandle]>>>

The method runs document.querySelectorAll within the page. If no elements match the selector, the return value resolves to [].

Shortcut for page.mainFrame().$$(selector).

page.$$eval(selector, pageFunction[, ...args])

  • selector <[string]> A [selector] to query page for
  • pageFunction <[function]([Array]<[Element]>)> Function to be evaluated in browser context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[Serializable]>> Promise which resolves to the return value of pageFunction

This method runs Array.from(document.querySelectorAll(selector)) within the page and passes it as the first argument to pageFunction.

If pageFunction returns a [Promise], then page.$$eval would wait for the promise to resolve and return its value.

Examples:

const divCount = await page.$$eval('div', divs => divs.length);
const options = await page.$$eval('div > span.options', options => options.map(option => option.textContent));

page.$eval(selector, pageFunction[, ...args])

  • selector <[string]> A [selector] to query page for
  • pageFunction <[function]([Element])> Function to be evaluated in browser context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[Serializable]>> Promise which resolves to the return value of pageFunction

This method runs document.querySelector within the page and passes it as the first argument to pageFunction. If there's no element matching selector, the method throws an error.

If pageFunction returns a [Promise], then page.$eval would wait for the promise to resolve and return its value.

Examples:

const searchValue = await page.$eval('#search', el => el.value);
const preloadHref = await page.$eval('link[rel=preload]', el => el.href);
const html = await page.$eval('.main-container', e => e.outerHTML);

Shortcut for page.mainFrame().$eval(selector, pageFunction).

page.$x(expression)

  • expression <[string]> Expression to evaluate.
  • returns: <[Promise]<[Array]<[ElementHandle]>>>

The method evaluates the XPath expression.

Shortcut for page.mainFrame().$x(expression)

page.accessibility

  • returns: <[Accessibility]>

page.addScriptTag(options)

  • options <[Object]>
    • url <[string]> URL of a script to be added.
    • path <[string]> Path to the JavaScript file to be injected into frame. If path is a relative path, then it is resolved relative to current working directory.
    • content <[string]> Raw JavaScript content to be injected into frame.
    • type <[string]> Script type. Use 'module' in order to load a Javascript ES6 module. See script for more details.
  • returns: <[Promise]<[ElementHandle]>> which resolves to the added tag when the script's onload fires or when the script content was injected into frame.

Adds a <script> tag into the page with the desired url or content.

Shortcut for page.mainFrame().addScriptTag(options).

page.addStyleTag(options)

  • options <[Object]>
    • url <[string]> URL of the <link> tag.
    • path <[string]> Path to the CSS file to be injected into frame. If path is a relative path, then it is resolved relative to current working directory.
    • content <[string]> Raw CSS content to be injected into frame.
  • returns: <[Promise]<[ElementHandle]>> which resolves to the added tag when the stylesheet's onload fires or when the CSS content was injected into frame.

Adds a <link rel="stylesheet"> tag into the page with the desired url or a <style type="text/css"> tag with the content.

Shortcut for page.mainFrame().addStyleTag(options).

page.authenticate(credentials)

  • credentials <?[Object]>
    • username <[string]>
    • password <[string]>
  • returns: <[Promise]>

Provide credentials for HTTP authentication.

To disable authentication, pass null.

page.bringToFront()

  • returns: <[Promise]>

Brings page to front (activates tab).

page.browser()

  • returns: <[Browser]>

Get the browser the page belongs to.

page.browserContext()

  • returns: <[BrowserContext]>

Get the browser context that the page belongs to.

page.click(selector[, options])

  • selector <[string]> A [selector] to search for element to click. If there are multiple elements satisfying the selector, the first will be clicked.
  • options <[Object]>
    • button <"left"|"right"|"middle"> Defaults to left.
    • clickCount <[number]> defaults to 1. See [UIEvent.detail].
    • delay <[number]> Time to wait between mousedown and mouseup in milliseconds. Defaults to 0.
  • returns: <[Promise]> Promise which resolves when the element matching selector is successfully clicked. The Promise will be rejected if there is no element matching selector.

This method fetches an element with selector, scrolls it into view if needed, and then uses page.mouse to click in the center of the element.
If there's no element matching selector, the method throws an error.

Bear in mind that if click() triggers a navigation event and there's a separate page.waitForNavigation() promise to be resolved, you may end up with a race condition that yields unexpected results. The correct pattern for click and wait for navigation is the following:

const [response] = await Promise.all([
  page.waitForNavigation(waitOptions),
  page.click(selector, clickOptions),
]);

Shortcut for page.mainFrame().click(selector[, options]).

page.close([options])

  • options <[Object]>
    • runBeforeUnload <[boolean]> Defaults to false. Whether to run the
      before unload
      page handlers.
  • returns: <[Promise]>

By default, page.close() does not run beforeunload handlers.

NOTE if runBeforeUnload is passed as true, a beforeunload dialog might be summoned
and should be handled manually via page's 'dialog' event.

page.content()

  • returns: <[Promise]<[string]>>

Gets the full HTML contents of the page, including the doctype.

page.cookies([...urls])

  • ...urls <...[string]>
  • returns: <[Promise]<[Array]<[Object]>>>
    • name <[string]>
    • value <[string]>
    • domain <[string]>
    • path <[string]>
    • expires <[number]> Unix time in seconds.
    • size <[number]>
    • httpOnly <[boolean]>
    • secure <[boolean]>
    • session <[boolean]>
    • sameSite <"Strict"|"Lax"|"Extended"|"None">

If no URLs are specified, this method returns cookies for the current page URL.
If URLs are specified, only cookies for those URLs are returned.

page.coverage

  • returns: <[Coverage]>

page.deleteCookie(...cookies)

  • ...cookies <...[Object]>
    • name <[string]> required
    • url <[string]>
    • domain <[string]>
    • path <[string]>
  • returns: <[Promise]>

page.emulate(options)

  • options <[Object]>
    • viewport <[Object]>
    • width <[number]> page width in pixels.
    • height <[number]> page height in pixels.
    • deviceScaleFactor <[number]> Specify device scale factor (can be thought of as dpr). Defaults to 1.
    • isMobile <[boolean]> Whether the meta viewport tag is taken into account. Defaults to false.
    • hasTouch<[boolean]> Specifies if viewport supports touch events. Defaults to false
    • isLandscape <[boolean]> Specifies if viewport is in landscape mode. Defaults to false.
    • userAgent <[string]>
  • returns: <[Promise]>

Emulates given device metrics and user agent. This method is a shortcut for calling two methods:

To aid emulation, puppeteer provides a list of device descriptors which can be obtained via the puppeteer.devices.

page.emulate will resize the page. A lot of websites don't expect phones to change size, so you should emulate before navigating to the page.

const puppeteer = require('puppeteer');
const iPhone = puppeteer.devices['iPhone 6'];

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.emulate(iPhone);
  await page.goto('https://www.google.com');
  // other actions...
  await browser.close();
})();

List of all available devices is available in the source code: src/DeviceDescriptors.ts.

page.emulateMedia(type)

  • type <?[string]> Changes the CSS media type of the page. The only allowed values are 'screen', 'print' and null. Passing null disables CSS media emulation.
  • returns: <[Promise]>

Note: This method is deprecated, and only kept around as an alias for backwards compatibility. Use page.emulateMediaType(type) instead.

page.emulateMediaFeatures(features)

  • features <?[Array]<[Object]>> Given an array of media feature objects, emulates CSS media features on the page. Each media feature object must have the following properties:
    • name <[string]> The CSS media feature name. Supported names are 'prefers-colors-scheme' and 'prefers-reduced-motion'.
    • value <[string]> The value for the given CSS media feature.
  • returns: <[Promise]>
await page.emulateMediaFeatures([{ name: 'prefers-color-scheme', value: 'dark' }]);
await page.evaluate(() => matchMedia('(prefers-color-scheme: dark)').matches);
// → true
await page.evaluate(() => matchMedia('(prefers-color-scheme: light)').matches);
// → false
await page.evaluate(() => matchMedia('(prefers-color-scheme: no-preference)').matches);
// → false

await page.emulateMediaFeatures([{ name: 'prefers-reduced-motion', value: 'reduce' }]);
await page.evaluate(() => matchMedia('(prefers-reduced-motion: reduce)').matches);
// → true
await page.evaluate(() => matchMedia('(prefers-reduced-motion: no-preference)').matches);
// → false

await page.emulateMediaFeatures([
  { name: 'prefers-color-scheme', value: 'dark' },
  { name: 'prefers-reduced-motion', value: 'reduce' },
]);
await page.evaluate(() => matchMedia('(prefers-color-scheme: dark)').matches);
// → true
await page.evaluate(() => matchMedia('(prefers-color-scheme: light)').matches);
// → false
await page.evaluate(() => matchMedia('(prefers-color-scheme: no-preference)').matches);
// → false
await page.evaluate(() => matchMedia('(prefers-reduced-motion: reduce)').matches);
// → true
await page.evaluate(() => matchMedia('(prefers-reduced-motion: no-preference)').matches);
// → false

page.emulateMediaType(type)

  • type <?[string]> Changes the CSS media type of the page. The only allowed values are 'screen', 'print' and null. Passing null disables CSS media emulation.
  • returns: <[Promise]>
await page.evaluate(() => matchMedia('screen').matches);
// → true
await page.evaluate(() => matchMedia('print').matches);
// → false

await page.emulateMediaType('print');
await page.evaluate(() => matchMedia('screen').matches);
// → false
await page.evaluate(() => matchMedia('print').matches);
// → true

await page.emulateMediaType(null);
await page.evaluate(() => matchMedia('screen').matches);
// → true
await page.evaluate(() => matchMedia('print').matches);
// → false

page.emulateTimezone(timezoneId)

  • timezoneId <?[string]> Changes the timezone of the page. See ICU’s metaZones.txt for a list of supported timezone IDs. Passing null disables timezone emulation.
  • returns: <[Promise]>

page.evaluate(pageFunction[, ...args])

  • pageFunction <[function]|[string]> Function to be evaluated in the page context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[Serializable]>> Promise which resolves to the return value of pageFunction

If the function passed to the page.evaluate returns a [Promise], then page.evaluate would wait for the promise to resolve and return its value.

If the function passed to the page.evaluate returns a non-[Serializable] value, then page.evaluate resolves to undefined. DevTools Protocol also supports transferring some additional values that are not serializable by JSON: -0, NaN, Infinity, -Infinity, and bigint literals.

Passing arguments to pageFunction:

const result = await page.evaluate(x => {
  return Promise.resolve(8 * x);
}, 7);
console.log(result); // prints "56"

A string can also be passed in instead of a function:

console.log(await page.evaluate('1 + 2')); // prints "3"
const x = 10;
console.log(await page.evaluate(`1 + ${x}`)); // prints "11"

[ElementHandle] instances can be passed as arguments to the page.evaluate:

const bodyHandle = await page.$('body');
const html = await page.evaluate(body => body.innerHTML, bodyHandle);
await bodyHandle.dispose();

Shortcut for page.mainFrame().evaluate(pageFunction, ...args).

page.evaluateHandle(pageFunction[, ...args])

  • pageFunction <[function]|[string]> Function to be evaluated in the page context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[JSHandle]>> Promise which resolves to the return value of pageFunction as in-page object (JSHandle)

The only difference between page.evaluate and page.evaluateHandle is that page.evaluateHandle returns in-page object (JSHandle).

If the function passed to the page.evaluateHandle returns a [Promise], then page.evaluateHandle would wait for the promise to resolve and return its value.

A string can also be passed in instead of a function:

const aHandle = await page.evaluateHandle('document'); // Handle for the 'document'

[JSHandle] instances can be passed as arguments to the page.evaluateHandle:

const aHandle = await page.evaluateHandle(() => document.body);
const resultHandle = await page.evaluateHandle(body => body.innerHTML, aHandle);
console.log(await resultHandle.jsonValue());
await resultHandle.dispose();

Shortcut for page.mainFrame().executionContext().evaluateHandle(pageFunction, ...args).

page.evaluateOnNewDocument(pageFunction[, ...args])

  • pageFunction <[function]|[string]> Function to be evaluated in browser context
  • ...args <...[Serializable]> Arguments to pass to pageFunction
  • returns: <[Promise]>

Adds a function which would be invoked in one of the following scenarios:

  • whenever the page is navigated
  • whenever the child frame is attached or navigated. In this case, the function is invoked in the context of the newly attached frame

The function is invoked after the document was created but before any of its scripts were run. This is useful to amend the JavaScript environment, e.g. to seed Math.random.

An example of overriding the navigator.languages property before the page loads:

// preload.js

// overwrite the `languages` property to use a custom getter
Object.defineProperty(navigator, "languages", {
  get: function() {
    return ["en-US", "en", "bn"];
  }
});

// In your puppeteer script, assuming the preload.js file is in same folder of our script
const preloadFile = fs.readFileSync('./preload.js', 'utf8');
await page.evaluateOnNewDocument(preloadFile);

page.exposeFunction(name, puppeteerFunction)

  • name <[string]> Name of the function on the window object
  • puppeteerFunction <[function]> Callback function which will be called in Puppeteer's context.
  • returns: <[Promise]>

The method adds a function called name on the page's window object.
When called, the function executes puppeteerFunction in node.js and returns a [Promise] which resolves to the return value of puppeteerFunction.

If the puppeteerFunction returns a [Promise], it will be awaited.

NOTE Functions installed via page.exposeFunction survive navigations.

An example of adding an md5 function into the page:

const puppeteer = require('puppeteer');
const crypto = require('crypto');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  page.on('console', msg => console.log(msg.text()));
  await page.exposeFunction('md5', text =>
    crypto.createHash('md5').update(text).digest('hex')
  );
  await page.evaluate(async () => {
    // use window.md5 to compute hashes
    const myString = 'PUPPETEER';
    const myHash = await window.md5(myString);
    console.log(`md5 of ${myString} is ${myHash}`);
  });
  await browser.close();
})();

An example of adding a window.readfile function into the page:

const puppeteer = require('puppeteer');
const fs = require('fs');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  page.on('console', msg => console.log(msg.text()));
  await page.exposeFunction('readfile', async filePath => {
    return new Promise((resolve, reject) => {
      fs.readFile(filePath, 'utf8', (err, text) => {
        if (err)
          reject(err);
        else
          resolve(text);
      });
    });
  });
  await page.evaluate(async () => {
    // use window.readfile to read contents of a file
    const content = await window.readfile('/etc/hosts');
    console.log(content);
  });
  await browser.close();
})();

page.focus(selector)

  • selector <[string]> A [selector] of an element to focus. If there are multiple elements satisfying the selector, the first will be focused.
  • returns: <[Promise]> Promise which resolves when the element matching selector is successfully focused. The promise will be rejected if there is no element matching selector.

This method fetches an element with selector and focuses it.
If there's no element matching selector, the method throws an error.

Shortcut for page.mainFrame().focus(selector).

page.frames()

  • returns: <[Array]<[Frame]>> An array of all frames attached to the page.

page.goBack([options])

  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
  • returns: <[Promise]<?[Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect. If
    can not go back, resolves to null.

Navigate to the previous page in history.

page.goForward([options])

  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
  • returns: <[Promise]<?[Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect. If
    can not go forward, resolves to null.

Navigate to the next page in history.

page.goto(url[, options])

  • url <[string]> URL to navigate page to. The url should include scheme, e.g. https://.
  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
    • referer <[string]> Referer header value. If provided it will take preference over the referer header value set by page.setExtraHTTPHeaders().
  • returns: <[Promise]<?[Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect.

page.goto will throw an error if:

  • there's an SSL error (e.g. in case of self-signed certificates).
  • target URL is invalid.
  • the timeout is exceeded during navigation.
  • the remote server does not respond or is unreachable.
  • the main resource failed to load.

page.goto will not throw an error when any valid HTTP status code is returned by the remote server, including 404 "Not Found" and 500 "Internal Server Error". The status code for such responses can be retrieved by calling response.status().

NOTE page.goto either throws an error or returns a main resource response. The only exceptions are navigation to about:blank or navigation to the same URL with a different hash, which would succeed and return null.

NOTE Headless mode doesn't support navigation to a PDF document. See the upstream issue.

Shortcut for page.mainFrame().goto(url, options)

page.hover(selector)

  • selector <[string]> A [selector] to search for element to hover. If there are multiple elements satisfying the selector, the first will be hovered.
  • returns: <[Promise]> Promise which resolves when the element matching selector is successfully hovered. Promise gets rejected if there's no element matching selector.

This method fetches an element with selector, scrolls it into view if needed, and then uses page.mouse to hover over the center of the element.
If there's no element matching selector, the method throws an error.

Shortcut for page.mainFrame().hover(selector).

page.isClosed()

  • returns: <[boolean]>

Indicates that the page has been closed.

page.keyboard

  • returns: <[Keyboard]>

page.mainFrame()

  • returns: <[Frame]> The page's main frame.

Page is guaranteed to have a main frame which persists during navigations.

page.metrics()

  • returns: <[Promise]<[Object]>> Object containing metrics as key/value pairs.
    • Timestamp <[number]> The timestamp when the metrics sample was taken.
    • Documents <[number]> Number of documents in the page.
    • Frames <[number]> Number of frames in the page.
    • JSEventListeners <[number]> Number of events in the page.
    • Nodes <[number]> Number of DOM nodes in the page.
    • LayoutCount <[number]> Total number of full or partial page layout.
    • RecalcStyleCount <[number]> Total number of page style recalculations.
    • LayoutDuration <[number]> Combined durations of all page layouts.
    • RecalcStyleDuration <[number]> Combined duration of all page style recalculations.
    • ScriptDuration <[number]> Combined duration of JavaScript execution.
    • TaskDuration <[number]> Combined duration of all tasks performed by the browser.
    • JSHeapUsedSize <[number]> Used JavaScript heap size.
    • JSHeapTotalSize <[number]> Total JavaScript heap size.

NOTE All timestamps are in monotonic time: monotonically increasing time in seconds since an arbitrary point in the past.

page.mouse

  • returns: <[Mouse]>

page.pdf([options])

  • options <[Object]> Options object which might have the following properties:
    • path <[string]> The file path to save the PDF to. If path is a relative path, then it is resolved relative to current working directory. If no path is provided, the PDF won't be saved to the disk.
    • scale <[number]> Scale of the webpage rendering. Defaults to 1. Scale amount must be between 0.1 and 2.
    • displayHeaderFooter <[boolean]> Display header and footer. Defaults to false.
    • headerTemplate <[string]> HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them:
    • date formatted print date
    • title document title
    • url document location
    • pageNumber current page number
    • totalPages total pages in the document
    • footerTemplate <[string]> HTML template for the print footer. Should use the same format as the headerTemplate.
    • printBackground <[boolean]> Print background graphics. Defaults to false.
    • landscape <[boolean]> Paper orientation. Defaults to false.
    • pageRanges <[string]> Paper ranges to print, e.g., '1-5, 8, 11-13'. Defaults to the empty string, which means print all pages.
    • format <[string]> Paper format. If set, takes priority over width or height options. Defaults to 'Letter'.
    • width <[string]|[number]> Paper width, accepts values labeled with units.
    • height <[string]|[number]> Paper height, accepts values labeled with units.
    • margin <[Object]> Paper margins, defaults to none.
    • top <[string]|[number]> Top margin, accepts values labeled with units.
    • right <[string]|[number]> Right margin, accepts values labeled with units.
    • bottom <[string]|[number]> Bottom margin, accepts values labeled with units.
    • left <[string]|[number]> Left margin, accepts values labeled with units.
    • preferCSSPageSize <[boolean]> Give any CSS @page size declared in the page priority over what is declared in width and height or format options. Defaults to false, which will scale the content to fit the paper size.
  • returns: <[Promise]<[Buffer]>> Promise which resolves with PDF buffer.

NOTE Generating a pdf is currently only supported in Chrome headless.

page.pdf() generates a pdf of the page with print css media. To generate a pdf with screen media, call page.emulateMediaType('screen') before calling page.pdf():

NOTE By default, page.pdf() generates a pdf with modified colors for printing. Use the -webkit-print-color-adjust property to force rendering of exact colors.

// Generates a PDF with 'screen' media type.
await page.emulateMediaType('screen');
await page.pdf({path: 'page.pdf'});

The width, height, and margin options accept values labeled with units. Unlabeled values are treated as pixels.

A few examples:

  • page.pdf({width: 100}) - prints with width set to 100 pixels
  • page.pdf({width: '100px'}) - prints with width set to 100 pixels
  • page.pdf({width: '10cm'}) - prints with width set to 10 centimeters.

All possible units are:

  • px - pixel
  • in - inch
  • cm - centimeter
  • mm - millimeter

The format options are:

  • Letter: 8.5in x 11in
  • Legal: 8.5in x 14in
  • Tabloid: 11in x 17in
  • Ledger: 17in x 11in
  • A0: 33.1in x 46.8in
  • A1: 23.4in x 33.1in
  • A2: 16.54in x 23.4in
  • A3: 11.7in x 16.54in
  • A4: 8.27in x 11.7in
  • A5: 5.83in x 8.27in
  • A6: 4.13in x 5.83in

NOTE headerTemplate and footerTemplate markup have the following limitations:

  1. Script tags inside templates are not evaluated.
  2. Page styles are not visible inside templates.

page.queryObjects(prototypeHandle)

  • prototypeHandle <[JSHandle]> A handle to the object prototype.
  • returns: <[Promise]<[JSHandle]>> Promise which resolves to a handle to an array of objects with this prototype.

The method iterates the JavaScript heap and finds all the objects with the given prototype.

// Create a Map object
await page.evaluate(() => window.map = new Map());
// Get a handle to the Map object prototype
const mapPrototype = await page.evaluateHandle(() => Map.prototype);
// Query all map instances into an array
const mapInstances = await page.queryObjects(mapPrototype);
// Count amount of map objects in heap
const count = await page.evaluate(maps => maps.length, mapInstances);
await mapInstances.dispose();
await mapPrototype.dispose();

Shortcut for page.mainFrame().executionContext().queryObjects(prototypeHandle).

page.reload([options])

  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
  • returns: <[Promise]<[Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect.

page.screenshot([options])

  • options <[Object]> Options object which might have the following properties:
    • path <[string]> The file path to save the image to. The screenshot type will be inferred from file extension. If path is a relative path, then it is resolved relative to current working directory. If no path is provided, the image won't be saved to the disk.
    • type <[string]> Specify screenshot type, can be either jpeg or png. Defaults to 'png'.
    • quality <[number]> The quality of the image, between 0-100. Not applicable to png images.
    • fullPage <[boolean]> When true, takes a screenshot of the full scrollable page. Defaults to false.
    • clip <[Object]> An object which specifies clipping region of the page. Should have the following fields:
    • x <[number]> x-coordinate of top-left corner of clip area
    • y <[number]> y-coordinate of top-left corner of clip area
    • width <[number]> width of clipping area
    • height <[number]> height of clipping area
    • omitBackground <[boolean]> Hides default white background and allows capturing screenshots with transparency. Defaults to false.
    • encoding <[string]> The encoding of the image, can be either base64 or binary. Defaults to binary.
  • returns: <[Promise]<[string]|[Buffer]>> Promise which resolves to buffer or a base64 string (depending on the value of encoding) with captured screenshot.

NOTE Screenshots take at least 1/6 second on OS X. See crbug.com/741689 for discussion.

page.select(selector, ...values)

  • selector <[string]> A [selector] to query page for
  • ...values <...[string]> Values of options to select. If the <select> has the multiple attribute, all values are considered, otherwise only the first one is taken into account.
  • returns: <[Promise]<[Array]<[string]>>> An array of option values that have been successfully selected.

Triggers a change and input event once all the provided options have been selected.
If there's no <select> element matching selector, the method throws an error.

page.select('select#colors', 'blue'); // single selection
page.select('select#colors', 'red', 'green', 'blue'); // multiple selections

Shortcut for page.mainFrame().select()

page.setBypassCSP(enabled)

  • enabled <[boolean]> sets bypassing of page's Content-Security-Policy.
  • returns: <[Promise]>

Toggles bypassing page's Content-Security-Policy.

NOTE CSP bypassing happens at the moment of CSP initialization rather then evaluation. Usually this means
that page.setBypassCSP should be called before navigating to the domain.

page.setCacheEnabled([enabled])

  • enabled <[boolean]> sets the enabled state of the cache.
  • returns: <[Promise]>

Toggles ignoring cache for each request based on the enabled state. By default, caching is enabled.

page.setContent(html[, options])

  • html <[string]> HTML markup to assign to the page.
  • options <[Object]> Parameters which might have the following properties:
    • timeout <[number]> Maximum time in milliseconds for resources to load, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider setting markup succeeded, defaults to load. Given an array of event strings, setting content is considered to be successful after all events have been fired. Events can be either:
    • load - consider setting content to be finished when the load event is fired.
    • domcontentloaded - consider setting content to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider setting content to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider setting content to be finished when there are no more than 2 network connections for at least 500 ms.
  • returns: <[Promise]>

page.setCookie(...cookies)

  • ...cookies <...[Object]>
    • name <[string]> required
    • value <[string]> required
    • url <[string]>
    • domain <[string]>
    • path <[string]>
    • expires <[number]> Unix time in seconds.
    • httpOnly <[boolean]>
    • secure <[boolean]>
    • sameSite <"Strict"|"Lax">
  • returns: <[Promise]>
await page.setCookie(cookieObject1, cookieObject2);

page.setDefaultNavigationTimeout(timeout)

  • timeout <[number]> Maximum navigation time in milliseconds

This setting will change the default maximum navigation time for the following methods and related shortcuts:

NOTE page.setDefaultNavigationTimeout takes priority over page.setDefaultTimeout

page.setDefaultTimeout(timeout)

  • timeout <[number]> Maximum time in milliseconds

This setting will change the default maximum time for the following methods and related shortcuts:

NOTE page.setDefaultNavigationTimeout takes priority over page.setDefaultTimeout

page.setExtraHTTPHeaders(headers)

  • headers <[Object]> An object containing additional HTTP headers to be sent with every request. All header values must be strings.
  • returns: <[Promise]>

The extra HTTP headers will be sent with every request the page initiates.

NOTE page.setExtraHTTPHeaders does not guarantee the order of headers in the outgoing requests.

page.setGeolocation(options)

  • options <[Object]>
    • latitude <[number]> Latitude between -90 and 90.
    • longitude <[number]> Longitude between -180 and 180.
    • accuracy <[number]> Optional non-negative accuracy value.
  • returns: <[Promise]>

Sets the page's geolocation.

await page.setGeolocation({latitude: 59.95, longitude: 30.31667});

NOTE Consider using browserContext.overridePermissions to grant permissions for the page to read its geolocation.

page.setJavaScriptEnabled(enabled)

  • enabled <[boolean]> Whether or not to enable JavaScript on the page.
  • returns: <[Promise]>

NOTE changing this value won't affect scripts that have already been run. It will take full effect on the next navigation.

page.setOfflineMode(enabled)

  • enabled <[boolean]> When true, enables offline mode for the page.
  • returns: <[Promise]>

page.setRequestInterception(value)

  • value <[boolean]> Whether to enable request interception.
  • returns: <[Promise]>

Activating request interception enables request.abort, request.continue and
request.respond methods. This provides the capability to modify network requests that are made by a page.

Once request interception is enabled, every request will stall unless it's continued, responded or aborted.
An example of a naïve request interceptor that aborts all image requests:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setRequestInterception(true);
  page.on('request', interceptedRequest => {
    if (interceptedRequest.url().endsWith('.png') || interceptedRequest.url().endsWith('.jpg'))
      interceptedRequest.abort();
    else
      interceptedRequest.continue();
  });
  await page.goto('https://example.com');
  await browser.close();
})();

NOTE Enabling request interception disables page caching.

page.setUserAgent(userAgent)

  • userAgent <[string]> Specific user agent to use in this page
  • returns: <[Promise]> Promise which resolves when the user agent is set.

page.setViewport(viewport)

  • viewport <[Object]>
    • width <[number]> page width in pixels. required
    • height <[number]> page height in pixels. required
    • deviceScaleFactor <[number]> Specify device scale factor (can be thought of as dpr). Defaults to 1.
    • isMobile <[boolean]> Whether the meta viewport tag is taken into account. Defaults to false.
    • hasTouch<[boolean]> Specifies if viewport supports touch events. Defaults to false
    • isLandscape <[boolean]> Specifies if viewport is in landscape mode. Defaults to false.
  • returns: <[Promise]>

NOTE in certain cases, setting viewport will reload the page in order to set the isMobile or hasTouch properties.

In the case of multiple pages in a single browser, each page can have its own viewport size.

page.setViewport will resize the page. A lot of websites don't expect phones to change size, so you should set the viewport before navigating to the page.

const page = await browser.newPage();
await page.setViewport({
  width: 640,
  height: 480,
  deviceScaleFactor: 1,
});
await page.goto('https://example.com');

page.tap(selector)

  • selector <[string]> A [selector] to search for element to tap. If there are multiple elements satisfying the selector, the first will be tapped.
  • returns: <[Promise]>

This method fetches an element with selector, scrolls it into view if needed, and then uses page.touchscreen to tap in the center of the element.
If there's no element matching selector, the method throws an error.

Shortcut for page.mainFrame().tap(selector).

page.target()

  • returns: <[Target]> a target this page was created from.

page.title()

  • returns: <[Promise]<[string]>> The page's title.

Shortcut for page.mainFrame().title().

page.touchscreen

  • returns: <[Touchscreen]>

page.tracing

  • returns: <[Tracing]>

page.type(selector, text[, options])

  • selector <[string]> A [selector] of an element to type into. If there are multiple elements satisfying the selector, the first will be used.
  • text <[string]> A text to type into a focused element.
  • options <[Object]>
    • delay <[number]> Time to wait between key presses in milliseconds. Defaults to 0.
  • returns: <[Promise]>

Sends a keydown, keypress/input, and keyup event for each character in the text.

To press a special key, like Control or ArrowDown, use keyboard.press.

await page.type('#mytextarea', 'Hello'); // Types instantly
await page.type('#mytextarea', 'World', {delay: 100}); // Types slower, like a user

Shortcut for page.mainFrame().type(selector, text[, options]).

page.url()

  • returns: <[string]>

This is a shortcut for page.mainFrame().url()

page.viewport()

  • returns: <?[Object]>
    • width <[number]> page width in pixels.
    • height <[number]> page height in pixels.
    • deviceScaleFactor <[number]> Specify device scale factor (can be though of as dpr). Defaults to 1.
    • isMobile <[boolean]> Whether the meta viewport tag is taken into account. Defaults to false.
    • hasTouch<[boolean]> Specifies if viewport supports touch events. Defaults to false
    • isLandscape <[boolean]> Specifies if viewport is in landscape mode. Defaults to false.

page.waitFor(selectorOrFunctionOrTimeout[, options[, ...args]])

  • selectorOrFunctionOrTimeout <[string]|[number]|[function]> A [selector], predicate or timeout to wait for
  • options <[Object]> Optional waiting parameters
    • visible <[boolean]> wait for element to be present in DOM and to be visible. Defaults to false.
    • timeout <[number]> maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass 0 to disable timeout. The default value can be changed by using the page.setDefaultTimeout(timeout) method.
    • hidden <[boolean]> wait for element to not be found in the DOM or to be hidden. Defaults to false.
    • polling <[string]|[number]> An interval at which the pageFunction is executed, defaults to raf. If polling is a number, then it is treated as an interval in milliseconds at which the function would be executed. If polling is a string, then it can be one of the following values:
    • raf - to constantly execute pageFunction in requestAnimationFrame callback. This is the tightest polling mode which is suitable to observe styling changes.
    • mutation - to execute pageFunction on every DOM mutation.
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[JSHandle]>> Promise which resolves to a JSHandle of the success value

This method behaves differently with respect to the type of the first parameter:

  • if selectorOrFunctionOrTimeout is a string, then the first argument is treated as a [selector] or [xpath], depending on whether or not it starts with '//', and the method is a shortcut for
    page.waitForSelector or page.waitForXPath
  • if selectorOrFunctionOrTimeout is a function, then the first argument is treated as a predicate to wait for and the method is a shortcut for page.waitForFunction().
  • if selectorOrFunctionOrTimeout is a number, then the first argument is treated as a timeout in milliseconds and the method returns a promise which resolves after the timeout
  • otherwise, an exception is thrown
// wait for selector
await page.waitFor('.foo');
// wait for 1 second
await page.waitFor(1000);
// wait for predicate
await page.waitFor(() => !!document.quer

当JavaScript代码调用console.timeStamp时发出。有关指标的列表,请参阅page.metrics

本文章首发在 LearnKu.com 网站上。

本译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。

原文地址:https://learnku.com/docs/puppeteer/3.1.0...

译文地址:https://learnku.com/docs/puppeteer/3.1.0...

上一篇 下一篇
贡献者:5
讨论数量: 0
发起讨论 查看所有版本


暂无话题~