class: Frame

未匹配的标注

在每个时间点,页面都通过 page.mainFrame()frame.childFrames() 方法公开其当前的 frame 树。

[Frame] 对象的生命周期由在页面对象上调度的三个控制事件:

  • 'frameattached' - 当 frame 连接到页面时触发。frame 只能加载到页面一次。
  • 'framenavigated' - 当 frame 将导航提交到其他URL时触发。
  • 'framedetached' - 当 frame 与页面分离时触发。 frame 只能与页面分离一次。

转储 frame 树的示例:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://www.google.com/chrome/browser/canary.html');
  dumpFrameTree(page.mainFrame(), '');
  await browser.close();

  function dumpFrameTree(frame, indent) {
    console.log(indent + frame.url());
    for (const child of frame.childFrames()) {
      dumpFrameTree(child, indent + '  ');
    }
  }
})();

从 iframe 元素获取文本的示例:

  const frame = page.frames().find(frame => frame.name() === 'myframe');
  const text = await frame.$eval('.selector', element => element.textContent);
  console.log(text);

frame.$(selector)

  • selector <[string]> 用于查询框架的选择器。
  • returns: <[Promise]<?[ElementHandle]>> Promise,解析为指向 frame 元素的 ElementHandle 。

该方法查询选择器的 frame 。如果 frame 内没有此类元素,则该方法将解析为null

frame.$$(selector)

  • selector <[string]> 用于查询框架的选择器。
  • returns: <[Promise]<[Array]<[ElementHandle]>>> 该 Promise 解析为指向框架元素的 ElementHandles 。

该方法在 frame 内运行document.querySelectorAll。如果没有元素与选择器匹配,则返回值解析为[]

frame.$$eval(selector, pageFunction[, ...args])

  • selector <[string]>用于查找 frame 的[选择器]
  • pageFunction <[function]([Array]<[Element]>)> 在浏览器中运行的函数
  • ...args <...[Serializable]|[JSHandle]>传递给pageFunction的参数
  • returns: <[Promise]<[Serializable]>> pageFunction的返回值为 Promise

此方法在 frame 内运行Array.from(document.querySelectorAll(selector)),并将其作为第一个参数传递给pageFunction
如果pageFunction返回一个[Promise],则frame.$$ eval将等待 promise 运行并返回其值。

例:

const divsCounts = await frame.$$eval('div', divs => divs.length);

frame.$eval(selector, pageFunction[, ...args])

  • selector <[string]> 用于查找 frame 的[选择器]
  • pageFunction <[function]([Element])> 在浏览器中运行的函数
  • ...args <...[Serializable]|[JSHandle]> 传递给pageFunction的参数
  • returns: <[Promise]<[Serializable]>> pageFunction的返回值为 Promise

此方法在 frame 内运行document.querySelector并将其作为第一个参数传递给pageFunction。如果没有与selector匹配的元素,则该方法将引发错误。

如果pageFunction返回一个[Promise],则frame.$ eval将等待 promise 运行并返回其值。

例:

const searchValue = await frame.$eval('#search', el => el.value);
const preloadHref = await frame.$eval('link[rel=preload]', el => el.href);
const html = await frame.$eval('.main-container', e => e.outerHTML);

frame.$x(expression)

  • expression <[string]> 要运行 (evaluate)的表达式。
  • returns: <[Promise]<[Array]<[ElementHandle]>>>

该方法运行 XPath 表达式。

frame.addScriptTag(options)

  • options <[Object]>
    • url <[string]> 要添加的脚本的 url。
    • path <[string]> 要注入 frame 的 JavaScript 文件的路径。 如果路径是一个相对路径,那么它是相对于current working directory解析的。
    • content <[string]> 要注入 frame 中的原始 JavaScript 内容。
    • type <[string]> 脚本类型。使用「 module 」加载 Javascript ES6 模块。更多详细内容参见script
  • returns: <[Promise]<[ElementHandle]>> 当脚本的 onload 启动或脚本内容被注入 frame 时,它将解析为添加的标签。

使用所需的 url 或内容在页面中添加<script> 标签。

frame.addStyleTag(options)

  • options <[Object]>
    • url <[string]> <link> 标签的 url 。
    • path <[string]> 注入 frame 中的 CSS 文件路径,如果路径是一个相对路径,那么它是相对于 current working directory解析的。
    • content <[string]> 要注入 frame 的原始 CSS 内容。
  • returns: <[Promise]<[ElementHandle]>> 当样式表的 onload 触发时,它将解析为添加的比标签或者当 CSS 内容被注入到 frame 中时。

使用所需的 url 添加一个<link rel="stylesheet"> 标签或者用内容添加一个

frame.childFrames()

  • returns: <[Array]<[Frame]>>

frame.click(selector[, options])

  • selector <[string]> 查找要单击的元素。如果有多个元素满足选择器,将单击第一个元素。
  • options <[Object]>
    • button <"left"|"right"|"middle"> 默认 left.
    • clickCount <[number]> 默认为 1. 见 [UIEvent.detail].
    • delay <[number]> 在鼠标向下鼠标向下之间等待的时间(毫秒)。默认值为0。
  • returns: <[Promise]> 当成功单击元素匹配的选择器时将解析 Promise 。如果没有匹选择器的元素 promise 将 rejected。

这个方法选择传入的元素,如果必要的话会将元素滚动到可视区域,之后使用 page.mouse 点击元素的内容。如果没有匹配到元素,会抛出异常。

注意:如果 click() 触发了导航事件,那么就会有一个由 page.waitForNavigation() 产生的 promise 要被解析,你可能会得到一个 promise 的竞争状态。正确的处理 click 和 wait for navigation 的方式如下:

const [response] = await Promise.all([
  page.waitForNavigation(waitOptions),
  frame.click(selector, clickOptions),
]);

frame.content()

  • returns: <[Promise]<[string]>>

获取框架完整的HTML内容,包括 doctype。

frame.evaluate(pageFunction[, ...args])

  • pageFunction <[function]|[string]> Function to be evaluated in browser context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[Serializable]>> Promise which resolves to the return value of pageFunction

如果传给 frame.evaluate 的函数返回了一个 promise,那么 frame.evaluate 将会等到 promise resolve 时返回它的值。

如果传给 frame.evaluate 的函数返回了一个非序列化的值,那么 frame.evaluate 将返回 undefined。DevTools协议还支持传输一些附加值,这些值不能通过JSON序列化:-0NaNInfinity-Infinity和bigint 文本。

const result = await frame.evaluate(() => {
  return Promise.resolve(8 * 7);
});
console.log(result); // prints "56"

也可以给函数传递字符串。

console.log(await frame.evaluate('1 + 2')); // prints "3"

[ElementHandle] instances can be passed as arguments to the frame.evaluate:

const bodyHandle = await frame.$('body');
const html = await frame.evaluate(body => body.innerHTML, bodyHandle);
await bodyHandle.dispose();

frame.evaluateHandle(pageFunction[, ...args])

  • pageFunction <[function]|[string]> Function to be evaluated in the page context
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[JSHandle]>> Promise which resolves to the return value of pageFunction as in-page object (JSHandle)

frame.evaluate 和 frame.evaluateHandle 唯一的不同是 frame.evaluateHandle 返回页面对象(JSHandle)。

如果传给 frame.evaluateHandle的函数返回了一个 [Promise」(developer.mozilla.org/en-US/docs/W... "Promise"),那么 frame.evaluateHandle 将会等到 promise resolve 时返回它的值。

const aWindowHandle = await frame.evaluateHandle(() => Promise.resolve(window));
aWindowHandle; // Handle for the window object.

也可以给函数传递字符串。

const aHandle = await frame.evaluateHandle('document'); // Handle for the 'document'.

JSHandle 实例也可以作为 frame.evaluateHandle 的参数:

const aHandle = await frame.evaluateHandle(() => document.body);
const resultHandle = await frame.evaluateHandle(body => body.innerHTML, aHandle);
console.log(await resultHandle.jsonValue());
await resultHandle.dispose();

frame.executionContext()

  • returns: <[Promise]<[ExecutionContext]>>

返回解析为 frame 的默认执行上下文的 promise。

frame.focus(selector)

  • selector <[string]> A [selector] of an element to focus. If there are multiple elements satisfying the selector, the first will be focused.
  • returns: <[Promise]> Promise which resolves when the element matching selector is successfully focused. The promise will be rejected if there is no element matching selector.

这个方法选择传入的元素并且使之获得焦点。如果没有匹配到元素,会抛出异常。

frame.goto(url[, options])

  • url <[string]> URL to navigate frame to. The url should include scheme, e.g. https://.
  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) or page.setDefaultTimeout(timeout) methods.
    • waitUntil <"load"|"domcontentloaded"|"networkidle0"|"networkidle2"|Array> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
    • referer <[string]> Referer header value. If provided it will take preference over the referer header value set by page.setExtraHTTPHeaders().
  • returns: <[Promise]<?[Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect.

如果存在下面的情况 frame.goto 将会抛出错误:

  • SSL 错误 (e.g. in case of self-signed certificates).
  • 目标 URL 不可用。
  • 导航过程中 timeout 被触发。
  • 主资源加载失败。

当远程服务器返回任何有效的HTTP状态代码时,frame.goto不会抛出错误,包括404“未找到”和500“内部服务器错误”。可以通过调用响应状态()

注意 frame.goto 抛出或返回一个主资源响应。 唯一的例外是导航到about:blank 或导航到具有不同 hash 的相同 URL,这将成功并返回 null。

注意 无头模式将不支持导航到一个 PDF 文档。具体见upstream issue

frame.hover(selector)

  • selector <[string]> A [selector] to search for element to hover. If there are multiple elements satisfying the selector, the first will be hovered.
  • returns: <[Promise]> Promise which resolves when the element matching selector is successfully hovered. Promise gets rejected if there's no element matching selector.

这个方法选择传入的元素,如果必要的话会滚动到视野区域中,然后使用 page.mouse 方法将鼠标悬浮在元素的中心。

如果没有匹配到元素,会抛出异常。

frame.isDetached()

  • returns: <[boolean]>

如果框架不被加载了返回 true,否则返回 false

frame.name()

  • returns: <[string]>

返回框架在标签中指定的 name 属性。

如果 name 为空,返回 id。

注意 这个值在框架创建的时侯就就计算好了,如果之后修改属性的话不会更新。

frame.parentFrame()

  • returns: <?[Frame]> Returns parent frame, if any. Detached frames and main frames return null.

frame.select(selector, ...values)

  • selector <[string]> A [selector] to query frame for
  • ...values <...[string]> Values of options to select. If the <select> has the multiple attribute, all values are considered, otherwise only the first one is taken into account.
  • returns: <[Promise]<[Array]<[string]>>> Returns an array of option values that have been successfully selected.

下拉框一旦选择了所提供的选项,changeinput 事件将会被触发。

如果没有匹配到下拉框,会抛出异常。

frame.select('select#colors', 'blue'); // 单选
frame.select('select#colors', 'red', 'green', 'blue'); // 多选

frame.setContent(html)

  • html <[string]> HTML markup to assign to the page.
  • returns: <[Promise]>

frame.tap(selector)

  • selector <[string]> A [selector] to search for element to tap. If there are multiple elements satisfying the selector, the first will be tapped.
  • returns: <[Promise]>

这个方法选择传入的元素,如果必要的话会滚动到视野区域中,然后使用 page.touchscreen 方法单击元素中心。

如果没有匹配到元素,会抛出异常。

frame.title()

  • returns: <[Promise]<[string]>> Returns page's title.

frame.type(selector, text[, options])

  • selector <[string]> A [selector] of an element to type into. If there are multiple elements satisfying the selector, the first will be used.
  • text <[string]> A text to type into a focused element.
  • options <[Object]>
    • delay <[number]> Time to wait between key presses in milliseconds. Defaults to 0.
  • returns: <[Promise]>

对于每一个文本中的字符执行 keydownkeypress / input, 和 keyup 事件

如果要输入特殊按键,比如 Control 或者 ArrowDown,使用 keyboard.press

frame.type('#mytextarea', 'Hello'); // 立即输入
frame.type('#mytextarea', 'World', {delay: 100}); // 延迟输入, 操作更像用户

frame.url()

  • returns: <[string]>

返回框架的 url。

frame.waitFor(selectorOrFunctionOrTimeout[, options[, ...args]])

  • selectorOrFunctionOrTimeout <[string]|[number]|[function]> A [selector], predicate or timeout to wait for
  • options <[Object]> Optional waiting parameters
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[JSHandle]>> Promise which resolves to a JSHandle of the success value

这个方法根据第一个参数类型的不同起到不同的作用:

  • 如果 selectorOrFunctionOrTimeoutstring,那么第一个参数会被当作 [selector] 或者 [xpath],取决于是不是以//开头的,这是 frame.waitForSelectorframe.waitForXPath 的快捷方式。
  • 如果 selectorOrFunctionOrTimeoutfunction,那么第一个参数会当作条件等待触发,这是 frame.waitForFunction() 的快捷方式。
  • 如果 selectorOrFunctionOrTimeoutnumber,那么第一个参数会被当作毫秒为单位的时间,方法会在超时之后返回 promise。
  • 其他类型,将会抛出错误。
// wait for selector
await page.waitFor('.foo');
// wait for 1 second
await page.waitFor(1000);
// wait for predicate
await page.waitFor(() => !!document.querySelector('.foo'));

将 node.js 中的参数传递给 page.waitFor 函数:

const selector = '.foo';
await page.waitFor(selector => !!document.querySelector(selector), {}, selector);

frame.waitForFunction(pageFunction[, options[, ...args]])

  • pageFunction <[function]|[string]> Function to be evaluated in browser context
  • options <[Object]> Optional waiting parameters
    • polling <[string]|[number]> An interval at which the pageFunction is executed, defaults to raf. If polling is a number, then it is treated as an interval in milliseconds at which the function would be executed. If polling is a string, then it can be one of the following values:
    • raf - to constantly execute pageFunction in requestAnimationFrame callback. This is the tightest polling mode which is suitable to observe styling changes.
    • mutation - to execute pageFunction on every DOM mutation.
    • timeout <[number]> maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass 0 to disable timeout.
  • ...args <...[Serializable]|[JSHandle]> Arguments to pass to pageFunction
  • returns: <[Promise]<[JSHandle]>> Promise which resolves when the pageFunction returns a truthy value. It resolves to a JSHandle of the truthy value.

waitForFunction 可以用来观察可视区域大小是否改变。

const puppeteer = require('puppeteer');

puppeteer.launch().then(async browser => {
  const page = await browser.newPage();
  const watchDog = page.mainFrame().waitForFunction('window.innerWidth < 100');
  page.setViewport({width: 50, height: 50});
  await watchDog;
  await browser.close();
});

将 node.js 中的参数传递给 page.waitForFunction 函数:

const selector = '.foo';
await page.waitForFunction(selector => !!document.querySelector(selector), {}, selector);

frame.waitForNavigation(options)

  • options <[Object]> Navigation parameters which might have the following properties:
    • timeout <[number]> Maximum navigation time in milliseconds, defaults to 30 seconds, pass 0 to disable timeout. The default value can be changed by using the page.setDefaultNavigationTimeout(timeout) method.
    • waitUntil <[string]|[Array]<[string]>> When to consider navigation succeeded, defaults to load. Given an array of event strings, navigation is considered to be successful after all events have been fired. Events can be either:
    • load - consider navigation to be finished when the load event is fired.
    • domcontentloaded - consider navigation to be finished when the DOMContentLoaded event is fired.
    • networkidle0 - consider navigation to be finished when there are no more than 0 network connections for at least 500 ms.
    • networkidle2 - consider navigation to be finished when there are no more than 2 network connections for at least 500 ms.
  • returns: <[Promise]<[?Response]>> Promise which resolves to the main resource response. In case of multiple redirects, the navigation will resolve with the response of the last redirect. In case of navigation to a different anchor or navigation due to History API usage, the navigation will resolve with null.

当框架导航到新 URL 时将被解析。它在运行代码时很有用。这将间接导致框架进行导航。看下这个例子:

const [response] = await Promise.all([
  frame.waitForNavigation(), // The navigation promise resolves after navigation has finished
  frame.click('a.my-link'), // Clicking the link will indirectly cause a navigation
]);

注意 使用 History API 去改变 URL 将会被认为是导航。

frame.waitForSelector(selector[, options])

  • selector <[string]> A [selector] of an element to wait for
  • options <[Object]> Optional waiting parameters
    • visible <[boolean]> wait for element to be present in DOM and to be visible, i.e. to not have display: none or visibility: hidden CSS properties. Defaults to false.
    • hidden <[boolean]> wait for element to not be found in the DOM or to be hidden, i.e. have display: none or visibility: hidden CSS properties. Defaults to false.
    • timeout <[number]> maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass 0 to disable timeout.
  • returns: <[Promise]<[ElementHandle]>> Promise which resolves when element specified by selector string is added to DOM.

等待被选择等待元素出现在页面中。如果调用时选择的元素已存在,则立即返回。如果在设定的毫秒时间之后还没有出现,则抛出异常。

这个方法可以在切换导航时使用:

const puppeteer = require('puppeteer');

puppeteer.launch().then(async browser => {
  const page = await browser.newPage();
  let currentURL;
  page.mainFrame()
    .waitForSelector('img')
    .then(() => console.log('First URL with image: ' + currentURL));
  for (currentURL of ['https://example.com', 'https://google.com', 'https://bbc.com'])
    await page.goto(currentURL);
  await browser.close();
});

frame.waitForXPath(xpath[, options])

  • xpath <[string]> A [xpath] of an element to wait for
  • options <[Object]> Optional waiting parameters
    • visible <[boolean]> wait for element to be present in DOM and to be visible, i.e. to not have display: none or visibility: hidden CSS properties. Defaults to false.
    • hidden <[boolean]> wait for element to not be found in the DOM or to be hidden, i.e. have display: none or visibility: hidden CSS properties. Defaults to false.
    • timeout <[number]> maximum time to wait for in milliseconds. Defaults to 30000 (30 seconds). Pass 0 to disable timeout.
  • returns: <[Promise]<[ElementHandle]>> Promise which resolves when element specified by xpath string is added to DOM.

等待 xpath 出现在页面中。如果在调用函数的时候 xpath 已经存在,会立即返回。如果在设定的毫秒时间之后还没有出现,则抛出异常。

这个方法可以在切换导航时使用:

const puppeteer = require('puppeteer');

puppeteer.launch().then(async browser => {
  const page = await browser.newPage();
  let currentURL;
  page.mainFrame()
    .waitForXPath('//img')
    .then(() => console.log('First URL with image: ' + currentURL));
  for (currentURL of ['https://example.com', 'https://google.com', 'https://bbc.com'])
    await page.goto(currentURL);
  await browser.close();
});

此方法使用 selector 提取元素,如果需要,将其滚动到视图中,然后使用page.mouse 单击元素的中心。
如果没有与 selector 匹配的元素,则该方法将引发错误。

本文章首发在 LearnKu.com 网站上。

本译文仅用于学习和交流目的,转载请务必注明文章译者、出处、和本文链接
我们的翻译工作遵照 CC 协议,如果我们的工作有侵犯到您的权益,请及时联系我们。

原文地址:https://learnku.com/docs/puppeteer/3.1.0...

译文地址:https://learnku.com/docs/puppeteer/3.1.0...

上一篇 下一篇
贡献者:2
讨论数量: 0
发起讨论 查看所有版本


暂无话题~