Best Python code snippet using playwright-python
test_dialog.py
Source: test_dialog.py
...12# See the License for the specific language governing permissions and13# limitations under the License.14import pytest15from playwright.async_api import Dialog, Page16async def test_should_fire(page: Page, server):17 result = []18 async def on_dialog(dialog: Dialog):19 result.append(True)20 assert dialog.type == "alert"21 assert dialog.default_value == ""22 assert dialog.message == "yo"23 await dialog.accept()24 page.on("dialog", on_dialog)25 await page.evaluate("alert('yo')")26 assert result27async def test_should_allow_accepting_prompts(page: Page, server):28 result = []29 async def on_dialog(dialog: Dialog):30 result.append(True)...
How to scrape data via scrapy python correctly from a dynamically(?) created table
Navigating to "url", waiting until "load" - Python Playwright Issue
Playwright wait on custom events
How to find partial text using Playwright
How to scrape dynamic content from a website?
How to locate a changing element in playwright?
How to start playwright outside 'with' without context managers
Playwright and PM2 Issue - hang while creating PlaywrightContextManager
In Playwright (Python) when there are multiple buttons on a page, all with the same name, how do i select correct button?
Receiving response from python as an array - PHP
Your xpath selector is not correct. Try this
'Year of Est.': response.xpath("//td[contains(text(), 'Year Established')]/following-sibling::td/div/div/div/text()").extract()
I also note some errors in your code such as the line below which will raise an error. You may want to recheck how you extract links from the search page.
data = self.link_extractor.extract(response.text, base_url=response.url)
Edit:
The year of establishment is loaded once the company tab is clicked. You have to simulate the click using selenium or scrapy-playwright. My simple implementation using scrapy-playwright
is as below.
import scrapy
from scrapy.crawler import CrawlerProcess
import os
from selectorlib import Extractor
from scrapy_playwright.page import PageCoroutine
class Spider(scrapy.Spider):
name = 'alibaba_crawler'
allowed_domains = ['alibaba.com']
start_urls = ['http://alibaba.com/']
link_extractor = Extractor.from_yaml_file(os.path.join(os.path.dirname(__file__), "../resources/search_results_searchpage.yml"))
def start_requests(self):
search_text = "Headphones"
url = "https://www.alibaba.com/trade/search?fsb=y&IndexArea=product_en&CatId=&SearchText={0}&viewtype=G".format(
search_text)
yield scrapy.Request(url, callback=self.parse, meta={"search_text": search_text})
def parse(self, response):
data = self.link_extractor.extract(
response.text, base_url=response.url)
for product in data['products']:
parsed_url = product["link"]
yield scrapy.Request(parsed_url, callback=self.crawl_mainpage, meta={"playwright": True, 'playwright_page_coroutines': {
"click": PageCoroutine("click", selector="//span[@title='Company Profile']"),
},})
def crawl_mainpage(self, response):
yield {
'name': response.xpath("//h1[@class='module-pdp-title']/text()").extract(),
'Year of Establishment': response.xpath("//td[contains(text(), 'Year Established')]/following-sibling::td/div/div/div/text()").extract()
}
if __name__ == "__main__":
process = CrawlerProcess(settings={
'DOWNLOAD_HANDLERS': {
"https": "scrapy_playwright.handler.ScrapyPlaywrightDownloadHandler",
},
'TWISTED_REACTOR' :"twisted.internet.asyncioreactor.AsyncioSelectorReactor"
})
process.crawl(Spider)
process.start()
Below is a sample log of running the scraper using python crawler.py
. The year 2010
is shown in the output
Check out the latest blogs from LambdaTest on this topic:
A good User Interface (UI) is essential to the quality of software or application. A well-designed, sleek, and modern UI goes a long way towards providing a high-quality product for your customers − something that will turn them on.
One of the biggest problems I’ve faced when building a test suite is not the writing of the tests but the execution. How can I execute 100s or 1000s of tests in parallel?If I try that on my local machine, it would probably catch fire – so we need a remote environment to send these to.
With the rapid evolution in technology and a massive increase of businesses going online after the Covid-19 outbreak, web applications have become more important for organizations. For any organization to grow, the web application interface must be smooth, user-friendly, and cross browser compatible with various Internet browsers.
Web applications continue to evolve at an unbelievable pace, and the architecture surrounding web apps get more complicated all of the time. With the growth in complexity of the web application and the development process, web application testing also needs to keep pace with the ever-changing demands.
LambdaTest’s Playwright tutorial will give you a broader idea about the Playwright automation framework, its unique features, and use cases with examples to exceed your understanding of Playwright testing. This tutorial will give A to Z guidance, from installing the Playwright framework to some best practices and advanced concepts.
Get 100 minutes of automation test minutes FREE!!