python 3.x - how to scrape web page that is not written directly using HTML, but is auto-generated using JavaScript? -


i trying scrape http://washingtonmonthly.com/college_guide?ranking=2016-rankings-national-universities website.

this website auto-generated using javascript update dom tree.i have tried below selenium code getting elements inside table, returns empty list.

from selenium import webdriver import time   driver = webdriver.chrome(executable_path="c:\\chrme\\chromedriver") driver.get('http://washingtonmonthly.com/college_guide?ranking=best-colleges-for-adult-learners-4-year-colleges') time.sleep(5) test = driver.execute_script("return document.getelementsbyclassname('tablesaw tablesaw-swipe')") print(test) 

is there way run scripts , html code ? using python 3.6

run script , suppose give table contains including csv output.

import csv selenium import webdriver selenium.webdriver.common.by import selenium.webdriver.support.wait import webdriverwait selenium.webdriver.support import expected_conditions ec  driver = webdriver.chrome() wait = webdriverwait(driver, 10) outfile = open('table_data.csv','w',newline='') writer = csv.writer(outfile) driver.get("http://washingtonmonthly.com/college_guide?ranking=2016-rankings-national-universities")  wait.until(ec.frame_to_be_available_and_switch_to_it("iframeresizer0")) wait.until(ec.visibility_of_element_located((by.css_selector, 'table.tablesaw')))  tab_data = driver.find_element_by_css_selector('table.tablesaw') list_rows = [[cell.text cell in row.find_elements_by_css_selector('td')]              row in tab_data.find_elements_by_css_selector('tr')] data in list_rows:     writer.writerow(data)     print(data)  driver.quit() 

btw, i'm assuming have lxml library installed.


Comments

Popular posts from this blog

javascript - Create a stacked percentage column -

Optimising Firebase database by automatically overwriting data -

javascript - Angular UI-Grid customTemplate directive causing rows to load slowly/? -