R Web Scraping Quick Start Guide by Olgun Aydin

R Web Scraping Quick Start Guide by Olgun Aydin

Author:Olgun Aydin [Olgun Aydin]
Language: eng
Format: epub, mobi
Tags: COM018000 - COMPUTERS / Data Processing, COM021030 - COMPUTERS / Databases / Data Mining, COM062000 - COMPUTERS / Data Modeling and Design
Publisher: Packt
Published: 2018-10-31T06:41:38+00:00


Article preview

In following part, we will also write XPath rules to collect this information, and then will be writing R script to collect the information, and at final step we will play with data to create some charts and plots.

Writing XPath rules

In this part, we are going to create our XPath rules to parse the HTML document we will collect:

First of all, we will write XPath rules to collect information from the left-hand side of the web page, in other words, to collect information about how many articles there are for each categories.

Let's navigate the landing page of the website devveri.com. As we exercised in previous chapter, will use Google Developer Tools to create and test XPath rules.

To use Google Developer Tools, we can right-click on an element, which we are interested in extracting data from.

Click Inspect Element. On the following screenshot, we marked the elements regarding categories:



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Popular ebooks
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(8720)
Distributed Machine Learning with Python by Guanhua Wang(3388)
Getting Started with CockroachDB by Kishen Das Kondabagilu Rajanna(2556)
Exploratory Data Analysis with Python Cookbook by Ayodele Oluleye(1398)
R Web Scraping Quick Start Guide by Olgun Aydin(1065)
Getting Started With CockroachDB: A Guide to Using a Modern, Cloud-Native, and Distributed SQL Database for Your Data-Intensive Apps by Kishen Das Kondabagilu. Rajanna(1022)
PostgreSQL 13 Cookbook: Over 120 recipes to build high-performance and fault-tolerant PostgreSQL database solutions by Vallarapu Naga Avinash Kumar(992)
Mastering PostgreSQL 15 - Fifth Edition by Hans-Jürgen Schönig(664)
Apache Hadoop 3 Quick Start Guide by Hrishikesh Karambelkar(428)
Pandas for Everyone: Python Data Analysis, 2nd Edition by Daniel Y. Chen(422)
Learn SQL with MySQL: Retrieve and Manipulate Data Using SQL Commands with Ease by Ashwin Pajankar(373)
SQL Query Design Patterns and Best Practices by Steve Hughes & Dennis Neer & Dr. Ram Babu Singh & Shabbir H. Mala & Leslie Andrews & Chi Zhang(366)
Deploy Node.js on GCP: A comprehensive guide to deploying Node.js on Google Cloud Platform by Jonathan Lin(359)
Configuring Sales and Distribution in SAP ERP by Unknown(333)
Leveling Up with SQL by Mark Simon(306)
Learning Data Science by Sam Lau(301)
Intermediate Python by Oswald Campesato(300)
Data Engineering with AWS: A Comprehensive Guide to Building Robust Data Pipelines by Paul Brian(271)
Pandas Basics by Oswald Campesato(269)
The Definitive Guide to Data Integration by Pierre-Yves BONNEFOY Emeric CHAIZE Raphaël MANSUY Mehdi TAZI(263)