Data Scraping — Octoparse 8 with data driven scenarios

Parama Kalyani KS
3 min readApr 16, 2021

Data Scraping, a technique in which a computer program extracts data from website and import into a file saved on your computer. It is used for various purposes like

a. research for business intelligence,

b. competitor monitoring,

c. pricing negotiation,

d. product optimization,

e. investment decisions,

f. to gather public opinion,

g. to improve online reputation,

h. social media insights,

i. lead generation,

j. fake review detection

How Scraping works?

Tools for Data Scraping

There are various tools available for data scraping. Some of the tools are listed below:

1. ParseHub

2. Scrapy

3. OctoParse

4. Scraper API

5. Mozenda

6. Webhose.io

7. Content Grabber

8. Common Crawl

In this blog, let us discuss in detail about the Octoparse tool.

Octoparse 8

Octoparse is a widely used data scraper tool without having to write a single line of code. It is a modern software tool where both experienced and inexperienced users would find easy to extract data and allows the user to save it as a clear structured data in a format of their choice.

Let us see a simple example of extracting data using Octoparse 8:

1. Copy paste the URL which you want to scrape in the home page of Octoparse tool.

2. Once you click start, web page will be auto detected.

3. Once auto detection completes, you can see the data scraped in the bottom of the page. You can delete or rename the columns. Then click save and the run.

4. Click run on your device.

5. You can see an option for export data once the process is completed and we can export data and save it in the desired format.

Data Driven Scenarios

Let us see some of the data driven scenarios for the above case study.

Feature: As a traveler, I would like to know the Outer banks Hilton hotel details so that I can plan for my vacation.

Scenario 1: Searching Hotel Name

Given I am on https://tinyurl.com/ve9wy632

when searching for hotel name

then show “Hilton Garden Inn Outer Banks/Kitty Hawk”

Scenario 2: Searching Price

Given I am on https://tinyurl.com/ve9wy632

when searching for price

then show “$342”

Scenario 3: Searching Site

Given I am on https://tinyurl.com/ve9wy632

when searching for site

then show “official”

Scenario 4: Searching Cancellation Fee

Given I am on https://tinyurl.com/ve9wy632

when searching for cancellation fee

then show “free cancellation”

Scenario 5: Searching Review Rating

Given I am on https://tinyurl.com/ve9wy632

when searching for review rating

then show “4.5 star”

Summary

Octoparse is an easy data scraping tool which can be used by both experienced and inexperienced users. Let’s scrape data with Octoparse!

Originally published at https://www.numpyninja.com on April 16, 2021.

--

--