Table of Contents
Web Scraping with AWS: Intro
Cloud-based web scraping platforms are more convenient for “self-service” scraping, of course, if you have the technical knowledge of building web scrapers and want to try web scraping by yourself. Though such kind of platform has a friendly user interface, as soon as you try the easiest scraping task, you’ll understand that quite a bit of technical knowledge is still required. In this topic, we’ll explore web scraping with AWS – Amazon Web Services (EC2) platform using WebHarvy from the cloud.
WebHarvy – A Powerful Web Scraper
WebHarvy is a web scraper enabling the extraction of web content (emails, URLs, HTML, and images) from target websites, and save data in various formats. With WebHarvy there is no necessity to write any code to script data; to extract the required data, you just need to select it and click your mouse. WebHarvy defines patterns of data in an automated manner; if it is required to scrape different items like name, price, or email address from a target page, all required configurations are made automatically.
Web Scraping from Cloud
In case you do not want to run it on your local computer, you can run WebHarvy right from the cloud thanks to AWS Elastic Compute Cloud (EC2) platform, which is used to get secure capacity in the cloud.
Amazon EC2 enables the running of a remote Windows instance in Cloud via Remote Desktop. Take a note that EC2 required minimal charges, but before that, you can enjoy а free tier for 12 months.
When you are connected to the Windows instance through Remote Desktop, download and install WebHarvy. Make sure that .Net 3.5 is also installed in the Windows instance to run WebHarvy.
Once you installed WebHarvy, you can start extracting data right away.
- Open WebHarvy
- Navigate to the target page.
- Click on Start Config on the toolbar and select the data items to capture.
- Captured data will be shown below in Captured Data Preview pane.
- Click on Start Mine on the toolbar.
- Once the mining process is finished, click on the Export button
- Select the desired format and start exporting the extracted and mined data.
To get more valuable insight regarding WebHarvy usage, read WebHarvy Web Scraper Review from DataOx.
At DataOx we are always happy to help you with data scraping services and advice on how to do web scraping by yourself from the cloud. Schedule a free consultation with our expert and find out how web scraping can help your business grow regardless of the web scraping type.