Scraping Data from Websites and consolidate in Excel
In this article we are going to see our next task to practise using UiPath.
This will make you learn more and be ready for the real-time work when you get into work for a company or as a freelancer.
In this task we are more focussing on Data Scraping concepts.
What you will learn by achieving this task?
- What is Data Scraping in UiPath?
- How that be used to scrape data?
- How do you scrape multiple pages of data?
- How to consolidate data in excel after scraping data from different websites?
What is Data scraping?
Data scraping is a technique where a computer program extracts human-readable data from other programs or sites.
Now we know about data scraping. So, let’s see the task that we are practicing now.
- Log in to a website(if that has a login) eg: Amazon or Flipkart.
- In the search bar search for a product.
- After getting the search results now we need to perform data scraping.
- Scrape all the results that are appearing on that page and the remaining pages as well.
- Now Consolidate all the scraped data into an excel file.
- In that excel creating tabs for different websites separately.
- When extracting data from Amazon make the tab name of the excel as Amazon.
This is also one of the best scenarios to practice and learn because whatever the business process you are automating, you need to send the report by end of the day.
Most of the time while performing Data Scraping I tend to see one error when scraping data for multiple pages.
It won’t give us any error and also it doesn’t show any error there while extracting data from multiple pages.
The answer for this will in the properties panel there will be an option to add the delay between two pages while extracting the data.
Provide some delay time there eg: 5000ms which is ideal for me.
Add delay depending on the loading time of the website that you are scraping from.
That’s it for this post. if you want more of this kind of task to practice comment below and let me know what topics as well.