Data Scraping is a powerful component of UiPath Studio which enables us to extract structured data from our browser, application or document to a database, CSV file or Excel spreadsheet.
The structured data is a specific kind of information that is highly organized and is presented in a predictable pattern.
Nowadays, a lot of useful data is available online. We usually use web API to extract it and use it for our purpose, but most websites do not offer a feature that enables to export a copy of this data. To manually copy and paste it to a local file in our computer is error prone, boring, and time-consuming. Data Scraping of UiPath can extract anything we see on a web browser like statistics, finance and stock info, real-estate data, product catalogs, search-engine results, job listings, social networks feeds, customer opinions, and competitive pricing etc. Within a corporation, we can find an even larger variety of data formats that data scraping can handle, including reports, dashboards, customers, employees, finance, and medical data that we need to transform and migrate.
Let’s consider a scenario where we will search for a product such as iPhone Mobile in an e-commerce site like Flipkart and extract all iPhone mobile names and price.
Step 1: Open browser and navigate Flipkart and search iPhone Mobile.
Step 2: Create a Blank Process from UiPath Studio and open scraping wizard from the Design tab, by clicking the Data Scraping button.
Step 3: After click on Data Scaping, we will get the following message box −
Step 4: Click the ‘Next’ button and it will give us the option to select the first item on the web page. In this example, we will select the first mobile name.
Step 5: Once you finished selecting the First element, it will prompt a dialog box for selecting second element as follows –
For our scenario, we will select a second mobile name forge second element in a similar way as the first element.
Step 6: After finishing the second element, the Configure Columns wizard step is displayed.
Here we will set the column name as “Name”. If we want to extract the URL of the product then we need to check the Extract URL checkbox.
Step 7: Next, UiPath studio will give us the Extract Wizard to preview the data. Here we can increase the maximum of results if more than 100 products are available on the web page.
It would be our choice to Extract Correlated data or Finish the extraction here. If we Extract Correlated data, then it will again take you to the web page from which we want to extract the data. In our case we will select the price from the web page.
After selecting the correlated data, preview data wizard is shown as follows –
Step 8: Once we finish the extraction it will ask the question” is data spanning multiple pages?” If we are extracting the data from multiple pages then click on Yes, otherwise No. We will click on Yes and select the “Next” button from the web page because data extraction here is happening from multiple pages.
Step 9: At last it will create the activity sequence as follows –