Data Scraping with UiPath

Data Scraping with UiPath

Data Scraping is a powerful component of UiPath Studio which enables us to extract structured data from our browser, application or document to a database, CSV file or Excel spreadsheet.  

The structured data is a specific kind of information that is highly organized and is presented in a predictable pattern. 

Nowadays, a lot of useful data is available online. We usually use web API to extract it and use it for our purpose, but most websites do not offer a feature that enables to export a copy of this data. To manually copy and paste it to a local file in our computer is error prone, boring, and time-consuming. Data Scraping of UiPath can extract anything we see on a web browser like statistics, finance and stock info, real-estate data, product catalogs, search-engine results, job listings, social networks feeds, customer opinions, and competitive pricing etc. Within a corporation, we can find an even larger variety of data formats that data scraping can handle, including reports, dashboards, customers, employees, finance, and medical data that we need to transform and migrate.

Let’s consider a scenario where we will search for a product such as iPhone Mobile in an e-commerce site like Flipkart and extract all iPhone mobile names and price.

Flipkart and extract all iPhone mobile names and price

Step 1: Open browser and navigate Flipkart and search iPhone Mobile.

Step 2: Create a Blank Process from UiPath Studio and open scraping wizard from the Design tab, by clicking the Data Scraping button.

UiPath Studio

Step 3: After click on Data Scaping, we will get the following message box

Extract wizard

Step 4: Click the ‘Next’ button and it will give us the option to select the first item on the web page. In this example, we will select the first mobile name.

select the first mobile name

Step 5: Once you finished selecting the First element, it will prompt a dialog box for selecting second element as follows –

Extract Wizard select second element

For our scenario, we will select a second mobile name forge second element in a similar way as the first element. 

Step 6: After finishing the second element, the Configure Columns wizard step is displayed.

Extract wizard configure columns

Here we will set the column name as “Name”. If we want to extract the URL of the product then we need to check the Extract URL checkbox.

Step 7: Next, UiPath studio will give us the Extract Wizard to preview the data. Here we can increase the maximum of results if more than 100 products are available on the web page.

Extract wizard preview data

It would be our choice to Extract Correlated data or Finish the extraction here. If we Extract Correlated data, then it will again take you to the web page from which we want to extract the data. In our case we will select the price from the web page. 

After selecting the correlated data, preview data wizard is shown as follows –


Extract Correlated data

Step 8: Once we finish the extraction it will ask the question” is data spanning multiple pages?” If we are extracting the data from multiple pages then click on Yes, otherwise No. We will click on Yes and select the “Next” button from the web page because data extraction here is happening from multiple pages.

Indicate next link

Step 9:  At last it will create the activity sequence as follows –

data scraping


Leave a Reply

SOAIS - Worksoft Newsletter

To view on your browser, click here
Facebook Twitter LinkedIn
Dear Default Value,

Welcome to SOAIS Newsletter of September 2021!

Continuous Testing with Remote Execution
The speed of innovation continues to increase, driving rapid and relentless change for today’s ever-evolving IT landscapes, creating greater risk as IT and business teams scramble to ensure timely delivery. How can your organization keep pace? Test more, worry less. With Worksoft’s Connective Automation Platform, you can easily build and maintain automated tests, accelerating testing time without losing scope or volume. You can schedule and execute remote, continuous tests to intercept defects sooner and prioritize remediation - without sacrificing your nights and weekends. Explore how continuous test automation and remote execution can empower your organization.

Click here to connect with us to get more information on our services.

Skip Costly Rework with Dynamic Change Resiliency​

Change resiliency is imperative in ever-evolving IT environments. Our patented object action framework streamlines change management by assigning object definitions to your shared assets. The same object may be used in a thousand automation steps, but it can be easily updated by making one simple change to the model definition. The change automatically propagates to every single instance where that object may have been used without a single line of code or manual human involvement. For more change readiness you can also engage our Impact Analysis for SAP to predict how changes in SAP transports will affect your business processes. 

Please click here to watch the video to get a gist.

SOAIS Blog – Nuts and bolts of Certify Database Maintenance​

One of the key thing, which is often missed by the organizations, who have invested in using Worksoft Certify for automating their Business Process Validation initiatives, is implementing a Database Maintenance Plan. While the business and the test automation consultants get excited about the shiny new thing that they have got and start building the regression suite; planning and executing a database maintenance plan for most of the customers gets pushed down the priority list. However, since all the test assets in Certify are stored in a Database, a robust database maintenance plan is very important to maintain smooth operation of Certify with acceptable performance criteria. The customers usually start facing issues once they have built significant number of Certify processes which they have started executing on regular basis. Such executions add a lot of data to the tables storing results data and increase the overall size of the Certify database.

Please click here to read the complete blog.

Worksoft Blog – Process Intelligence: A Multi-Dimensional Approach

The ability to extract process knowledge has become easier through the years. Technology has evolved to the point where we can deploy capabilities that connect at multiple levels to extract different types of process insight. In the past, organizations were forced to spend enormous energy extracting data manually from different applications and databases. Then, they would have to use things like spreadsheets to transform the data and convert it into meaningful information. 

Please click here and read the complete blog.
Unit 9, Level 5, Navigator, ITPL,
Bangalore - 560 066.
Phone: +91 80 40071234
Suite 101, 1979, N Mill St,
Naperville, IL 60563
Phone 1-800-262-2427
Please click here to Unsubscribe / Unsubscribe Preferences

Leave us your info