Scrape an HTML data table

Greetings!

Here is what we are trying to do:

  1. Log into a protected web page - check
  2. Enter filter criteria in various fields - check
  3. Gather data that gets listed below - How? (Note that this result can span multiple pages)
  4. For each of the rows generated in step 3, click on each one of them (this opens a “popup” window)… We want to store all that data (parsed into FN, LN, Address etc) along with the header info into an excel sheet

Any help to get this done will be hugely appreciated!

1 Like

Welcome to our community, @Shankar!

Did I get it right that “check” means you figured out how to do it?

Firstly, I will explain how we can interact with the items on the web pages:
You can open any webpage in a browser and switch to developer mode ( F12 or right-click on the page or select “Inspect elements”). The code page appears on the right. Next, you need to click on this icon (please, look on the picture) or press Ctrl + Shift + C hotkey.

image
Then you can choose the necessary element on the webpage and the corresponding line on the code will be displayed. Click on choosen item with left mouse button to select it. Then click on the line of code and copy either selector or XPath and paste it in the Block properties of function from “Browser” group.


As for your example, it looks simple enough, you can select one of the items you need (FN, LN, address, etc.) using attributes, CSS selector or XPath and paste it into the function settings in the platform. Also, you can click on the “Calculate a value” checkbox and insert a variable inside the selector, so you can create a loop where the robot will open a pop-up window and extract the necessary information and repeat these actions as many times as there are different data rows.
For example, you can get the value of each news header using a dynamic CSS selector.


After each extraction, you can use “Append row to Excel file” or save the data to an array and then use the “Write Excel file” function to write all the values at once.

If you have any questions,@Shankar, please ask!

Hi Art,
Thanks for the detailed explanation.
Although I get the main idea, it would be really helpful if you could share some sample code… For ex, on the Udemy course that explains price comparison using web scraping, I see “subprogram_read_config” and “subprogram_loop_through_items”- are these routines available somewhere for reference? it would be great if you could share the .neek file.

Thanks again!
Shankar.

@Shankar, sorry, but we do not distribute algorithms, I can only offer you a simple robot that we use during the technical demonstration. Sent it to you by e-mail.
As I said, this process is simple, so, if there are any questions in the process of development, please ask.