When creating workflows for scrapping or manipulating web data you are very likely to use some activities from the 'Browser' group of activities. More likely, you will want to extract data from certain HTML elements, or elements with specific classes or IDs. ElectroNeek utilizes CSS selector or XPath mechanisms to specify elements. Both are capable to find almost any HTML element on a web page.
Let's consider Google.com page as example. If we intend to create some primitive bot that opens a google.com page in browser and searches some particular information, the bot should at least be able to:
Locate the input area in HTML code
Type request in the area
Locate the Google Search button in HTML code
Press the button
Assume we first want to locate the main input element on the page and get its selector.
Open google.com in the browser manually and navigate the page.
Press F12 to inspect HTML code.
Navigate Elements tab.
You should see something like
That is HTML structure of web page. The next step is to navigate to particular element and get attributes identifying the element. To do so press the following icon in the left-upper corner of the inspector.
After activating this mode just move the cursor over the element on the left side and its corresponding code will be highlighted on the right.
Thus we can see the code for any element of the page just hovering mouse cursor.
Cascading Style Sheets (CSS) is a style sheet language used for describing the look and formatting of a document written in HTML or XML. In CSS, selectors are patterns used to select the styled element(s).
A CSS selector is immediately shown on the screen when hovering the mouse (see the picture above).
In the example with google.com the main input field had following CSS selector
Use CSS Selector Tester to play with the different selectors.
XPath, the XML path language, is a query language for selecting nodes from an XML document. Locating elements with XPath works very well with a lot of flexibility. XPath uses path expressions to navigate through elements and attributes in an XML document.
There exist examples hard to deal with using just CSS selectors. Take a look at the HTML code
<p> First </p><p> Second </p><p> Third. Some text in Paragraph </p>
XPath for getting content of the third
<p> tag is
//p[contains(text(), 'Some text in Paragraph')]
But there is no possibility to match content inside
<p> tag with Pure CSS Selector.
There are no content selectors in CSS3 specification. We can match on an element, the name of an attribute in the element, and the value of a named attribute in an element. There is nothing for matching content within an element, though.
In order to get quick access to XPath right click on highlighted part of code and navigete to XPath copy
In the example with google.com the main input field had following XPath