With Google Spreadsheets, we can easily retrieve data from the pages we browse on the internet. We can take data from a second-hand site we look at, examine it, and even show it on a graph. All we need to know here is the function syntax.
The IMPORTXML function requires the URL and Xpath of the website. We can easily get the URL with copy-paste. But we cannot access Xpat directly. So what is Xpath?
XPath allows reading data stored as XML, displayed using HTML. If we want to get information from a website, we need to use the Xpath language.
In this article, I will tell you how to do it in three ways:
- How can we get Xpath manually?
- How can we get Xpath with right-click inspection?
- How can we get Xpath with Chrome extension?
How can we get Xpath manually?
We want to retrieve data from a website. For this, Google Sheets has developed a very easy function for us. The syntax is as follows.
For example; I want to get the titles used on my own website. I wonder what titles were used.
Note: We use 'h1' for the main title while writing on the site. We use 'h2' for the sub-title.
If I'm curious about the main titles, the Xpath will be like this→ “//h1” and the subtitles→ “//h2”
Let's put them inside the function now:
- URL: "https://northstaranalytics.co.uk/","//
- Xpath: "//h2"
The results come as follows:
It's done. After typing “//, it is enough to write whatever we want. Incoming data will not fit in the cell. You can find a detailed article on this subject here: ”How to wrap text in google sheets?
How can we get Xpath with right-click inspect?
We can write manually to get Xpath, or we can get it using the website. For example, we want to reach the list of best-selling books on Amazon.com. We can do this easily:
Type "best selling books of 2021" in the search section. Come to the section where the books appear on the next page.
We must make sure that the area indicated in blue on the screen shows the area we want to search.
Right-click on the blue point ->Copy->Xpath
Now we can write this copied field instead of in the function. For convenience, insert the URL into cell A1.
I paste the XPath into cell A2. I'm ready to write my function in another cell.
After following the steps in order, the information coming to our worksheet will be as follows.
This information came to a single cell. We need to apply the Split function to separate the information into cells. You can find detailed information about it here→ “How to split cells in Google Sheets?”
If we want the information not to go out of the cell, but to stay inside the cell, we need to say fit the texts. Detailed article about it→ “How to wrap text in Google Sheets?”
This is what our worksheet looks like after doing the last two steps.
How can we get Xpath with Chrome extension?
Another way to get Xpath through the website is to use a Chrome extension. The plugin we will use: Selector Gadget
- Type Selector Gadget on the page you are searching for
- Add Chrome extension
3. Come to the page where we want to get data. (I'm curious about the 2021 best-selling books on Amazon)-Come to Amazon.com
4. Extensions🡪Selector Gadget
Books will appear in green after the plugin is activated.
5. Go to the bottom right of the page where it says Xpath and click
6. Copy, paste into the function in your worksheet. (Or paste in another cell as above, write the name of this cell in the function.)
After following the steps, our worksheet will look like this;
If we just click on the prices after activating the extension, the prices will appear in yellow. If we click on Xpath, only the price list will come.