How to minimize the work required to collect data

Quick guide

  1. Click LISTLY WHOLE on Alibaba

2.  Hit add settings on Databoard and set how many times the auto-scroll should repeat

3.  Click Recent and hit + Group

4.  Copy and paste all the URLs to scrape to ADD URL section

5.  Click Group Excel and download data into a single spreadsheet




Feeling the need to organize your data collection? Keep reading to learn about the web scraping tips and tricks! This tutorial will walk you through how to collect product information on Alibaba, one of the world's largest e-commerce companies, and create a scheduled task so that you can get any piece of new information you need every single day.



Go to Alibaba and enter a keyword to search for the products you want.

Hit the LISTLY WHOLE button.

Double-check if all the data you need has been collected.

Alibaba ads often appear above the search results, so some irrelevant product information may be collected.



Auto-scroll


Click add setting on your Databoard.

Increase the number of times to repeat scrolling actions at the bottom of the page.

Auto-scroll allows you to automatically scroll down the web page and load more data. You may want to set the number of times to repeat scrolling actions to three for scraping the Alibaba web page.



Click Save and hit the Refresh button on your Databoard.

Once the data collection is done, click Latest and + Group in order.

Click + Group to group all the URLs you would like to scrape and save into a single spreadsheet.

Copy and paste the URLs you want to scrape in the following ADD URL section.

How to get 1, 2, 3・・・ web page addresses


The Alibaba online marketplace displays product information by splitting the content of the website into multiple product listing pages as above. To collect as much product information as possible, you need to scrape each web page.


Click the 1, 2, 3 ... pages one by one and check out how each page’s URL address changes. In general, page number parameters followed by page= would change (e.g. page=2, page=3, page=4 ...). You can simply change the parameters and create new URLs. You can also copy each page’s address and paste it one by one!

Click Latest and check out the data extraction is complete.

Hit Group Excel to download all the data into a single Excel spreadsheet!

Scheduler


Set the time and date to collect data by clicking the add scheduler button on your Databoard.

Get email notifications and stay up-to date with the latest data extraction!