BitSky has a desktop application for MacOS, Windows OS, and Ubuntu, and already pre-installed packages you need. So you don't need to spend time installing or configure the environment for Python, NodeJS, or other programming languages.
BitSky supports all programming languages(e.g. Python, Java, NodeJS, and so on), so you can use the programming language you already familiar with, don't need to learn a new programming language just for web crawling.
Except for those unique features, BitSky also has the following features:
Crawling any type of websites. BitSky can crawl static websites or single page application
Based on microservices architecture, naturally support distributed, easy to scalable, and extendable
With BitSky you just need to focus on extract data, and other work, BitSky will do for you.
Try the FAQ - List most common questions
Try the How-Tos - List the most common solutions
Report bugs or features to us in our issue tracker
BitSky based on microservices architecture, so Retailer, Producer, Supplier are microservices.
A Supplier creates a chain between Retailer and Producer. A Supplier includes all the functions that manage Retailer Configurations, manage Producer Configurations, receive Tasks from a Retailer, and assign Tasks to suitable Producers, and move success or fail Tasks to Task History
Configuration for a Producer, it controls a Producer whether can execute Tasks and how to execute Tasks. A Producer MUST connect to a Producer Configuration before it can be assigned Tasks and a Producer Configuration is one to one relationship with a Producer.
Configuration for a Retailer, it has information about a Retailer. For example Base URL, Health Check URL, and receive Tasks URL. A Retailer MUST connect to a Retailer Configuration before it can create and receive Tasks and a Retailer Configuration is one to one relationship with a Retailer.
A Producer MUST connect to a Producer Configuration and both Producer Configuration and Producer should have the same type.
Retailer creates Tasks and sends to Supplier, Supplier assign Tasks to suitable Producers, after Producers successfully execute Tasks, will send Tasks back to Retailer, send back Tasks will contain crawled data(e.g.
HTML), Retailer can extract useful information from received Tasks or create more Tasks. Retailer also needs to decide where to store extract data and use what kind of format. Most of your time is working on creating your own Retailer.