Recently I wanted to check the response times of all category and product detail pages of a Magento 2 shop that I am working on.
So I searched for some tools for load-testing and crawling:
- First of all there is Apache Bench. But you have to give Apache Bench (ab) a list of URLs to crawl - but at least for Magento 2.0 the XML sitemap does not contain all possible link combinations.
- There is Apache JMeter. This is a great tool but I don’t know how I can crawl a site with it.
But afaik none of these tools can easily combine load-testing and crawling.
So I built a little web-crawler in go that
- reads the XML sitemap of a website,
- then visits all links in the XML sitemap
- and while it is reading the response of each sitemap entry looks for new links and crawls them too.
… and all of that concurrently with 5, 50 or even 100 workers at the same time to get done fast and/or to create a bit of load for the site so I can see how it responds to a bit of stress:
gargantua crawl --url https://www.sitemaps.org/sitemap.xml --workers 5
「 gargantua 」 is just a prototype but is works pretty good and is really easy to use:
You can get the the latest code and binaries at github.com/andreaskoch/gargantua.
If I missed some tools that can to the same tasks please let me know via Twitter (@andreaskoch). If you find bugs or have ideas for feature requests please create an issue in the GitHub repository: github.com/andreaskoch/gargantua.