pavuk ( http://www.pavuk.org/ ) is a very sophisticated web spider, with some very nice configuration options. PAVUK goes on where wget et al throw the towel in the ring. ;-)
These URLs can be used to test specific pavuk features.
This includes some of the tests that come with pavuk itself and are run by the
make check command â€” which you run after
If you want to run all tests at once, but prevent pavuk from spidering the complete site (which I'd rather frown upon, if you get my drift), you may find it useful to know that these tests are all located inside this directory, so you can easily achieve this by specifying the URL of this page as the starting URL for grabbing/testing the test cases, while specifying the additional '-dont_leave_dir' command line option to prevent pavuk from traveling outside this directory.
Test URL: a Chinese webforum (chinese_bbs_test1.html)
We offer additional test cases for use with pavuk ('make check') on this site. These include:
(c)Copyright 2001-2009, Gerrit E.G. Hobbelt (Ger Hobbelt a.k.a. [i_a] ) - Hebbut.Net