To give specific recommendations we'd have to see your code. However, HTML parsing is a very difficult problem; be sure to use an existing parsing library and don't attempt to create your own.
In general it is better to implement web-crawlers using breadth-first search and not depth-first. Depth-first searches often make many calls to the same domain and path; this can be detected by web sites and your crawler may be throttled or even blocked. Breadth-first crawlers avoid this and have more opportunities for optimizations; for example recognizing that two sites are copies of each other, and abandoning the slower site.