• Hi All

    Please note that at the Chandoo.org Forums there is Zero Tolerance to Spam

    Post Spam and you Will Be Deleted as a User

    Hui...

  • When starting a new post, to receive a quicker and more targeted answer, Please include a sample file in the initial post.

Recursive web crawling using VBA

shahin

Active Member
Hi there! Is it possible to crawl a web page recursively? Using two or three requests it is possible to produce lots of links but that is not what i want. Actually I was thinking to do it myself but I don't know how to roll a newly produced link using function or something so that it will run until all the links in a page reach its' dead end. Here is what I wrote to extract the link of a page. Hope somebody will give me an idea how to make those links roll recursively. Thanks in advance.

Code:
Sub RecursiveCrawling()
Const url = "https://en.wikipedia.org/wiki/Main_Page"
Const mainlink = "https://en.wikipedia.org"
Dim topics As Object, post As Object

With CreateObject("MSXML2.serverXMLHTTP")
.Open "GET", url, False
.setRequestHeader "Content-Type", "text/xml"
.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = .responseText
End With
Set topics = html.getElementsByTagName("a")
    For Each post In topics
        x = x + 1
        Cells(x, 1) = post.href
    Next post
Set topics = Nothing
End Sub
 
Last edited:
Hi !

As it may be a never ending story, it could be a mess !
Or a crash memory …

Create a request function procedure with URL in parameter.
Within this function procedure, for each link call the same function procedure
with link URL.

Main procedure just call this function request procedure with main URL …
 
OMG!!! Who did I get in the loop!!!! Happy to here from you. Dear Marc L, what you just suggested might be easy to follow but the problem is that I'm very weak in function. So, If you created earlier any demo for somebody, I would be very happy to have that. Thanks in advance. I was not around my PC that is why I respond late.
 

Sorry, I was on a hurry : it is not a function but a procedure ! :confused:
With an URL as a parameter (previous post edited) …​
 
Hi Marc L, it's not good to nag for something repeatedly from a busy man like you but I'm very much willing to learn it. So, again it's my urge to you to provide me with a demo in your spare time on how to make a call to crawl recursively in vba. No offense meant.
 
What is a recursive function or procedure ?
It's just a process which can call itself …

A recursive procedure demonstration,
within a blank worksheet enter 1 to 3 in A1 to A3 :​
Code:
Sub RecProc(Rg As Range)
       Rg(1, 2).Value = Rg.Value * 2
    If Rg(2) > 0 Then
        RecProc Rg(2)
        Rg(1, 3).Value = Rg(1, 2).Value + Rg(2, 2).Value
    End If
End Sub

Sub Main()
    RecProc [A1]
End Sub
Main procedure just call RecProc from A1 cell
and within RecProc if a condition is true RecProc call itself.
Follow code progression in step by step mode via F8 key …
 
Back
Top