When I was trying to clone my website http://researchweb.iiit.ac.in/~sanskar.tibrewal, there was issues with clonning, as it was cloning researchweb.iiit.ac.in instead of the link provided. On digging deep I foung the issue lies in the yarl.
For example:
`
a=yarl.URL("../index.html")
b=yarl.URL("http://researchweb.iiit.ac.in/~sanskar.tibrewal")
b.join(a)
Output:
URL('http://researchweb.iiit.ac.in/index.html')
`
Therefore the ~sanskar.tibrewal part get cut out.
Similar code can be found at:

Here self.root is the b and url is the a from the above example.
One of the fix is to rewrite the code and keep updating the self.root. Another fix is to use regular expressions or parsers for parsing out the dots and finding the correct links.
When I was trying to clone my website http://researchweb.iiit.ac.in/~sanskar.tibrewal, there was issues with clonning, as it was cloning researchweb.iiit.ac.in instead of the link provided. On digging deep I foung the issue lies in the yarl.
For example:
`
Therefore the ~sanskar.tibrewal part get cut out.
Similar code can be found at:
Here self.root is the b and url is the a from the above example.
One of the fix is to rewrite the code and keep updating the self.root. Another fix is to use regular expressions or parsers for parsing out the dots and finding the correct links.