This repository was archived by the owner on Jul 30, 2024. It is now read-only.

Description
A website has tens of thousands or more urls after rendering, and these urls are hierarchical. If the url of the previous level is judged to be repeated, then its next level url is directly ignored. Can the problem bloomfilter be solved? Where do I need to change?
一个网站经过渲染后有几万甚至更多url,这些url是分级的。如果上一级的url被判断重复了,那么它的下一级url就被直接忽略了,这个问题bloomfilter能解决吗?我需要在哪里修改?有没有大佬能提供一个好的思路?