sub-link-extractor

The DefaultLinkExtractor is a Java library designed to extract all the links from a given webpage. It utilizes JSoup to parse the HTML content.

Installation

For those using Gradle, add the following to your build.gradle file:

implementation 'io.github.revfactory:sub-link-extractor:0.1.1'

For Maven users, add the following to your pom.xml:

<dependency>
    <groupId>io.github.revfactory</groupId>
    <artifactId>sub-link-extractor</artifactId>
    <version>0.1.1</version>
</dependency>

Usage

import io.github.revfactory.LinkExtractorStrategy;
import io.github.revfactory.DefaultLinkExtractor;

// ...

LinkExtractorStrategy extractor = new DefaultLinkExtractor(1500);  // 1.5 second delay
List<String> links = extractor.extractLinks("http://example.com/docs");

Remember, the library can also be customized using different strategies by implementing the LinkExtractorStrategy interface.

Error Handling

The extractLinks method can throw an IOException. Ensure you have proper error handling to manage any network issues or other related problems.

Licensing

This project is distributed under the Apache 2.0 license.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sub-link-extractor

Installation

Usage

Error Handling

Licensing

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

sub-link-extractor

Installation

Usage

Error Handling

Licensing