-
Notifications
You must be signed in to change notification settings - Fork 12
2. Sources
Sources are used to provide information from external websites, RSS feeds, or APIs on Ferrite. It's preferred to prefer APIs and RSS over scraping as those methods do not overload a website.
- name: My cool source
version: 1
minVersion: '0.7'
about: >-
# Folded about description
website: https://mycoolwebsite.com
tags:
# YAML sequence of tag blocks
- name: A cool tag
color: # A hex value
trackers:
# YAML sequence of URLs
- udp://tracker1.com/announce
- http://tracker2.net/announce
api:
# Add API info here
jsonParser:
# Add JSON parser here
rssParser:
# Add RSS parser here
htmlParser:
# Add HTML parser here## FieldsRequired String: The source's name.
Required Integer: This is the version number of the source. Each update to the source increments the version by 1. Only increment the version when you are sure that the source is ready to be published.
Optional String: The minimum app version this source can run on. YAML plugins only run on v0.7 or above by default.
Optional String: A short description of the source. Will be shown in source-specific settings. Recommended to use line folding as shown in the template.
Optional String: The base URL of the website. For example, https://google.com is the base URL of Google. DO NOT include the trailing slash otherwise the source will break. Required if dynamicWebsite is not used.
Optional Boolean: Marks if the website can be filled in the source settings by the user. Used for locally hosted sources.
Optional Array<Tag>: Please see Tag documentation.
Optional Array<String>: If only a magnet hash field is provided, this field is used to construct a magnet link. Trackers are provided as an array of strings and the best way to get them is to decode a magnet link from the website itself. Magnet links usually contain trackers after an &tr part of the URL.
Optional JsonParser: Allows for parsing of API payloads. Always prefer this over HTML parsers!
Optional RssParser: Allows for parsing of RSS feeds. Always prefer this over HTML parsers!
Optional HtmlParser: The web scraping module for a source. Use this if a source does not have an API and allows scraping!
API information is required if you want to query from a website that contains API routes. These are also used for aggregate sources that run a local server.
api:
apiUrl: # base URL for API routes
clientId: # client ID or username credential
clientSecret: # client secret or password credentialOptional String: The base URL for api routes.
Optional ApiCredential: If an API wants a username or client ID, add a block here.
Optional ApiCredential: If an API wants a password or client secret, add a block here.
JSON parsers are used for parsing JSON apis from certain websites. Making this module requires some knowledge on understanding JSON fields. This is paired with the API module.
Here is a template for a JSON parser with all the fields filled out. If a field is optional, remove it.
jsonParser:
searchUrl: # Search path
results: # Results located on first JSON layer
subResults: # Results located on second JSON layer
magnetHash:
# Complex query
magnetLink:
# Complex query
subName:
# Complex query
title:
# Complex query
size:
# Complex query
sl:
seeders: # Seeder key name
leechers: # Leecher key nameOptional String: The URL given when querying an API. For example, when given a URL such as https://www.google.com/search?q=hello, the search URL is whatever comes after the base URL (in this case /search?q=hello). It is important to include the slash at the beginning otherwise the source will break.
Optional String: JSON field for a results array. Most API results are compacted in an array of JSON objects.
Optional String: JSON field for results on the second layer of JSON for a result.
Optional ComplexQuery: JSON key for the location of a magnet hash
Optional ComplexQuery: JSON key for the location of a magnet link
Optional ComplexQuery: Used for aggregate sources. Adds the original website name where the aggregate source fetched the item from.
Optional ComplexQuery: JSON key for the location of a result title
Optional ComplexQuery: JSON key for the size of a result
Optional: Used to get seeder and leecher values for an item. All the below properties are optional
-
seeders
String: The JSON result key for seeders -
leechers
String: The JSON result key for leechers
RSS parsers are used for parsing RSS feeds from websites. Making this module doesn't require any prior concepts and shouldn't be difficult to pick up.
Here is a template for an RSS parser with all the fields filled out. If a field is optional, remove it.
rssParser:
searchUrl: # Search path
items: # Items selector
magnetHash:
# Complex query
magnetLink:
# Complex query
subName:
# Complex query
title:
# Complex query
size:
# Complex query
sl:
seeders: # Seeder tag name/value (if discriminator present)
leechers: # Leecher tag name/value (if discriminator present)
discriminator: # Complex query discriminator
attribute: # Seeder and leecher value tag nameOptional String: Base URL of the RSS feed. Some websites use domains such as feed.website.com for RSS feeds as opposed to website.com.
Optional String: The URL given when searching content on a feed. For example, when given a URL such as https://www.google.com/search?q=hello, the search URL is whatever comes after the base URL (in this case /search?q=hello). It is important to include the slash at the beginning otherwise the source will break.
Parameters:
- {query}
Var: Replaced with the user's URLencoded search query
Required String: The tag name for items in an RSS xml document. This is usually called item.
Optional ComplexQuery: The tag name for a magnet hash (if present).
Optional ComplexQuery: The tag name for a magnet link (if present). If a magnetLink field isn't provided, a magnetHash field must be provided with trackers.
Optional ComplexQuery: Used for aggregate sources. Adds the original website name where the aggregate source fetched the item from.
Optional ComplexQuery: The tag name for the title of an item
Optional ComplexQuery: The tag name for the size of an item. It doesn't matter if the size is in an integer format, Ferrite will convert it into the proper format (GB, MB, etc).
Optional: Used to get seeder and leecher values for an item. All the below properties are optional
Arguments:
-
seeders
String: The tag name for seeders -
leechers
String: The tag name for leechers -
discriminator
String: A replication of discriminator from complex queries -
attribute
String: Tag name used for both seeders and leechers
HTML parsers are used for web scraping. Making this module requires understanding web scraping and basic DOM methods such as querySelector and querySelectorAll for testing.
Here is a template for an HTML parser with all the fields filled out. If a field is optional, remove it.
htmlParser:
searchUrl: # Search URL path
rows: # Row selector
magnet:
# Complex query
externalLinkQuery: # If a magnet is on a different webpage
subName:
# Complex query
title:
# Complex query
size:
# Complex query
sl:
seeders: # Seeder selector
leechers: # Leecher selector
combined: # Combined seeder and leecher string selector
attribute: # Complex query attribute
seederRegex: # Regex for seeder string
leecherRegex: # Regex for leecher stringRequired String: The URL given when searching content on a website. For example, when given a URL such as https://www.google.com/search?q=hello, the search URL is whatever comes after the base URL (in this case /search?q=hello). It is important to include the slash at the beginning otherwise the source will break.
Parameters:
- {query}
Var: Replaces with the user's URL encoded search query
Required String: The CSS selector for selecting a table row. Most of these sites use HTML tables. Please consult this while web scraping.
Required ComplexQuery: The HTML parser only looks for magnet links.
Extra arguments:
- externalLinkQuery
String: If a magnet link is located on a different page, this fetches the URL required to navigate to that page and fetch the magnet link. This will make the source slow as the search results scale, please add theSlowtag if using this argument.
Optional ComplexQuery: Used for aggregate sources. Adds the original website name where the aggregate source fetched the item from.
Optional ComplexQuery: Follows complex query spec. No unique parameters.
Optional ComplexQuery: Follows complex query spec. No unique parameters.
Optional: Used to get seeder and leecher values on a website. All the below properties are optional.
Arguments:
- seeders
String: The seeder CSS selector - leechers
String: The leecher CSS selector - combined
String: A CSS selector used when seeders and leechers are in one string (ex.Seeders: 100 / Leechers: 200) - attribute
String: Tag name used for both seeders and leechers - seederRegex
String: Regex used to strip the seeder value from a string (follows the same rules as complex query regexes) - leecherRegex
String: Regex used to strip the leecher value from a string (follows the same rules as complex query regexes)
These are generic queries used by Ferrite for keys that require a little more information when parsing the contents.
Any key that has ComplexQuery as a tag will always use these parameters.
Here is a template for a complex query with all the parameters filled in. If an parameter is optional, remove it.
query: # CSS selector for the scraper
discriminator: # For RSS and JSON - tag discrimination
attribute: # Name or value of tag depending on discriminator presence
regex: # Regex stringRequired String: The CSS selector for selecting the element in question.
Required String: The attribute to look for after selecting the query (ex. href, title, span). The default value is text for getting a tag's textContent.
Optional String: Used with RSS and JSON parsers.
If an RSS tag is formatted something such as with <attr name="magnet" value="magnetLinkHere"> and you want the value, the discriminator would be name and the attribute will be value.
If some JSON parameters return similar values, such as a title, a discriminator can be used to separate entries that have the same title but other differing parameters.
Optional String: Runs regex on the query result before presentation to the user.
-
Do not include the beginning and end slashes in this string (ex.
/regex/) -
When using a
\character, escape it using\\ -
If a regex does not have a capturing group, it would be assumed to check for a match
-
This regex must have only one capturing group if you want to return data. Don't know what a capture group is?
These are generic queries used for grabbing and storing API credentials. Most of these fields are not needed, but some APIs require way too much authentication for simple queries.
url: # URL to grab the credential
dynamic: # Should the credential be dynamically set by the user
expiryLength: # How long the credential will last
responseType: # Response type from the URL
query: # Where in the response the credential will be locatedOptional String: If there is a separate URL to grab the credential from, enter it here.
Optional Boolean: Indicates if the credential should be entered by the user in the source's settings.
Optional Integer: How long the credential lasts for.
Optional String: What the response is from the credential URL (Ex. json).
Optional String: Field where the credential is located from a URL response.