-
POST /login?username=:username&password=:password: To login.{ "message": "Login successfully.", "access_token": "xxx" } -
POST /logout: To logout.
-
GET /health: Check if scheduled scraping activity is executing as expected:{ "discover": "okay", // or "not okay" "update": "okay" // or "not okay" } -
GET /variables: Get all variables{ "message": "returning all variables", "result": [ { "key": "discover:pid", "value": "4396" }, {variable info}, ... ] } -
GET /variables?key=:key: Get variables by key{ "message": "returning variables with key :key", "result": {variable info} }
-
GET /stats: Get stats of all days and sites{ "message": "Returning all stats", "result": [ { "date": "2020-03-01", "site_id": 1, "new_article_count": 134, "updated_article_count": 0 }, {stats}, {stats}. ... ] } -
GET /stats?date=:date: Get stats of all sites on a day, e.g.GET /stats?date=2020-04-05{ "message": "Return stats of all sites on :date", "result": [ {stats}, {stats}, ... ] } -
GET /stats?site_id=:id: Get stats of a site, but only return stats of the last 30 days.{ "message": "Returning stats of site :id from the last 30 days.", "result": [ {stats}, {stats}, ... ] }
-
GET /articles: Returning 10 recent articles.{ "message": "Returning 10 most recent articles.", "result": [ { "article_id": 3939173, "article_type": "Article", "first_snapshot_at": 1588107403, "last_snapshot_at": 1588107403, "next_snapshot_at": 1588193803, "redirect_to": null, "site_id": 105, "snapshot_count": 1, "url": xxx, "url_hash": "43445058" }, {article info}, {article info}, ... ] } -
GET /articles?url=:url. Get articles with url, only for exact match. The query find matches in both requested url and redirected url.{ "message": "Returning articles that matches url :url, "result": [ {article info}, {article info}, ... ] } -
GET /articles/:id: Get article info with article_id:{ "message": "Returning article with id :id", "result": {article info} }
-
GET /sites: Get all sites.{ "message": "Returning all sites", "result": [ { "airtable_id": "xxx", "config": "{...}", "is_active": 1, "last_crawl_at": 1588123660, "name": "yyy", "site_id": 100, "site_info": "{...}", "type": "zzz", "url": "ooo" }, {site info}, {site info}, ... ] } -
GET /sites/active: Get all active sites.{ "message": "Returning active sites", "result": [ {site info}, {site info}, ... ] } -
GET /sites/:id/article_count: Get article count in a site.{ "message": "Returning article count of site :id", "result": { "site": {site :id info} "article_count": 100 } } -
GET /sites/:id/latest_article: Get most recently added article of a site:{ "message": "Returning latest article from site :id", "result": { "latest_article": {article info}, "site": {site :id info} }
-
GET /publications?q=:search_string: Get publications where title or text contains the search string.{ "message": "Return publications that matches :search_string", "result": [ {publication info}, {publication info}, ... ] }
-
GET /playground/random: Get a random publication title.{ "publication_id": "xxx", "text": "yyy" } -
POST /playground/add_record: Add a record.{ "message": "Add new record successfully.", "record_id": 123, }
- login first:
GET /loginto fill out the form and submit.
-
Login first:
python ns-api.py login, if successful, the credential would be saved insecrets.json. -
To get site stats:
$ python ns-api.py stats
Optional Arguments: --site-id: view stats of a particular site. --date: view stats of a particular date. e.g. 2020-04-03 -o / --output: filename to save the json output. -
To get variable:
$ python ns-api.py variables
Optional Arguments: --key: variable key -o / --output: filename to save the json output.