-
Notifications
You must be signed in to change notification settings - Fork 29
Dydyshko Andrey #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
InvokerAndrey
wants to merge
28
commits into
introduction-to-python-bsuir-2019:master
Choose a base branch
from
InvokerAndrey:final_task
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Dydyshko Andrey #35
Changes from 24 commits
Commits
Show all changes
28 commits
Select commit
Hold shift + click to select a range
64066b4
Init commit
InvokerAndrey 3d5dc1d
Added argumet parser
InvokerAndrey b291b7f
Added rss reader class and main function
InvokerAndrey 2872722
Implemented --json argument processing
InvokerAndrey 247aae4
Refactored invoke mothods
InvokerAndrey 4abd526
Implemented human-readable format
InvokerAndrey f14e675
Added --version argument
InvokerAndrey f1f134d
Implemented --verbose argument
InvokerAndrey 5cb2846
Delete rss_reader.py
InvokerAndrey 83343d7
Create README.md
InvokerAndrey d7eec79
Create requirements.txt
InvokerAndrey 934e2da
Merge branch 'final_task' of https://github.com/BntuHater/PythonHomew…
InvokerAndrey 5d1778d
Implemented [Iteration 2] Distribution
InvokerAndrey 71ab39b
Implemented [Iteration 3] News caching
InvokerAndrey 0ae55a5
Refactored [Iteration 3] News caching
InvokerAndrey 10ab7b2
Now cached news dasplay specifically from the transmitted URL
InvokerAndrey c5b01e9
Implented --to-pdf argument
InvokerAndrey e4cdb8b
Implemented --to-pdf argument
InvokerAndrey f4f457d
Refactored code
InvokerAndrey 304ddb9
fixed rss-reader --help
InvokerAndrey 177135c
Implemented --to-html argument
InvokerAndrey 656376a
added --to-html argument
InvokerAndrey 47245d4
Included fonts for pdf into setup.py
InvokerAndrey bfe71bc
implemented couple exceptions
InvokerAndrey 589dc8d
RSSException, --colorize, test_RSSReader
InvokerAndrey 12f9051
refactored args
InvokerAndrey b2fce55
version 0.5.0
InvokerAndrey e7e4e7c
exception and requests
InvokerAndrey File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -102,3 +102,6 @@ venv.bak/ | |
|
|
||
| # mypy | ||
| .mypy_cache/ | ||
|
|
||
| # IDE | ||
| .idea | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| Copyright (c) 2018 The Python Packaging Authority | ||
|
|
||
| Permission is hereby granted, free of charge, to any person obtaining a copy | ||
| of this software and associated documentation files (the "Software"), to deal | ||
| in the Software without restriction, including without limitation the rights | ||
| to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | ||
| copies of the Software, and to permit persons to whom the Software is | ||
| furnished to do so, subject to the following conditions: | ||
|
|
||
| The above copyright notice and this permission notice shall be included in all | ||
| copies or substantial portions of the Software. | ||
|
|
||
| THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR | ||
| IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, | ||
| FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE | ||
| AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER | ||
| LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, | ||
| OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE | ||
| SOFTWARE. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,49 @@ | ||
| # PythonHomework | ||
| [Introduction to Python] Homework Repository | ||
|
|
||
| # How to use | ||
| * pip install -r requirements.txt | ||
| * python rss-reader.py "https://news.yahoo.com/rss/" --limit 2 --json | ||
| * --date prints cached news that were parsed previously from the given URL | ||
| Creates folder cache and saves news in JSON files format | ||
| file name = date (like 20191125.json) | ||
| * For --to-pdf argument: specify the path to the folder | ||
| where 'news.pdf/cached_news.pdf' file will be saved. | ||
| The file will be overwritten after restarting the program. | ||
| Make sure to copy that file if you need it. Same thing with --to-html argument. | ||
| Also --to-html uses pictures from websites, so they wont be displayed without | ||
| internet connection | ||
| * Btw i use fonts for .pdf files to avoid encoding issues, | ||
| hope they will be installed correctly by 'pip install .' | ||
| * P.S. Ля, ребята, 4 курс птуира, уже распред идет во всю, работа нужна кааапец | ||
|
|
||
|
|
||
| # Parameters | ||
| * --help (show this help message and exit) | ||
| * --limit LIMIT (limit news topics if this parameter provided) | ||
| * --json (prints result as JSON in stdout) | ||
| * --verbose (outputs verbose status messages) | ||
| * --version (print version info) | ||
| * --date (It should take a date in YYYYmmdd format. For example: | ||
| --date 20191020The new from the specified day will be printed out. | ||
| If the news are not found error will be returned.) | ||
| * --to-pdf TO_PDF (It should take the path of the directory where new PDF file will be saved) | ||
| * --to-html TO_HTML (It should take the path of the directory where new HTML file will be saved) | ||
|
|
||
| # JSON structure | ||
| feed = { | ||
| 'Title': 'feed title', | ||
| 'Published': 'date', | ||
| 'Summary': 'news description', | ||
| 'Link': 'original link to news', | ||
| 'Url': 'url of rss feed' | ||
| 'Image': 'original link to the image' | ||
| } | ||
|
|
||
| # Progress | ||
| - [x] [Iteration 1] One-shot command-line RSS reader. | ||
| - [x] [Iteration 2] Distribution | ||
| - [x] [Iteration 3] News caching | ||
| - [x] [Iteration 4] Format converter | ||
| - [ ] * [Iteration 5] Output colorization | ||
| - [ ] * [Iteration 6] Web-server | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,142 @@ | ||
| """ | ||
| Contains class RSSReader which receives arguments from cmd | ||
| and allows to parse URL with RSS feed and print it in stdout | ||
| in different formats | ||
| """ | ||
|
|
||
| import os | ||
| import json | ||
|
|
||
| import feedparser | ||
| from bs4 import BeautifulSoup | ||
| import dateutil.parser as dateparser | ||
|
|
||
|
|
||
| class RSSReader: | ||
| """ Reads news from RSS url and prints them """ | ||
|
|
||
| def __init__(self, args, logger): | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Советую заменить аргумент |
||
| self.args = args.get_args() | ||
| self.logger = logger | ||
|
|
||
| def get_feed(self): | ||
| """ Returns parsed feed and caches it""" | ||
|
|
||
| news_feed = feedparser.parse(self.args.url) | ||
| for entry in news_feed.entries[:self.args.limit]: | ||
| self.cache_news_json(entry) | ||
| self.logger.info('News has been cached') | ||
| return news_feed.entries[:self.args.limit] | ||
|
|
||
| def print_feed(self, entries): | ||
| """ Prints feed in stdout """ | ||
|
|
||
| self.logger.info('Printing feed') | ||
|
|
||
| for entry in entries: | ||
| print('========================================================') | ||
| print(f'Title: {entry.title}') | ||
| print(f'Published: {entry.published}', end='\n\n') | ||
| print(f'Summary: {BeautifulSoup(entry.summary, "html.parser").text}', end='\n\n') | ||
| print(f'Image: {self.get_img_url(entry)}') | ||
| print(f'Link: {entry.link}') | ||
| print('========================================================') | ||
|
|
||
| def get_img_url(self, entry): | ||
| """ Parses image url from <description> in rss feed """ | ||
| soup = BeautifulSoup(entry.summary, 'html.parser') | ||
| img = soup.find('img') | ||
| if img: | ||
| img_url = img['src'] | ||
| return img_url | ||
| else: | ||
| return None | ||
|
|
||
| def print_feed_json(self, entries): | ||
| """ Prints feed in stdout in JSON format """ | ||
|
|
||
| self.logger.info('Printing feed in JSON format') | ||
|
|
||
| for entry in entries: | ||
| feed = self.to_dict(entry) | ||
| print(json.dumps(feed, indent=2, ensure_ascii=False), ',', sep='') | ||
|
|
||
| def to_dict(self, entry): | ||
| """ Converts entry to dict() format """ | ||
|
|
||
| feed = dict() | ||
| feed['Title'] = entry.title | ||
| feed['Published'] = entry.published | ||
| feed['Summary'] = BeautifulSoup(entry.summary, "html.parser").text | ||
| feed['Link'] = entry.link | ||
| feed['Url'] = self.args.url | ||
| feed['Image'] = self.get_img_url(entry) | ||
| return feed | ||
|
|
||
| def cache_news_json(self, entry): | ||
| """ Saves all printed news in JSON format (path = 'cache/{publication_date}.json')""" | ||
|
|
||
| date = dateparser.parse(entry.published, fuzzy=True).strftime('%Y%m%d') | ||
| directory_path = 'cache' + os.path.sep | ||
| if not os.path.exists(directory_path): | ||
| self.logger.info('Creating directory cache') | ||
| os.mkdir(directory_path) | ||
|
|
||
| file_path = directory_path + date + '.json' | ||
|
|
||
| feed = self.to_dict(entry) | ||
| news = list() | ||
| try: | ||
| with open(file_path, encoding='utf-8') as rf: | ||
| news = json.load(rf) | ||
| if feed in news: | ||
| # already cached | ||
| return | ||
| except FileNotFoundError: | ||
| self.logger.info('Creating new .json file') | ||
| except json.JSONDecodeError: | ||
| self.logger.info('Empty JSON file') | ||
|
|
||
| with open(file_path, 'w', encoding='utf-8') as wf: | ||
| news.append(feed) | ||
| json.dump(news, wf, indent=2) | ||
|
|
||
| def get_cached_json_news(self): | ||
| """ Returns the list of cached news with date from arguments """ | ||
|
|
||
| file_path = 'cache' + os.path.sep + self.args.date + '.json' | ||
| cached_news = list() | ||
| try: | ||
| with open(file_path) as rf: | ||
| news = json.load(rf) | ||
| for new in news: | ||
| if new['Url'] == self.args.url: | ||
| cached_news.append(new) | ||
| if not cached_news: | ||
| # News with such url have not been found | ||
| raise FileNotFoundError | ||
| return cached_news[:self.args.limit] | ||
| except FileNotFoundError: | ||
| print('There are no cached news with such date by this url') | ||
| except json.JSONDecodeError: | ||
| # Empty json file | ||
| # Or no news by needed url | ||
| print('There are no cached news with such date by this url') | ||
| return False | ||
|
|
||
| def print_cached_feed(self, cached_feed): | ||
| """ Prints saved news in stdout """ | ||
|
|
||
| self.logger.info('Printing cached feed') | ||
| for new in cached_feed: | ||
| print('---------------------------------------------------------') | ||
| for key, value in new.items(): | ||
| print(f'{key}: {value}') | ||
| print('---------------------------------------------------------') | ||
|
|
||
| def print_cached_feed_json(self, cached_feed): | ||
| """ Prints saved news in stdout in JSON format """ | ||
|
|
||
| self.logger.info('Printing cached feed in JSON format') | ||
| for new in cached_feed: | ||
| print(json.dumps(new, indent=2), ',', sep='') | ||
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,6 @@ | ||
| """ Package entry point """ | ||
|
|
||
| from app.rss_reader import main | ||
|
|
||
| if __name__ == '__main__': | ||
| main() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| """ | ||
| Contains ArgParser class which allows parse arguments from cmd | ||
| """ | ||
|
|
||
| import argparse | ||
|
|
||
|
|
||
| __version__ = '0.4.0' | ||
|
|
||
|
|
||
| class ArgParser: | ||
| """ Reads arguments """ | ||
|
|
||
| def __init__(self): | ||
| self.args = self.parse_args() | ||
|
|
||
| def parse_args(self): | ||
| """ Reads arguments from the cmd and returns them """ | ||
|
|
||
| argparser = argparse.ArgumentParser(description='One-shot command-line RSS reader', prog='rss-reader') | ||
| argparser.add_argument( | ||
| 'url', | ||
| type=str, | ||
| help='Input RSS url containing news' | ||
| ) | ||
| argparser.add_argument( | ||
| '--limit', | ||
| type=int, | ||
| default=None, | ||
| help='Sets a limit for news output (default - no limit)' | ||
| ) | ||
| argparser.add_argument( | ||
| '--json', | ||
| action='store_true', | ||
| help='Prints feed in JSON format in stdout' | ||
| ) | ||
| argparser.add_argument( | ||
| '--version', | ||
| action='version', | ||
| version=f'%(prog)s version {__version__}', | ||
| default=None, | ||
| help='Prints version of program' | ||
| ) | ||
| argparser.add_argument( | ||
| '--verbose', | ||
| action='store_true', | ||
| help='Prints all logs in stdout' | ||
| ) | ||
| argparser.add_argument( | ||
| '--date', | ||
| type=str, | ||
| help='It should take a date in YYYYmmdd format. For example: --date 20191020' | ||
| 'The new from the specified day will be printed out. If the news are not found error will be returned.' | ||
| ) | ||
| argparser.add_argument( | ||
| '--to-pdf', | ||
| dest='to_pdf', | ||
| type=str, | ||
| help='It should take the path of the directory where new PDF file will be saved' | ||
| ) | ||
| argparser.add_argument( | ||
| '--to-html', | ||
| dest='to_html', | ||
| type=str, | ||
| help='It should take the path of the directory where new HTML file will be saved' | ||
| ) | ||
| args = argparser.parse_args() | ||
| return args | ||
|
|
||
| def get_args(self): | ||
| """ Returns arguments """ | ||
| return self.args |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Лучше удалить эту строку :)