snowy is a simple Python script that generates an HTML encyclopedia from a wordlist using Wikipedia/MediaWiki API.
Follow these steps to set up and run snowy on your machine.
Open your terminal (or Command Prompt) and run:
git clone https://github.com/saifnama/snowy.git
cd snowyInstall the required libraries using the provided requirements file:
pip install -r requirements.txtCreate a text file (e.g., test_words.txt) with one word per line, then run the script:
python snowy.py -i words.txt -o my_encyclopedia.html- Async Fetching: Snowy uses an asynchronous architecture (built on
asyncio) to fetch data for multiple words in parallel. This makes it 3-4x faster. - API Safety: To respect Wikipedia's servers, Snowy uses a Concurrency Limit (max 3 simultaneous connections) and "Smart Retries" (exponential backoff). This ensures high performance without overwhelming the API or risking a block.
- Summaries: Fetches the first paragraph from Wikipedia for a concise overview.
- Auto-Image Discovery: Automatically finds the best high-quality image with accurate captions.
- Linked Data: Includes Wikidata IDs and direct links to Wikipedia for further reading.
- Duplicate Merging: Handles duplicate words automatically (e.g., "India" and "india" are merged).
- Safe & Fast: Built-in rate limiting (0.5s) to stay within Wikipedia's API guidelines.
- Python 3.7+
This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.
