Skip to content

Releases: buptanswer/pyimport2pkg

🎉 PyImport2Pkg v1.0.0 - First Stable Release

06 Dec 04:07

Choose a tag to compare

Release Date: December 6, 2025

We're excited to announce the first stable release of PyImport2Pkg! This release marks the transition from alpha to production-ready status, with comprehensive internationalization, API stability improvements, and bug fixes.


🌟 What is PyImport2Pkg?

PyImport2Pkg solves a common problem in the AI-assisted coding era:

Given Python code with import statements, quickly identify which pip packages need to be installed.

import cv2           # → pip install opencv-python
from PIL import Image  # → pip install Pillow
import sklearn       # → pip install scikit-learn

Perfect for when AI generates code with lots of imports and you need to set up dependencies!


✨ Highlights of v1.0.0

🌍 Full Internationalization

All CLI output has been translated from Chinese to English for better international accessibility.

Before (v0.3.0):

构建状态:
  状态: completed
  总包数: 5000

After (v1.0.0):

Build Status:
  Status: completed
  Total packages: 5000

📦 Stable Python API

Core classes are now exported from the package root for easier imports:

# Now you can do:
from pyimport2pkg import Scanner, Parser, Filter, Mapper, Exporter

# Instead of:
from pyimport2pkg.scanner import Scanner
from pyimport2pkg.parser import Parser
# ...

🔧 Bug Fixes

  1. Dynamic Version in JSON Export - Fixed hardcoded "0.2.0" in JSON export metadata
  2. Complete JSON Export - Unresolved imports now properly included in JSON output
  3. Documentation Accuracy - Fixed all Python API examples in README
  4. CLI Documentation - Corrected parameter names (--python-version, requirements format)

📚 Production Ready

  • Development Status: Alpha → Production/Stable
  • All 304 tests passing
  • Comprehensive documentation
  • English + Chinese README

📥 Installation

pip install pyimport2pkg

Or upgrade from earlier versions:

pip install --upgrade pyimport2pkg

🚀 Quick Start

# Analyze a project
pyimport2pkg analyze /path/to/project

# Generate requirements.txt
pyimport2pkg analyze . -o requirements.txt

# Query a specific module
pyimport2pkg query cv2

📋 What's Changed

Code Changes

  • Update version to 1.0.0
  • Internationalize all CLI output (Chinese → English)
  • Export core classes from package root (__init__.py)
  • Fix dynamic version in JSON export (was hardcoded 0.2.0)
  • Fix missing unresolved parameter in JSON export
  • Fix mapper.py documentation and comment numbering
  • Remove unused Path import from mapper.py
  • Update development status to Production/Stable

Documentation

  • Fix README Python API examples (correct method names)
  • Fix README CLI documentation (--python-version, requirements format)
  • Update both English and Chinese READMEs
  • Add comprehensive v1.0.0 changelog
  • Create v1.0.0 user guide

Testing

  • All 304 tests passing
  • Updated test assertions to use dynamic version

📖 Documentation


🔄 Upgrade Guide

From v0.3.0

Simply upgrade via pip - fully backward compatible:

pip install --upgrade pyimport2pkg

No configuration or code changes required for CLI usage.

For Python API Users

If you import classes from submodules, you can now use the cleaner root imports:

# Old (still works)
from pyimport2pkg.scanner import Scanner

# New (recommended)
from pyimport2pkg import Scanner

🙏 Acknowledgments

Thank you to everyone who tested the v0.x releases and provided feedback. Your input helped shape this stable release!


🔗 Links


📦 Release Assets

  • Source code (zip)
  • Source code (tar.gz)
  • PyPI packages: pyimport2pkg-1.0.0.tar.gz and pyimport2pkg-1.0.0-py3-none-any.whl

Full Changelog: v0.3.0...v1.0.0

Made with ❤️ for developers using AI code generators

🤖 This release was prepared with assistance from Claude Code

PyImport2Pkg v0.3.0

06 Dec 01:08

Choose a tag to compare

Release Date: December 6, 2025

🎉 Overview

v0.3.0 represents a major performance and reliability upgrade focused on making large-scale database builds practical and user-friendly. With intelligent incremental updates, true parallel processing, batch database writes, and smart error recovery, building a comprehensive mapping database is now faster and more robust than ever.

Key Highlights:

  • 🚀 10-50x faster database writes (batch processing)
  • 📈 50x parallel concurrency (up from 20x)
  • 💾 Seamless interrupt & resume capability
  • 🧠 Intelligent incremental updates (smart delta logic)
  • 🛡️ Rate limit detection & graceful handling
  • ⚙️ Memory-optimized chunked processing for 15000+ packages

✨ New Features

1. Intelligent Incremental Updates (Default Behavior)

No need for manual tracking. Extend your database with a single command:

# Database has 500 packages, want to expand to 1000
pyimport2pkg build-db --max-packages 1000

# Output:
# Database contains 500 packages
# Target: top 1000 packages
# Will process 500 new packages only

Why this matters:

  • Smart package matching (by name, not order)
  • Safe expansion workflow
  • Automatic recovery of previously failed packages
  • No duplication, no data loss

Use Cases:

  • Incrementally build larger databases
  • Regular database maintenance and updates
  • Add newly popular packages

2. Build Progress Tracking

Persistent build state with BuildProgress class:

  • Track processed and failed packages
  • Automatic progress snapshots
  • Batch saves (every 100 packages) for performance
  • Safe interruption and recovery
from pyimport2pkg.database import get_build_progress

progress = get_build_progress()
status = progress.get_status()
# {'status': 'in_progress', 'total': 5000, 'processed': 2500, 'failed': 10, ...}

3. Interrupt & Resume (--resume)

Resume from the exact breakpoint:

# Start building
pyimport2pkg build-db --max-packages 14100

# Network issue? Interrupted?
# Later, simply resume (remembers your target):
pyimport2pkg build-db --resume

Smart Recovery:

  • Automatically remembers --max-packages value
  • Works with --retry-failed too
  • No reprocessing of completed packages
  • Data always safe

4. Failed Package Retry (--retry-failed)

Intelligently retry only failed packages from previous runs:

# First attempt: 5000 packages, 860 failed, 4140 succeeded
pyimport2pkg build-db --max-packages 5000

# Retry only the 860 that failed
pyimport2pkg build-db --retry-failed
# 834 succeed now, only 26 remain failed

# Try again
pyimport2pkg build-db --retry-failed
# Only 26 packages processed this time

Smart Tracking:

  • Successful packages auto-removed from failed list
  • Reduces retry time with each iteration
  • Automatic --max-packages preservation

5. Force Rebuild (--rebuild)

Clean slate database construction:

# Delete old mapping.db and start fresh
pyimport2pkg build-db --rebuild --max-packages 5000

6. Memory-Optimized Chunked Processing

Handle 15000+ packages without memory bloat:

  • Processes 500 packages per chunk
  • Memory freed after each chunk completes
  • Interrupt-safe: no chunk loss on resume
  • Perfect for large-scale builds
# Build database with 20000 packages (memory efficient)
pyimport2pkg build-db --max-packages 20000 --concurrency 50

7. Rate Limit Detection & Auto-Recovery

Automatic detection and graceful handling of PyPI rate limits:

  • Detects 20 consecutive failures as potential rate limiting
  • Auto-pauses 30 seconds, then retries
  • Up to 5 pause attempts before stopping
  • Progress saved during pauses (safe to interrupt)
Detected 20 consecutive failures - possible rate limiting.
Pausing 30 seconds before retry (pause 1/5)...
Resuming...

8. Performance Optimization

Batch Database Writes

  • 100 packages per batch commit (vs. one-at-a-time in v0.2.0)
  • Uses executemany() for batch inserts
  • SQLite WAL mode + optimized cache settings
  • Result: 10-50x faster writes

Batch Progress Saves

  • 100 packages per progress file update
  • Immediate saves on interrupt/complete
  • Balances safety and performance

Increased Concurrency

  • Default concurrency: 50 (up from 20)
  • Uses httpx.Limits for optimized connection pooling
  • Result: 2-3x faster package fetching

9. Graceful Interrupt Handling (Ctrl+C)

Press Ctrl+C and the tool handles it elegantly:

^C
Saving progress, please wait... (Ctrl+C again to force quit)

Build interrupted. Processed 2500/5000 packages.
Use --resume to continue building, or --retry-failed to retry failures.

Progress is always saved safely.


10. Build Status Command (NEW)

Check current or last build status:

pyimport2pkg build-status

# Output:
# Build Status: in_progress
# Total: 5000
# Processed: 2500
# Failed: 12
# Success Rate: 99.5%
# Last Updated: 2025-12-06 10:30:45

11. Timestamped Error Logs

Error logs no longer overwrite each other:

data/
├── mapping.db
├── build_progress.json
├── build_errors_20251206_082307.json  (timestamp included)
└── build_errors_20251206_100000.json  (timestamp included)

📋 CLI Changes

build-db Command Options

Option Description Default
--max-packages Target number of PyPI packages 5000
--concurrency Number of parallel workers 50
--resume Resume interrupted build
--retry-failed Retry only failed packages
--rebuild NEW Force rebuild (delete old DB)
--db-path Custom database file path data/mapping.db

Removed Options

  • --incremental - No longer needed; intelligent incrementalism is now default

New Command: build-status

pyimport2pkg build-status

📊 Performance Improvements

Speed

Operation v0.2.0 v0.3.0 Improvement
5000 packages 50-100 min 10-20 min 5-10x
Database writes Per-package Batch (100) 10-50x
Concurrency 20x 50x 2.5x

Memory Usage

Dataset Memory Footprint
5000 packages ~200 MB
10000 packages ~400 MB
20000 packages ~600 MB (chunked, not monolithic)

🔄 Typical Workflows

Build and Expand Incrementally

# Start small
pyimport2pkg build-db --max-packages 500

# Expand later (only new packages processed)
pyimport2pkg build-db --max-packages 5000

# Keep growing
pyimport2pkg build-db --max-packages 10000

Robust Large-Scale Build

# Build 20000 packages with resume capability
pyimport2pkg build-db --max-packages 20000

# If interrupted:
pyimport2pkg build-db --resume

# If failures occur:
pyimport2pkg build-db --retry-failed

Network Recovery

# Build starts
pyimport2pkg build-db --max-packages 5000

# Network issues → auto-detected and paused
# Auto-resume on timeout recovery

# Or manually retry failures
pyimport2pkg build-db --retry-failed

🧪 Testing

  • Total Test Cases: 304 (↑ from 263)
  • New Test Cases: 41
  • Test Coverage: BuildProgress, batch operations, CLI options, chunked processing, rate limiting
  • Status: All tests passing ✅

📝 Breaking Changes

⚠️ Important for v0.2.0 Users:

  1. --incremental option removed - Use direct --max-packages instead
  2. Default concurrency increased to 50 - Faster, but uses more bandwidth
  3. Progress save frequency changed - Now every 100 packages (was every package)

Migration Guide

# OLD (v0.2.0)
pyimport2pkg build-db --max-packages 5000 --incremental

# NEW (v0.3.0)
pyimport2pkg build-db --max-packages 5000  # incremental is default now!

# Expand database
pyimport2pkg build-db --max-packages 10000  # smart delta applied automatically

🐛 Bug Fixes & Stability

  • Fixed edge cases in incremental update logic
  • Improved error handling for network timeouts
  • Better memory cleanup in long-running builds
  • More informative error messages for debugging

📚 Documentation Updates

  • ✅ New comprehensive README.md
  • ✅ Updated USER_GUIDE with all v0.3.0 features
  • ✅ CLI help messages improved
  • ✅ API documentation updated

🚀 Upgrade Instructions

From v0.2.0

# Update via pip
pip install --upgrade pyimport2pkg

# Or reinstall from source
pip install -e ".[dev]"

# Verify installation
pyimport2pkg --version
# pyimport2pkg 0.3.0

Recommended Actions:

  1. Backup existing data/mapping.db if valuable
  2. Run pyimport2pkg build-status to check build state
  3. Try pyimport2pkg build-db --max-packages 5000 to rebuild efficiently

📋 What's Next? (v0.4.0 Roadmap)

  • Conda environment detection and analysis
  • Version constraint inference from code patterns
  • pyproject.toml output format support
  • Interactive candidate selection UI
  • Custom mapping configuration file support
  • Plugin system for third-party mapping providers

🙏 Thanks

This release includes improvements suggested by users encountering large-scale database builds. Special thanks to the community for feedback and testing!


📞 Support

Read more