The current README provides installation instructions and links to API documentation, but it lacks a clear, practical demonstration of how to use the package for real-world data analysis.
For a data-focused library like malariagen_data, users (especially students and new contributors) benefit significantly from seeing how the package is used in practice. At the moment, users are redirected to multiple documentation pages, which can make onboarding fragmented and less intuitive.
Steps followed and expected result
Steps followed -
Installed the package using pip
Opened the README to understand how to start using it
Navigated to API documentation links
Tried to identify a basic workflow for data analysis
Observed result -
No simple example demonstrating how to load and analyze data
No end-to-end workflow (e.g., from data access to analysis output)
Heavy reliance on external documentation
Difficult for beginners to quickly get started
Expected result -
README should include:
A minimal working example (Python code snippet)
A short workflow such as -
Load dataset
Perform a basic query/filter
Run a simple analysis
Visualize or interpret results
Optional: Link to a Colab notebook for hands-on experimentation
Proposed Solution
Add a “Quick Start Example” section in the README-
Example using one dataset (e.g., Ag3 or Pf7)
Simple Python snippet demonstrating:
Importing the package
Connecting to dataset
Running a basic analysis
Include expected output or explanation
Also consider -
Linking a ready-to-run Google Colab notebook
Adding a short explanation of common use cases
Expected Impact
Significantly improves onboarding experience
Helps users understand practical usage quickly
Reduces dependency on navigating multiple documentation pages
Encourages more contributors and users to adopt the package
Additional Context
Since malariagen_data is designed to work with large-scale genomics datasets in cloud environments, providing a simple and clear entry point is essential for accessibility, especially for students and researchers new to the ecosystem.
The current README provides installation instructions and links to API documentation, but it lacks a clear, practical demonstration of how to use the package for real-world data analysis.
For a data-focused library like malariagen_data, users (especially students and new contributors) benefit significantly from seeing how the package is used in practice. At the moment, users are redirected to multiple documentation pages, which can make onboarding fragmented and less intuitive.
Steps followed and expected result
Steps followed -
Installed the package using pip
Opened the README to understand how to start using it
Navigated to API documentation links
Tried to identify a basic workflow for data analysis
Observed result -
No simple example demonstrating how to load and analyze data
No end-to-end workflow (e.g., from data access to analysis output)
Heavy reliance on external documentation
Difficult for beginners to quickly get started
Expected result -
README should include:
A minimal working example (Python code snippet)
A short workflow such as -
Load dataset
Perform a basic query/filter
Run a simple analysis
Visualize or interpret results
Optional: Link to a Colab notebook for hands-on experimentation
Proposed Solution
Add a “Quick Start Example” section in the README-
Example using one dataset (e.g., Ag3 or Pf7)
Simple Python snippet demonstrating:
Importing the package
Connecting to dataset
Running a basic analysis
Include expected output or explanation
Also consider -
Linking a ready-to-run Google Colab notebook
Adding a short explanation of common use cases
Expected Impact
Significantly improves onboarding experience
Helps users understand practical usage quickly
Reduces dependency on navigating multiple documentation pages
Encourages more contributors and users to adopt the package
Additional Context
Since malariagen_data is designed to work with large-scale genomics datasets in cloud environments, providing a simple and clear entry point is essential for accessibility, especially for students and researchers new to the ecosystem.