Skip to content

Add End-to-End Usage Examples and Real-World Analysis Workflow in README #1238

@lovelymahor

Description

@lovelymahor

The current README provides installation instructions and links to API documentation, but it lacks a clear, practical demonstration of how to use the package for real-world data analysis.

For a data-focused library like malariagen_data, users (especially students and new contributors) benefit significantly from seeing how the package is used in practice. At the moment, users are redirected to multiple documentation pages, which can make onboarding fragmented and less intuitive.

Steps followed and expected result

Steps followed -

Installed the package using pip
Opened the README to understand how to start using it
Navigated to API documentation links
Tried to identify a basic workflow for data analysis

Observed result -

No simple example demonstrating how to load and analyze data
No end-to-end workflow (e.g., from data access to analysis output)
Heavy reliance on external documentation
Difficult for beginners to quickly get started

Expected result -

README should include:
A minimal working example (Python code snippet)
A short workflow such as -
Load dataset
Perform a basic query/filter
Run a simple analysis
Visualize or interpret results
Optional: Link to a Colab notebook for hands-on experimentation
Proposed Solution

Add a “Quick Start Example” section in the README-

Example using one dataset (e.g., Ag3 or Pf7)
Simple Python snippet demonstrating:
Importing the package
Connecting to dataset
Running a basic analysis
Include expected output or explanation

Also consider -

Linking a ready-to-run Google Colab notebook
Adding a short explanation of common use cases
Expected Impact
Significantly improves onboarding experience
Helps users understand practical usage quickly
Reduces dependency on navigating multiple documentation pages
Encourages more contributors and users to adopt the package

Additional Context

Since malariagen_data is designed to work with large-scale genomics datasets in cloud environments, providing a simple and clear entry point is essential for accessibility, especially for students and researchers new to the ecosystem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions