From a79c8d43b74fc0c5015070c003760ffa2e19e0c5 Mon Sep 17 00:00:00 2001 From: PowerChell Date: Thu, 5 Feb 2026 09:45:39 -0700 Subject: [PATCH 1/4] WIP --- docs/about-source/core-concepts.md | 165 +++++++++++++++++++++++ docs/about-source/what-is-source.md | 44 ++++++ docs/core-concepts.md | 92 ------------- docs/index.md | 1 + docs/using-source/creating-an-account.md | 99 ++++++++++++++ docs/{ => using-source}/data-proxy.md | 4 +- docs/{ => using-source}/data-upload.md | 7 +- docs/{ => using-source}/web-ui.md | 0 sidebars.ts | 23 +++- 9 files changed, 333 insertions(+), 102 deletions(-) create mode 100644 docs/about-source/core-concepts.md create mode 100644 docs/about-source/what-is-source.md delete mode 100644 docs/core-concepts.md create mode 100644 docs/using-source/creating-an-account.md rename docs/{ => using-source}/data-proxy.md (97%) rename docs/{ => using-source}/data-upload.md (97%) rename docs/{ => using-source}/web-ui.md (100%) diff --git a/docs/about-source/core-concepts.md b/docs/about-source/core-concepts.md new file mode 100644 index 0000000..937b30b --- /dev/null +++ b/docs/about-source/core-concepts.md @@ -0,0 +1,165 @@ +--- +title: Core Concepts +id: core-concepts +slug: /core-concepts +sidebar_position: 2 +draft: true +--- + +Source is a data publishing utility designed to make data shared in object stores easier to find, explore, and share on the web. Understanding these fundamental concepts will help you navigate and use Source Cooperative effectively. + +## Overview + +Source allows individuals or organizations to publish files to the web, collected in data products. Every data product is owned by either an individual or an organization. A data product has a title and description, can contain any number of files (or objects), and can contain a README file. + +## Accounts + +**Individual Accounts**: When you create an account on Source, you get a personal namespace where you can publish and manage your own data products. Your individual account is identified by your username (e.g., `username`). This username becomes part of your data product URLs. + +### Individual Benefits +- **Easy Publishing**: Simple tools for publishing data without technical infrastructure +- **Professional Presence**: Establish a professional presence in the data community +- **Attribution**: Clear attribution for data contributions and ownership +- **Collaboration**: Connect with other data publishers and users +- **Impact**: Increase the visibility and impact of research and data work + +**Organizational Accounts**: Organizations can create shared accounts that multiple individuals can manage collaboratively. Organizational accounts have their own namespace and allow teams to publish data under a shared identity. For example, the Harvard Library Innovation Lab publishes data under the `harvard-lil` namespace. + +### Organization Features +- **Branded Presence**: Organizations have branded profile pages +- **Team Management**: Multiple individuals can contribute to organization data products +- **Governance**: Organizational policies and standards can be applied to data publishing +- **Analytics**: Insights into data usage and impact + +### Organization Benefits +- **Data Strategy**: Implement a comprehensive data publishing strategy +- **Compliance**: Meet open data requirements and transparency goals +- **Engagement**: Increase engagement with stakeholders and the public +- **Efficiency**: Streamline data sharing processes without maintaining custom infrastructure + +## Data Products + +Data products (also called repositories in some contexts) are the primary organizational unit in Source. They serve as containers for related data files and provide a way to group and organize information logically. + +A data product is a collection of related data files with associated metadata and documentation. Each data product consists of: + +- **A unique identifier**: Following the pattern `account-name/data-product-name` (e.g., `cholmes/eurocrops`) +- **Title**: A descriptive, human-readable name for the data product +- **Description**: A detailed explanation of what the data product contains and its purpose +- **Owner**: Either an individual or an organization +- **Metadata**: Including tags, license information, and other descriptive details +- **Documentation**: Typically a README.md file at the data product root that explains the data +- **Objects**: The actual data files stored in the data product (any number of files) +- **Visibility settings**: Controlling who can access the data + +Data products are built entirely on cloud object storage, which allows Source to host very large volumes of data. While platforms like GitHub limit repositories to around 5GB, Source data products can be hundreds of terabytes. For example, the RapidAI4EO dataset on Source is over 100TB. + +### Data Product URLs + +Each data product has both a web view and a data access view: +- **Web view**: `https://source.coop/account-name/data-product-name/` - Browse files in your browser +- **Data view**: `https://data.source.coop/account-name/data-product-name/` - Programmatic access via JSON + +Everything in Source is designed to be linkable. You can navigate deeper into data products with URLs like `https://source.coop/account-name/data-product-name/subdirectory/`. + +### Data Product Features +- **Public Access**: All data products are publicly accessible via HTTP +- **Unlisted Data Products**: While all data products are publicly accessible, users can opt to leave them unlisted to prevent them from appearing in search results or lists of data products on the Source website +- **Web Interface**: Each data product has a dedicated web page for browsing and exploring +- **Direct HTTP Access**: Every object can be accessed directly via its URL + +### Future Data Product Features +- **Restricted Access**: The ability to restrict data product access based on identity and access rules +- **Data Product Monetization**: The ability to charge for access to data products +- **Versioning**: Support for tracking changes made to data products + +## Roles and Permissions + +**Account Roles**: Individuals can have different roles within organizational accounts: +- **Owner**: Full control over the organization and all its data products +- **Administrator**: Can manage data products and organization members +- **Member**: Can contribute to the organization's data products based on assigned permissions + +**Data Product Roles**: Access to individual data products can be managed separately: +- **Owner**: Full control over the data product, including deletion and access management +- **Contributor**: Can upload and modify data within the product +- **Viewer**: Can view and download data (relevant for restricted access data products) + +## Objects + +Objects are the individual files or data items stored within data products. They represent the actual data that users want to access and analyze. + +### Object Characteristics + +In Source: +- **File Types**: Objects can be any file type (GeoTIFF, CSV, Parquet, JSON, NetCDF, PMTiles, images, documents, etc.) +- **Size**: Objects can be any size, from kilobytes to terabytes (no practical limits) +- **Organization**: Objects are organized using path prefixes to create virtual directory structures +- **Direct Access**: Each object has a direct URL for access: `https://data.source.coop/account/data-product/path/to/file.ext` + +### Object Storage vs File Storage + +Source uses object storage rather than traditional file storage. This has important implications: + +- **Scalability**: Object storage can handle massive volumes of data efficiently +- **No version control**: Unlike Git-based systems, Source doesn't provide granular version control on individual objects. Source is designed for publishing "fully baked" data products +- **Flat namespace**: While you can organize objects with path prefixes, the underlying storage is flat rather than hierarchical + +## Visibility Options + +When creating a data product, you can set one of three visibility levels: + +**Public**: Anyone can discover and access your data product without authentication. This is ideal for open data that you want to share broadly with the research community and public. Most data products on Source are public. + +**Internal**: Only members of your organization can access the data product. Useful for data that should be shared within your team but not with external users. + +**Private**: Only specific individuals you grant access to can view and download the data. Best for sensitive data, work-in-progress datasets, or data that requires explicit permission for access. + +Note: Visibility can be changed at any time, allowing you to develop datasets privately before making them public. + +## Object Previewers + +Source provides built-in preview functionality for common data formats directly in the browser. This allows users to visualize and explore data before downloading. When you navigate to an individual file in Source, you'll see a preview along with metadata rather than immediately downloading the file. + +Currently supported preview formats include: +- **Geospatial vector data**: PMTiles +- **Cloud-optimized rasters**: Cloud Optimized GeoTIFFs (COG) +- **Vector data**: GeoJSON, FlatGeobuf +- **Tabular data**: CSV, Parquet +- **Metadata and documentation**: JSON, XML, Markdown + +The preview system is extensible, and the community can propose solutions for additional file formats as needs arise. + +## Data Access Methods + +Source provides multiple ways to access data: + +1. **Web Browser**: Browse and download files through the web interface at `https://source.coop` +2. **Direct HTTP**: Access individual files directly via `https://data.source.coop` +3. **AWS CLI**: Use the S3-compatible API to list, upload, download, and manage objects programmatically +4. **SDKs**: Use AWS SDKs (boto3 for Python, AWS SDK for JavaScript, etc.) with Source's endpoint +5. **Direct Cloud Access**: Authenticated users can generate credentials to access data directly from the underlying cloud storage + +## Tags and Discoverability + +Data products can be tagged with relevant keywords to improve discoverability. Common tags include: +- Data types: `vector`, `raster`, `tabular` +- Themes: `agriculture`, `climate`, `conservation`, `land cover` +- Formats: `geoparquet`, `cog`, `pmtiles`, `netcdf` +- Applications: `machine learning`, `segmentation`, `time series` + +Tags help users find relevant datasets through search and browsing, and improve Source's visibility in search engines. + +## Key Principles + +### Open Access +All data published through Source is publicly accessible, promoting transparency and collaboration. + +### Simplicity +Source eliminates the complexity of building custom data portals or APIs, making data publishing accessible to everyone. + +### Standards-Based +Source leverages existing web standards and object storage protocols, ensuring compatibility and longevity. + +### Community-Driven +Source fosters a community of data publishers and users, creating opportunities for collaboration and knowledge sharing. diff --git a/docs/about-source/what-is-source.md b/docs/about-source/what-is-source.md new file mode 100644 index 0000000..67ba958 --- /dev/null +++ b/docs/about-source/what-is-source.md @@ -0,0 +1,44 @@ +--- +title: What is Source Cooperative? +id: what-is-source +slug: /what-is-source +sidebar_position: 1 +--- + +Source Cooperative is a data publishing utility for the web that allows trusted organizations and individuals to publish data of any kind at any scale without needing to build or maintain their own infrastructure. Built on cloud object storage, Source provides a public catalog, standardized access, and community visibility for open scientific and geospatial data. + +## Why Source Cooperative? + +**Built for Data Publishing, Not Just Storage**: While cloud object storage services like Amazon S3, Google Cloud Storage, and Azure Blob Storage can store data, they don't make it discoverable or accessible to others. Source is a utility built on top of cloud object storage that provides a public catalog, standardized access, and community visibility that raw cloud storage can't offer. + +**Focus on Your Data, Not Infrastructure**: Instead of building and maintaining data portals, custom APIs, and hosting infrastructure, Source lets you focus on creating high-quality data products that are easy to publish and easy to use. + +**No Lock-In**: Source respects the Community Right to Replicate. Data providers are never locked into Source – you can always move your data elsewhere and host it independently if needed. + +**Cost-Effective at Scale**: Source hosts over 1 petabyte of data across 300+ data products. Whether you're publishing a few gigabytes or hundreds of terabytes, Source provides cost-effective hosting without requiring you to manage cloud infrastructure. + +**Cloud-Native Access**: Data on Source is stored in S3-compatible object storage, enabling efficient programmatic access through standard tools like the AWS CLI, Python's boto3, and various other programming libraries. Access data via the web interface or bring your compute directly to the data in the cloud. + +**Built for the Research Community**: Source is developed and maintained by Radiant Earth, a 501(c)(3) non-profit organization. As a non-profit utility, Source aims to provide the best service to its members at the lowest possible cost, without seeking arbitrary profits or vendor lock-in. + +## Real-World Impact + +Organizations already using Source include: + +- **Bridges to Prosperity** uses Source to enable AI-powered global water mapping, tripling the known coverage of mapped waterways worldwide +- **Earth Genome** shares 60+ terabytes of processed satellite imagery and 3.5 billion vector embeddings through Source +- **Dynamical.org** provides fast, easy access to weather data, serving 13,000 unique visitors and 31.3 million API requests +- **Auspatious** publishes cloud-optimized geospatial datasets, making high-resolution data accessible without requiring large downloads + +## Current Status + +Source is currently in beta. While all data hosted in Source is available to the public, publishing data requires applying to be a beta tester. To apply, visit [the beta tester application form](https://forms.gle/4weS1hkRjZhQLoPE9). + +Source currently: +- Hosts over 1 petabyte of data +- Serves approximately 500 terabytes of data transfer per month +- Logs an average of 126 million data requests per month +- Supports over 300 data products from 66+ organizations + +Source is funded by the Navigation Fund's Open Science Initiative, with in-kind support from AWS and Azure for data hosting. + diff --git a/docs/core-concepts.md b/docs/core-concepts.md deleted file mode 100644 index a9a1255..0000000 --- a/docs/core-concepts.md +++ /dev/null @@ -1,92 +0,0 @@ ---- -title: Core Concepts -id: core-concepts -slug: /core-concepts -sidebar_position: 2 -draft: true ---- - -# Core Concepts - -Source is a data publishing utility designed to make data shared in object stores easier to find, explore, and share on the web. - -## Overview - -The best way to summarize Source is that it allows individuals or organizations to publish files to the web, collected in repositories. Every repository is owned by either an individual or an organization. A repository has a title and description, can contain any number of files (or objects), and can contain a README file. - -## Repositories - -Repositories are the primary organizational unit in Source. They serve as containers for related data files and provide a way to group and organize information logically. - -### Repository Structure -- **Title**: A descriptive name for the repository -- **Description**: A detailed explanation of what the repository contains -- **Owner**: Either an individual or an organization -- **Files**: Any number of data objects can be stored within a repository -- **README**: An optional markdown file that provides additional context and documentation - -### Repository Features -- **Public Access**: All repositories are publicly accessible via HTTP -- **Unlisted Repositories**: While all repositories are publicly accessible, users can opt to leave them unlisted to prevent them from appearing in search results or lists of repositories on the Source website -- **Web Interface**: Each repository has a dedicated web page for browsing and exploring - -### Future Respository Features -- **Restricted Access**: The ability to restrict repository access based on identity and access rules -- **Repository Monetization**: The ability to charge for access to repositories -- **Versioning**: Support for tracking changes made to repositories - -## Objects - -Objects are the individual files or data items stored within repositories. They represent the actual data that users want to access and analyze. - -### Object Characteristics -- **File Types**: Support for any file format (CSV, JSON, images, documents, etc.) -- **Size**: No practical limits on file size -- **Direct Access**: Objects can be accessed directly via HTTP URLs -- **Organization**: Objects can be organized in folders or with descriptive naming - -## Individuals - -Individuals can own and manage repositories on Source, making it easy for researchers, data scientists, and other professionals to share their work. - -### Individual Accounts -- **Profile**: Each individual has a profile page showcasing their repositories -- **Ownership**: Individuals can own multiple repositories -- **Collaboration**: Individuals can collaborate with organizations on shared repositories -- **Attribution**: Clear attribution for data contributions and ownership - -### Individual Benefits -- **Easy Publishing**: Simple tools for publishing data without technical infrastructure -- **Professional Presence**: Establish a professional presence in the data community -- **Collaboration**: Connect with other data publishers and users -- **Impact**: Increase the visibility and impact of research and data work - -## Organizations - -Organizations can use Source to establish a structured approach to data publishing, making their data assets more discoverable and accessible. - -### Organization Features -- **Branded Presence**: Organizations have branded profile pages -- **Team Management**: Multiple individuals can contribute to organization repositories -- **Governance**: Organizational policies and standards can be applied to data publishing -- **Analytics**: Insights into data usage and impact - -### Organization Benefits -- **Data Strategy**: Implement a comprehensive data publishing strategy -- **Compliance**: Meet open data requirements and transparency goals -- **Engagement**: Increase engagement with stakeholders and the public -- **Efficiency**: Streamline data sharing processes without maintaining custom infrastructure - -## Key Principles - -### Open Access -All data published through Source is publicly accessible, promoting transparency and collaboration. - -### Simplicity -Source eliminates the complexity of building custom data portals or APIs, making data publishing accessible to everyone. - -### Standards-Based -Source leverages existing web standards and object storage protocols, ensuring compatibility and longevity. - -### Community-Driven -Source fosters a community of data publishers and users, creating opportunities for collaboration and knowledge sharing. diff --git a/docs/index.md b/docs/index.md index affc325..74ed8b8 100644 --- a/docs/index.md +++ b/docs/index.md @@ -6,6 +6,7 @@ id: index slug: / sidebar_label: Home sidebar_position: 1 + --- import ThemedImage from '@theme/ThemedImage'; diff --git a/docs/using-source/creating-an-account.md b/docs/using-source/creating-an-account.md new file mode 100644 index 0000000..3e8862e --- /dev/null +++ b/docs/using-source/creating-an-account.md @@ -0,0 +1,99 @@ +--- +title: Creating an Account on Source Cooperative +id: create-an-account +slug: /create-an-account +sidebar_position: 2 +draft: true +--- + +Getting started with Source Cooperative requires creating an account. Currently, Source is in beta, so publishing data requires being accepted as a beta tester. + +## Prerequisites + +Before creating an account, be aware that: +- **Source is currently in beta**: All data hosted in Source is publicly accessible, but publishing data requires applying to be a beta tester +- **Beta tester application**: To publish data to Source, apply at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) +- **Browsing and downloading**: You can browse and download public data on Source without an account + +## Step 1: Navigate to Source Cooperative + +Visit [https://source.coop](https://source.coop) and click "Log In / Register" in the top navigation. + +## Step 2: Choose Your Sign-Up Method + +Source offers authentication through third-party providers. Select your preferred authentication method and follow the sign-in flow. Common options may include: +- GitHub authentication +- Google authentication +- Other OAuth providers + +## Step 3: Set Up Your Profile + +After authentication, you'll be able to: +- Set your username (this becomes your namespace: `username/`) +- Add a display name +- Provide biographical information (optional) +- Add profile links and additional information + +## Step 4: Apply as a Beta Tester (For Data Publishers) + +If you want to publish data on Source: + +1. Complete the beta tester application at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) +2. Wait for approval from the Radiant Earth team +3. Once approved, you'll be able to create data products and upload data + +## Creating an Organizational Account + +If you need to publish data under an organizational identity rather than your personal account: + +1. First create your individual account +2. Contact the Source team at [hello@source.coop](mailto:hello@source.coop) to request creation of an organizational account +3. Provide: + - Desired organization name and ID + - Organization description + - Initial organization members and their roles +4. The Source team will create the organizational account and grant you access + +Note: Self-service organizational account creation is on the roadmap but not yet available. + +## Tips for Account Setup + +- **Choose your username carefully**: Your username becomes part of all your data product URLs (e.g., `https://source.coop/username/dataset-name/`). Choose something professional and recognizable. + +- **Use your real identity**: Source is built for trusted organizations and individuals. Using your real name or organizational identity helps build trust in the data you publish. + +- **Complete your profile**: A complete profile with biographical information helps other users understand who is publishing the data and builds confidence in your data products. + +- **Link to your work**: Include links to your organization, research profile (ORCID), or relevant work to establish credibility. + +## What You Can Do With an Account + +Once you have an account and beta access: + +- **Publish data products**: Create repositories to host your datasets +- **Generate access credentials**: Get AWS CLI credentials for programmatic data access +- **Manage your data**: Update, organize, and document your data products +- **Collaborate**: Add team members to organizational accounts +- **Track usage**: Monitor how your data products are being accessed (feature in development) + +## Data Access Without an Account + +You don't need an account to: +- Browse public data products on Source +- Download data through your web browser +- Access data via direct HTTP URLs +- View data previews + +However, you will need an account for: +- Accessing data programmatically via AWS CLI +- Generating S3-compatible access credentials +- Publishing your own data products +- Accessing private or restricted data products (if granted permission) + +## Next Steps + +After creating your account: +1. Explore existing data products to see examples +2. Review the [Core Concepts](core-concepts.md) to understand how Source works +3. If approved as a beta tester, proceed to [Publishing a Data Product](publishing-a-data-product.md) +4. Learn about [uploading data](uploading-data.md) to your repositories \ No newline at end of file diff --git a/docs/data-proxy.md b/docs/using-source/data-proxy.md similarity index 97% rename from docs/data-proxy.md rename to docs/using-source/data-proxy.md index 8a4f616..47427f6 100644 --- a/docs/data-proxy.md +++ b/docs/using-source/data-proxy.md @@ -1,12 +1,10 @@ --- -title: Data Proxy +title: Accessing Data Through the Source Data Proxy id: data-proxy slug: /data-proxy sidebar_position: 3 --- -# Accessing Data Through the Source Data Proxy - The Source Data Proxy provides S3-compatible access to all data hosted on Source Cooperative. You can access data through the proxy without authentication, making it easy to programmatically download datasets using standard AWS CLI commands. ## Using the AWS CLI diff --git a/docs/data-upload.md b/docs/using-source/data-upload.md similarity index 97% rename from docs/data-upload.md rename to docs/using-source/data-upload.md index d5e48d6..e5e18f8 100644 --- a/docs/data-upload.md +++ b/docs/using-source/data-upload.md @@ -1,4 +1,9 @@ -# How to Upload Your Data to Source Cooperative +--- +title: How to Upload Your Data to Source Cooperative +id: data-upload +slug: /data-upload +sidebar_position: 2 +--- This guide explains how to deliver your data to Source Cooperative in a secure and simple way. It is written for data providers and does not require deep Amazon Web Service (AWS) knowledge. diff --git a/docs/web-ui.md b/docs/using-source/web-ui.md similarity index 100% rename from docs/web-ui.md rename to docs/using-source/web-ui.md diff --git a/sidebars.ts b/sidebars.ts index 6b770d1..919c5e9 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -15,13 +15,24 @@ import type {SidebarsConfig} from '@docusaurus/plugin-content-docs'; const sidebars: SidebarsConfig = { tutorialSidebar: [ 'index', - 'core-concepts', - 'data-proxy', - 'web-ui', { - type: 'doc', - id: 'data-upload', - label: 'Data Upload', + type: 'category', + label: 'About Source', + items: [ + 'about-source/what-is-source', + 'about-source/core-concepts', + ], + }, + { + type: 'category', + label: 'Using Source', + items: [ + 'using-source/create-an-account', + 'using-source/data-upload', + 'using-source/web-ui', + 'using-source/data-proxy', + + ], }, { type: 'category', From 4b391f3494b6e3d61f221403d32449819733cfea Mon Sep 17 00:00:00 2001 From: PowerChell Date: Fri, 6 Feb 2026 08:05:51 -0700 Subject: [PATCH 2/4] WIP --- docs/about-source/core-concepts.md | 132 ++++++++++++----------- docs/about-source/what-is-source.md | 12 +-- docs/using-source/creating-an-account.md | 23 ++-- docs/using-source/data-upload.md | 2 +- 4 files changed, 86 insertions(+), 83 deletions(-) diff --git a/docs/about-source/core-concepts.md b/docs/about-source/core-concepts.md index 937b30b..60697f4 100644 --- a/docs/about-source/core-concepts.md +++ b/docs/about-source/core-concepts.md @@ -3,43 +3,19 @@ title: Core Concepts id: core-concepts slug: /core-concepts sidebar_position: 2 -draft: true --- Source is a data publishing utility designed to make data shared in object stores easier to find, explore, and share on the web. Understanding these fundamental concepts will help you navigate and use Source Cooperative effectively. ## Overview -Source allows individuals or organizations to publish files to the web, collected in data products. Every data product is owned by either an individual or an organization. A data product has a title and description, can contain any number of files (or objects), and can contain a README file. +Source allows individuals or organizations to publish files to the web, collected in data products. Every data product is owned by either an individual or an organization. A data product has a title and description, can contain any number of files (or objects), and can contain a README file (the file you see displayed at the root of the data product). -## Accounts +## The data -**Individual Accounts**: When you create an account on Source, you get a personal namespace where you can publish and manage your own data products. Your individual account is identified by your username (e.g., `username`). This username becomes part of your data product URLs. +### Data Products -### Individual Benefits -- **Easy Publishing**: Simple tools for publishing data without technical infrastructure -- **Professional Presence**: Establish a professional presence in the data community -- **Attribution**: Clear attribution for data contributions and ownership -- **Collaboration**: Connect with other data publishers and users -- **Impact**: Increase the visibility and impact of research and data work - -**Organizational Accounts**: Organizations can create shared accounts that multiple individuals can manage collaboratively. Organizational accounts have their own namespace and allow teams to publish data under a shared identity. For example, the Harvard Library Innovation Lab publishes data under the `harvard-lil` namespace. - -### Organization Features -- **Branded Presence**: Organizations have branded profile pages -- **Team Management**: Multiple individuals can contribute to organization data products -- **Governance**: Organizational policies and standards can be applied to data publishing -- **Analytics**: Insights into data usage and impact - -### Organization Benefits -- **Data Strategy**: Implement a comprehensive data publishing strategy -- **Compliance**: Meet open data requirements and transparency goals -- **Engagement**: Increase engagement with stakeholders and the public -- **Efficiency**: Streamline data sharing processes without maintaining custom infrastructure - -## Data Products - -Data products (also called repositories in some contexts) are the primary organizational unit in Source. They serve as containers for related data files and provide a way to group and organize information logically. +Data products are the primary organizational unit in Source. They serve as containers for related data files and provide a way to group and organize information logically. A data product is a collection of related data files with associated metadata and documentation. Each data product consists of: @@ -54,50 +30,42 @@ A data product is a collection of related data files with associated metadata an Data products are built entirely on cloud object storage, which allows Source to host very large volumes of data. While platforms like GitHub limit repositories to around 5GB, Source data products can be hundreds of terabytes. For example, the RapidAI4EO dataset on Source is over 100TB. -### Data Product URLs +#### Data Product URLs Each data product has both a web view and a data access view: + - **Web view**: `https://source.coop/account-name/data-product-name/` - Browse files in your browser - **Data view**: `https://data.source.coop/account-name/data-product-name/` - Programmatic access via JSON Everything in Source is designed to be linkable. You can navigate deeper into data products with URLs like `https://source.coop/account-name/data-product-name/subdirectory/`. -### Data Product Features +#### Data Product Features + - **Public Access**: All data products are publicly accessible via HTTP - **Unlisted Data Products**: While all data products are publicly accessible, users can opt to leave them unlisted to prevent them from appearing in search results or lists of data products on the Source website - **Web Interface**: Each data product has a dedicated web page for browsing and exploring - **Direct HTTP Access**: Every object can be accessed directly via its URL -### Future Data Product Features +#### Future Data Product Features + - **Restricted Access**: The ability to restrict data product access based on identity and access rules - **Data Product Monetization**: The ability to charge for access to data products - **Versioning**: Support for tracking changes made to data products -## Roles and Permissions - -**Account Roles**: Individuals can have different roles within organizational accounts: -- **Owner**: Full control over the organization and all its data products -- **Administrator**: Can manage data products and organization members -- **Member**: Can contribute to the organization's data products based on assigned permissions - -**Data Product Roles**: Access to individual data products can be managed separately: -- **Owner**: Full control over the data product, including deletion and access management -- **Contributor**: Can upload and modify data within the product -- **Viewer**: Can view and download data (relevant for restricted access data products) - -## Objects +### Objects Objects are the individual files or data items stored within data products. They represent the actual data that users want to access and analyze. -### Object Characteristics +#### Object Characteristics In Source: + - **File Types**: Objects can be any file type (GeoTIFF, CSV, Parquet, JSON, NetCDF, PMTiles, images, documents, etc.) - **Size**: Objects can be any size, from kilobytes to terabytes (no practical limits) - **Organization**: Objects are organized using path prefixes to create virtual directory structures - **Direct Access**: Each object has a direct URL for access: `https://data.source.coop/account/data-product/path/to/file.ext` -### Object Storage vs File Storage +#### Object Storage vs File Storage Source uses object storage rather than traditional file storage. This has important implications: @@ -105,19 +73,7 @@ Source uses object storage rather than traditional file storage. This has import - **No version control**: Unlike Git-based systems, Source doesn't provide granular version control on individual objects. Source is designed for publishing "fully baked" data products - **Flat namespace**: While you can organize objects with path prefixes, the underlying storage is flat rather than hierarchical -## Visibility Options - -When creating a data product, you can set one of three visibility levels: - -**Public**: Anyone can discover and access your data product without authentication. This is ideal for open data that you want to share broadly with the research community and public. Most data products on Source are public. - -**Internal**: Only members of your organization can access the data product. Useful for data that should be shared within your team but not with external users. - -**Private**: Only specific individuals you grant access to can view and download the data. Best for sensitive data, work-in-progress datasets, or data that requires explicit permission for access. - -Note: Visibility can be changed at any time, allowing you to develop datasets privately before making them public. - -## Object Previewers +### Object Previewers Source provides built-in preview functionality for common data formats directly in the browser. This allows users to visualize and explore data before downloading. When you navigate to an individual file in Source, you'll see a preview along with metadata rather than immediately downloading the file. @@ -130,7 +86,7 @@ Currently supported preview formats include: The preview system is extensible, and the community can propose solutions for additional file formats as needs arise. -## Data Access Methods +### Data Access Methods Source provides multiple ways to access data: @@ -140,7 +96,55 @@ Source provides multiple ways to access data: 4. **SDKs**: Use AWS SDKs (boto3 for Python, AWS SDK for JavaScript, etc.) with Source's endpoint 5. **Direct Cloud Access**: Authenticated users can generate credentials to access data directly from the underlying cloud storage -## Tags and Discoverability +### Tags and Discoverability + +## Accounts + +### Individual Accounts + +When you create an account on Source, you get a personal namespace where you can publish and manage your own data products. Your individual account is identified by your username (e.g., `username`). This username becomes part of your data product URLs. + +#### Individual Account Benefits + +- **Easy Publishing**: Simple tools for publishing data without technical infrastructure +- **Professional Presence**: Establish a professional presence in the data community +- **Attribution**: Clear attribution for data contributions and ownership +- **Collaboration**: Connect with other data publishers and users +- **Impact**: Increase the visibility and impact of research and data work + +### Organizational Accounts + +Organizations can create shared accounts that multiple individuals can manage collaboratively. Organizational accounts have their own namespace and allow teams to publish data under a shared identity. For example, the Harvard Library Innovation Lab publishes data under the `harvard-lil` namespace. + +#### Organization Features + +- **Branded Presence**: Organizations have branded profile pages +- **Team Management**: Multiple individuals can contribute to organization data products +- **Governance**: Organizational policies and standards can be applied to data publishing +- **Analytics**: Insights into data usage and impact + +#### Organization Benefits + +- **Data Strategy**: Implement a comprehensive data publishing strategy +- **Compliance**: Meet open data requirements and transparency goals +- **Engagement**: Increase engagement with stakeholders and the public +- **Efficiency**: Streamline data sharing processes without maintaining custom infrastructure + +#### Roles and Permissions + +**Account Roles**: Individuals can have different roles within organizational accounts: + +- **Owner**: Full control over the organization and all its data products +- **Administrator**: Can manage data products and organization members +- **Member**: Can contribute to the organization's data products based on assigned permissions + +**Data Product Roles**: Access to individual data products can be managed separately: + +- **Owner**: Full control over the data product, including deletion and access management +- **Contributor**: Can upload and modify data within the product +- **Viewer**: Can view and download data (relevant for restricted access data products) + + Data products can be tagged with relevant keywords to improve discoverability. Common tags include: - Data types: `vector`, `raster`, `tabular` @@ -153,13 +157,17 @@ Tags help users find relevant datasets through search and browsing, and improve ## Key Principles ### Open Access + All data published through Source is publicly accessible, promoting transparency and collaboration. ### Simplicity + Source eliminates the complexity of building custom data portals or APIs, making data publishing accessible to everyone. ### Standards-Based + Source leverages existing web standards and object storage protocols, ensuring compatibility and longevity. ### Community-Driven -Source fosters a community of data publishers and users, creating opportunities for collaboration and knowledge sharing. + +Source fosters a community of data publishers and users, creating opportunities for collaboration and knowledge sharing. \ No newline at end of file diff --git a/docs/about-source/what-is-source.md b/docs/about-source/what-is-source.md index 67ba958..7437af4 100644 --- a/docs/about-source/what-is-source.md +++ b/docs/about-source/what-is-source.md @@ -25,20 +25,20 @@ Source Cooperative is a data publishing utility for the web that allows trusted Organizations already using Source include: -- **Bridges to Prosperity** uses Source to enable AI-powered global water mapping, tripling the known coverage of mapped waterways worldwide -- **Earth Genome** shares 60+ terabytes of processed satellite imagery and 3.5 billion vector embeddings through Source -- **Dynamical.org** provides fast, easy access to weather data, serving 13,000 unique visitors and 31.3 million API requests -- **Auspatious** publishes cloud-optimized geospatial datasets, making high-resolution data accessible without requiring large downloads +- **[Bridges to Prosperity](/case-studies/bridges-to-prosperity.md)** uses Source to enable AI-powered global water mapping, tripling the known coverage of mapped waterways worldwide +- **[Earth Genome](/case-studies/earth-genome.md)** shares 60+ terabytes of processed satellite imagery and 3.5 billion vector embeddings through Source +- **[Dynamical.org](/case-studies/dynamical.md)** provides fast, easy access to weather data, serving 13,000 unique visitors and 31.3 million API requests +- **[Auspatious](/case-studies/auspatious.md)** publishes cloud-optimized geospatial datasets, making high-resolution data accessible without requiring large downloads ## Current Status Source is currently in beta. While all data hosted in Source is available to the public, publishing data requires applying to be a beta tester. To apply, visit [the beta tester application form](https://forms.gle/4weS1hkRjZhQLoPE9). Source currently: + - Hosts over 1 petabyte of data - Serves approximately 500 terabytes of data transfer per month - Logs an average of 126 million data requests per month - Supports over 300 data products from 66+ organizations -Source is funded by the Navigation Fund's Open Science Initiative, with in-kind support from AWS and Azure for data hosting. - +Source is funded by Taylor Geospatial, with in-kind support from AWS and Azure for data hosting. \ No newline at end of file diff --git a/docs/using-source/creating-an-account.md b/docs/using-source/creating-an-account.md index 3e8862e..5734a19 100644 --- a/docs/using-source/creating-an-account.md +++ b/docs/using-source/creating-an-account.md @@ -1,16 +1,14 @@ --- -title: Creating an Account on Source Cooperative +title: Creating an Account id: create-an-account slug: /create-an-account -sidebar_position: 2 -draft: true +sidebar_position: 1 --- Getting started with Source Cooperative requires creating an account. Currently, Source is in beta, so publishing data requires being accepted as a beta tester. -## Prerequisites - Before creating an account, be aware that: + - **Source is currently in beta**: All data hosted in Source is publicly accessible, but publishing data requires applying to be a beta tester - **Beta tester application**: To publish data to Source, apply at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) - **Browsing and downloading**: You can browse and download public data on Source without an account @@ -22,6 +20,7 @@ Visit [https://source.coop](https://source.coop) and click "Log In / Register" i ## Step 2: Choose Your Sign-Up Method Source offers authentication through third-party providers. Select your preferred authentication method and follow the sign-in flow. Common options may include: + - GitHub authentication - Google authentication - Other OAuth providers @@ -29,6 +28,7 @@ Source offers authentication through third-party providers. Select your preferre ## Step 3: Set Up Your Profile After authentication, you'll be able to: + - Set your username (this becomes your namespace: `username/`) - Add a display name - Provide biographical information (optional) @@ -48,13 +48,7 @@ If you need to publish data under an organizational identity rather than your pe 1. First create your individual account 2. Contact the Source team at [hello@source.coop](mailto:hello@source.coop) to request creation of an organizational account -3. Provide: - - Desired organization name and ID - - Organization description - - Initial organization members and their roles -4. The Source team will create the organizational account and grant you access - -Note: Self-service organizational account creation is on the roadmap but not yet available. +3. Once granted permission, you can create the organizational account at source.coop. ## Tips for Account Setup @@ -74,25 +68,26 @@ Once you have an account and beta access: - **Generate access credentials**: Get AWS CLI credentials for programmatic data access - **Manage your data**: Update, organize, and document your data products - **Collaborate**: Add team members to organizational accounts -- **Track usage**: Monitor how your data products are being accessed (feature in development) ## Data Access Without an Account You don't need an account to: + - Browse public data products on Source - Download data through your web browser - Access data via direct HTTP URLs - View data previews However, you will need an account for: + - Accessing data programmatically via AWS CLI - Generating S3-compatible access credentials - Publishing your own data products -- Accessing private or restricted data products (if granted permission) ## Next Steps After creating your account: + 1. Explore existing data products to see examples 2. Review the [Core Concepts](core-concepts.md) to understand how Source works 3. If approved as a beta tester, proceed to [Publishing a Data Product](publishing-a-data-product.md) diff --git a/docs/using-source/data-upload.md b/docs/using-source/data-upload.md index e5e18f8..5db475f 100644 --- a/docs/using-source/data-upload.md +++ b/docs/using-source/data-upload.md @@ -1,5 +1,5 @@ --- -title: How to Upload Your Data to Source Cooperative +title: Upload Your Data id: data-upload slug: /data-upload sidebar_position: 2 From 707ecdc69410536cae4747bc8f81f7d76ece360c Mon Sep 17 00:00:00 2001 From: PowerChell Date: Mon, 9 Feb 2026 23:01:50 -0700 Subject: [PATCH 3/4] more cleanup --- docs/about-source/core-concepts.md | 5 ++- docs/using-source/create-a-data-product.md | 30 +++++++++++++ ...ing-an-account.md => create-an-account.md} | 13 +++--- docs/using-source/data-proxy.md | 2 +- docs/using-source/data-upload.md | 2 + docs/using-source/web-ui.md | 42 ------------------- sidebars.ts | 3 +- 7 files changed, 44 insertions(+), 53 deletions(-) create mode 100644 docs/using-source/create-a-data-product.md rename docs/using-source/{creating-an-account.md => create-an-account.md} (86%) delete mode 100644 docs/using-source/web-ui.md diff --git a/docs/about-source/core-concepts.md b/docs/about-source/core-concepts.md index 60697f4..8b8b733 100644 --- a/docs/about-source/core-concepts.md +++ b/docs/about-source/core-concepts.md @@ -28,7 +28,7 @@ A data product is a collection of related data files with associated metadata an - **Objects**: The actual data files stored in the data product (any number of files) - **Visibility settings**: Controlling who can access the data -Data products are built entirely on cloud object storage, which allows Source to host very large volumes of data. While platforms like GitHub limit repositories to around 5GB, Source data products can be hundreds of terabytes. For example, the RapidAI4EO dataset on Source is over 100TB. +Data products are built entirely on cloud object storage, which allows Source to host very large volumes of data. While platforms like GitHub limit project size to around 5GB, Source data products can be hundreds of terabytes. For example, the RapidAI4EO dataset on Source is over 100TB. #### Data Product URLs @@ -52,6 +52,8 @@ Everything in Source is designed to be linkable. You can navigate deeper into da - **Data Product Monetization**: The ability to charge for access to data products - **Versioning**: Support for tracking changes made to data products +For how to create and manage data products in the web interface, see [Create a Data Product](/create-a-data-product). + ### Objects Objects are the individual files or data items stored within data products. They represent the actual data that users want to access and analyze. @@ -78,6 +80,7 @@ Source uses object storage rather than traditional file storage. This has import Source provides built-in preview functionality for common data formats directly in the browser. This allows users to visualize and explore data before downloading. When you navigate to an individual file in Source, you'll see a preview along with metadata rather than immediately downloading the file. Currently supported preview formats include: + - **Geospatial vector data**: PMTiles - **Cloud-optimized rasters**: Cloud Optimized GeoTIFFs (COG) - **Vector data**: GeoJSON, FlatGeobuf diff --git a/docs/using-source/create-a-data-product.md b/docs/using-source/create-a-data-product.md new file mode 100644 index 0000000..d459f45 --- /dev/null +++ b/docs/using-source/create-a-data-product.md @@ -0,0 +1,30 @@ +--- +title: Create a Data Product +id: create-a-data-product +slug: /create-a-data-product +sidebar_position: 2 +--- + +To create a data product, you need an account and [beta access](/create-an-account). After approval, sign out and sign back in—the option to create a new data product will then appear in the dropdown at the top right of the navigation bar. + +Data products can be owned by an organization or an individual. Organization creation and management is currently disabled, so new data products are created under your individual account. + +## When creating a data product + +- **Identifier**: 3–39 characters, alphanumeric and hyphens only (A–Z, 0–9, -). No consecutive hyphens, and it cannot start or end with a hyphen. The identifier appears in the URL. +- **Title**: Maximum 200 characters. +- **Description**: Optional; maximum 500 characters. Use it for a short overview; put detailed documentation in the README. +- **Tags**: Comma-separated, up to 20 tags. They help others discover your data. +- **Visibility**: New data products are created **Unlisted** (not shown in search). When ready to publish, open the data product page, click **Edit** in the sidebar, and set the state to **Listed**. + +## Editing a data product + +To change the title, description, tags, or visibility later, open your data product page and click **Edit** in the sidebar. + +## README and documentation + +The landing page for a data product renders a `README.md` file from the root of the product. You can use standard markdown. Include contact information so users know who to reach for support. If your README does not appear after uploading, check that the file is at the root and allow a few minutes for the cache to update. + +## Next steps + +After creating your data product, see [Upload Your Data](/data-upload) to add files. diff --git a/docs/using-source/creating-an-account.md b/docs/using-source/create-an-account.md similarity index 86% rename from docs/using-source/creating-an-account.md rename to docs/using-source/create-an-account.md index 5734a19..cf4ba20 100644 --- a/docs/using-source/creating-an-account.md +++ b/docs/using-source/create-an-account.md @@ -1,5 +1,5 @@ --- -title: Creating an Account +title: Create an Account id: create-an-account slug: /create-an-account sidebar_position: 1 @@ -10,7 +10,7 @@ Getting started with Source Cooperative requires creating an account. Currently, Before creating an account, be aware that: - **Source is currently in beta**: All data hosted in Source is publicly accessible, but publishing data requires applying to be a beta tester -- **Beta tester application**: To publish data to Source, apply at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) +- **Beta tester application**: To publish data on Source, apply at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) - **Browsing and downloading**: You can browse and download public data on Source without an account ## Step 1: Navigate to Source Cooperative @@ -40,7 +40,7 @@ If you want to publish data on Source: 1. Complete the beta tester application at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) 2. Wait for approval from the Radiant Earth team -3. Once approved, you'll be able to create data products and upload data +3. Once approved, sign out and sign back in so the option to create a new data product appears in the top-right menu. You can then create data products and upload data ## Creating an Organizational Account @@ -64,7 +64,7 @@ If you need to publish data under an organizational identity rather than your pe Once you have an account and beta access: -- **Publish data products**: Create repositories to host your datasets +- **Publish data products**: Create data products to host your datasets - **Generate access credentials**: Get AWS CLI credentials for programmatic data access - **Manage your data**: Update, organize, and document your data products - **Collaborate**: Add team members to organizational accounts @@ -89,6 +89,5 @@ However, you will need an account for: After creating your account: 1. Explore existing data products to see examples -2. Review the [Core Concepts](core-concepts.md) to understand how Source works -3. If approved as a beta tester, proceed to [Publishing a Data Product](publishing-a-data-product.md) -4. Learn about [uploading data](uploading-data.md) to your repositories \ No newline at end of file +2. Review [Core Concepts](/core-concepts) to understand how Source works +3. If approved as a beta tester, follow [Create a Data Product](/create-a-data-product), then learn about [uploading your data](/data-upload) diff --git a/docs/using-source/data-proxy.md b/docs/using-source/data-proxy.md index 47427f6..4e5ff06 100644 --- a/docs/using-source/data-proxy.md +++ b/docs/using-source/data-proxy.md @@ -1,5 +1,5 @@ --- -title: Accessing Data Through the Source Data Proxy +title: Access Data Through the Source Data Proxy id: data-proxy slug: /data-proxy sidebar_position: 3 diff --git a/docs/using-source/data-upload.md b/docs/using-source/data-upload.md index 5db475f..b611a97 100644 --- a/docs/using-source/data-upload.md +++ b/docs/using-source/data-upload.md @@ -8,6 +8,8 @@ sidebar_position: 2 This guide explains how to deliver your data to Source Cooperative in a secure and simple way. It is written for data providers and does not require deep Amazon Web Service (AWS) knowledge. +If you do not see the option to upload (for example, Edit Mode or View Credentials on your product page), contact [hello@source.coop](mailto:hello@source.coop) to request upload access. + --- ## The short version (what you need to do) diff --git a/docs/using-source/web-ui.md b/docs/using-source/web-ui.md deleted file mode 100644 index 41fc89c..0000000 --- a/docs/using-source/web-ui.md +++ /dev/null @@ -1,42 +0,0 @@ ---- -title: Web Interface -id: web-ui -slug: /web-ui -sidebar_position: 2 ---- - -## Repositories - -Repositories are where data products live on Source. Currently, new data products will be provisioned a path within an Amazon S3 bucket for you will be given credentials that allow you to upload your data. We recommend that you upload a README markdown file that contains detailed documentation about the data product and contact information that data users can use if they need support for the data product. - -## Creating a Repository - -To create a repository, you must first create an account on Source Cooperative at [https://auth.source.coop/ui/login](https://auth.source.coop/ui/login) and have approval from us to create new repositories after having been approved for [beta access](https://forms.gle/4weS1hkRjZhQLoPE9). - -Once you have received confirmation that your account has been enabled for repository creation, sign out of your Source Cooperative account and sign back in. You should now have a "New Repository" button in the dropdown menu at the top right of the navigation bar. - -Repositories may be associated with either an Organization or Individual. Repositories created within organizations allow all members of the organization to generate read / write credentials for the underlying S3 bucket. Currently, Organization creation and management is disabled so repositories must be created under your individual account. We will enable this feature when development of our Organization management feature is complete. - -Repository IDs appear in the URL for your repository. Repository IDs must be between 3 and 39 characters in length, contain only alphanumeric characters as well as hyphens (A-Z, 0-9, -), may not contain consecutive hyphens, and may not begin or end with a hyphen. - -The title for your repository must be less than 200 characters in length. - -The description for your repository, if provided, must be less than 500 characters in length. Repository descriptions should be used to give a quick overview of the data product and more information should be provided in the README. - -Repository tags allow users to filter for specific data products matching their needs. Tags must be comma separated and you may not have more than 20 tags. - -Your repository will be created in the "Unlisted" state. This state means that your repository will not appear in the search results but users with direct links to the repository page will be able to view it. This is useful for when you have not uploaded your entire data product yet or it is not ready for release yet. Once your repository is ready to be published, you can set the state to "Listed" in the repository edit page. - -### Uploading Data - -Please write to hello@source.coop if you do not already have the ability to upload data to Source. - -### Editing Repositories - -After your repository has been created, you may edit the title, description, and tags by navigating to your repository's page and clicking the "Edit" link in the sidebar. You may also change the state of your repository from "Unlisted" to "Listed" here or vice versa. - -### README Markdown Files - -The landing page for your repository attempts to render a `README.md` located at the root of your bucket. If no `README.md` file is found, a message will be displayed indicating so. If you have uploaded a `README.md` file and this message is showing, check that you have uploaded it to the correct location and wait a few minutes for the cache to update. - -You may put whatever content you wish in this document to describe your data product. It's also recommended to include contact information so that if users have questions they know who to reach out to. Standard markup syntax is supported in this file. \ No newline at end of file diff --git a/sidebars.ts b/sidebars.ts index 919c5e9..6bbf47f 100644 --- a/sidebars.ts +++ b/sidebars.ts @@ -28,10 +28,9 @@ const sidebars: SidebarsConfig = { label: 'Using Source', items: [ 'using-source/create-an-account', + 'using-source/create-a-data-product', 'using-source/data-upload', - 'using-source/web-ui', 'using-source/data-proxy', - ], }, { From 7498e904f183c4fec1dba356965354325f1f8d0a Mon Sep 17 00:00:00 2001 From: PowerChell Date: Tue, 10 Feb 2026 20:48:16 -0700 Subject: [PATCH 4/4] a few fixes --- docs/about-source/core-concepts.md | 23 +++++++++++----------- docs/using-source/create-a-data-product.md | 2 +- docs/using-source/create-an-account.md | 12 +++++------ 3 files changed, 18 insertions(+), 19 deletions(-) diff --git a/docs/about-source/core-concepts.md b/docs/about-source/core-concepts.md index 8b8b733..4cb4b11 100644 --- a/docs/about-source/core-concepts.md +++ b/docs/about-source/core-concepts.md @@ -81,11 +81,12 @@ Source provides built-in preview functionality for common data formats directly Currently supported preview formats include: -- **Geospatial vector data**: PMTiles +- **Geospatial vector tiles**: PMTiles - **Cloud-optimized rasters**: Cloud Optimized GeoTIFFs (COG) - **Vector data**: GeoJSON, FlatGeobuf - **Tabular data**: CSV, Parquet - **Metadata and documentation**: JSON, XML, Markdown +- **3D data**: 3D data files (e.g. Harvard Smithsonian data archive) The preview system is extensible, and the community can propose solutions for additional file formats as needs arise. @@ -101,11 +102,19 @@ Source provides multiple ways to access data: ### Tags and Discoverability +Data products can be tagged with relevant keywords to improve discoverability. Common tags include: +- Data types: `vector`, `raster`, `tabular` +- Themes: `agriculture`, `climate`, `conservation`, `land cover` +- Formats: `geoparquet`, `cog`, `pmtiles`, `netcdf` +- Applications: `machine learning`, `segmentation`, `time series` + +Tags help users find relevant datasets through search and browsing, and improve Source's visibility in search engines. + ## Accounts ### Individual Accounts -When you create an account on Source, you get a personal namespace where you can publish and manage your own data products. Your individual account is identified by your username (e.g., `username`). This username becomes part of your data product URLs. +When you create an account on Source, you get a personal namespace where you can publish and manage your own data products. Your individual account is identified by your username (e.g., `source.coop/cholmes`). This username becomes part of your data product URLs. #### Individual Account Benefits @@ -147,16 +156,6 @@ Organizations can create shared accounts that multiple individuals can manage co - **Contributor**: Can upload and modify data within the product - **Viewer**: Can view and download data (relevant for restricted access data products) - - -Data products can be tagged with relevant keywords to improve discoverability. Common tags include: -- Data types: `vector`, `raster`, `tabular` -- Themes: `agriculture`, `climate`, `conservation`, `land cover` -- Formats: `geoparquet`, `cog`, `pmtiles`, `netcdf` -- Applications: `machine learning`, `segmentation`, `time series` - -Tags help users find relevant datasets through search and browsing, and improve Source's visibility in search engines. - ## Key Principles ### Open Access diff --git a/docs/using-source/create-a-data-product.md b/docs/using-source/create-a-data-product.md index d459f45..74e5567 100644 --- a/docs/using-source/create-a-data-product.md +++ b/docs/using-source/create-a-data-product.md @@ -7,7 +7,7 @@ sidebar_position: 2 To create a data product, you need an account and [beta access](/create-an-account). After approval, sign out and sign back in—the option to create a new data product will then appear in the dropdown at the top right of the navigation bar. -Data products can be owned by an organization or an individual. Organization creation and management is currently disabled, so new data products are created under your individual account. +Data products can be owned by an organization or an individual. You will see a dropdown option when creating the data product of who will be displayed as the data product host (you or one of your organizations). ## When creating a data product diff --git a/docs/using-source/create-an-account.md b/docs/using-source/create-an-account.md index cf4ba20..fbac3ba 100644 --- a/docs/using-source/create-an-account.md +++ b/docs/using-source/create-an-account.md @@ -5,12 +5,12 @@ slug: /create-an-account sidebar_position: 1 --- -Getting started with Source Cooperative requires creating an account. Currently, Source is in beta, so publishing data requires being accepted as a beta tester. +Getting started with sharing data on Source Cooperative requires creating an account. Currently, Source is in beta, so publishing data requires being accepted as a beta tester. Before creating an account, be aware that: - **Source is currently in beta**: All data hosted in Source is publicly accessible, but publishing data requires applying to be a beta tester -- **Beta tester application**: To publish data on Source, apply at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) +- **Beta tester application**: To publish data on Source, apply via [this form](https://forms.gle/4weS1hkRjZhQLoPE9) - **Browsing and downloading**: You can browse and download public data on Source without an account ## Step 1: Navigate to Source Cooperative @@ -32,15 +32,15 @@ After authentication, you'll be able to: - Set your username (this becomes your namespace: `username/`) - Add a display name - Provide biographical information (optional) -- Add profile links and additional information +- Add profile photo, links, and additional information ## Step 4: Apply as a Beta Tester (For Data Publishers) If you want to publish data on Source: -1. Complete the beta tester application at [https://forms.gle/4weS1hkRjZhQLoPE9](https://forms.gle/4weS1hkRjZhQLoPE9) -2. Wait for approval from the Radiant Earth team -3. Once approved, sign out and sign back in so the option to create a new data product appears in the top-right menu. You can then create data products and upload data +1. Complete the beta tester application via [this form](https://forms.gle/4weS1hkRjZhQLoPE9) +2. Wait for approval from the Source Cooperative team +3. Once approved, refresh your page or sign out and sign back in so the option to create a new data product appears in the top-right menu. You can then create data products and upload data ## Creating an Organizational Account