Skip to content

feat: add dataset sharing, public preview page, and import#136

Open
balasiddarthan22 wants to merge 8 commits into
tinyfish-io:mainfrom
balasiddarthan22:feat/share-dataset
Open

feat: add dataset sharing, public preview page, and import#136
balasiddarthan22 wants to merge 8 commits into
tinyfish-io:mainfrom
balasiddarthan22:feat/share-dataset

Conversation

@balasiddarthan22

@balasiddarthan22 balasiddarthan22 commented Jun 9, 2026

Copy link
Copy Markdown

Summary

  • Adds a Share button to the dataset page header so owners can toggle a dataset between public and private
  • Introduces a public /share/[id] preview page showing the dataset name, description, columns, and a live 5-row data preview with no login required
  • Adds an Import button to the dashboard where anyone can paste a share link, preview the schema, and clone it into their account
  • Supports cross-instance import: paste a share link from any BigSet instance, fetch the schema live, and import it in one click
  • Share links embed the schema as a base64 fallback so imports work even if the source instance is offline
  • Adds a backend /share/:id endpoint with CORS so any BigSet instance can fetch dataset metadata
  • Schema inference now extracts a suggested row count from the prompt (e.g. "top 10") to pre-fill the max rows field in the wizard
  • Adds Open Graph meta tags on /share/[id] so links preview richly on WhatsApp, Telegram, Slack, etc.
  • Guards against importing your own dataset, checks quota before import, and surfaces errors inline

Test plan

  • Click Import on dashboard, paste a share link from a different origin, confirm schema preview loads cross-instance
  • Open the share link in incognito and confirm "Sign in to add this dataset" appears for unauthenticated users
  • Open a dataset, click Share, toggle to Public and confirm the share link appears with Copy working
  • Type "top 10 companies" in the new dataset wizard and confirm Max rows pre-fills to 10
  • Paste the /share/[id] link in a new tab and confirm the preview page loads with columns and row data
  • Click Add to my BigSet and confirm a new dataset is created and you are redirected to it
  • As the owner visiting your own share link, confirm it shows "View your dataset" instead of the import CTA
  • Toggle a dataset back to Private and confirm the share page shows "Dataset not found"
  • Try importing your own public dataset and confirm it shows an error instead of creating a duplicate

@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds public dataset sharing and import flows: a metadata fetch utility, public /share preview pages and API, dashboard Import modal, dataset-page ShareModal and "Add to my BigSet" flow, Convex mutations for visibility/import with quota checks, backend share endpoints, middleware change to allow /share/* routes without authentication, and docker-compose dev env updates for internal service URLs.

Sequence Diagram

sequenceDiagram
  participant User
  participant Dashboard
  participant SharePage
  participant ConvexAPI
  participant Backend
  participant Database

  User->>Dashboard: Click Import
  Dashboard->>Dashboard: Open ImportModal
  User->>Dashboard: Enter share URL
  Dashboard->>ConvexAPI: datasets.get(id) for preview (same-instance)
  Dashboard->>Backend: GET /api/share/:id (cross-instance preview)
  ConvexAPI->>Database: fetch dataset metadata
  Database-->>ConvexAPI: dataset + visibility
  ConvexAPI-->>Dashboard: preview data
  User->>Dashboard: Click Import
  Dashboard->>ConvexAPI: importDataset(sourceId) or importDatasetFromSchema(schema)
  ConvexAPI->>Database: validate quota/owner & insert new dataset
  Database-->>ConvexAPI: newId
  ConvexAPI-->>Dashboard: return newId
  Dashboard->>User: navigate to /dataset/[newId]

  User->>SharePage: Open shared link
  SharePage->>ConvexAPI: datasets.get(id) + datasetRows.listByDataset
  ConvexAPI->>Database: fetch metadata + rows
  Database-->>ConvexAPI: dataset + preview rows
  ConvexAPI-->>SharePage: data
  User->>SharePage: Click "Add to my BigSet"
  SharePage->>ConvexAPI: importDataset(sourceId)
  ConvexAPI->>Database: insert new dataset
  Database-->>ConvexAPI: newId
  ConvexAPI-->>SharePage: return newId
  SharePage->>User: navigate to /dataset/[newId]
Loading

Possibly related PRs

  • tinyfish-io/bigset#9: Earlier middleware and Convex/dev wiring that this PR extends for /share routes and docker-compose Convex env.
  • tinyfish-io/bigset#30: Related quota system changes that importDataset now enforces.
  • tinyfish-io/bigset#15: Prior public-datasets and middleware routing work related to /share path handling.

Suggested reviewers

  • simantak-dabhade
  • manav-tf
  • hwennnn
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main changes: adding dataset sharing (public toggle), a public preview page, and import functionality.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description clearly describes the dataset sharing feature, public preview page, import functionality, and related guards, directly aligning with all significant changes in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@frontend/app/dataset/`[id]/layout.tsx:
- Line 17: The ternary using `dataset.rowCount ? ... : null` drops valid zero
values; update the metadata expression in layout.tsx to check for null/undefined
explicitly (e.g., test `dataset.rowCount` against null/undefined) and return
`${dataset.rowCount} rows` when a numeric value including 0 is present,
otherwise return null; locate the expression that builds the metadata (the
`dataset.rowCount` ternary) and replace it with an explicit null/undefined check
so zero row counts are preserved.

In `@frontend/app/dataset/`[id]/page.tsx:
- Around line 970-974: The handleCopy function currently calls
navigator.clipboard.writeText(shareUrl) without awaiting or handling rejection;
make handleCopy async, await navigator.clipboard.writeText(shareUrl) inside a
try/catch, on success setCopied(true) and setTimeout to clear it, and on failure
set an error state or show user feedback (e.g., setCopyError or call a toast) so
users know the copy failed; optionally provide a fallback (e.g., open a prompt
with the shareUrl selected) in the catch block. Reference: handleCopy,
setCopied, shareUrl.

In `@frontend/lib/fetch-dataset-meta.ts`:
- Around line 15-20: The fetch call in fetch-dataset-meta.ts that posts to
`${convexUrl}/api/query` must include an abort timeout to avoid blocking page
rendering; update the fetch invocation inside the function (the const res =
await fetch(...)) to pass a signal (use AbortSignal.timeout(5000) or an
AbortController with a chosen ms) and wrap the fetch in try/catch so that when
it throws an AbortError or times out you return null (treat timeout as a null
result) instead of letting the error propagate; ensure you still handle other
errors appropriately and keep the existing next: { revalidate: 60 } option.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6bca3c71-b083-48d6-be7f-6173ae13bfd6

📥 Commits

Reviewing files that changed from the base of the PR and between 625866d and 71b1e66.

📒 Files selected for processing (10)
  • docker-compose.dev.yml
  • frontend/app/dashboard/page.tsx
  • frontend/app/dataset/[id]/layout.tsx
  • frontend/app/dataset/[id]/page.tsx
  • frontend/app/share/[id]/error.tsx
  • frontend/app/share/[id]/layout.tsx
  • frontend/app/share/[id]/page.tsx
  • frontend/convex/datasets.ts
  • frontend/lib/fetch-dataset-meta.ts
  • frontend/proxy.ts

Comment thread frontend/app/dataset/[id]/layout.tsx Outdated
Comment thread frontend/app/dataset/[id]/page.tsx Outdated
Comment thread frontend/lib/fetch-dataset-meta.ts
balasiddarthan22 and others added 2 commits June 9, 2026 21:25
- Fix rowCount falsy check: use != null so zero rows are not dropped from OG description
- Make handleCopy async with try/catch; show error if clipboard write fails
- Add AbortSignal.timeout(5000) to Convex metadata fetch to prevent blocking page render

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add GET /api/share/[id] Next.js route that returns public dataset schema
as JSON with CORS * headers, so any BigSet instance can fetch it.

Add importDatasetFromSchema Convex mutation that creates a local dataset
from a name/description/columns payload (no source Convex ID required).

Update ImportModal to detect when a pasted URL belongs to a different
origin: fetches schema from the remote /api/share/:id endpoint and
uses importDatasetFromSchema to clone it locally. Same-origin imports
continue to use the existing Convex-based flow.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@frontend/app/api/share/`[id]/route.ts:
- Around line 1-27: Move the API route out of the frontend tree: create a new
backend route that exports the same handlers (GET and OPTIONS) and re-use
fetchPublicDatasetMeta there; specifically, copy the logic in the GET handler
(await params, call fetchPublicDatasetMeta(id), return 404 or dataset with CORS
headers) and the OPTIONS handler to the backend service, remove this file from
frontend, and update the frontend code that used this API to call the new
backend endpoint URL instead of the local frontend route.
- Around line 11-15: The GET handler currently uses fetchPublicDatasetMeta which
performs a Convex fetch with next: { revalidate: 60 }, allowing cached public
metadata to be served after visibility changes; update the share-related reads
so they use a non-cached fetch by switching the fetch options to cache:
"no-store" (or removing next.revalidate) for calls used by
frontend/app/api/share/[id]/route.ts and frontend/app/share/[id]/layout.tsx
(generateMetadata) — e.g., modify fetchPublicDatasetMeta in
frontend/lib/fetch-dataset-meta.ts to accept an options flag to disable
revalidation or add an alternative function (e.g.,
fetchPublicDatasetMetaNoCache) that calls Convex/fetch with cache: "no-store",
then replace the usages in route.ts and generateMetadata to call the non-cached
variant so visibility revocations take effect immediately.

In `@frontend/convex/datasets.ts`:
- Around line 533-544: There is a duplicate declaration of the constant
columnValidator; remove the second declaration (the one inside the shown diff)
so the module uses the original columnValidator defined earlier, and verify any
downstream uses still reference the existing columnValidator and that its shape
(name, type union, description, isPrimaryKey) matches expected validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5d80f798-1c5a-438b-a0a1-01fb0641410c

📥 Commits

Reviewing files that changed from the base of the PR and between 675e227 and 1104417.

📒 Files selected for processing (4)
  • frontend/app/api/share/[id]/route.ts
  • frontend/app/dashboard/page.tsx
  • frontend/convex/datasets.ts
  • frontend/lib/fetch-dataset-meta.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • frontend/lib/fetch-dataset-meta.ts
  • frontend/app/dashboard/page.tsx

Comment thread frontend/app/api/share/[id]/route.ts
Comment thread frontend/app/api/share/[id]/route.ts Outdated
Comment thread frontend/convex/datasets.ts Outdated

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/src/index.ts (1)

11-11: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Import api from ./convex.js — current code won’t compile.

api.datasets.get is used on Line 689, but api is not imported.

Suggested fix
-import { convex, internal } from "./convex.js";
+import { convex, api, internal } from "./convex.js";

As per coding guidelines, backend/src/**/*.ts should import { convex, api, internal } from ./convex.js.

Also applies to: 689-689

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@backend/src/index.ts` at line 11, The top-level import only brings in convex
and internal but the file uses api (e.g., api.datasets.get around the code that
calls datasets.get), so update the import from "./convex.js" to also include
api; specifically modify the import statement that currently imports { convex,
internal } to import { convex, api, internal } so the api symbol is available
wherever api.datasets.get is invoked.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@backend/src/index.ts`:
- Around line 699-700: The catch block in the /share/:id route handler currently
maps all exceptions to a 404; update the error handling in that catch (the catch
for the /share/:id route handler) to only return 404 when you can positively
identify a true “not found” or permission error (e.g., a specific
ConvexNotFound/NotAuthorized indicator or a check that dataset is null/private),
and for all other unexpected/operational exceptions return 502 (or 500) and log
the full error; ensure you preserve existing behavior for legitimate
missing/private datasets while surfacing/recording real backend failures instead
of masking them as 404.

In `@frontend/app/api/share/`[id]/route.ts:
- Around line 18-25: The code currently maps any upstream failure to a 404;
update the response logic in the handler in frontend/app/api/share/[id]/route.ts
so that you only return NextResponse.json(..., { status: 404, headers: CORS })
when the fetched response status is actually 404 (res.status === 404), but for
any other non-ok upstream response return a 502 Bad Gateway (e.g.,
NextResponse.json({ error: "Upstream service error" }, { status: 502, headers:
CORS })), and change the catch block to return a 502 as well to surface
network/timeouts; keep using the same CORS headers and the same data flow
(res.json()) when res.ok is true.

---

Outside diff comments:
In `@backend/src/index.ts`:
- Line 11: The top-level import only brings in convex and internal but the file
uses api (e.g., api.datasets.get around the code that calls datasets.get), so
update the import from "./convex.js" to also include api; specifically modify
the import statement that currently imports { convex, internal } to import {
convex, api, internal } so the api symbol is available wherever api.datasets.get
is invoked.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 5ce136c1-bf68-42c3-bb4d-d9dba98236da

📥 Commits

Reviewing files that changed from the base of the PR and between 31f937d and 22d8304.

📒 Files selected for processing (3)
  • backend/src/index.ts
  • docker-compose.dev.yml
  • frontend/app/api/share/[id]/route.ts
✅ Files skipped from review due to trivial changes (1)
  • docker-compose.dev.yml

Comment thread backend/src/index.ts Outdated
Comment thread frontend/app/api/share/[id]/route.ts Outdated
Add public GET /share/:id backend endpoint with CORS for cross-instance schema fetch
Add frontend proxy at /api/share/[id] that forwards to backend
Import modal detects cross-instance links and fetches schema preview from source
Share links embed schema as base64 ?schema= param for offline fallback
importDatasetFromSchema Convex mutation creates dataset from raw schema
Schema inference extracts suggested_row_count from prompt to pre-fill max rows
Add BACKEND_URL to docker-compose frontend env for internal proxy calls
Add localhost to allowedDevOrigins for local dev
@balasiddarthan22

Copy link
Copy Markdown
Author
Tinyfish.Bigset.demo.mp4

Demo showing cross-instance dataset sharing and import. Person A shares a public dataset, Person B on a separate BigSet instance pastes the link, gets a live schema preview, and imports it in one click. Share links also embed the schema as a fallback so imports work even if the source instance is offline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant