-
Notifications
You must be signed in to change notification settings - Fork 5
feat: add download guide pages for anvil-cmg #4758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
3fff405
feat: add download guide pages for anvil-cmg (#4757)
frano-m bc01f85
feat: add sidebar navigation to guide pages (#4757)
frano-m ca4510e
fix: add exact selectedMatch to guides overview nav item (#4757)
frano-m 4c801b3
fix: rename guides overview label to "About AnVIL Explorer" (#4757)
frano-m 556a261
fix: use portalURL variable for requesting data access link (#4757)
frano-m bb91ca9
feat: replace curl command page with file manifest download page (#4757)
frano-m 0e93892
feat: data download options option to file manifest (#4757)
frano-m a14704a
fix: change file manifest to data download (#4757)
frano-m c7d386d
feat: tsv file download paths (#4757)
frano-m 0c48aae
fix: return notFound from getContentStaticProps for proper 404 status…
frano-m ae8bf4e
fix: rename GUIDES nav item to ABOUT_ANVIL_EXPLORER (#4757)
frano-m File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| <Breadcrumbs | ||
| breadcrumbs={[ | ||
| { path: "/datasets", text: "AnVIL Data Explorer" }, | ||
| { path: "/guides", text: "Guides" }, | ||
| { path: "", text: "Data Download Options" }, | ||
| ]} | ||
| /> | ||
|
|
||
| # Data Download Options | ||
|
|
||
| There are several ways to download files for use on local, institutional, or other computational services. | ||
|
|
||
| With support from Amazon's Open Data Sponsorship Program, the AnVIL open-access datasets are available with no-cost egress and for use within the AWS environment. | ||
|
|
||
| Managed-access datasets can be downloaded from Google Cloud Platform on a requester-pays basis. For more information, refer to the ["Requesting Data Access"]({portalURL}/learn/find-data/requesting-data-access#requester-pays) document. | ||
|
|
||
| The following options are available through the AnVIL Data Explorer: | ||
|
|
||
| - **TSV File Manifest Downloads** | ||
| - Available for all datasets, including open and managed access datasets. | ||
| - Manifest can include one or more datasets based on the search criteria used in the AnVIL Data Explorer. | ||
|
|
||
| - **Data Download via curl** | ||
| - Available only for open-access datasets. | ||
| - `curl` Command downloads can be full datasets or include select file types from one or more open-access datasets. | ||
|
|
||
| - **Individual File Download** | ||
| - Available only for files in open-access datasets. |
103 changes: 103 additions & 0 deletions
103
app/content/anvil-cmg/guides/data-download-via-curl.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| <Breadcrumbs | ||
| breadcrumbs={[ | ||
| { path: "/datasets", text: "AnVIL Data Explorer" }, | ||
| { path: "/guides", text: "Guides" }, | ||
| { path: "", text: "Data Download via curl" }, | ||
| ]} | ||
| /> | ||
|
|
||
| # Data Download via curl | ||
|
|
||
| The **Download Open-Access Data (curl Command)** enables the user to select the organism type and file formats they wish to transfer to a local or institutional system. Complete datasets can be downloaded by selecting all available file types. | ||
|
|
||
| **NOTE:** At this time, this option is available only for open-access datasets. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| curl must be installed on the destination system where the command will be run. Most Mac, Linux, Windows 10 & 11 systems include curl by default. Older Windows users can download it from the curl website or use Windows Subsystem for Linux (WSL). | ||
|
|
||
| ## Example | ||
|
|
||
| ### Downloading The Full Dataset | ||
|
|
||
| 1. Visit the dataset of interest by clicking on the dataset name in the Data Explorer. | ||
|
|
||
| <Figure | ||
| alt="Visit the dataset of interest" | ||
| src="/guides/curl-command-download/single-dataset-download-01.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 2. On the dataset description page, click on the "Export" button in the upper right-hand corner of that page. | ||
|
|
||
| <Figure | ||
| alt="Click the Export button" | ||
| src="/guides/curl-command-download/single-dataset-download-02.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 3. Then click on "Download Open-Access Data Files (No Data Transfer Fees)" in the "Download" section near the bottom of the page. | ||
|
|
||
| <Figure | ||
| alt="Click Download Open-Access Data Files" | ||
| src="/guides/curl-command-download/single-dataset-download-03.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 4. This will display a screen that allows some refinement of the data to download. | ||
|
|
||
| <Figure | ||
| alt="Refine the data to download" | ||
| src="/guides/curl-command-download/single-dataset-download-04.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 5. Select all of the organism type(s) at the top of the page. | ||
|
|
||
| 6. Check the box next to the Name heading. This will select all of the file types. | ||
| - If the user wants to download only specific file types, select only those file types and leave the others unchecked. | ||
|
|
||
| 7. Select Bash<sup>[1](#footnote-1)</sup> if you are on Mac, Linux, or Windows Subsystem for Linux; select cmd.exe if you are on Windows Command Prompt. | ||
|
|
||
| 8. Click on the Request curl Command button. | ||
|
|
||
| <Figure | ||
| alt="Click the Request curl Command button" | ||
| src="/guides/curl-command-download/single-dataset-download-08.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| This will generate a curl manifest and the command needed to transfer the files. The resulting command will be similar to this: | ||
|
|
||
| ``` | ||
| curl --location --fail https://service.explore.anvilproject.org/manifest/files/ksQylKdhbnZpbDEzpGN1cmzEEKxolyZNG12_p9nHuKrRpbDEEH2f6ZDL2lSzofvXZ80pfgXEIJHlLajfJ07ut9ZEMwSwDDAdmSZQam5pZbCxG3WZeFBl | curl --retry 15 --retry-delay 10 --config - | ||
| ``` | ||
|
|
||
| On the destination system, issue the specified curl command. Clicking the text box containing the curl command copies it to your clipboard so you can paste it into a terminal window. | ||
|
|
||
| <Figure | ||
| alt="Copy the curl command to clipboard" | ||
| src="/guides/curl-command-download/single-dataset-download-08b.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| For single-dataset downloads, a series of subdirectories will be created containing the selected files from that dataset. | ||
|
|
||
| ### Downloading Files From Multiple Datasets | ||
|
|
||
| Downloading files from multiple datasets works the same way as downloading from a single dataset, except for how you select the datasets. | ||
|
|
||
| In this case, on the Data Explorer's main page, use the faceted search feature in the right-hand column to select the datasets of interest and then click on the "Export" button on the top right of the page. | ||
|
|
||
| <Figure | ||
| alt="Select datasets and click Export" | ||
| src="/guides/curl-command-download/multiple-datasets-download-01.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| From this point on, the interface is the same as the single dataset download above. Continue with Step 3 above. | ||
|
|
||
| --- | ||
|
|
||
| <sup id="footnote-1">1</sup> The Bash shell will work for most of the common | ||
| Unix/Linux command-line shells. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,47 @@ | ||
| <Breadcrumbs | ||
| breadcrumbs={[ | ||
| { path: "/datasets", text: "AnVIL Data Explorer" }, | ||
| { path: "/guides", text: "Guides" }, | ||
| { path: "", text: "Individual File Download" }, | ||
| ]} | ||
| /> | ||
|
|
||
| # Individual File Download | ||
|
|
||
| Individual file downloads from the AnVIL Data Explorer are available for open-access files. Files can also be downloaded with information in the dataset manifests. | ||
|
|
||
| ## Example | ||
|
|
||
| Downloading individual files. | ||
|
|
||
| 1. Use the faceted search in the left-hand column to limit the scope of the files listed. | ||
|
|
||
| <Figure | ||
| alt="Use the faceted search to limit the scope of files" | ||
| src="/guides/individual-download/individual-files-01.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 2. Select the "Files" tab at the top of the list of datasets. This will change the display to list the available files. | ||
|
|
||
| <Figure | ||
| alt="Select the Files tab" | ||
| src="/guides/individual-download/individual-files-02.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 3. To download the file, click the download button next to the file of interest. This will start the Download folder as specified in the browser configuration. | ||
|
|
||
| <Figure | ||
| alt="Click the download button next to the file" | ||
| src="/guides/individual-download/individual-files-03.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| Note that if there are files in the list that are not available for download (e.g., files from a managed access dataset), the download icon will be grayed out. | ||
|
|
||
| <Figure | ||
| alt="Disabled download icon for files that are not available for download" | ||
| src="/guides/individual-download/individual-files-04.webp" | ||
| width="100%" | ||
| /> |
86 changes: 86 additions & 0 deletions
86
app/content/anvil-cmg/guides/tsv-file-manifest-download.mdx
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,86 @@ | ||
| <Breadcrumbs | ||
| breadcrumbs={[ | ||
| { path: "/datasets", text: "AnVIL Data Explorer" }, | ||
| { path: "/guides", text: "Guides" }, | ||
| { path: "", text: "TSV File Manifest Download" }, | ||
| ]} | ||
| /> | ||
|
|
||
| # TSV File Manifest Download | ||
|
|
||
| Manifest downloads are available for all of the datasets listed in the AnVIL Data Explorer, including both open-access and managed-access datasets. A tab-separated-value file (.tsv) is generated based on the data selected. | ||
|
|
||
| The downloaded manifest contains a number of columns. Depending on how the data will be accessed and used, some of the key columns are: | ||
|
|
||
| - **dataset.title**, which contains the name of the dataset that the file belongs to. | ||
| - A manifest can contain files from multiple datasets, depending on how the file is generated. | ||
| - **datasets.consent_group** and **datasets.data_use_permission**, which contain the dataset's consent and use codes. | ||
| - **files.file_size**, which contains the file size in bytes. | ||
| - **files.name**, which contains the file name. | ||
| - **files.drs_url**, which contains the DRS URL for use within the Terra environment. | ||
| - **files.azul_url**, which is a URL that allows HTTP access to the individual file. | ||
| - Files in open-access datasets are available via this link. | ||
| - At this time, AnVIL requires requester-pays for managed-access datasets, so the files are not accessible through this URL. | ||
| - **files.azul_mirror_url**, which contains the URI to the Amazon Web Services S3 bucket for that file. | ||
| - Please note that the file name in the bucket is a hash to reduce storage requirements in case there is file duplication. | ||
| - This field will be blank if the file is not present through the AWS Open Data Sponsorship Program. | ||
|
|
||
| ## Example | ||
|
|
||
| ### Downloading The Manifest For A Single Dataset | ||
|
|
||
| 1. Visit the dataset of interest by clicking on the dataset name in the Data Explorer. | ||
|
|
||
| <Figure | ||
| alt="Visit the dataset of interest" | ||
| src="/guides/dataset-manifest-download/single-dataset-download-01.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 2. On the dataset description page, click on the "Export" button in the upper right-hand corner of that page. | ||
|
|
||
| <Figure | ||
| alt="Click the Export button" | ||
| src="/guides/dataset-manifest-download/single-dataset-download-02.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 3. Then click on "Download TSV Manifest" in the "Download" section near the bottom of the page. | ||
|
|
||
| <Figure | ||
| alt="Click Download TSV Manifest" | ||
| src="/guides/dataset-manifest-download/single-dataset-download-03.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 4. This will display a screen to request the generation of the manifest. Click on the "Request Link" button. | ||
|
|
||
| <Figure | ||
| alt="Click the Request Link button" | ||
| src="/guides/dataset-manifest-download/single-dataset-download-04.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| 5. Once the manifest is generated, you can either download it directly by clicking the download icon or copy its URL by clicking the copy icon. | ||
|
|
||
| <Figure | ||
| alt="Download or copy the manifest link" | ||
| src="/guides/dataset-manifest-download/single-dataset-download-05.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| The manifest can be viewed with any utilities that can import tab-separated-value files. It can additionally be processed with scripts depending on the need. | ||
|
|
||
| ### Downloading A Manifest For Multiple Datasets | ||
|
|
||
| Downloading files from multiple datasets works the same way as downloading from a single dataset, except for how you select the datasets. | ||
|
|
||
| In this case, on the Data Explorer's main page, use the faceted search feature in the right-hand column to select the datasets of interest and then click on the "Export" button on the top right of the page. | ||
|
|
||
| <Figure | ||
| alt="Select datasets and click Export" | ||
| src="/guides/dataset-manifest-download/multiple-datasets-download-01.webp" | ||
| width="100%" | ||
| /> | ||
|
|
||
| From this point on, the interface is the same as the single dataset download above. Continue with Step 3 above. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| import { Main } from "@databiosphere/findable-ui/lib/components/Layout/components/ContentLayout/components/Main/main"; | ||
| import { Nav } from "@databiosphere/findable-ui/lib/components/Layout/components/Nav/nav"; | ||
| import { ContentView } from "@databiosphere/findable-ui/lib/views/ContentView/contentView"; | ||
| import { GetStaticProps, InferGetStaticPropsType } from "next"; | ||
| import { MDXRemote } from "next-mdx-remote"; | ||
| import { JSX } from "react"; | ||
| import { Content } from "../../app/components/Layout/components/Content/content"; | ||
| import { MDX_COMPONENTS } from "../../app/content/common/constants"; | ||
| import { getContentStaticProps } from "../../app/content/common/contentPages"; | ||
| import { | ||
| ABOUT_ANVIL_EXPLORER, | ||
| DATA_DOWNLOAD_OPTIONS, | ||
| DATA_DOWNLOAD_VIA_CURL, | ||
| INDIVIDUAL_FILE_DOWNLOAD, | ||
| TSV_FILE_MANIFEST_DOWNLOAD, | ||
| } from "../../site-config/anvil-cmg/dev/layout/navigationItem"; | ||
| const slug = ["guides", "data-download-options"]; | ||
|
|
||
| export const getStaticProps: GetStaticProps = async () => { | ||
| return getContentStaticProps({ params: { slug } }, "Data Download Options"); | ||
| }; | ||
|
|
||
| const Page = ({ | ||
| layoutStyle, | ||
| mdxSource, | ||
| }: InferGetStaticPropsType<typeof getStaticProps>): JSX.Element => { | ||
| return ( | ||
| <ContentView | ||
| content={ | ||
| <Content> | ||
| <MDXRemote {...mdxSource} components={MDX_COMPONENTS} /> | ||
| </Content> | ||
| } | ||
| navigation={ | ||
| <Nav | ||
| navigation={[ | ||
| ABOUT_ANVIL_EXPLORER, | ||
| { active: true, ...DATA_DOWNLOAD_OPTIONS }, | ||
| TSV_FILE_MANIFEST_DOWNLOAD, | ||
| DATA_DOWNLOAD_VIA_CURL, | ||
| INDIVIDUAL_FILE_DOWNLOAD, | ||
| ]} | ||
| /> | ||
|
frano-m marked this conversation as resolved.
|
||
| } | ||
| layoutStyle={layoutStyle ?? undefined} | ||
| /> | ||
| ); | ||
| }; | ||
|
|
||
| Page.Main = Main; | ||
|
|
||
| export default Page; | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.