diff --git a/docs/images/Q-IAM-role.png b/docs/images/Q-IAM-role.png new file mode 100644 index 0000000..04f7a9f Binary files /dev/null and b/docs/images/Q-IAM-role.png differ diff --git a/docs/images/Q-R-script.png b/docs/images/Q-R-script.png new file mode 100644 index 0000000..25d191f Binary files /dev/null and b/docs/images/Q-R-script.png differ diff --git a/docs/images/Q-amazon-q-jup.png b/docs/images/Q-amazon-q-jup.png new file mode 100644 index 0000000..32f3c58 Binary files /dev/null and b/docs/images/Q-amazon-q-jup.png differ diff --git a/docs/images/Q-code-completion-1.png b/docs/images/Q-code-completion-1.png new file mode 100644 index 0000000..c619e8a Binary files /dev/null and b/docs/images/Q-code-completion-1.png differ diff --git a/docs/images/Q-code-completion.png b/docs/images/Q-code-completion.png new file mode 100644 index 0000000..46fa22a Binary files /dev/null and b/docs/images/Q-code-completion.png differ diff --git a/docs/images/Q-domain-name.png b/docs/images/Q-domain-name.png new file mode 100644 index 0000000..2217d2d Binary files /dev/null and b/docs/images/Q-domain-name.png differ diff --git a/docs/images/Q-explain.png b/docs/images/Q-explain.png new file mode 100644 index 0000000..323d3b9 Binary files /dev/null and b/docs/images/Q-explain.png differ diff --git a/docs/images/Q-fix.png b/docs/images/Q-fix.png new file mode 100644 index 0000000..d3257d4 Binary files /dev/null and b/docs/images/Q-fix.png differ diff --git a/docs/images/Q-iam-policy-review.png b/docs/images/Q-iam-policy-review.png new file mode 100644 index 0000000..c63bf08 Binary files /dev/null and b/docs/images/Q-iam-policy-review.png differ diff --git a/docs/images/Q-jupy-lab.png b/docs/images/Q-jupy-lab.png new file mode 100644 index 0000000..9971068 Binary files /dev/null and b/docs/images/Q-jupy-lab.png differ diff --git a/docs/images/Q-optimize-script.png b/docs/images/Q-optimize-script.png new file mode 100644 index 0000000..7c327eb Binary files /dev/null and b/docs/images/Q-optimize-script.png differ diff --git a/docs/images/Q-optimize.png b/docs/images/Q-optimize.png new file mode 100644 index 0000000..14b337e Binary files /dev/null and b/docs/images/Q-optimize.png differ diff --git a/docs/images/Q-parallel-processing.png b/docs/images/Q-parallel-processing.png new file mode 100644 index 0000000..47afb79 Binary files /dev/null and b/docs/images/Q-parallel-processing.png differ diff --git a/docs/images/Q-role-policy.png b/docs/images/Q-role-policy.png new file mode 100644 index 0000000..29dbcb3 Binary files /dev/null and b/docs/images/Q-role-policy.png differ diff --git a/docs/images/Q-send-cell-with-prompt.png b/docs/images/Q-send-cell-with-prompt.png new file mode 100644 index 0000000..d172a5d Binary files /dev/null and b/docs/images/Q-send-cell-with-prompt.png differ diff --git a/docs/images/Q-snakemake-cloud.png b/docs/images/Q-snakemake-cloud.png new file mode 100644 index 0000000..8ef60a8 Binary files /dev/null and b/docs/images/Q-snakemake-cloud.png differ diff --git a/docs/images/Q-snakemake-cluod.png b/docs/images/Q-snakemake-cluod.png new file mode 100644 index 0000000..8ef60a8 Binary files /dev/null and b/docs/images/Q-snakemake-cluod.png differ diff --git a/docs/images/Q-snakemake-wf.png b/docs/images/Q-snakemake-wf.png new file mode 100644 index 0000000..9c636bc Binary files /dev/null and b/docs/images/Q-snakemake-wf.png differ diff --git a/notebooks/GenAI/AWS_Amazon_Q_Developer.ipynb b/notebooks/GenAI/AWS_Amazon_Q_Developer.ipynb new file mode 100644 index 0000000..bd3974c --- /dev/null +++ b/notebooks/GenAI/AWS_Amazon_Q_Developer.ipynb @@ -0,0 +1,547 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Introduction to Amazon Q Developer\n", + "\n", + "**Skill Level: Beginner**\n", + "\n", + "## Overview\n", + "\n", + "Amazon Q Developer is a generative artificial intelligence (AI) powered conversational assistant designed to enhance the \n", + "software development process, particularly within the AWS ecosystem. Amazon Q can assist with various coding tasks such as providing inline code completions, generating new code, and scanning for security vulnerabilities. Amazon Q Developer can be accessed through an IDE such as VSCode and JupyterLab or through the command line. \n", + "\n", + "The model that Amazon Q utilizes has been supplemented with the high-quality AWS content, allowing users to ask questions\n", + "about AWS architecture, AWS resources, best practices, documentation, and support. \n", + "\n", + "In this tutorial, we will be running the **free tier of Amazon Q Developer**. You will not incur charges for the first 50 interactions with the chatbot each month. To understand the pricing plan of Amazon Q, please refer to the information found in this [link](https://aws.amazon.com/q/developer/pricing/).\n", + "\n", + "## Learning Objectives\n", + "\n", + "* Understand the capabilities of Amazon Q Developer and learn how to access the tool through an IDE \n", + "* Engage with Amazon Q Developer and ask code-related queries\n", + "* Implement inline code completions \n", + "* Generate new code snippets based on specific requirements \n", + "\n", + "## Prerequisites\n", + "\n", + "To complete this tutorial, you will need access to an Integrated Development Environment (IDE) popular options include SageMaker AI Studio, VSCode, and JupyterLab. Note that installation steps, pricing, and certain features of Amazon Q Developer may vary across different IDEs and pricing tiers. While we are using the free tier, check out the comparison below to understand the differences in requirements between using the free tier in SageMaker AI, running workflows locally, and utilizing the pro tier.\n", + "\n", + "**Amazon Q Developer Free Tier**\n", + "- SageMaker AI\n", + " - Requires an AWS account\n", + " - Supported IDEs include Jupyter Lab and Code Editor\n", + " - Will need to add the following role policy:\n", + "\n", + " ```JSON\n", + " {\n", + " \"Effect\": \"Allow\",\n", + " \"Action\": [\n", + " \"q:SendMessage\"\n", + " ],\n", + " \"Resource\": [\n", + " \"*\"\n", + " ]\n", + " }\n", + " {\n", + " \"Sid\": \"AmazonQDeveloperPermissions\",\n", + " \"Effect\": \"Allow\",\n", + " \"Action\": [\n", + " \"codewhisperer:GenerateRecommendations\"\n", + " ],\n", + " \"Resource\": \"*\"\n", + " }\n", + " ```\n", + "\n", + "- Local IDE\n", + " - AWS account is not required \n", + " - Required to create an AWS Builder ID profile which includes creating a username and password. \n", + " - To learn more visit the following link [here](https://docs.aws.amazon.com/signin/latest/userguide/create-aws_builder_id.html).\n", + " - Supported IDEs include VS Code, Visual Studio and JetBrains\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Amazon Q Developer Pro Tier**\n", + "- Requires AWS account as a prerequisite \n", + "- Needed integration with IAM Identity Center (IDC) requiring **adminstator approval** \n", + "- Costs $19/mo per user" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "**Warning:** SageMaker AI is not to be confused by Sagemaker NextGen. Sagemaker AI offers a wide range of machine learning functionalities, including tools for training and tuning models. However, deploying endpoints in SageMaker Studio can be very expensive, we strongly recommend using Amazon Bedrock for model testing and Generative AI (GenAI) work to manage costs more effectively.\n", + "\n", + "## Get Started\n", + "\n", + " In this tutorial, we will run through use cases that showcase many features Amazon Q Developer provides such as inline code completions, generating new code, and scanning for security vulnerabilities through the use of 'Quick Actions' listed below:\n", + "\n", + "* **/fix:** Corrects errors in code.\n", + "* **/optimize:** Suggests optimization methods.\n", + "* **/explain:** Explains code snippets and provides enhancement suggestions.\n", + "* **/test:** Builds unit tests to evaluate and test code.\n", + "\n", + "**Note:** Generative AI responses may differ for each user, even with the same input, due to its dynamic and context-sensitive nature." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "## Use Cases" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Use Case 1: Utilize in-line code completion.\n", + "\n", + "In-line code completion is a feature that helps you write code faster and with fewer errors. As you type, it suggests possible ways to complete your code based on what you've started to write. \n", + "\n", + "Try it out! Let's try adding a sixth step to the following script as shown in the image below. As you type, you will notice that Amazon Q will attempt to auto-complete your code. \n", + "\n", + "![alt text](../../docs/images/Q-code-completion-1.png)\n", + "\n", + "**Original Script**\n", + "\n", + "You can add the sixth step to the code cell below." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import pandas as pd\n", + "import subprocess\n", + "\n", + "# Step 1: Read the sample sheet\n", + "sample_sheet = pd.read_csv('samplesheet.csv')\n", + "\n", + "# Step 2: Run FastQC\n", + "for index, row in sample_sheet.iterrows():\n", + " fastqc_command = f\"fastqc {row['file_path']} -o ./fastqc_results/\"\n", + " subprocess.run(fastqc_command, shell=True)\n", + "\n", + "# Step 3: Run MultiQC\n", + "multiqc_command = \"multiqc ./fastqc_results/ -o ./multiqc_report/\"\n", + "subprocess.run(multiqc_command, shell=True)\n", + "\n", + "# Step 4: Run STAR aligner\n", + "for index, row in sample_sheet.iterrows():\n", + " star_command = f\"STAR --genomeDir /path/to/genome --readFilesIn {row['file_path']} --outFileNamePrefix ./star_results/{row['sample_id']}\"\n", + " subprocess.run(star_command, shell=True)\n", + "\n", + "# Step 5: Index BAM files with Samtools\n", + "for index, row in sample_sheet.iterrows():\n", + " bam_file = f\"./star_results/{row['sample_id']}.bam\"\n", + " samtools_command = f\"samtools index {bam_file}\"\n", + " subprocess.run(samtools_command, shell=True) " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Use Case 2: Python Notebook functionalities\n", + "\n", + "The quick actions menu provides a list of ways that you may prompt the coding assistant. In this use case, we will attach code snippets found in a python notebook and test the `/fix`, `optimize`, and `explain` quick actions. \n", + "\n", + "\n", + "##### **`/fix` Prompt**\n", + "\n", + "The '/fix' quick action lets you correct errors in your code. Lets try the example below which will run into an error when you run the `describe.df()`. \n", + "\n", + "Lets first set up our dataframe before we run the next line of code." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#Cell 1\n", + "\n", + "#import libraries\n", + "import pandas as pd\n", + "\n", + "# Initialize data of lists\n", + "data = {\n", + " 'Gene': ['GeneA', 'GeneB', 'GeneC', 'GeneD'],\n", + " 'Expression_Level': [12.5, 8.3, 15.2, 7.8],\n", + " 'Sample_ID': ['S1', 'S2', 'S3', 'S4'],\n", + " 'Condition': ['Control', 'Treated', 'Control', 'Treated']\n", + "}\n", + "\n", + "# Create DataFrame\n", + "df = pd.DataFrame(data)\n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Now that we have our dataframe set up you can run the next code cell. You should run into an error that states the code is wrong. To have Amazon Q Developer fix this error follow the steps below:\n", + "\n", + "1. Select the cell below, navigate to the Amazon Q Developer search bar and type in `/fix`.\n", + "2. Click on the down arrow next to the send button and select \"Send message with selection\". " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Error debugging test /fix\n", + "# View summary statistics\n", + "describe().df" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The response may look like the image below and should contain the corrected code, a description of what the code does and suggestions for improving it. You can easily implement the suggested changes by clicking on the three dots at the top of the response and selecting \"Replace selection\". Lets try the **/optimize** quick action now!\n", + "\n", + "![alt text](../../docs/images/Q-fix.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### **`/optimize` Prompt**\n", + "\n", + "1. Select the code cell below, navigate to the Amazon Q Developer search bar and type in `/optimize`.\n", + "2. Click on the down arrow next to the send button and select \"Send message with selection\", you should now suggestions from Amazon Q on how you can optimize the code cell below.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "#Cell 2\n", + "\n", + "# Optimize selection test /optimize\n", + "# Add additional data to the dataframe\n", + "df['Sample_Type'] = ['Tissue1', 'Tissue2', 'Tissue1', 'Tissue2']\n", + "df['P_Value'] = [0.05, 0.01, 0.03, 0.07]" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### **`/explain` Prompt**\n", + "1. Select the code cell below, navigate to the Amazon Q Developer search bar and type in `/explain`\n", + "2. Click on the down arrow next to the send button and select \"Send message with selection\" \n", + "\n", + "Similar to the image below you should see that Amazon Q Developer has explained the code cell. Additionally, suggestions to enhance the code are provided." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "Cell 3\n", + "#Explain selection test /explain\n", + "#Plot results\n", + "import matplotlib.pyplot as plt\n", + "\n", + "df.plot(x='Sample_ID', y='Expression_Level', kind='line')\n", + "plt.show()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "![alt text](../../docs/images/Q-explain.png)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Use Case 3: Modify an existing script. \n", + "\n", + "Prompting can be used to modify an existing script with new functions and code. The coding assistant may also provide detailed explanations and examples, making it easier to understand and implement the changes.\n", + "\n", + "In this script we will incorporate parallel processing. Parallel processing allows a program to execute multiple tasks simultaneously, which can significantly speed up the execution time, especially for tasks that are computationally intensive. \n", + "\n", + "The response from Amazon Q Developer may include the following elements:\n", + "\n", + "- **Modified Python Script:** The complete modified script with parallel processing for specific steps.\n", + "- **Key Improvements:** A list of the main improvements made in the modified script.\n", + "- **Customization Instructions:** Instructions on how to customize the parallel processing by specifying the number of processes.\n", + "- **Notes:** Considerations and caveats related to the script and parallel processing.\n", + "- **Performance Optimization Tips:** Tips for optimizing performance when using the modified script.\n", + " \n", + "Lets start by entering the following prompt in the search bar of your Amazon Q assistant. \n", + "\n", + "**Note:** In this prompt, we include the script. When utilizing Amazon Q Developer in other IDEs such as VSCode or Code Editor, you may reference files through their filepaths.\n", + "\n", + "\n", + "##### **Prompt** " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```markdown\n", + "I would like to use parallel processing in my Python script. Can you modify this script to do so?\n", + "\n", + "Script: \n", + "```\n", + "\n", + "```python\n", + "import pandas as pd\n", + "import subprocess\n", + "\n", + "# Step 1: Read the sample sheet\n", + "sample_sheet = pd.read_csv('samplesheet.csv')\n", + "\n", + "# Step 2: Run FastQC\n", + "for index, row in sample_sheet.iterrows():\n", + " fastqc_command = f\"fastqc {row['file_path']} -o ./fastqc_results/\"\n", + " subprocess.run(fastqc_command, shell=True)\n", + "\n", + "# Step 3: Run MultiQC\n", + "multiqc_command = \"multiqc ./fastqc_results/ -o ./multiqc_report/\"\n", + "subprocess.run(multiqc_command, shell=True)\n", + "\n", + "# Step 4: Run STAR aligner\n", + "for index, row in sample_sheet.iterrows():\n", + " star_command = f\"STAR --genomeDir /path/to/genome --readFilesIn {row['file_path']} --outFileNamePrefix ./star_results/{row['sample_id']}\"\n", + " subprocess.run(star_command, shell=True)\n", + "\n", + "# Step 5: Index BAM files with Samtools\n", + "for index, row in sample_sheet.iterrows():\n", + " bam_file = f\"./star_results/{row['sample_id']}.bam\"\n", + " samtools_command = f\"samtools index {bam_file}\"\n", + " subprocess.run(samtools_command, shell=True)\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "#### Use Case 4: Code conversion\n", + "\n", + "Code conversion is often necessary when you want to adapt existing scripts to different environments, workflows, or tools. A common use-case is converting scripts into a workflow language. \n", + "\n", + "The response you may see can include elements found in a workflow and explanations on how to run the provided scripts. For example, it may contain the following elements:\n", + "\n", + "- Scripts: Workflow file, config file, cluster configuration, and submission script, defining rules, configurations, resource requirements, and execution commands.\n", + "- Key Features: The key features of the workflow are highlighted and explained.\n", + "- Running Instructions: Instructions on how to run the workflow, with multiple options provided.\n", + "- Setup Instructions: Instructions to create necessary directories, modify configuration files, and run the workflow using a cluster profile or direct submission.\n", + "- Enhancement Suggestions: Suggestions for adding rules for quality control or downstream analysis, defining dependencies, and including quality control outputs.\n", + "\n", + "\n", + "Please paste the following prompt into your coding assistant search bar. \n", + "\n", + "##### **Prompt** " + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "```markdown\n", + "Convert this slurm script into a Snakemake workflow.\n", + "\n", + "Script: \n", + "```\n", + "```bash\n", + "#!/bin/bash\n", + "#SBATCH --job-name=star_alignment\n", + "#SBATCH --output=star_output.txt\n", + "#SBATCH --error=star_error.txt\n", + "#SBATCH --nodes=1\n", + "#SBATCH --ntasks=1\n", + "#SBATCH --cpus-per-task=8\n", + "#SBATCH --mem=32G\n", + "#SBATCH --time=02:00:00\n", + "\n", + "module load star\n", + "\n", + "INPUT_DIR=/path/to/input\n", + "OUTPUT_DIR=/path/to/output\n", + "GENOME_DIR=/path/to/genome\n", + "\n", + "STAR --genomeDir $GENOME_DIR --readFilesIn $INPUT_DIR/sample_R1.fastq $INPUT_DIR/sample_R2.fastq --outFileNamePrefix $OUTPUT_DIR/sample\n", + "\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Use Case 5: Cloud migration assistance. \n", + "\n", + "Cloud migration of bioinformatics pipelines involves moving data and computational workflows to the cloud. This allows researchers to use scalable and powerful cloud resources, making it easier to process large datasets and perform complex analyses efficiently and cost-effectively. Prompting can be used to facilitate the migration of pipelines to the cloud. As Amazon Q Developer specializes in queries and tasks related to AWS, let's prompt the model to facilitate migration to AWS infrastructure. \n", + "\n", + "The response may include modifications to enable the pipeline to run using cloud infrastructure, such as integrating cloud-specific URLs and parameters, defining access policies, and creating scripts for submitting workflows to cloud-based job scheduling services. Key features highlighted may be cloud integration for storage and job scheduling, security best practices (access policies, secrets management, network security, data encryption, logging/monitoring), and resource management through cloud services. Instructions for configuring cloud settings, defining access policies, and running the workflow may be provided, along with suggestions for additional security measures like endpoint configurations, data lifecycle policies, encryption, and access logging. Additionally, Amazon Q Developer has the latest information on AWS updates, ensuring that the implementation leverages the most current features and best practices.\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "We will be continuing to ask questions in the same chat thread that created the snakemake script earlier. Referencing your chat history and fine-tuning the response to best suit your needs is an example of chain-of-thought prompting. \n", + "\n", + "Please paste the following prompt in your coding assistant search bar. \n", + "\n", + "\n", + "\n", + "##### **Prompt** \n", + "```markdown\n", + "Modify this Snakemake workflow to run using cloud resources in AWS. What are the best practices for securing Snakemake workflows when using AWS cloud resources? \n", + "```\n", + "\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Use Case 6: Write code from scratch\n", + "\n", + "Prompting an AI tool like Amazon Q Developer to write a script can be incredibly useful for several reasons. It significantly enhances efficiency by reducing the time and effort needed to create code from scratch. Additionally, it serves as a valuable learning aid, helping users understand coding practices and library usage through generated examples. The generated scripts provide a flexible starting point that can be easily customized to meet specific requirements, allowing users to quickly adapt and expand their projects.\n", + "\n", + "The response should include an R script that accomplishes the prompt instructions, with comments to help users understand the logic. \n", + "\n", + "Please paste the following prompt in your coding assistant search bar. \n", + "\n", + "##### **Prompt** \n", + "```markdown\n", + "Can you assist me in writing an R script that generates a plot of gene expression levels from a dataset? Please use the ggplot2 library for visualization. The script should read a CSV file containing gene expression data and produce a bar plot showing the expression levels of each gene.\n", + "``` " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Use Case 7: Error debugging\n", + "\n", + "Amazon Q Developer can also be used to identify and fix errors in your code. This is highly beneficial as it can save time and identify errors that may have been difficult resolve. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "vscode": { + "languageId": "plaintext" + } + }, + "source": [ + "### Prompting Best Practices \n", + "\n", + "Here are some tips to help you in your prompting journey!\n", + "\n", + "1. Be Specific: Clearly state your request or question. Provide specific details to avoid ambiguity.\n", + "\n", + "2. Provide Context: Provide background information that would be relevant to your prompt.\n", + "\n", + "3. Break Down Complex Requests: Divide complex tasks into smaller, manageable parts.\n", + "\n", + "5. Iterate and Refine: Refine your prompts based on the responses you receive. Provide additional context to fine-tune the responses according to your end goal. \n", + "\n", + "6. Validate the Response: Always use a human-in-the-loop approach to validate the responses you receive. \n" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusions\n", + "\n", + "Congrats! You have successfully experimented with the features of Amazon Q. We hope you continue leveraging the powers of GenAI and Amazon Q Developer to drive impactful results in your projects.\n", + "\n", + "## Clean Up\n", + "\n", + "Once you have completed the tutorial, you may stop the JupyterLab application and delete the SageMaker AI Studio domain. Please note that this action will delete the JupyterLab environment created for this tutorial and any other applicatios running in your SageMaker AI Studio domain. \n", + "\n", + "### Stopping domain applications\n", + "\n", + "1. Navigate to Amazon SageMaker AI\n", + "2. Click on Admin configurations > Domains in the left menu bar\n", + "3. Select the SageMaker AI Studio domain by clicking on the circle found on the right side of the domain name\n", + "4. Scroll down to the available applications \n", + "5. Select any applications you have created and click Stop\n", + "\n", + "### Deleting the SageMaker AI Studio domain\n", + "\n", + "1. Navigate to Amazon SageMaker AI\n", + "2. Click on Admin configurations > Domains in the left menu bar\n", + "3. Click on the domain name \n", + "4. Scroll down to the Delete domain box and select Delete domain" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "base", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python", + "version": "3.11.7" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/notebooks/GenAI/example_scripts/bioinformatics_testing.py b/notebooks/GenAI/example_scripts/bioinformatics_testing.py new file mode 100644 index 0000000..57373a5 --- /dev/null +++ b/notebooks/GenAI/example_scripts/bioinformatics_testing.py @@ -0,0 +1,26 @@ +import pandas as pd +import subprocess + +# Step 1: Read the sample sheet +sample_sheet = pd.read_csv('samplesheet.csv') + +# Step 2: Run FastQC +for index, row in sample_sheet.iterrows(): + fastqc_command = f"fastqc {row['file_path']} -o ./fastqc_results/" + subprocess.run(fastqc_command, shell=True) + +# Step 3: Run MultiQC +multiqc_command = "multiqc ./fastqc_results/ -o ./multiqc_report/" +subprocess.run(multiqc_command, shell=True) + +# Step 4: Run STAR aligner +for index, row in sample_sheet.iterrows(): + star_command = f"STAR --genomeDir /path/to/genome --readFilesIn {row['file_path']} --outFileNamePrefix ./star_results/{row['sample_id']}" + subprocess.run(star_command, shell=True) + +# Step 5: Index BAM files with Samtools +for index, row in sample_sheet.iterrows(): + bam_file = f"./star_results/{row['sample_id']}.bam" + samtools_command = f"samtools index {bam_file}" + subprocess.run(samtools_command, shell=True) + diff --git a/notebooks/GenAI/example_scripts/quick-actions-testing.ipynb b/notebooks/GenAI/example_scripts/quick-actions-testing.ipynb new file mode 100644 index 0000000..bb8cf6b --- /dev/null +++ b/notebooks/GenAI/example_scripts/quick-actions-testing.ipynb @@ -0,0 +1,133 @@ +{ + "cells": [ + { + "cell_type": "code", + "execution_count": 1, + "id": "bed9c0c9-4756-4161-b4be-e32ce3a58bff", + "metadata": {}, + "outputs": [], + "source": [ + "#Cell 1\n", + "#import libraries\n", + "\n", + "import pandas as pd\n", + "import numpy as np" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "64f8d006-28f3-4d83-ae3a-9e23cccff5d7", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + " Gene Expression_Level Sample_ID Condition\n", + "0 GeneA 12.5 S1 Control\n", + "1 GeneB 8.3 S2 Treated\n", + "2 GeneC 15.2 S3 Control\n", + "3 GeneD 7.8 S4 Treated\n" + ] + } + ], + "source": [ + "#Cell 2\n", + "\n", + "# Initialize data of lists\n", + "data = {\n", + " 'Gene': ['GeneA', 'GeneB', 'GeneC', 'GeneD'],\n", + " 'Expression_Level': [12.5, 8.3, 15.2, 7.8],\n", + " 'Sample_ID': ['S1', 'S2', 'S3', 'S4'],\n", + " 'Condition': ['Control', 'Treated', 'Control', 'Treated']\n", + "}\n", + "\n", + "# Create DataFrame\n", + "df = pd.DataFrame(data)\n", + "\n", + "# Display the DataFrame\n", + "print(df)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "8ca19300-1635-4a8a-9ef8-f9554bc1baac", + "metadata": {}, + "outputs": [ + { + "ename": "NameError", + "evalue": "name 'describe' is not defined", + "output_type": "error", + "traceback": [ + "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", + "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", + "Cell \u001b[0;32mIn[5], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m# View summary statistics\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m \u001b[43mdescribe\u001b[49m()\u001b[38;5;241m.\u001b[39mdf\n", + "\u001b[0;31mNameError\u001b[0m: name 'describe' is not defined" + ] + } + ], + "source": [ + "#Cell 3\n", + "\n", + "# Error debugging test /fix\n", + "# View summary statistics\n", + "describe().df" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "550a402e-66bd-4890-a063-e3d82679c0a8", + "metadata": {}, + "outputs": [], + "source": [ + "#Cell 4\n", + "\n", + "# Optimize selection test /optimize\n", + "# Add additional data to the dataframe\n", + "df['Sample_Type'] = ['Tissue1', 'Tissue2', 'Tissue1', 'Tissue2']\n", + "df['P_Value'] = [0.05, 0.01, 0.03, 0.07]" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0d45c1aa-2075-4c8a-9ecc-94fb03a71f78", + "metadata": {}, + "outputs": [], + "source": [ + "#Cell 5\n", + "\n", + "#Explain selection test /explain\n", + "#Plot results\n", + "import matplotlib.pyplot as plt\n", + "\n", + "df.plot(x='Sample_ID', y='Expression_Level', kind='line')\n", + "plt.show()" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}