Skip to content

Emnyldrmdgn/02-Query-files-using-a-serverless-SQL-pool

Repository files navigation

Query-files-using-a-serverless-SQL-pool

Azure Synapse Analytics provides serverless SQL pools that enable you to decouple the SQL query engine from the data storage and run queries against data files in common file formats such as delimited text and Parquet.

Azure Synapse Analytics - Data Querying and Visualization

This project involves querying, analyzing, and visualizing data using Azure Synapse Analytics. Various data file formats (CSV, JSON, Parquet) were queried using SQL, and meaningful insights were derived from the data. Additionally, the results were visualized to present the data analysis more effectively.

Contents

  1. Azure Synapse Analytics Workspace Setup
    An Azure Synapse Analytics workspace was created in the Azure portal, and the Azure Data Lake Storage connection was established.

  2. Data Querying and File Types

    • CSV Files: SQL queries were executed on CSV files to retrieve the first 100 rows of data.
    • JSON Files: SQL queries were run on JSON files to extract values from the JSON data.
    • Parquet Files: SQL queries were executed on Parquet files to analyze the data.
  3. Data Visualization

    • SQL query results were visualized using the integrated charting tools in Synapse Studio.
    • Line and column charts were used to visualize the annual revenue trend.
  4. Data Sources and External Databases
    External data sources were created using PolyBase to reference data stored in the data lake for more complex queries.

    • External Data Source: An external data source was set up to allow SQL querying of data from the data lake.
    • External Table: External tables were created to query data from CSV files.
  5. Resource Cleanup
    After completing the project, Azure Synapse Analytics resources were cleaned up to avoid unnecessary costs.

Requirements

  • Azure Subscription (Administrator-level access)
  • Azure Synapse Analytics Workspace
  • Azure Data Lake Storage Gen2
  • SQL Knowledge

Technologies Used

  • Azure Synapse Analytics
  • SQL (Structured Query Language)
  • Azure Data Lake Storage Gen2
  • PolyBase
  • Synapse Studio

Steps

  1. Create an Azure Synapse Analytics workspace.
  2. Upload data files to Azure Data Lake.
  3. Query the data using SQL in Synapse Studio.
  4. Visualize the query results.
  5. Clean up resources once the project is completed.

Outcomes

This project demonstrates how data querying and visualization processes can be simplified and how Azure Synapse Analytics can be used effectively for data analysis.

ScreenShots

lab21 lab22 lab23 lab24 lab25 lab26 lab27 lab27properties lab28code lab28code2 lab28code3 lab28code4 lab29code lab29code2 lab210code lab2sqldatacode1-2 lab2sqldatacode3 lab2sqlchart1 lab2sqlchart2 lab2sqlchart3

About

Azure Synapse Analytics provides serverless SQL pools that enable you to decouple the SQL query engine from the data storage and run queries against data files in common file formats such as delimited text and Parquet.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors