Course
machine-learning-zoomcamp
Question
How we can estimate the cost of executed queries within a specified range of time in bigquery, and determine what users and when in time this executions were made.
Answer
The metadata table that contains all the history related with query and Jobs executions is called 'JOBS_BY_PROJECT' and we can find it in the path <your_project_id>.<your_region>.INFORMATION_SCHEMA.JOBS_BY_PROJECT; the following query must be executed within the bigquery query editor.
SELECT
DATE(creation_time) as date,
job_type,
statement_type,
user_email,
SUM(total_bytes_billed) / 1099511627776 * 6.25 as estimated_cost_usd, -- on-demand model costs are defined US$ 6.25 per TB
COUNT(*) as query_count
FROM project-0c3c5223-416f-4242-b0f.region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECT
WHERE creation_time >= DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -60 DAY) --Interval can be modified
GROUP BY date, user_email, job_type, statement_type
ORDER BY estimated_cost_usd DESC -- Order by the highest cost execution based on total bytes billed
;
Output:
With this query we will be able to find insights about in which dates and what users did query executions with highest costs and optimize ones that are not efficient.
Checklist
Course
machine-learning-zoomcamp
Question
How we can estimate the cost of executed queries within a specified range of time in bigquery, and determine what users and when in time this executions were made.
Answer
The metadata table that contains all the history related with query and Jobs executions is called 'JOBS_BY_PROJECT' and we can find it in the path
<your_project_id>.<your_region>.INFORMATION_SCHEMA.JOBS_BY_PROJECT; the following query must be executed within the bigquery query editor.SELECT
DATE(creation_time) as date,
job_type,
statement_type,
user_email,
SUM(total_bytes_billed) / 1099511627776 * 6.25 as estimated_cost_usd, -- on-demand model costs are defined US$ 6.25 per TB
COUNT(*) as query_count
FROM
project-0c3c5223-416f-4242-b0f.region-us.INFORMATION_SCHEMA.JOBS_BY_PROJECTWHERE creation_time >= DATE_ADD(CURRENT_TIMESTAMP(), INTERVAL -60 DAY) --Interval can be modified
GROUP BY date, user_email, job_type, statement_type
ORDER BY estimated_cost_usd DESC -- Order by the highest cost execution based on total bytes billed
;
Output:
With this query we will be able to find insights about in which dates and what users did query executions with highest costs and optimize ones that are not efficient.
Checklist