DE101 comment for 1.7 states that the expected output is 25 lines, but got 399 instead with LIMIT 10 removed. Provided my notebook below, please advise if mistaken. Thank you.
Setup
%%capture
%%bash
python ./generate_data.py
python ./run_ddl.py
Run Python code as shown below
Run SQL code as shown below, with the %%sql called magics
25/12/18 13:23:57 WARN SparkSession: Using an existing Spark session; only runtime SQL configurations will take effect.
We use the prod.db schema where all our tables are create by run_ddl.py
Your code below
# THIS IS THE ACTUAL STATEMENT FROM 1.7:
# %%sql
# -- UNION will remove duplicate rows; the below query will produce 25 rows
# SELECT c_custkey, c_name FROM customer WHERE c_name LIKE'%_91%'
# UNION
# SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91%'
# UNION
# SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91'
# LIMIT 10;
%%sql
-- UNION will remove duplicate rows; the below query will produce 25 rows
SELECT COUNT(*) FROM(
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE'%_91%'
UNION
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91%'
UNION
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91');
%%sql
-- UNION will remove duplicate rows; the below query will produce 25 rows
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE'%_91%'
UNION
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91%'
UNION
SELECT c_custkey, c_name FROM customer WHERE c_name LIKE '%_91'
LIMIT 30;
| c_custkey |
c_name |
| 8991 |
Customer#000008991 |
| 2916 |
Customer#000002916 |
| 7911 |
Customer#000007911 |
| 9140 |
Customer#000009140 |
| 3291 |
Customer#000003291 |
| 5914 |
Customer#000005914 |
| 9160 |
Customer#000009160 |
| 14991 |
Customer#000014991 |
| 9103 |
Customer#000009103 |
| 11191 |
Customer#000011191 |
| 11991 |
Customer#000011991 |
| 9116 |
Customer#000009116 |
| 9125 |
Customer#000009125 |
| 9186 |
Customer#000009186 |
| 9115 |
Customer#000009115 |
| 9121 |
Customer#000009121 |
| 9158 |
Customer#000009158 |
| 9139 |
Customer#000009139 |
| 9171 |
Customer#000009171 |
| 9197 |
Customer#000009197 |
| 9916 |
Customer#000009916 |
| 14091 |
Customer#000014091 |
| 1910 |
Customer#000001910 |
| 10910 |
Customer#000010910 |
| 14914 |
Customer#000014914 |
| 4915 |
Customer#000004915 |
| 7691 |
Customer#000007691 |
| 8910 |
Customer#000008910 |
| 12915 |
Customer#000012915 |
| 13391 |
Customer#000013391 |
DE101 comment for 1.7 states that the expected output is 25 lines, but got 399 instead with
LIMIT 10removed. Provided my notebook below, please advise if mistaken. Thank you.Setup
Run Python code as shown below
Run SQL code as shown below, with the
%%sqlcalled magicsWe use the
prod.dbschema where all our tables are create byrun_ddl.pyYour code below