USE ExampleDatabase;First of all a warning: UPDATE queries will permanently overwrite data in a column! Therefore, it is often best to create a new (empty) column first and to update this column, rather than overwriting existing data. UPDATE queries can be used with mathematical, text, date and logical functions. They can also add data from another table, which will be covered in the section on JOINS. The mathematical, text, date and logical functions shown in this section can also be used in SELECT queries. As shown in the previous section, SELECT queries do not change the exsiting tables; therefore, it can be helpful to run a SELECT query first, to check if the function creates the desired result and then to modify the syntax into an UPDATE query.
In it's most basic application, an UPDATE query can be used to copy data from one origin column to another target column.
ALTER TABLE questionnaire.recruitment --create new column
ADD smoker_2 int NULL; --same data type as original column
GO
UPDATE questionnaire.recruitment
SET smoker_2=smoker; --target column = origin columnIf the target column has the same data type as the origin column, the UPDATE function should run without problems. If the target column has a different data type than the origin column, SQL Server will try to convert the original data type to work with the target data type. This conversion is invisble to the user and is called an implicit conversion. This link https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-2017 shows a matrix of the conversion between data types.
For more control, explicit data conversion can be used via the CAST or CONVERT function. In most cases CAST and CONVERT are interchangeable, i.e. they will produce the same result. Therefore, it often comes down to personal preference; however, one exception is dates. For coverting date or datetime columns, the CONVERT function is recommended, because it allows an output style to be specfied (see https://teamsql.io/blog/?p=1455 and https://docs.microsoft.com/en-us/sql/t-sql/functions/cast-and-convert-transact-sql?view=sql-server-2017).
ALTER TABLE person.info
ADD dob_text varchar(10);
--CAST
UPDATE person.info
SET dob_text=CAST(dob AS varchar(10)); CAST will convert the date into a string in the same format as it is stored as a date.
ALTER TABLE person.info
ADD dob_date2 varchar(10);
--CONVERT
UPDATE person.info
SET dob_date2=CONVERT(varchar(10),dob,103);By adding style 103 to the CONVERT statement the string uses the dd/mm/yyyy output format.
UPDATE can also be used with a where clause to selectively change rows:
ALTER TABLE person.info
ADD gender varchar(6);
UPDATE person.info
SET gender='female'
WHERE sex=1;
UPDATE person.info
SET gender='male'
WHERE sex=0;The arithmetic operators are:
| Operator | Use |
|---|---|
| + | addition |
| - | subtraction |
| * | multiplication |
| / | division |
| % | modulo |
ALTER TABLE questionnaire.recruitment
ADD smokerPlusExercise int;
GO
UPDATE questionnaire.recruitment
SET smokerPlusExercise=smoker+exercise;If one of the values in an arithmetic operation is missing (i.e. NULL), the result will be NULL.
ALTER TABLE questionnaire.recruitment
ADD smokerPlusalcohol int;
GO
UPDATE questionnaire.recruitment
SET smokerPlusalcohol=smoker+alcohol;Examples of the other mathematical operators using a SELECT query:
SELECT *
,BP_systolic-BP_diastolic AS subtraction
,BP_systolic*BP_diastolic AS multiplication
,BP_systolic/BP_diastolic AS division
,BP_systolic%BP_diastolic AS modulo
FROM health.bloodpressureNote that the division results are all 1 or 2. This has happened, because both BP_systolic and BP_diastolic have data type int and therefore the resulting column is also an int. To get a result with decimals, one or both columns need to be converted into numeric data type first:
SELECT *
,CAST(BP_systolic AS NUMERIC(4,1))/CONVERT(numeric(5,2),BP_diastolic) AS division_decimals
FROM health.bloodpressureThe result is shown with a greater number of decimal places than specified in the CAST or CONVERT statement. This link https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-2017 shows how the precision is set for different arithmetic operations. For divisions the minimum precision is 6 decimal places.
| Function | Meaning | Example |
|---|---|---|
| ABS | absolute value of a number |
SELECT*
,ABS(temperature) as abs_temp
FROM env.temperature;
|
| POWER | raises value to specified power |
SELECT*
,POWER(temperature,3) as power_temp
FROM env.temperature;
SELECT*
,POWER(concentration,0.5) as squareroot_conc
FROM env.pollutant
|
| SQUARE | square of a number |
SELECT*
SQUARE(concentration) as square_conc
FROM env.pollutant
|
| SQRT | square root of a number |
SELECT*
SQRT(concentration) as squareroot_conc
FROM env.pollutant
|
| EXP | exponential (raises value to base e) |
SELECT*
,EXP(concentration) AS exp_conc
FROM env.pollutant
|
| LOG | logarithm of a number, base is optional, by default the natural logarithm will be calculated |
SELECT*
,LOG(concentration) AS ln_conc
,LOG(concentration,5) AS logbase5_conc
FROM env.pollutant
|
| LOG10 | logarithm of a number with base 10 |
SELECT*
,LOG10(concentration) AS lg_conc
FROM env.pollutant
|
| SIGN | returns -1,0 or +1 |
SELECT*
,SIGN(temperature)
FROM env.temperature
|
The ROUND function will round to a specified number of decimal places. If a negative number of decimal places is specified SQL will round to that number of places to the left of the decimal point, i.e. -1 will round to tens, -2 will round to hundreds and so on. The FLOOR and CEILING functions can be used to always round down or up. However, these always round to an integer.
SELECT *
,ROUND(CAST(BP_systolic AS NUMERIC(4,1))/CONVERT(numeric(5,2),BP_diastolic),2) AS division_decimals
,ROUND(BP_systolic,-1) AS negative_round
,FLOOR(CAST(BP_systolic AS NUMERIC(4,1))/CONVERT(numeric(5,2),BP_diastolic)) AS rounddown
,CEILING(CAST(BP_systolic AS NUMERIC(4,1))/CONVERT(numeric(5,2),BP_diastolic)) AS roundup
FROM health.bloodpressureThe RAND function will return a pseudo-random number between 0 and 1.
SELECT *
,RAND(100) AS random
FROM person.infoHowever, every row in the random column, shows the same number, because the RAND function only ran once. To add a different random number in each, you could add a loop that runs RAND() multiple times or you could use the quick fix below. CRYPT_GEN_RANDOM() generates cryptographic randomly generated hexadecimal numbers. The number between the brackets is the length of the hexadecimal in bytes. The modulo is used to convert the hexadecimal into an integer. The random numbers generated range from 0 to n-1, where n is the number used in the modulo.
SELECT *
,CRYPT_GEN_RANDOM(2)%100 AS random
FROM person.infoAdd two columns to the env.temperature table. In the first column calculate the temperature in Kelvin. In the second column calculate the temperature in Fahrenheit.
Click to see solution
ALTER TABLE env.temperature
ADD kelvin decimal(6,2) NULL
,fahrenheit decimal(6,2) NULL;
GO
UPDATE env.temperature
SET kelvin=temperature+273.15
,fahrenheit=temperature*1.8+32;
SELECT *
FROM env.temperatureThe CONCAT function can be used to concatenate two or more columns or user specified strings. The columns can have any data type; however, SQL will convert the data into strings, therefore the resulting columns will be a text column. If a user speficied string is used, it must be enclosed in single quotes.
ALTER TABLE questionnaire.recruitment
ADD fav_col_fruit varchar(20) NULL
,fav_col_date varchar(20) NULL
,smoker_alcohol varchar(3) NULL;
GO
UPDATE questionnaire.recruitment
SET fav_col_fruit = CONCAT(fav_col,fav_fruit)
,fav_col_date = CONCAT(fav_col,'2019-01-23')
,smoker_alcohol = CONCAT(smoker,'_',alcohol); The LEN function can be used to count the number of characters in a cell. It only works with text data type, not with numeric or date data types. It will count ALL characters, including white spaces.
SELECT *
,LEN(fav_col) AS length_colour
,LEN(fav_fruit) AS length_fruit
,LEN(fav_col_fruit) AS length_both
,LEN(fav_col)+LEN(fav_fruit) AS length_check
,LEN(smoker_alcohol) AS length_smok_alc
FROM questionnaire.recruitment;
SELECT LEN('1 2 3 4 5 ')--white spaces are countedTo remove leading or trailing white spaces from a string the LTRIM and RTRIM functions can be used. LTRIM will remove all white spaces to the left of the first character, RTRIM will remove all white spaces to the right of the last character. ``SQL SELECT LEN(LTRIM(' 12345')) AS fromLeft; SELECT LEN(RTRIM('12345 ')) AS fromRight;
SQL Server 2017 upwards also has the TRIM function. TRIM will remove white space from anywhere in a string. It can also be used to remove specific characters from a string. In SQL Server 2012, the same can be achived with the REPLACE function. The REPLCAE function uses the following syntax REPLACE('a string','characters to be replaced','new characters'). So to remove all white spaces you could use:
```SQL
SELECT REPLACE(' 1 2 3 4 5 ',' ','')
REPLACE can change any character pattern:
SELECT*
,REPLACE(fav_col,'bl','xx') as new_col
FROM questionnaire.recruitment;Useful functions to extract part of a string are LEFT,RIGHT, SUBSTRING and CHARINDEX. LEFT extracts the first n characters from a string by using LEFT('a string', number of characters). The RIGHT syntax works in the same way, but extracts the last n characters from a string. The SUBSTRING function can extract from anywhere in a string by specifying a starting position and the number of characters after the starting position. CHARINDEX is useful when the starting position or the number of characters to be extracted varies. For example, the first part of a postcode can vary in length from 2 to 4 characters; therefore to extract the first part of a postcode, you could use CHARINDEX to find the white space separating the first and second part of the postcode and then use this with a LEFT function. The CHARINDEX syntax requires a character to be found, a string to be searched, and can optionally have a starting position: CHARINDEX('character to be found','a string to be searched','optional starting position'.
SELECT *
,LEFT(fav_col_fruit,5) as leftstring
,RIGHT(fav_col_date,10) as rightstring -- the result shown is a still a string, not a date!
FROM questionnaire.recruitment;
SELECT*
,LEN(postcode) as lengthPostcode
,CHARINDEX(' ',postcode) as lengthFirstPart
,LEFT(postcode,CHARINDEX(' ',postcode)-1) AS firstPart
,SUBSTRING(postcode,CHARINDEX(' ',postcode)+1,LEN(postcode)-CHARINDEX(' ',postcode)) as secondPart
FROM person.info For a list of all string functions go to: https://docs.microsoft.com/en-us/sql/t-sql/functions/string-functions-transact-sql?view=sql-server-2017
SQL can add timestamps to data. The most commonly functions for this are SYSDATETIME and CURRENT_TIMESTAMP. They only differ in the number of fractional seconds they provide. Obvioulsy, these functions are most useful when adding rows to a table, but they can be used at any time.
SELECT*
,SYSDATETIME() AS sysdate
,CURRENT_TIMESTAMP as curr_timestamp
FROM person.infoThe DATEADD function can add or subtract timeperiods specified in a number of units ranging from nanoseconds to years. For a full list of available units see: https://docs.microsoft.com/en-us/sql/t-sql/functions/dateadd-transact-sql?view=sql-server-2017 . The function uses the following syntax: DATEADD('unit of timeperiod to be added, e.g. year or minute', 'number of timeperiods to be added', 'date to which timeperiod should be added').
SELECT *
,DATEADD(month,12,dob) as months12after
,DATEADD(month,-12,dob) as months12before
FROM person.info;
SELECT *
,DATEADD(minute,10,vDateTime) as plus10minutes
FROM env.pollutant;The DATEDIFF function can calculate the difference between two dates or datetimes in a number of units. It is similar to the DATEADD function in that you need to specify a unit for the timeperiod. The syntax is DATEDIFF('unit of timeperiod, e.g. day or hour','start date','end date').
SELECT *
,DATEDIFF(year,dob,SYSDATETIME()) as agenow
FROM person.info;
SELECT *
,DATEDIFF(year,t2.dob,t1.vDate) as ageAtVisit
FROM person.schedule AS t1
LEFT JOIN person.info AS t2
on t1.subjectID=t2.subjectID;Dates can be created by providing integers for the year, month, and day using the DATEFROMPARTS(year,month,day) syntax. This can be useful when dates in the original dataset are recorded in an awkward format and need to be imported as a text string.
--Example of awkward dates
CREATE TABLE #tempdate
(weirddate varchar(10) NULL);
GO
INSERT INTO #tempdate (weirddate)
VALUES ('12/31/2019'),('5/6/19');
SELECT*
FROM #tempdate;
ALTER TABLE #tempdate
ADD vYear int
,vMonth int
,vDay int
,vDate date;
UPDATE #tempdate
SET vMonth = CAST(LEFT(weirddate,CHARINDEX('/',weirddate)-1) AS int);
UPDATE #tempdate
SET vDay = CAST(SUBSTRING(weirddate,LEN(vMonth)+2,CHARINDEX('/',weirddate,LEN(vMonth)+1)-1) AS int);
UPDATE #tempdate
SET vYear = CAST(RIGHT(weirddate,LEN(weirddate)-(LEN(vMonth)+LEN(vDay)+2)) AS int);
UPDATE #tempdate
SET vDate = DATEFROMPARTS(vYear,vMonth,vDay);
UPDATE #tempdate
SET vDate = DATEADD(year,2000,vDate)
WHERE vDate < '2000-01-01';The DATEPART function can be used on existing dates. It can extract any part of a date, but is particularly useful to extract the weekday or the day of the year.
SELECT*
,DATEPART(weekday,dob) as dobWeekday --in the default setting Sunday = 1!
,DATEPART(quarter,dob) as dobQuarter
FROM person.info;The CHOOSE function can be used to select something from a list based on an integer value. The syntax is CHOOSE(integer index, 'list of things to choose from separated by commas').
SELECT*
,CHOOSE(DATEPART(weekday,dob),'Sun','Mon','Tue','Wed','Thu','Fri','Sat') as dobWeekday
FROM person.info;The most common logical expressions are conditional statements (aka if...else statements). SQL Server uses the CASE syntax for conditional statements.
CASE 'name of a column to be checked or can be left blank'
WHEN 'condition to be checked'
THEN 'what to do if condition is true
ELSE 'what to do if condition is false, this is optional, if no else option is specified, cells where the condition is false will be set to NULL'
END
You can use multiple WHEN ... THEN... statements.
ALTER TABLE person.info
ADD gender_case varchar(6);
GO
UPDATE person.info
SET gender_case =
CASE sex
WHEN 1 THEN 'female'
ELSE 'male'
END;
ALTER TABLE health.physical
ADD bmi_cat_number int NULL;
GO
UPDATE health.physical
SET bmi_cat_number =
CASE
WHEN bmi_cat = 'normal' THEN 0
WHEN bmi_cat = 'overweight' THEN 1
WHEN bmi_cat = 'underweight' THEN -1
END;The same operators that are used in where clauses, can be used in CASE statements.
SELECT *
,CASE
WHEN height>190 AND weight<60 THEN 'tall & light'
END AS bmi_text
FROM health.physicalAdd a column to the person.info table called postcode8. In this column show the postcode with 8 characters,i.e. if a postcode is already 8 characters long, copy it as is, if it is shorter add more whitespace to the whitespace in the middle, so that it becomes 8 characters long.
Click to see solution
ALTER TABLE person.info
ADD postcode8 varchar(8) NULL;
GO
UPDATE person.info
SET postcode8 =
CASE
WHEN LEN(postcode) = 8 THEN postcode
WHEN LEN(postcode) = 7 THEN REPLACE(postcode,' ',' ')
WHEN LEN(postcode) = 6 THEN REPLACE(postcode,' ',' ')
END;
SELECT *
,LEN(postcode8)
FROM person.info;