Updating records in the Database

The usual form of an UPDATE statement is
UPDATE [LOW_PRIORITY] tablename
SET column1=expression1 [, column2=expression2, ...]
[ WHERE condition]
[ LIMIT number ];
An expression can be a constant or a mathematical/string/date expression that can include numeric, date, string, or flow control functions and also the values from the row under consideration. Let's consider a simple example where all employees are getting their salary increased by 15%:
 UPDATE employee
    SET salary = 1.15 * salary;
Please note that each employee has 15% of his/hers own salary.

In WHERE clause we can set some conditions that allow us to modify not all but selected records only. Let's try to increase the salary for only those employees who were hired more than 5 years ago:
 UPDATE employee
    SET salary = 1.15 * salary
  WHERE (now()-interval 5 year) > hired;
Of course we can change only a single row if we want.

Using the LIMIT section we can specify how many rows we want to change. Let's increase salary for only three employees:
 UPDATE employee
    SET salary = 1.05 * salary
  LIMIT 3;

Deleting Records from the Database

Deleting rows from databases is very simple. You can do this using the DELETE statement, which generally looks like this:
DELETE FROM tablename
[ WHERE condition]
[ ORDER BY columnname]
[ LIMIT number] ;
If you use the DELETE statement without WHERE clause MySQL will delete all records from the specified table. Using WHERE clause you can specify what exactly you want to delete. Let's delete all employees with salary more than $50,000.00
 DELETE FROM employee
  WHERE salary > 50000.00;

ORDER BY instructs the MySQL database in what order these records must be deleted. Of course there is no reason to use this option except with LIMIT section of the statement. Let's remove the first two employees with salary less than $10,000.00 (the first two from alphabetically ordered list):
 DELETE FROM employee
  WHERE salary < 10000.00
  ORDER BY name
  LIMIT 2;

Starting from version 4.0.0 the multi table delete format is supported in MySQL. The general form for this format is:

DELETE tablename [, tablename, ...]
FROM list_of_tables
[ WHERE condition];
The idea is that only matching rows from the tables listed before the FROM clause is deleted. The effect is that you can delete rows from many tables at the same time and also have additional tables that are used for searching. Example:
 DELETE t1, t2
   FROM t1, t2, t3
  WHERE t1.id = t2.id AND t2.id=t3.id;
In this example we remove matching rows from tables t1 t2.

Retrieving Data from the Database

To retrieve data from tables the SELECT statement is used. The basic form of this statement is:
SELECT item_list
FROM tablelist
[ WHERE conditions ]
[ GROUP BY group_type]
[ HAVING where_definition ]
[ ORDER BY order_type [ASC | DESC]]
[ LIMIT [start,] howmany]

Retrieving data from all rows

The simplest way to obtain data from a table is the following:
 SELECT *
   FROM students;
This query will bring all information from table students. If we are interested in only some particular attributes (not all of them), then we can specify what attributes we want to see:
 SELECT id, lastname, firstname, dateofbirth
   FROM students;

Retrieving data with specific criteria

Most of the time queries without any additional restrictions bring too many rows. If we know what we are looking for we can specify some selection criteria. For example, let's retrieve the names of all students with GPA greater than 3.5:
 SELECT firstname, lastname
   FROM students
  WHERE gpa > 3.5;
The WHERE clause specifies the criteria used to select particular rows. Table below contains different types of operators, which can be used in WHERE clause:
Operator Description Example
= Tests whether two values are equal
SELECT name
  FROM customers
 WHERE id=17;
> Tests whether the first value is greater than the other
SELECT name
  FROM employee
 WHERE salary > 45000.00
< Tests whether the first value is less than the other
SELECT name
  FROM employee
 WHERE salary < 45000.00
>= Tests whether the first value is greater than or equal to the other
SELECT name
  FROM employee
 WHERE salary >= 55000.00
<= Tests whether the first value is less than or equal to the other
SELECT name
  FROM employee
 WHERE salary <= 55000.00
!=
<>
Tests whether two values are not equal
SELECT lastname
  FROM students
 WHERE gpa != 2; 
IS NULL Tests whether field actually contains a value. Please note that you can not use ... WHERE retired = NULL
SELECT lastname
  FROM students
 WHERE graduated IS NULL; 
IS NOT NULL Tests whether field doesn't contain a value. Please note that you can not use ... WHERE retired <> NULL
SELECT lastname
  FROM students
 WHERE graduated IS NOT NULL; 
BETWEEN Tests whether a value is greater or equal to a minimum value and less than or equal to a maximum value
SELECT lastname
  FROM students
 WHERE graduated IS NOT NULL
       AND
       gpa BETWEEN 2.5 AND 3.9;
IN Tests whether a value is in a particular set
SELECT lastname
  FROM students
 WHERE MONTH(graduated) IN (10, 11, 12);
NOT IN Tests whether a value is not in a particular set
SELECT lastname
  FROM students
 WHERE MONTH(graduated) IN (6, 7, 8);
LIKE Checks whether a value matches a pattern using simple SQL pattern matching. You can use two special symbols here:
  • _ - matches any single character
  • % - matches any sequence of characters
  • SELECT lastname
      FROM students
     WHERE lastname LIKE 'Bo%'
           OR
           lastname LIKE '_o%';
    NOT LIKE Checks whether a value does not match a pattern
    SELECT lastname
      FROM students
     WHERE lastname NOT LIKE '_____';
    REGEXP [BINARY] Checks whether a value matches a regular expression. This is much more powerful than simple match using LIKE. For more information see MySQL documentation
    SELECT lastname
      FROM students
     WHERE lastname REGEXP BINARY 'L.*r';

    Of course you don't have to use only one condition in the WHERE clause. You can have a whole logical expression containing simple logical conditions combined with AND, OR, NOT, and parenthesis.

    Aliases

    Assume a have table employee and in this table I have name attribute that contains both first and last name separated with space. I would like to print only the first names of my employees. To do that I will use two string functions instr() and substring() that will be discussed later in more detailed way. So, my query will be something like this:
     SELECT substring(name, 1, instr(name, ' ')-1)
       FROM employee;
    
    and the result:
    
    substring(name, 1, instr(name, ' ')-1)
    --------------------------------------
    John
    Bob
    Annie
    Bob
    
    4 row(s) selected
    Everything works just fine. The only thing I'm not happy about is the name of the column I see in the output. It's hard to understand at a glance what substring(name, 1, instr(name, ' ')-1) means. We can easily avoid this kind of things using column aliases:
     SELECT substring(name, 1, instr(name, ' ')-1) AS 'First name',
            substring(name, instr(name, ' ')+1) AS 'Last name'
       FROM employee;
    
    In this case the output will be:
    First name	Last name
    ------------- ----------------
    John          Smith
    Bob           Smith
    Annie         Lancaster
    Bob           Bimon
    
    4 row(s) selected
    You can even use these aliases in ORDER BY and GROUP BY clauses:
    SELECT name, 1.15*salary as newsalary
      FROM employee
     ORDER by newsalary DESC;
    

    Retrieving data in a particular order

    If we need to sort the output in a specific order we can use ORDER BY clause. In this clause we can specify a list of columns in the order of we want them to be sorted and modifiers ASC and/or DESC. Let's look at the example:
     SELECT *
       FROM employee
      ORDER BY hired DESC, salary ASC;
    
    This query will print all information about all employees and the rows will be sorted by hiring date (in descending order), if several employees have the same hiring date, then they will be sorted by their salary (in ascending order). Please note that you can use aliases (assign in SELECT clause) he instead of real column names.

    Retrieving just a portion of data

    We can use LIMIT section of SELECT statement to retrieve only a portion of data from a table. This query produces only three rows as its result even if there are many more in the table.
     SELECT *
       FROM employee
      LIMIT 3;
    
    Using LIMIT with two arguments we can control at what record the query will start printing the rows and how many of it'll print. So, the next query will print 5 more records from the employee table starting from the fourth one:
     SELECT *
       FROM employee
      LIMIT 3, 5;
    
    More exactly, it will print no more than 5 rows (how many depends on how many records you have in your table). This feature is very convenient for web output when you don't wont to put too many information on one page.

    Retrieving data from multiple tables

    Often, to answer a question from the database, you will need to use data more than from one table. For example, if you want to print all employees from a given department it is not enough for you to select data from only employee table because this table does not contain department names. Fortunately, SELECT statement allows us to use a list of tables in its FROM clause:
     SELECT *
       FROM employee, department;
    
    Please note than this query produces too many extra rows because it performs cartesian product. To avoid that we need to specify that we want to join these tables. The JOIN methods will be discussed later for now it's enough to specify additional restriction on printer rows:
     SELECT *
       FROM employee, department
      WHERE id = depid;
    
    Unfortunately this query does not work. Why?
    To avoid that error we have to specify which table we want to get id attribute from:
     SELECT *
       FROM employee, department
      WHERE department.id = depid;
    

    As a matter of fact sometime we need to select from two copies of the same table. Let's consider the following example:

  • Select names and salaries for all employees whose salary is greater than John's.
    First of all we need to use employee table to know what's John's salary (SELECT salary FROM employee WHERE name LIKE 'John%'), then we need to use employee table to find all people whose salary is greater. But we can not use the same table since this query produces only one row (we have only one John in our company) so we have no other salary to compare. That means we need to select from 2 employee tables like this:
     SELECT name, salary
       FROM employee, employee
      WHERE name LIKE 'John%'
            AND
            salary > salary;
    
    Obviously, this query will not work. Why ?
    We can't avoid that ambiguity using the same method we used before put table name in front of attribute names to specify which table this attribute is coming from) because we have the same tables. In situation such as this we need to use aliases for table names:
     SELECT e2.name, e2.salary
       FROM employee e1, employee e2
      WHERE e1.name LIKE 'John%'
            AND
            e2.salary > e1.salary;
    
    Some time we need to select from more than two tables. For example:
  • Print the names of all employees and their bosses.

    Joining tables

    In the section above we saw how we can join two tables. There is a different syntax for the same query:
    SELECT * 
      FROM department JOIN employee 
     WHERE department.id = employee.depid;
    
    Notice, we are using key word JOIN instead of comma. In this example only the rows where there was a match between the tables were included. Sometimes we especially interested in the rows where there is no match -- for example, departments that have no employees, books that have never been ordered, customers who never placed an order. The easiest way to answer this type of question in MySQL is to use a left join:
    SELECT d.id, d.name, e.name 
      FROM department  d LEFT JOIN employee e
        ON d.id = e.depid;
    
    Please note the difference :
    Left join Cross join (usual)
    SELECT d.id, d.name, e.name 
      FROM department  d LEFT JOIN employee e
        ON d.id = e.depid;
    
    SELECT d.id, d.name, e.name 
      FROM department  d, employee e
     WHERE d.id = e.depid;
    
    id  name            name    
    SY  Systems         John Smith  
    SY  Systems         Bob Smith  
    SY  Systems         Bob Bimon  
    SY  Systems         Tom Jeckson  
    PH  Physics         Anny Lancaster  
    PH  Physics         Sam Tinker  
    NW  New department  NULL
      
    7 row(s) selected
    
    id  name     name    
    SY  Systems  John Smith  
    SY  Systems  Bob Smith  
    PH  Physics  Anny Lancaster  
    SY  Systems  Bob Bimon  
    SY  Systems  Tom Jeckson  
    PH  Physics  Sam Tinker  
    
    6 row(s) selected
    

    If there is no matching row in the right table, a row will be added to the result that contains NULL values in the right columns. Using this we can find all departments that have no employees:
    SELECT department.id, department.name, 'Has no employee' AS Employment
      FROM department LEFT JOIN employee 
        ON employee.depid = department.id
     WHERE employee.id IS NULL;
    Look at the difference between department LEFT JOIN employee and employee LEFT JOIN department.

    Grouping and aggregating data

    We often want to know some information about data in a table, not data itself, for example, how mane employees in this department, the average salary in the company, etc. MySQL has a set of aggregate functions that are useful for answering this type of query:

    Name Description
    AVG(column) Average of values in the specified column
    COUNT(items) If you specify a column, this will give you the number of non-NULL values on that column. If you add the word DISTINCT in front of the column name, you will get a count of the distinct values in that column only. If you specify COUNT(*), you will get a row count regardless of NULL values.
    MIN(column) Minimum of values in the specified column
    MAX(column) Maximum of values in the specified column
    STD(column) Standard deviation of values in the specified column
    STDDEV(column) Same as STD(column)
    SUM(column) Sum of values in the specified column

    These aggregate functions can be applied to a table as a whole, or to groups of data within a table.
    Examples:
  • Count the number of all employees:
    SELECT COUNT(*)
      FROM employee;
    
  • Count the number of all employees in every department:
    SELECT depid, COUNT(id)
      FROM employee
     GROUP BY depid;
    
  • Print name of each department, number of employees in it, and their average salary:
    SELECT D.name, COUNT(E.id), AVG(salary)
      FROM department D LEFT JOIN employee E
        ON D.id = E.depid
     GROUP BY D.name;	
    
  • Or improved version of the previous query:
    SELECT D.name, 
           COUNT(E.id) AS 'Emp #', 
           TRUNCATE(IFNULL(AVG(salary), 0), 2) AS 'Average Salary'
      FROM department D LEFT JOIN employee E
        ON D.id = E.depid
     GROUP BY D.name;	
    
    Note: in ANSI SQL, if you use an aggregate function or GROUP BY clause, the only things that can appear in your SELECT clause are the aggregate function(s) and the columns named in the GROUP BY clause. Also, if you want to use a column in a GROUP BY clause, it must be listed in the SELECT clause.
    In addition to grouping and aggregating data, we can also test the result of an aggregate function using HAVING clause. Remember: you can not use aggregate functions in WHERE clause. For example, print the name of every department with more than 2 employees:
    SELECT D.name, COUNT(E.id) as empnum
      FROM department D, employee E
     WHERE D.id = E.depid
     GROUP BY D.name
    HAVING empnum > 2;	
    

    Using User Variables

    There queries that can not be created using only one MySQL select statement. Let's consider an example:
  • Print the names of all departments with average salary (for the department) greater than average salary of the company.
    We know how to find average salary of the company:
    SELECT AVG(salary)
      FROM employee;
    
    and we know how to find all department with the average salary greater than a specified value:
    SELECT D.name
      FROM department D, employee E
     WHERE D.id = E.depid
     GROUP BY D.name
    HAVING AVG(salary) > specified_value;    
    
    What we can not do is to substitute the first query for that specified_value into the second one. So, we need a way to remember the result of the first query and use it in the second one. MySQL offers a way to do it. This way uses user variables. We need to run these two queries to achieve the goal:
    SELECT @avg_salary:=avg(salary) 
      FROM employee;
    
    SELECT D.name AS Department,  
           truncate(avg(salary),2) AS ASalary 
      FROM department D, employee E 
     WHERE D.id = E.depid 
     GROUP BY D.name 
    HAVING ASalary > @avg_salary;
    


    For additional information see MySQL documentation.