What is having sum sql. SELECT command HAVING section. Write queries with the SQL HAVING statement yourself, and then look at the solutions

It has in its arsenal many powerful tools for manipulating data stored in the form of tables.

Undoubtedly, the ability to group data when sampling it according to a certain criterion is one of these tools. HAVING, along with the WHERE operator, allows you to determine the conditions for selecting data that has already been grouped in some way.

HAVING SQL parameter: description

First of all, it is worth noting that this parameter is optional and is used exclusively in conjunction with the GROUP BY parameter. As you remember, GROUP BY is used when aggregate functions are used in SELECT, and the results of their calculations need to be obtained for certain groups. If WHERE allows you to set selection conditions before the data is grouped, then HAVING contains conditions relating to the data directly in the groups themselves. For a better understanding, let's look at the example with the circuit presented in the figure below.

This is an excellent example that gives HAVING SQL a description. A table is given with a list of product names, companies that produce them, and some other fields. In the query in the upper right corner, we are trying to get information about how many product items each company produces, and in the result we want to display only those companies that produce more than 2 items. The GROUP BY parameter formed three groups corresponding to company names, for each of which the number of products (rows) was calculated. But the HAVING parameter, with its condition, cut off one group from the resulting sample, since it did not satisfy the condition. As a result, we get two groups corresponding to companies with 5 and 3 production quantities.

One might wonder why use HAVING when SQL has WHERE. If we used WHERE, it would look at the total number of rows in the table, not by groups, and the condition would not make sense in this case. However, quite often they coexist perfectly in one request.

In the example above, we can see how the data is first selected by the names of the employees specified in the WHERE parameter, and then the result grouped in GROUP BY goes through additional check according to the salary amount for each employee.

SQL HAVING parameter: examples, syntax

Let's look at some features of the HAVING SQL syntax. The description of this parameter is quite simple. Firstly, as already noted, it is used exclusively in conjunction with the GROUP BY parameter and is specified immediately after it and before ORDER BY, if there is one in the request. This is understandable, since HAVING defines conditions for already grouped data. Second, only the aggregate functions and fields specified in the GROUP BY parameter can be used in the condition of this parameter. All conditions in this parameter are specified in exactly the same way as in the case of WHERE.

Conclusion

As you can see, there is nothing complicated about given operator No. Semantically, it is used in the same way as WHERE. It is important to understand that WHERE is used with respect to all selected data, and HAVING is used only with respect to the groups defined in the GROUP BY parameter. We have presented a comprehensive description of HAVING SQL, which is enough for you to confidently work with it.

How can I find out the number of PC models produced by a particular supplier? How to determine the average price of computers that have the same specifications? These and many other questions related to some statistical information can be answered using final (aggregate) functions. The standard provides the following aggregate functions:

All these functions return a single value. At the same time, the functions COUNT, MIN And MAX applicable to any data type, while SUM And AVG are used only for numeric fields. Difference between function COUNT(*) And COUNT(<имя поля>) is that the second one does not take into account NULL values ​​when calculating.

Example. Find the minimum and maximum price for personal computers:

Example. Find the available number of computers produced by manufacturer A:

Example. If we are interested in the quantity various models, produced by manufacturer A, then the query can be formulated as follows (using the fact that in the Product table each model is recorded once):

Example. Find the number of available different models produced by manufacturer A. The query is similar to the previous one, in which it was required to determine the total number of models produced by manufacturer A. Here you also need to find the number of different models in the PC table (i.e., those available for sale).

To ensure that only unique values ​​are used when obtaining statistical indicators, when argument aggregate functions can be used DISTINCT parameter. Another parameter ALL is the default and assumes that all returned values ​​in the column are counted. Operator,

If we need to get the number of PC models produced everyone manufacturer, you will need to use GROUP BY clause, syntactically following WHERE clauses.

GROUP BY clause

GROUP BY clause used to define groups of output lines that can be applied to aggregate functions (COUNT, MIN, MAX, AVG and SUM). If this clause is missing and aggregate functions are used, then all columns with names mentioned in SELECT, must be included in aggregate functions, and these functions will be applied to the entire set of rows that satisfy the query predicate. Otherwise, all columns of the SELECT list not included in aggregate functions must be specified in the GROUP BY clause. As a result, all output query rows are divided into groups characterized by the same combinations of values ​​in these columns. After this, aggregate functions will be applied to each group. Please note that for GROUP BY all NULL values ​​are treated as equal, i.e. when grouping by a field containing NULL values, all such rows will fall into one group.
If if there is a GROUP BY clause, in the SELECT clause no aggregate functions, then the query will simply return one row from each group. This feature, along with the DISTINCT keyword, can be used to eliminate duplicate rows in a result set.
Let's look at a simple example:
SELECT model, COUNT(model) AS Qty_model, AVG(price) AS Avg_price
FROM PC
GROUP BY model;

In this request, for each PC model, their number and average cost are determined. All rows with the same model value form a group, and the output of SELECT calculates the number of values ​​and average price values ​​for each group. The result of the query will be the following table:
model Qty_model Avg_price
1121 3 850.0
1232 4 425.0
1233 3 843.33333333333337
1260 1 350.0

If the SELECT had a date column, then it would be possible to calculate these indicators for each specific date. To do this, you need to add date as a grouping column, and then the aggregate functions would be calculated for each combination of values ​​(model-date).

There are several specific rules for performing aggregate functions:

  • If as a result of the request no rows received(or more than one row for a given group), then there is no source data for calculating any of the aggregate functions. In this case, the result of the COUNT functions will be zero, and the result of all other functions will be NULL.
  • Argument aggregate function cannot itself contain aggregate functions(function from function). Those. in one query it is impossible, say, to obtain the maximum of average values.
  • The result of executing the COUNT function is integer(INTEGER). Other aggregate functions inherit the data types of the values ​​they process.
  • If the SUM function produces a result that is greater than the maximum value of the data type used, error.

So, if the request does not contain GROUP BY clauses, That aggregate functions included in SELECT clause, are executed on all resulting query rows. If the request contains GROUP BY clause, each set of rows that has the same values ​​of a column or group of columns specified in GROUP BY clause, makes up a group, and aggregate functions are performed for each group separately.

HAVING offer

If WHERE clause defines a predicate for filtering rows, then HAVING offer applies after grouping to define a similar predicate that filters groups by values aggregate functions. This clause is needed to validate the values ​​that are obtained using aggregate function not from individual rows of the record source defined in FROM clause, and from groups of such lines. Therefore, such a check cannot be contained in WHERE clause.

The HAVING clause is used in combination with the GROUP BY clause. It can be used in a SELECT statement to filter the records returned by the GROUP BY clause.

HAVING clause syntax

aggregate_function may be a function like SUM, COUNT, MIN, or MAX.

Example of using the SUM function
For example, you can use the SUM function to look up the department name and the sales amount (for the relevant departments). The HAVING offer can only select those departments whose sales are more than $1000.

SELECT department, SUM(sales) AS "Total sales" FROM order_details GROUP BY department HAVING SUM(sales) > 1000 ;

Example of using the COUNT function
For example, you can use the COUNT function to retrieve the name of the department and the number of employees (in the relevant department) who earned more than $25,000 per year. The HAVING proposal will select only those departments where there are more than 10 such employees.

Example of using the MIN function
For example, you can use the MIN function to return the department name and the minimum revenue for that department. The HAVING proposal will return only those departments whose revenue starts at $35,000.

SELECT department, MIN(salary) AS "Lowest salary" FROM employees GROUP BY department HAVING MIN(salary) = 35000 ;

Example of using the MAX function
For example, you can also use the function to retrieve the department name and the department's maximum revenue. The HAVING proposal will only return those departments whose maximum revenue is less than $50,000.

SELECT department, MAX(salary) AS "Highest salary" FROM employees GROUP BY department HAVING MAX(salary)< 50000 ;

Last update: 07/19/2017

T-SQL uses the GROUP BY and HAVING statements to group data, using the following formal syntax:

SELECT columns FROM table

GROUP BY

The GROUP BY clause determines how the rows will be grouped.

For example, let's group products by manufacturer

SELECT Manufacturer, COUNT(*) AS ModelsCount FROM Products GROUP BY Manufacturer

The first column in the SELECT statement - Manufacturer represents the name of the group, and the second column - ModelsCount represents the result of the Count function, which calculates the number of rows in the group.

It is worth considering that any column that is used in a SELECT statement (not counting columns that store the result of aggregate functions) must be specified after the GROUP BY clause. So, for example, in the case above, the Manufacturer column is specified in both the SELECT and GROUP BY clauses.

And if the SELECT statement selects on one or more columns and also uses aggregate functions, then you must use the GROUP BY clause. Thus, the following example will not work because it does not contain a grouping expression:

SELECT Manufacturer, COUNT(*) AS ModelsCount FROM Products

Another example, let's add a grouping by the number of products:

SELECT Manufacturer, ProductCount, COUNT(*) AS ModelsCount FROM Products GROUP BY Manufacturer, ProductCount

The GROUP BY clause can group on multiple columns.

If the column you are grouping on contains a NULL value, the rows with the NULL value will form a separate group.

Note that the GROUP BY clause must come after the WHERE clause but before the ORDER BY clause:

SELECT Manufacturer, COUNT(*) AS ModelsCount FROM Products WHERE Price > 30000 GROUP BY Manufacturer ORDER BY ModelsCount DESC

Group filtering. HAVING

Operator HAVING determines which groups will be included in the output result, that is, it filters groups.

The use of HAVING is in many ways similar to the use of WHERE. Only WHERE is used to filter rows, HAVING is used to filter groups.

For example, let’s find all product groups by manufacturer for which more than 1 model is defined:

SELECT Manufacturer, COUNT(*) AS ModelsCount FROM Products GROUP BY Manufacturer HAVING COUNT(*) > 1

In this case, in one command we can use WHERE and HAVING expressions:

SELECT Manufacturer, COUNT(*) AS ModelsCount FROM Products WHERE Price * ProductCount > 80000 GROUP BY Manufacturer HAVING COUNT(*) > 1

That is, in this case, the rows are first filtered: those products are selected whose total cost is more than 80,000. Then the selected products are grouped by manufacturer. And then the groups themselves are filtered - those groups that contain more than 1 model are selected.

If it is necessary to sort, then the ORDER BY expression comes after the HAVING expression:

SELECT Manufacturer, COUNT(*) AS Models, SUM(ProductCount) AS Units FROM Products WHERE Price * ProductCount > 80000 GROUP BY Manufacturer HAVING SUM(ProductCount) > 2 ORDER BY Units DESC

In this case, the grouping is by manufacturer, and the number of models for each manufacturer (Models) and the total number of all products for all these models (Units) are also selected. At the end, the groups are sorted by number of products in descending order.

In the previous article we looked at. There I wrote that this construction allows you to select separate groups and for each group calculate the functions specified after SELECT. A HAVING allows, according to the results of executing functions, to filter out unnecessary rows from groups. Let's look at this in more detail.

Let's remember our previous problem, where we calculated the average price of milk for a specific supermarket chain. Let’s not just look at the average price, but also list only those supermarket chains where average price below 38.

For this filtering based on the results of executing the aggregate function, we use in SQL command HAVING:

SELECT `shop_id`, AVG(`price`) FROM `table` GROUP BY `shop_id` HAVING AVG(`price`)< 38

As a result, instead of 4 we will only have lines 3 :

shop_id AVG(`price`)
1 37.5
2 36.0
3 37.0

If designs GROUP BY it won't be then HAVING will not apply to a specific group, but to the entire sample. This means that if the condition HAVING will be executed, it will not have any effect. And if it is not executed, then there will not be a single resulting row.