Autonumber field: Avegage from 4 columns

I have 4 columns and i need average from this columns.
I can sum values from this columns and divide by 4, but this is inproper in case of nulls.

Is there any easy way to do that?

Right now I'am using:

((SELECT CASE WHEN Column1 IS NOT null
OR Column2 IS NOT NULL
OR Column3 IS NOT NULL
OR Column4 IS NOT NULL THEN
(SELECT
(CASE WHEN Column1 IS NOT null THEN Column1 ELSE 0 END +
CASE WHEN Column2 IS NOT null THEN Column2 ELSE 0 END +
CASE WHEN Column3 IS NOT null THEN Column3 ELSE 0 END +
CASE WHEN Column4 IS NOT null THEN Column4 ELSE 0 END)
/ (0.0 +
CASE WHEN Column1 IS NOT null THEN 1 ELSE 0 END +
CASE WHEN Column2 IS NOT null THEN 1 ELSE 0 END +
CASE WHEN Column3 IS NOT null THEN 1 ELSE 0 END +
CASE WHEN Column4 IS NOT null THEN 1 ELSE 0 END))
ELSE
null
END

Can you post a small sample with some numbers demonstrating your problem, along with the desired output and also an explanation with rules of exactly how you want your averages calculated?

/Kenneth

|||

This is probably the best way to do it. It is straightforward and easy to debug, if a pain to write.

It is never going to be easy when working with a vector of values like this in SQL. SQL is optimized solely for working with sets of data, so if you had another table with this value in it and a key to group on, you could have a million values and it would work. The first normal form deals with this situation of having a variable number of values in a row for this very reason.

This kind of thing isn't always wrong of course, I have a place where I have to compare date values from a join to see when any values in the row were last updated. It is a big red flag though.

|||

You can use some other syntax to shorten the notation, but you will have to do essentially the same logic. If you have the choice to reorganize you tables, you might want to put the four columns in a separate table. That would allow you to use the Aggregate functions, which already have the NULL handling.

Here is an example that shows how to use other functions to shorten the notation. In order to prevent a divide by zero in the case where all 4 columns are null. I chose to return 0, but you could also return null.

drop table #colAvg

create table #colAvg(

pkid int not null Identity(1,1),

colA float null,

colB float null,

colC float null,

colD float null

)

insert #colAvg values( 1.0, null, 3.0, 4.0 )

insert #colAvg values( null, null, 3.0, 4.0 )

insert #colAvg values( 1.0, null, null, 4.0 )

insert #colAvg values( null, null, null, 4.0 )

insert #colAvg values( null, null, null, null )

Selectpkid,

numerator = (IsNull( colA, 0 ) + IsNull( colB, 0 ) + IsNull( colC, 0 ) + IsNull( colD, 0 )),

denominator = IsNull( NullIf( IsNull( Sign( colA ), 0 ) + IsNull( Sign( colB), 0 ) + IsNull( Sign( colC ), 0 ) + IsNull( Sign( colD ), 0 ), 0 ), 1 ),

average = (IsNull( colA, 0 ) + IsNull( colB, 0 ) + IsNull( colC, 0 ) + IsNull( colD, 0 )) /

IsNull( NullIf( IsNull( Sign( colA ), 0 ) + IsNull( Sign( colB), 0 ) + IsNull( Sign( colC ), 0 ) + IsNull( Sign( colD ), 0 ), 0 ), 1 )

From#colAvg

pkid numerator denominator average

-- -- -

1 8.0 3.0 2.6666666666666665

2 7.0 2.0 3.5

3 5.0 2.0 2.5

4 4.0 1.0 4.0

5 0.0 1.0 0.0|||table (Column1, Column2, Column3, Column4).
Table always have only 1 row.

Example1:
values (1,2,3,4) - average 2.5

Example2:
values(1,2,3,null) - average 2

Example3:
values(1,null,null,9) - average 5|||

Wow. Totally suggest you create 1-4 rows instead of 1-4 scalar values like this. I don't know what the statement used to load the data, but I will be changing it to insert into a table would be far easier in the long run (especially when the annoying user realizes that actually they need 10+ values)

For one row, you could do this (for > 1 row you would need a bit more in the where clause or a group on the outside)

select sum(columnValue)
from (select column1 as columnValue
from table
where column1 is not null
UNION ALL
select column2
from table
where column2 is not null
UNION ALL
select column3
from table
where column3 is not null
UNION ALL
select column4
from table
where column4 is not null) as values

|||

Another way to write the query is to do below. The cross join might be faster depending on the data.

select avg(case c.n

when 1 then t.column1

when 2 then t.column2

when 3 then t.column3

when 4 then t.column4

end)

from table as t

cross join (select 1 union all select 2 union all select 3 union all select 4) as c(n)

|||

Nice solution to a unnice problem :) I am always impressed by the many ways one can write SQL to get past improper design. It just goes to show you that a few minutes upfront can often save you hours later. If the data was in rows:

select sum(value)
from table

Oh well, certainly not as many creative solutions required :)

|||

select (isnull(column1,0)+isnull(column1,0)+isnull(column1,0)+isnull(column1,0))/4

FROM TABLEX

|||

I use the same temp table from anomolous :

For a unpivot solution for Sql Server 2005:

drop table #colAvg
create table #colAvg(
pkid int not null Identity(1,1),
colA float null,
colB float null,
colC float null,
colD float null
)
insert #colAvg values( 1.0, null, 3.0, 4.0 )
insert #colAvg values( null, null, 3.0, 4.0 )
insert #colAvg values( 1.0, null, null, 4.0 )
insert #colAvg values( null, null, null, 4.0 )
insert #colAvg values( null, null, null, null )
SELECT pkid, AVG(newValues) as avgFrom4Columns FROM
(SELECT pkid, colA, colB, colC, colD
FROM #colAvg) p
UNPIVOT
(newValues FOR cols4 IN (colA, colB, colC, colD)
) as unpvt
GROUP BY pkid

By the way, the Isnull(colnameX,0) solution does not provide the right answer .

|||Good use of the new unpivot operator. Btw, it does the same as the CROSS JOIN technique although easier to express in terms of syntax.|||Data are in columns not in rows.|||I can't divide by 4. I must count not null columns. I must divide by 4 or 3 or 2 or 1 (depends on values).|||

This is a second way I want to do this, but instead of "sum" you must use "avg" and you can remove "where" statement (avg function automaticaly removes null values).

Here is my another question. Witch way will be faster? Yours with "union" or my with "case"?|||Case approach is faster since the query using CASE expression does less I/O than UNION ALL approach or CROSS JOIN or UNPIVOT. You can perform the calculations in one pass on each row.

Monday, February 13, 2012

Avegage from 4 columns

No comments:

Post a Comment

Autonumber field

Blog Archive

About Me