Thursday, February 16, 2012

Averaging a set of columns in the same table

I have a table which has 100 columns with appx 20000 rows
Each column represents a set of data from a different source.
Anywhere from 1 to 100 of the columns may contain data (empty will be null)
I need to find a way to average accross the rows depending on the number of
rows which have data
Ie for first row
(Col1 + Col2 + Ccol3...Col34) / 34 where only col1 to col34 have data.
or
(Col1 + Col2 + Ccol3) / 3 where only col1 to col3 have data.
Note number of rows will always be the same for every column appx 20000
I know I could convert the horizontal data into vertical but this will mean
a table of over 2million rows. Other constarints force me to aviod this.
Any ideas ?Hi
Normalize the table and then you can do averages over the groups (as you now
have rows) using the group by clause.
Your table design is not conducive to easy programming.
Regards
--
Mike Epprecht, Microsoft SQL Server MVP
Zurich, Switzerland
IM: mike@.epprecht.net
MVP Program: http://www.microsoft.com/mvp
Blog: http://www.msmvps.com/epprecht/
"Jinx1966" <Jinx1966@.discussions.microsoft.com> wrote in message
news:D9C74420-C0A1-400B-8518-3CE1BE0B19D5@.microsoft.com...
>I have a table which has 100 columns with appx 20000 rows
> Each column represents a set of data from a different source.
> Anywhere from 1 to 100 of the columns may contain data (empty will be
> null)
> I need to find a way to average accross the rows depending on the number
> of
> rows which have data
> Ie for first row
> (Col1 + Col2 + Ccol3...Col34) / 34 where only col1 to col34 have data.
> or
> (Col1 + Col2 + Ccol3) / 3 where only col1 to col3 have data.
>
> Note number of rows will always be the same for every column appx 20000
> I know I could convert the horizontal data into vertical but this will
> mean
> a table of over 2million rows. Other constarints force me to aviod this.
> Any ideas ?|||> I know I could convert the horizontal data into vertical but this will mean
> a table of over 2million rows.
And... ?
> Other constarints force me to aviod this.
What constraints? This design is likely to be an awful nightmare and
performance bottleneck. The cost of supporting it is surely far greater
than that of fixing it.
but ...
(COALESCE(col1,0)+COALESCE(col2,0)+COALESCE(col3,0) ...)/
(CASE WHEN col1 IS NOT NULL THEN 1 END +
CASE WHEN col2 IS NOT NULL THEN 1 END +
CASE WHEN col3 IS NOT NULL THEN 1 END +
...)
(yuck!)
--
David Portas
SQL Server MVP
--|||Thanks Mike.
"Mike Epprecht (SQL MVP)" wrote:
> Hi
> Normalize the table and then you can do averages over the groups (as you now
> have rows) using the group by clause.
> Your table design is not conducive to easy programming.
> Regards
> --
> Mike Epprecht, Microsoft SQL Server MVP
> Zurich, Switzerland
> IM: mike@.epprecht.net
> MVP Program: http://www.microsoft.com/mvp
> Blog: http://www.msmvps.com/epprecht/
> "Jinx1966" <Jinx1966@.discussions.microsoft.com> wrote in message
> news:D9C74420-C0A1-400B-8518-3CE1BE0B19D5@.microsoft.com...
> >I have a table which has 100 columns with appx 20000 rows
> > Each column represents a set of data from a different source.
> > Anywhere from 1 to 100 of the columns may contain data (empty will be
> > null)
> >
> > I need to find a way to average accross the rows depending on the number
> > of
> > rows which have data
> >
> > Ie for first row
> > (Col1 + Col2 + Ccol3...Col34) / 34 where only col1 to col34 have data.
> > or
> > (Col1 + Col2 + Ccol3) / 3 where only col1 to col3 have data.
> >
> >
> > Note number of rows will always be the same for every column appx 20000
> >
> > I know I could convert the horizontal data into vertical but this will
> > mean
> > a table of over 2million rows. Other constarints force me to aviod this.
> >
> > Any ideas ?
>
>|||Thanks David. I quite agree with your comments. Only you know how it can be,
small machines with small resources run by people with small brains and ears.
Now I have your and Mikes opinions I can get this issue fixed the proper
way...
Appreciate your time.
Regards
"David Portas" wrote:
> > I know I could convert the horizontal data into vertical but this will mean
> > a table of over 2million rows.
> And... ?
> > Other constarints force me to aviod this.
> What constraints? This design is likely to be an awful nightmare and
> performance bottleneck. The cost of supporting it is surely far greater
> than that of fixing it.
> but ...
> (COALESCE(col1,0)+COALESCE(col2,0)+COALESCE(col3,0) ...)/
> (CASE WHEN col1 IS NOT NULL THEN 1 END +
> CASE WHEN col2 IS NOT NULL THEN 1 END +
> CASE WHEN col3 IS NOT NULL THEN 1 END +
> ...)
> (yuck!)
> --
> David Portas
> SQL Server MVP
> --
>

No comments:

Post a Comment