Author Topic: numerically stable average  (Read 2563 times)

0 Members and 1 Guest are viewing this topic.

Offline codeguy

  • Forum Regular
  • Posts: 174
numerically stable average
« on: October 11, 2018, 07:23:33 pm »
In this post are two methods for calculating the average of a dataset. The standard method proves to be problematic for some cases, where the element values are large or the range is large. These two algorithms agree UNTIL n becomes large, then the truncation errors start creeping in for the standard method, using a running sum of all the elements in the range. The numerically stable method avoids this error quite well. In this particular instance, the numerically stable version lags the standard method by 50% or so, but the fact that overflow is completely avoided may be of benefit in some cases when dealing with large datasets or numbers that are large or perhaps even both.
Code: QB64: [Select]
  1. ntest& = 0
  2.     ntest& = ntest& + 1
  3.     REDIM a(0 TO ntest&) AS DOUBLE
  4.     FOR q& = LBOUND(a) TO UBOUND(a)
  5.         a(q&) = ntest& / 2 + .125
  6.     NEXT
  7.     s0! = TIMER(.001)
  8.     AverageArrayNS a(), LBOUND(a), ntest&, average#
  9.     f0! = TIMER(.001)
  10.     PRINT "numerically stable "; ntest&; average#; f0! - s0!
  11.     s1! = TIMER(.001)
  12.     AverageStandardX a(), LBOUND(a), ntest&, averagx#
  13.     f1! = TIMER(.001)
  14.     PRINT "standard not stable"; ntest&; averagx#; f1! - s1!
  15.     ntest& = ntest& * 2
  16.  
  17. '****************
  18. '* this method is subject to truncation/overflow errors -- fails at large N.
  19. '****************
  20. SUB AverageStandardX (a() AS DOUBLE, start AS LONG, finish AS LONG, a#)
  21. sum# = 0
  22. FOR c& = start TO finish
  23.     sum# = sum# + a(c&)
  24. a# = sum# / (finish - start + 1)
  25.  
  26. '****************
  27. '* a numerically stable way to calculate the average of elements in an array without the dreaded overflow.
  28. '****************
  29. SUB AverageArrayNS (a() AS DOUBLE, start AS LONG, finish AS LONG, average AS DOUBLE)
  30. DIM aa_int AS DOUBLE: aa_int = 0
  31. DIM aa_temp AS DOUBLE: aa_temp = 0
  32. DIM StatN AS LONG: StatN = finish - start + 1
  33.  
  34. FOR c& = start TO finish
  35.     aa_temp = a(c&) + aa_temp
  36.     DO
  37.         IF aa_temp < StatN THEN
  38.             EXIT DO
  39.         ELSE
  40.             aa_int = aa_int + 1
  41.             aa_temp = aa_temp - StatN
  42.         END IF
  43.     LOOP
  44. average = aa_int + aa_temp / StatN
  45.  
Just another perhaps boring but useful algo to add to your code arsenal if overflow is to be avoided at all costs and results MUST be accurate.
« Last Edit: October 11, 2018, 08:07:01 pm by codeguy »