python - Query long lists -
i query value of exponentially weighted moving average @ particular points. inefficient way follows. l
list of times of events , queries
has times @ want value of average.
a=0.01 l = [3,7,10,20,200] y = [0]*1000 item in l: y[int(item)]=1 s = [0]*1000 in xrange(1,1000): s[i] = a*y[i-1]+(1-a)*s[i-1] queries = [23,68,103] q in queries: print s[q]
outputs:
0.0355271185019 0.0226018371526 0.0158992102478
in practice l
large , range of values in l
huge. how can find values @ times in queries
more efficiently, , without computing potentially huge lists y
, s
explicitly. need in pure python can use pypy.
is possible solve problem in time proportional
len(l)
, notmax(l)
(assuminglen(queries) < len(l)
)?
here code doing this:
def ewma(l, queries, a=0.01): def decay(t0, x, t1, a): math import pow return pow((1-a), (t1-t0))*x assert l == sorted(l) assert queries == sorted(queries) samples = [] try: t0, x0 = (0.0, 0.0) = iter(queries) q = it.next()-1.0 t1 in l: # new value decayed previous value, plus x1 = decay(t0, x0, t1, a) + # take care of queries between t0 , t1 while q < t1: samples.append(decay(t0, x0, q, a)) q = it.next()-1.0 # take care of queries equal t1 while q == t1: samples.append(x1) q = it.next()-1.0 # update t0, x0 t0, x0 = t1, x1 # take care of remaining queries while true: samples.append(decay(t0, x0, q, a)) q = it.next()-1.0 except stopiteration: return samples
i've uploaded fuller version of code unit tests , comments pastebin: http://pastebin.com/shhaz710
edit: note same thing chris pak suggests in answer, must have posted typing this. haven't gone through details of code, think mine bit more general. code supports non-integer values in l
, queries
. works kind of iterables, not lists since don't indexing.
Comments
Post a Comment