Since yesterday’s post I’ve spent a lot more time beating on my video averaging script. Actually, I rewrote the thing from scratch, and it’s looking pretty good. For my second proof-of-concept (above) I used all 28 Ask a Ninja videos available on Revver. Thanks to the fine folks in #ffmpeg on Freenode, I managed to really speed up the video capture portion of the script. This time I went for 800×600 and captured a frame every 20 seconds. Capturing 272 frames from almost four hours of video took six and a half minutes and averaging them took just over four minutes. Once again, I tweaked the contrast of the final image a bit.
The real bottleneck in the process is memory. I’m using RMagick, but though its built-in ImageList#average method makes my job easy, it’s far from ideal. That’s because in order to average a bunch of images you have to load them all into an ImageList, in which case RMagick stores each and every one in memory as uncompressed RGB data. There’s no way around this, and there’s no easy alternative.
I’m considering writing my own averaging code. Writing code that would take little memory would be a fairly simple task (to calculate an average you don’t need to know every value, just a single running sum), but I’m worried I’ll just be trading memory for speed. If I have 200 800×600 images that’s almost 100 million pixels to iterate through, one at a time, which could be hell in an interpreted language like Ruby. However, I think I’ll do some benchmarks and see if maybe Ruby won’t surprise me once again.

©1999-2006 Jordan B. Running
I don’t know how Ruby handles arrays, but an idea to try (if you can) is to have an 800×600 array for totals and then one for the image (maybe loop so that you load one frame into the image array). Add that image array to the total, then load another image, do add it, etc.
It might be quicker to do an entire array at a time than to loop through each pixel. I know it is with perl and IDL, but I know nothing of Ruby.
No matter what I’m still doing 100 million addition operations, which tales a lot of CPU cycles. I’ve started working on my own averaging code, and my benchmarks so far show that averaging 100 million pixels would take about half an hour on my machine. However, that was with randomly-generated pixels–actually loading each image into an array of pixels will probably multiply that time by a nontrivial amount. But it takes up very little memory, so I’ll probably keep working on it.
Oh, ok.
For some reason, when I had to do massive operations involving arrays of satellite data, it seemed to run faster when I would add an entire array to an entire array, (something like sum_array=sum_array+array1) instead of adding each element, one at a time
(for i=0,500 do begin
for j= 0,500 do begin
sum_array(i,j)=sum_array(i, j) + array1(i, j) )
Of course, that might just be the for loops in there. I’m not entirely sure. Either way, the first method was a drastic improvement in IDL. It seems to work similarly in PERL (although the improvement didn’t seem anywhere near as drastic).
Of course, you’re using Ruby and not IDL, so it might not matter since you’re using a language that actually makes sense.
Yeah, Ruby unfortunately doesn’t have a way to add two arrays in that manner, so you have to do it element by element, which takes its toll.
Looking good! I like this one better than the average girl. As before, I’m anxious to see the code when you feel comfortable with sharing it! :D And, 10 minutes is not bad at all! But, yeah, killing memory sucks. What about doing a bit of both methods: load up, say, 1/10th of the frames and average those, then repeat that step, continually adding to the sum? Just a thought.
Cheers!
M.T.