Wednesday, April 28, 2010

Experiments with Avisynth and VirtualDub pt.1

I was given the task of creating a number of videos to use as stimuli for a psychological experiment, so decided to use the programs AviSynth for initially scripting the basic parameters, then using the resulting script file in VirtualDub to turn them into real-live, playable videos.

This blog entry details the difficulties encountered and what resulted.

First off, a warning: these programs are not for the faint-of-heart.

All 32-bit MS Windows (95/98/NT/2000/XP

Avisynth is a script-driven video frame-server with its own particular syntax, so you're effectively writing a program; if you don't like writing code then leave well alone!

'Frame serving' is a way of getting around the built-in restrictions occurring in many video editing programs and means you can use Avisynth (along with its comprehensive built-in filters) to wrap media (audio, video, pictures) into a file format that can be recognised by other programs.

There's no graphic front-end, so it's just you and a text editor:
  • write the code
  • test it
  • find out why it didn't work
  • test it again.
It has loads of parameters, but also has a good online reference with code examples and instructions.

While researching how to create my videos I discovered that a number of people have created front-ends for generating scripts with Avisynth, but after having a go with one of them (AVSGenie) I found you really had to know Avisynth beforehand so it seemed simpler to just wade in and have a go.
Luckily there is an active user-base so for most queries there's normally some useful help already on-line.


Windows 32-bit and 64-bit platforms (98/ME/NT4/2000/XP/Vista/7)

VirtualDub is an open-source video editor, but not in the Final Cut Pro, Vegas, Premier type-of-way.

It will not open MPEG files so no editing of your ripped DVD's (but there are ways around this, see VirtualDubMod and VirtualDub-MPEG, both now quite old), nor will it open many other file formats (that's why we have to use AviSynth), but it's usable, robust, has a number of very decent filters and applications built-in, an enthusiastic user-base, plus it's free.

As a long-term user of Sony Vegas, I have to say it lacks the immediacy of a commercial video editor, but makes up for that with the number of features it includes. As it can also frame-serve, like Avisynth, I have occasionally used it to frameserve video to Vegas. It makes the editing a big laggy, but works well!


The Stimuli

The experiment was to find out peoples' perceptions of music and movement, comprising of five ascending notes (in fourths) and five images of a person rising from crouched to fully out-stretched. Each note - 1 to 5, with 1 being lowest - would play while showing one of the images.

In the first set of videos the notes directly corresponded to a body position, so assuming 1 is the lowest note, then it would correspond to the crouched body position, going up to note 5 playing with the fully outstretched body position.
Got it?

But, the order the notes play is not just straight up and down, oh no, but runs through a regular sequence, such as:

Notes Images
13452 13452
14532 14532
15324 15324
14352 14352

then ..
25431 25431
24531 24531
23415 23415
24351 24351

a
nd so on.

Then there are the variations.

Instead of each note corresponding to a body position, the note would correspond to its inverse within the list, so:
note 1: body position 5; note 2: body position 4; 3:3; 4:2; 5:1; etc.

With me so far?

And the third variation would be a random series of notes and body positions.

There were twenty videos for each variation, so sixty videos in all, and the timing had to be accurate for each one.

Now, I could have just put them together using Sony Vegas, editing together each picture and note for the sixty videos, but it would have taken a lot of time, would have need a lot of checking, and would not be very adaptable.

Using Avisynth was ... well, more elegant. And, in the end, was very usable. Plus it got me out of problems that I could later sort out by just using a 'search and replace text' tool and that would have been impossible to easily correct with a normal video editing program without having to re-edit all the videos.

As in the way of experimenting, various things conspired to make me trash the first batch of videos, so in the end I was glad that I'd chosen this way of producing them right from the very start.

Onto Part 2: Writing the code for Avisynth

Experiments with Avisynth and VirtualDub pt.2


Writing the code for Avisynth


After you've installed Avisynth, then creating a text file with an .AVS extension will enable you to play a video in Windows Media Player (note: this uses a setting within Avisynth, and is particular to
Windows Media Player. No other media player like Quicktime or VLC will be able to play the file).

The easiest way to find out if it's all working is to put this line into a text file, save it, then right click on it and 'play' it in Windows Media Player. It should play a 10-second video with the text "Hello World"


BlankClip()
Subtitle("Hello, world!")


If it doesn't work then I'm not sure I can help you, try Googling for some help; it's what I'd do.
If it does work, then we can continue.

I'd already prepared my five sound files (WAV - 16 bit/48KHz) containing single notes and five image files (JPG) with the pictures to be used, so the only thing to do now was put them together. This took a bit of experimentation, but finally I ended up with an AVS file like this (explanation at the end):



# AVS – avisynth file for generating notes for Jordan sequence

# filename: T_Test_up_1.00s.avs T:12345 I:12345
# variables
# lenB is blank period before and after sequence of pictures, in frames
lenB=25
# lenP is length each picture is shown, in frames
lenP=25
#
# Blank intro
video = ImageSource("C:\Black.JPG", end=lenB, fps=25)
audio = WAVSource("C:\Blank-1s0.wav")
clip0 = AudioDub(video, audio)
# 1
video = ImageSource("C:\1.jpg", end=lenP, fps=25)
audio = WAVSource("C:\1.wav")
clip1 = AudioDub(video, audio)
#
# 2
video = ImageSource("C:\2.jpg", end=lenP, fps=25)
audio = WAVSource("C:\2.wav")
clip2 = AudioDub(video, audio)
#
# 3
video = ImageSource("C:\3.jpg", end=lenP, fps=25)
audio = WAVSource("C:\3.wav")
clip3 = AudioDub(video, audio)
#
# 4
video = ImageSource("C:\4.jpg", end=lenP, fps=25)
audio = WAVSource("C:\4.wav")
clip4 = AudioDub(video, audio)
#
#5
video = ImageSource("C:\5.jpg", end=lenP, fps=25)
audio = WAVSource("C:\5.wav")
clip5 = AudioDub(video, audio)
#
# blank Outro
video = ImageSource("C:\Black.JPG", end=lenB, fps=25)
audio = WAVSource("C:\Blank-1s0.wav")
clip6 = AudioDub(video, audio)
#
#
clip0 ++clip1 ++clip2 ++clip3 ++clip4 ++clip5 ++clip6
#
# end

So, what's going on here:

  • Everything after a hash ('#') is a comment
  • I've given a bit of explanation at the beginning, plus the sequence of notes and images (in T: and I:)
  • Then there are two variables to set the length of time (lenB) of the blank bits at the beginning and end of the whole sequence, and then the length of time (lenP) each image shows along with the sound. I've chosen 25 here, as in 25 frames-per-second to keep it nice and simple. I did some tests that said a second looked about right; any shorter made the video too jerky and any longer made it drag.
  • Then we're on to the first bit of video, which is a blank frame, just using a JPG formatted to 720 pixels wide by 576 pixels high. Why those sizes, you ask?
Well, that's the standard frame size for a PAL DV codec, and having played around with some of the testing software (E-Prime and SuperLab), I found that one of them (SuperLab) had problems with showing out-of-spec videos. So, I made them in-specification.

  • Avisynth has problems showing a video without audio, so I created a blank audio file 1 second long.
  • And the instruction 'AudioDub' puts them together, and then I assigned it the name 'clip0'
  • The same instructions are repeated for all the other clips, and another blank section added to the end, then the final instruction:
clip0 ++clip1 ++clip2 ++clip3 ++clip4 ++clip5 ++clip6

Is the one that concatanates all the clips together for the video.

  • And that's it! This .AVS video will happily play in Windows Media Player, so as to test it.

Having created the template for the AVS file, I then used Excel to generate all the different variables for which image (n.JPG) was to display with which sound (n.WAV), and simply copy'n'pasted all sixty results from Excel into text files. (I won't bore you with the
details of how to do the Excel file, just to say that it was very simply done by manually stepping through a list of the sequences).

Now to make them into real videos, and for that we need VirtualDub

Onto Part 3: Batch encoding AVS files into AVI files