Making lyric videos

For some time now I had wanted to find a way to create lyric videos without much effort. I really didn’t want a custom video per song, with hand-made everything. Just an easy way to call some scripts and do some tweaks by hand, and get a lyric video.

My first attempts with ImageMagick and animated GIFs didn’t quite work out because the animated GIFs, at least when converted to a video (to add the audio track) didn’t really keep the timing they were supposed to. So I was about to give up, but Luka gave me a very good idea: to make subtitles out of the lyrics, and the use ffmpeg to burn the subtitles into the video itself. So I got to work.

Note: This whole process is absolutely for nerds, as it involves the command-line, Linux, and some hand-made scripts and tweaks in Aegisub, the subtitling program I used. If you’re looking for an easy way to do it with a graphical or web program, you’re out of luck (there might be ways, but this is certainly not it).

Preparing the subtitles

The first step is to create the subtitles for the lyrics. Astonishingly, I start with a simplified text file (in a made-up format), then convert with a hand-made script into an .srt file… then I convert it to the final .ass format in Aegisub. So, yes, the lyrics go through three different formats in total.

The first, made-up lyric format looks like this:

00:52,500 Another gal asked him to please everyone
00:55,000 what an impossible burden to bear

01:09,000 Bunch of Satan suckers
01:11,500 Selling cola products
01:14,500 Are you Christians? Please forgive me
01:17,500 If you didn't like what I said

As you can probably see, it’s a very compact format that can be written easily in a text editor. The idea is that each line stays until the next one, hence I add an empty line with a timestamp (the third line) to make the previous lyric show for 4 seconds, instead of 14. With a hand-made script I convert this format to an .srt file, by calling it like so:

./lyrics-to-srt.sh melvin.txt >melvin.srt

Now you might be wondering: why don’t you convert them directly into the final format? The reason is that the .ass format is a bit more involved, and it contains formatting information, too, like the font used, font size, position of the subtitle lines, etc. It was easier for me to convert to .srt first, do the visual stuff in Aegisub, and save the file as .ass in Aegisub.

So how does that work, exactly? Aegisub has a “Styles Manager” (available from the menu “Subtitle”) in which you can define a subtitle style. In it you define the font and other things. You define that once, and that style will be available for you in any file you open with Aegisub. So once I have the subtitles in .srt (a format that Aegisub can open), I open the file there, select all subtitle lines with Ctrl-A, then choose the style I want in the UI:

Selecting all subtitle lines and choosing style

After clicking on the style, all lines get marked as having that style, as you can see here (notice the change in the “Style” column, to the left of the lyric text itself):

All subtitle lines now with the new style

Now I can choose “Save as…” and save the file as eg. melvin.ass.

Preparing the background video

Once I get the subtitles in the appropriate format, I can make the base video (without lyrics) in Kdenlive. The process is as follows:

  1. I load the song with “Add Clip” on the left hand pane.
  2. I click “Add Title Clip” on the left hand pane, and I put whatever static text or images I want for the background, including the name of the song.
  3. I add both to “Audio 1” and “Video 1” respectively, and make sure that the title clip is as long as the audio track.
  4. I render it to an MP4 video file.

This will give me a video file that has the title of the song and whatever background I want, but no lyrics or subtitles.

Putting it all together

At this point I have two things: the final subtitles in .ass format, and the basic video without the lyrics. With this, I can use ffmpeg to produce a second video with the subtitles already on it, in whatever font, size, and position I chose in Aegisub. The magic command is this:

ffmpeg -i background-melvin.mp4 -vf ass=melvin.ass melvin.avi

The result will be a video with the subtitles rendered on it, like the Melvin lyric video you can see on YouTube.


Although it does involve several steps and it’s pretty dirty, this method to create the videos is good enough for me. I wasn’t going to create that many (only four in this batch) and I’m comfortable using these tools.

It should be possible to simplify the workflow by typing the subtitles directly in Aegisub, instead of in a text file, then convert them. I might do exactly that the next time I have to make another batch of these, but this time I already had the lyrics in the first format (due to my previous attempts with ImageMagick) so I figured I’d convert them to be able to open them in Aegisub instead of typing them there again. I hope this post was useful!