using ffmpeg for cutting media files – and the gotchas involved

So here I was, trying to complete some ordinary transcoding / media file cutting tasks with ffmpeg. Turns out there are some weird gotchas when using some versions of ffmpeg (in my case, SVN-r10571) on the commandline.

Problem 1: transcoding to mp3 insists on using 64kbits/s or 128kbits/s bitrate

So a simple task is to transcode an audio file of some arbitrary format to MP3. In my case, I only wanted a 30-second piece of the original file, converted to MP3. The original was in 256kbits/s, and I decided to use the same bitrate for the output, just for kicks. Reading the man page, -b is for video bitrate, and -ab is for audio bitrate. So I executed:

wrong:  ffmpeg -ss 00:00:30.00 -t 25 -i foo.mp3 -ab 256 foo-new.mp3

The output, of course, was this:

Input #0, mp3, from 'foo.mp3':
...
mdb:109, lastbuf:0 skipping granule 0
size= 393kB time=25.2 bitrate= 128.0kbits/s
video:0kB audio:393kB global headers:0kB muxing overhead 0.007950%

128 kbits/s. That was…not what I wanted.

Turns out, as the ffmpeg man page hints but does not specify clearly, a “k” is required to label the units of the new bitrate. As in:

right:  ffmpeg -ss 00:00:30.00 -t 25 -i foo.mp3 -ab 256k foo-new.mp3

This time, the output was:
Input #0, mp3, from 'foo.mp3':
...
mdb:109, lastbuf:0 skipping granule 0
size= 782kB time=25.0 bitrate= 256.0kbits/s
video:0kB audio:782kB global headers:0kB muxing overhead 0.003996%

Much better. It’s very strange how that if I don’t give it the ending “k” but do give it a higher value than 64, it always bumps up the bitrate to 128kbits/s, from the default of 64k (but not to the number I actually wanted).

 

Problem 2: cutting media file without re-encoding

So ffmpeg can be used to cut a media file, without reencoding the media stream. You simply pass the raw copy codec to the -acodec (or the -vcodec for video) at encode time. For my MP3, I thought it was a pretty trivial problem, so I issued:

wrong: ffmpeg -i bar.mp3 -ss 00:00:30.00 -t 25 -acodec copy bar-new.mp3

This seemed to process correctly, except it created a 555-byte empty file with no content in it. What’s even weirder, if you issued:

wrong: ffmpeg -i bar.mp3 -ss 00:00:10.00 -t 25 -acodec copy bar-new.mp3

That is, -ss 00:00:10.00 to seek to the 10th second, and -t 25 to record 25 seconds worth of audio. Strangely enough, the output file had 15 seconds of audio, the subtraction of 25 by 10. Curiouser and curiouser. Now,

wrong: ffmpeg -i bar.mp3 -ss 00:00:10.00 -t 25 -acodec mp3 bar-new.mp3

does in fact create a 25-second MP3 file. Only, of course, a re-encoded one at 64kbits/s. Something that we are not looking for. But strange.

A post on ffmpeg-devel finally provided enough hints to clue me into the problem. Unlike many command line apps, the order of arguments passed to ffmpeg seem to be silently significant. The correct incantation is:

right: ffmpeg -ss 00:00:30.00 -t 25 -i bar.mp3 -acodec copy bar-new.mp3

Note the switched order for the arguments -i and the args -ss and -t, where the -i must follow the other two. Now, the desired 25-second file, cut from the original, is correctly produced.

When arguments are missing some arbitrary text or in the wrong order, ffmpeg doesn’t sanity-check or warn you of these…it just silently proceeds and does some very strange things. Things that make you scratch your head and wonder, “wtf? did I mistype an argument somewhere?”

Ah, and of course, you make me have to track down all these little idiosyncrasies, wading through blog posts and mailing lists and forums. Agh.

19 Replies to “using ffmpeg for cutting media files – and the gotchas involved”

  1. Thanks for taking the time to delve into this. Glad I found this after an hour of pounding my head against the keyboard.

  2. Read the manual! It clearly states that options, like the above one, apply to the first file following the options. So in “ffmpeg -i bar.mp3 -ss 00:00:10.00 -t 25 -acodec mp3 bar-new.mp3” the -ss -t -acodec options all apply to the file “bar-new.mp3”.

    In “ffmpeg -ss 00:00:30.00 -t 25 -i bar.mp3 -acodec copy bar-new.mp3” the -ss and -t option apply to bar.mp3 and the -acodec applies to “bar-new.mp3″.

    Look at this one (capture from webcam) to dvd mpeg:”ffmpeg -f mjpeg -s 464×480 -vcodec mjpeg -r 14 -i /dev/video0 -target pal-dvd ~/stream.mpeg”

    1. Thanks for the pointer. I would disagree with your conclusion that a manual is the end-all solution to the issue. The problem is that this sort of user interface design is unintuitive (yes, even CLIs are user interfaces). When specifying arguments on a command line, the majority of programs treat it as a hash of options, where order is unimportant. It would not occur to most users, even seasoned command-line jockeys, that it is the order of arguments that caused strange behavior.

      This sort of problem should be detected, and a warning printed to standard error. In most cases, it makes no sense to accept -t or -ss options for an output file rather than an input file, making detection quite easy for the majority of cases. It is the responsibility of ffmpeg to do this, because here it is ffmpeg that is behaving out of the norm, not the user.

      Imagine if I designed a doorknob — which looks like an ordinary doorknob — but requires that you pull/push on the knob to open, rather than turning it. I would expect that many users would be confused when they first try any door with this knob installed. If my oddly designed doorknob is used in all major buildings, users may indeed write blog posts about how unintuitive it is. And assuming that I cannot redesign my doorknob at all, I would post a sign that says “Pull to open”, rather than putting a 10-page operating manual (complete with diagrams) next to every such door, with a guy next to it yelling “Read the F’ing Manual!” 🙂

  3. THANK YOU for publishing this little and so precious knowledge!

    I’ve lost hours trying to cut the useless parts of a .TS file, and I couldn’t understand why ffmpeg was re-encoding everything (in crappy quality). I could have managed to find the “copy” codecs, but putting -ss on the beginning was beyond my imagination.

    I agree with yiming: command line should be as much as possible order-neutral, and sanity checks MUST occur.

    1. I disagree with yiming and Krysztof von Murphy about the sanity checks. If a person doesn’t understand how to use an interface, don’t break the operation of scripts which are able to use the CLI intelligently, so that these 2 guys don’t have to read a manual. (Maybe yiming and Krysztof are just not man compatible!) For example, imagine that all commands only took 1 file argument like Windows’ right-click menus. Then a simple operation in Linux like file copying becomes two GUI operations: Cut and Paste. Why? because you cannot specify both source and destination. Point: every doorknob is not a simple turn-to-open latch. Some “Doorknobs” require you to enter a holding area, then select which door you want to exit and wait until you reach the selected floor. Some people made a career and retired operating a doorknob that other people could not reliably operate. Its cool to be an elevator operator until you get stuck between floors. Its cool to transcode music and lectures and video with freedom until you have to learn a little more.

      The difference between Propaganda and Education is that Propaganda tells you what to think and Education (and Open Source) teaches you how to think.

      1. It’s this kind of engineering-centric thinking (as opposed to user-centric or human-centric), so prevalent within our industry, that allows HCI designers to be paid so well. This kind of thinking misses the point: a user should *not* be forced to do unintuitive things, simply because it makes engineering easy, or because the engineer does not like to pay attention to users.

        Being open source does not excuse bad user interface design. Actively confusing interfaces — interfaces that function contrary to user expectations — creates errors where none needed to exist. If you check the About page here for my background, you’ll find that my “cred” on software and technology issues to be significant (Berkeley PhD work, startup founder, engineer, etc.). Nevertheless, I made this error, because my expectations on how a CLI should behave — drawn from years of experience — is violated in idiosyncratic and largely needless ways. Ironically, had I *not* been a regular CLI user, I might not have formed this particular expectation. Nevertheless, the key target audience of a tool like ffmpeg is precisely an experienced CLI user.

        The man page is a red herring. The need to refer to or post signage is practically *already admitting* that you have made an design error, or at least made some incomprehensible design choices in the eyes of your typical user. An intuitively designed system does not require warnings, because of the affordances of such systems naturally guides user into making correct choices or decisions. I won’t even get into the matter of requiring the reading of an entire document vs simple signage, which at least has the possibility of being effective at educating a user.

        I highly recommend Don Norman’s book “The Design of Everyday Things” (and its companion, Emotional Design: Why We Love Or Hate Everyday Things). This is a book that every would-be software engineer should read before they start taking on any sort of software design or architect role. Every HCI designer already starts with it as the introductory text, and I think every engineer should have to read this as well. It explains in an extremely lucid and well-reasoned way why certain designs succeed and others fail. It certainly changed my views (similar to your own, once) on how human-facing interfaces should be designed, and where the responsibility for user errors lies.

        And yes, I will reiterate: even CLIs are user interfaces.

      2. What you are complaining about is called the specialization of labor. You know — the innovation that bought us the industrial revolution, ownership of capital, and the employee/employer relationship.

        I personally want the guy who writes FFmpeg to focus on what he is good at and ignore graphical user interface issues. Somebody else can work on that in parallel with Fabrice Bellard and friends continuing to make a more functional / flexible CLI tool and handling new containers and data streams. FFmpeg itself is a super powerful CLI tool as it stands. No GUI designed for it could ever capture all of its flexibilty at the command-line. However, being an important piece of software, FFmpeg has had several GUIs made for it addressing various common workflows and usage.

        Maybe you should investigate using one of those.

        I am not surprised that even those with a Computer Science PhD background have to learn a few new tricks when dealing with Audio and Video streams processing. I know I did. One user I heard of was piping the output of ffmpeg directly into speexenc. Ingenious! Every new area of knowledge requires learning. Be open to it. The standard university way is to put such info into a man page (instead of making you read the source code). Nobody reads the whole man page everytime. And finding the information you want is a life skill. However, End Users such as yourself are not the only users of FFmpeg and similar CLI tools. Scripts use FFmpeg also. So the CLI has to service the needs of CLI people and those that want to automate the process or those that want to create a GUI for it. Why spend the time creating the GUI portion of a program that doesn’t work or a software system that nobody uses or wants to use?

        That’s what you are advocating. Cart before horse.

        Which is more important: the quality inside or the outside veneer? Our society already puts too much emphasis on the glitz. As I said previously, important software like FFmpeg causes software developers to embed it in scripts which make it easier to use and very important CLI utilities get several Graphical User Interfaces like FFmpeg has.

        Maybe you should use the software that Sansa supplies with e2xx players for video processing .avi files into the unit’s proprietary video format. Its got a nice interface! No thinking required. No unexpected surprises.

        I agree that a CLI is a user interface. One where the explanation of behaviors is documented concisely or in a form that you unfortunately can’t speed read. FFmpeg is not a system. (Quoting from your response: “intuitively designed system…”) It is a component — a part of a system. If you need a system, maybe you should investigate using an ffmpeg GUI.

        The beauty of FFmpeg is that it is simple enough for you to begin using and to be quickly rewarded with fantastic results. You were hooked by the power of FFmpeg — as am I. It is worthy of our discussion. The unexpected surprise with FFmpeg is how versatile and capable a command line interface can be reusing the same switches multiple times for different purposes. Truly it is the swiss-army knife of audio/video processing. Be glad these Linux utilities and libraries are now available on Windows. And that OSS developers have the freedom to work on what they deem most important.

      3. I don’t think you’ve quite understood my points. It seems that you believe that fundamentally, functionality is orthogonal to system design. It is not. No matter how many features you place in a system, it remains unusable until you are able to provide a coherent information architecture to allow users to use them in a reasonable way. The onus of that is on *you*, the system designer, and not on the user. The fact that you seem to want to separate what is “function” from “design” simply illustrates how far apart our views are.

        The points on a GUI is, again, a lovely attempt at a red herring. We’re discussing the user expectations of command line interfaces, which still require design consideration, since there are norms and expectations in such interfaces. Snide insinuations about how users who somehow can’t hack a command-line should go look for a nice safe GUI is unnecessary and unproductive. I built a horizontally scalable, distributed video transcoding system complete with a RESTful web API while working for Yahoo, under the guidance of an experienced team lead who is now a director at another venture-funded video startup. Allow me to say that I know exactly what goes on under the hood here.

        I’m glad you’re friends with developers of this package. They do excellent work — that has *never* been in question. If I didn’t care or understand the ffmpeg package, I wouldn’t give two cents about their software issues — I have neither the free time nor the inclination for that. I enjoy working with ffmpeg, make no mistake. The developers have simply neglected a key consideration, which is fine given their disinclination or lack of expertise in design issues; I have now raised as a critique for improvement. I don’t think they need you to defend their expertise (nor is such a defense required).

        On the other hand, I certainly do not need another dose of “Free Software is awesome; to hell with the user” propaganda, or for that matter, amateur social criticism either. Fundamentally I believe in user-centered design, and it seems that you do not.

        A blog comment thread is not an appropriate place for extended debate (we’ve already hit the thread nesting limit :)). Please contact me by email (you can find my email over at the bottom of UC Berkeley web page) if you genuinely want to discuss the various merits of HCI principles as applied to CLI.

  4. Thanks a lot!
    You have helped a lot of people!
    If one day I become rich I will express my gratitude in $$$!
    Promise is a promise!

    In man page of ffmpeg it is specified that parameters concerning the source file -i SOURCE_FILE need to be before it.
    But people do not read man pages.
    They use google and copy-paste.

  5. Thank you! I was already starting to go mad over the parameters for cutting and imagining all kinds of weird possible bugs there. I wish the order of parameters wouldn’t matter as much as their name (i.e. -itsoffset for input, -otsoffset for output, possibly).
    Thanks again!

  6. Very useful page (except the rants from merciful/othniel, self appointed open-source know-it-alls), many thanks! Saved me lots of time!

  7. Thank for this!! Something so simple as just rearranging the position of the -ss and -t can affect the whole splicing issue.

    I gave up last week, thinking that ffmpeg is flawed. But true to form, someone was able to figure this out, so I have to thank you deeply, as you’ve helped put a big project back on track! 🙂

  8. While
    ffmpeg -ss 00:00:30.00 -t 25 -i bar.mp3 -acodec copy bar-new.mp3

    will work in most cases, you will get an error message pointing out that “-t” is not an input command.

    The correct syntax is:
    ffmpeg -ss 00:00:30.00 -i bar.mp3 -t 25 -acodec copy bar-new.mp3

    where “-t” is after the input command.

Leave a Reply

Your email address will not be published. Required fields are marked *