Skip to content
Last updated

Video datasets

For videos with audio tracks, we recommend encoding videos using ffmpeg with the following settings:

ffmpeg -i INPUT \
  -preset veryslow \
  -keyint_min 2 -g 24 -sc_threshold 0 \
  -c:v libx264 -pix_fmt yuv420p -crf 17 \
  -c:a aac -b:a 256k -ac 2 -ar 48000 \
  -movflags +frag_keyframe+empty_moov+default_base_moof \
  OUTPUT.mp4

This ensures wide browser compatibility. A CRF value below 18 is generally considered visually lossless. For audio, compression at 256 kbps or above is considered unnoticeable [1, 2].

For videos without audio, use the following settings:

ffmpeg -i INPUT \
  -preset veryslow \
  -keyint_min 2 -g 24 -sc_threshold 0 \
  -c:v libx264 -pix_fmt yuv420p -crf 17 \
  -an \
  -movflags +frag_keyframe+empty_moov+default_base_moof \
  OUTPUT.mp4

The -movflags option creates fragmented mp4 files which are required in some of our experiments. If videos are not fragmented, our platform will automatically fragment videos using ffmpeg, but will not transcode videos.

References

[1] Pras et al. (2009). Subjective Evaluation of MP3 Compression for Different Musical Genres.
[2] Hines et al. (2014). Perceived Audio Quality for Streaming Stereo Music.