Hikvision Intercom Ringtone audio format

Discussion in 'Alarm & Security Systems' started by mgomespt, Nov 3, 2016.

Share This Page

  1. mgomespt

    mgomespt Young grasshopper

    Nov 2, 2016
    Likes Received:
    Lisbon, Portugal
    Just thought I'd post a little tutorial on how to properly format audio files for the new ringtone upload feature found on firmware version 1.4.0. This procedure took me a while to figure out and I couldn't find any proper documentation around it...unless you can call this snippet from the latest user guide useful:

    The stock "ringtones" are OK-ish but I found them all to have a bit too much of a "sense of urgency" for my home doorbell tastes. I eventually dabbed into Freesound.org - Freesound.org and found a lively community of field and studio recordists, with multiple freely available high quality recordings of a wide variety of specialised objects...such as the noble tubular doorbells :)

    I used two applications in order to get the format of the ringtone file done in a way accepted by the indoor station:

    - ffmpeg (to reencode a downloaded audio mp3 file to the proper WAV format accepted by the indoor station. ffmpeg leaves a metadata nag which will need to be removed in another piece of software, in my procedure I'll use the freely available - donationware - "Audacity" to achieve this goal);
    - Audacity (to strip any metadata from the resulting file, since the indoor station doesn't accept having any metadata included in the uploaded ringtone files. ffmpeg stubbornly insists on having at least a "encoder" metadata field being populated in its output files. Potentially the whole transcoding process can be done in Audacity alone as it provides some customisation of its exported audio files, but the custom export feature seems to require ffmpeg external integration with Audacity - a path that I didn't venture on);

    The indoor station is very strict about the audio format it needs: the ringtone WAV file MUST have the following properties:

    - "pcm_s16le" encoded;
    - mono;
    - 8000 Hz sampling rate;
    - 128kbs bitrate;
    - zero metadata accepted.

    (I'm also suspicious that it may have a hard limit on how long the audio file can be, but I haven't tested this above ~10 seconds).

    It's a bit of a shame that the indoor station audio playout libraries are limited to this low quality audio format, not much use to have better quality files running through such a low quality small speaker, I guess. I think that in the long run I'll have the indoor station triggering a proper physical tubular doorbell contraption "just because".


    1 - Reencode the audio file using ffmpeg (in my example I use a source/input mp3 file which only has a single audio channel, so I didn't need to tell ffmpeg to combine stereo channels into mono, however I read about different approaches to this topic on this official ffmpeg webpage):
    $ ffmpeg -i pipe.mp3 -acodec pcm_s16le -ar 8000 -ab 128k tubular_doorbell1.wav
    ffmpeg version 2.3.3 Copyright (c) 2000-2014 the FFmpeg developers
    [FONT=Menlo]  built on Aug 20 2014 13:28:07 with llvm-gcc 4.2.1 (LLVM build 2336.11.00)
      configuration: --prefix=/Volumes/Ramdisk/sw --enable-gpl --enable-pthreads --enable-version3 --enable-libspeex --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-avfilter --enable-libopencore_amrwb --enable-libopencore_amrnb --enable-filters --enable-libgsm --enable-libvidstab --enable-libx265 --arch=x86_64 --enable-runtime-cpudetect
      libavutil      52. 92.100 / 52. 92.100
      libavcodec     55. 69.100 / 55. 69.100
      libavformat    55. 48.100 / 55. 48.100
      libavdevice    55. 13.102 / 55. 13.102
      libavfilter     4. 11.100 /  4. 11.100
      libswscale      2.  6.100 /  2.  6.100
      libswresample   0. 19.100 /  0. 19.100
      libpostproc    52.  3.100 / 52.  3.100[/FONT]
    [COLOR=#FFFB01][FONT=Menlo][COLOR=#ff5fff][mp3 @ 0x7fce8a01e000] [/COLOR]Estimating duration from bitrate, this may be inaccurate[/FONT][/COLOR]
    [FONT=Menlo]Input #0, mp3, from 'pipe.mp3':
      Duration: 00:00:09.46, start: 0.000000, bitrate: 96 kb/s
        Stream #0:0: Audio: mp3, 44100 Hz, mono, s16p, 96 kb/s
    Output #0, wav, to 'tubular_doorbell1.wav':
        ISFT            : Lavf55.48.100
        Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 8000 Hz, mono, s16, 128 kb/s
          encoder         : Lavc55.69.100 pcm_s16le
    Stream mapping:
      Stream #0:0 -> #0:0 (mp3 (native) -> pcm_s16le (native))
    Press [q] to stop, [?] for help
    size=     148kB time=00:00:09.45 bitrate= 128.1kbits/s    
    video:0kB audio:148kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.051553%[/FONT]
    2 - Open the resulting ffmpeg-transcoded file using Audacity and click "File->Export Audio";
    2.1 - Choose "WAV (Microsoft) signed 16-bit PCM" as the file type and click on the "save" button;
    2.2 - On the following window, choose the "encoder" row and click on "remove" (to remove this metadata field from the exported audio file) and then click on the "Ok" button to save the resulting exported file.

    Now you can launch ivms 4200 (tested on release and upload the ringtone under the "Remote Configuration" area of your indoor station (tested on firmware version 1.4.0), you can then access your newly uploaded ringtone on the touchscreen UI of your indoor station.
    Last edited: Nov 7, 2016
    tangent likes this.