• Welcome to BirdForum, the internet's largest birding community with thousands of members from all over the world. The forums are dedicated to wild birds, birding, binoculars and equipment and all that goes with it.

    Please register for an account to take part in the discussions in the forum, post your pictures in the gallery and more.
ZEISS DTI thermal imaging cameras. For more discoveries at night, and during the day.

Question about post-processing: getting audio-bombed (1 Viewer)

OkamotoKeitaSin

Active member
When recording a focal species I sometimes get "audiobombed" (like photobombed but audio) by various factors, but generally due to

1) Digital/actual noise. Digital ones being unexplained interferences resulting in sudden loud clips, and actual ones when a fruit/leaf falls and makes a loud sound.
2) Non-focal bird. Say I'm recording a sub-song (often rather soft) and a different species comes by, squeaks loudly just for several seconds, then flies off.

I've so far been using "pop-mute" on Audacity to reduce the amplitude for such sections. For "type 1" noises I see no harm in doing this since those audio information are useless anyway, and at times I delete those loud clips if they are just several milliseconds long. For "type 2" noises (i.e. real birds) I'm concerned that reducing their amplitude via "pop mute" might distort sounds.

The general rule of thumb I understand is to normalise the recording so that the focal bird vocalisation is at -3dB. So for each type of "noise", what's the recommended workflow (apart from giving up) to make the best out of the recording?
 
For transient sounds such as the type 1 sounds you mention, I would try the following
1/ use a spectral editing tool to copy a very short interval from a quite area of the sonogram, ideally from immediately before the click or pop, then paste this over the top of the unwanted sound. This is easiest when the spectrogram for the unwanted sound does not cross the bird vocal - if it crosses, you can try to copy and paste smaller ‘blocks’ over the click but avoiding the spectrogram for the bird sound.
2/ normalise the level of the recording after editing the click or pop out - often a click or pop can be the loudest sound and therefore dominate the normalisation process unless dealt with first. -3dB is the guidance from Xeno-canto, but if the main vocal is quite you may not be able to achieve this without also boosting the noise too much. I think really -3dB should be thought of as the maximum for a good recording, rather than the rule.

Copy and paste editing is available in many software suites. I use Wavelab Pro, but there are cheaper options.

I find using declick noise functions much less reliable than manual editing out clicks and would not recommend it.

Another option with spectral editing tools is to select the unwanted sound and change the level, but I find this problematic. In Wavelab at least, the level change softens at the edges, so the edit is less precise.

You can use the same copy and paste process for knocking out other sounds such a human voice or other bird sounds, but this can be challenging if the unwanted sound and wanted sound cross each other. In general the process works best to cover very short sounds. For larger copy and paste actions the edit can be more noticeable (particularly if there is a background sound in the copy zone). Copying and pasting a short clip (which appears quite) multiple times to cover a larger area normally doesn’t work well, as you can get a repetitive noise by doing this.

Another option is to use something like Steinberg SpectralLayers Pro 8. With this you can split a recording into transients (short vertical elements on the sonogram), tonals (long horizontal elements) and noise. The software works a bit like Adobe Photoshop with the option to create layers. You can therefore select sounds from the sonogram and move them to new layers. You can then change the level for this layer only. You could try to use this technique to separate out and tone down a species sound bombing the recording. The selection tools in Spectral Layers are very good and allow very precise edits to be performed from the sonogram.

If you are concerned that the end edit has gone to far and lost something, prior to normalising levels, you can always try phase inverting the edit with the original. If the edit is good you should only hear the unwanted clicks and other sounds you have deleted - if you can hear elements of the target bird’s vocals, the edit is no good.

Of course the best option is to try and avoid unwanted sounds in the first place. If you have a condenser mic and are recording in humid conditions this may explain clicks and pops. I think a foam mic cover can help protect the capsule from humidity, but this provides limited wind protection. An RF condenser mic should be less susceptible to humidity, but the mics are far more expensive.

For photobombing a good directional mic well aimed at the target bird, should result in attenuation of bird sounds off-axis. Of course this will not solve the problem if a bird pops up directly behind or in front of the target and blasts it’s head off.
 
Thanks for the detailed advice Jon!

I post-process via Audacity which too has copy paste functions, will use them next time to remove loud clicks/other human noise. Seems a little bit like cloning in Photoshop, it'll probably take me a little bit of pratice to make things sound smooth for longer noises.

SpectralLayers looks expensive, but I'm thinking that I might be able to apply a similar method using the "Equaliser" on Audacity to targettedly reduce the amplitude of certain frequencies (if it doesn't overlap with the main bird)? I do that right now to deal with loud cicada.

Never knew about the phase inverting method, that's very clever. I'm sometimes concerned whether my high/low pass filters are too strict and are affecting the main vocalisation, shall incorporate that to my workflow too. And also I now know why I might be experiencing frequent "pops" - I use a condenser mic with a foam cover, but it's impossible to escape humidity in Southeast Asia where I live!

Going a little off tangent but regarding the -3dB guidance, when the bird song is too soft (but is of a precious species/sound type worth contributing online) is it generally better to upload a softer overall recording than one that is amplified to a certain extent? I'm a little confused, because while there's the concern of amplifying noise when forcefully normalising to -3dB (or something slightly softer), on the flip side, if it's too soft anyway, the user will still end up needing to amplify these noise anyway when using the recording.

For photobombing a good directional mic well aimed at the target bird, should result in attenuation of bird sounds off-axis. Of course this will not solve the problem if a bird pops up directly behind or in front of the target and blasts it’s head off.

It has generally been fine, but just recently a flowerpecker did exactly what you described when I was recording a nice flycatcher subsong! Such is life I guess 😅
 
You can try using an equaliser function, but I think this works best for low frequency knocks (from handling) and wind noise. With the equaliser that I use, you get a display of the sound intensity against frequency, so you can listen a few times, see the frequencies where the vocals are, then start manipulating the equaliser to knock out the low or high frequencies without impacting the bird vocals. I recently tried this for Kittiwakes in Norway, and Dartford Warbler in the UK, which were both recorded on windy days, and was quite pleased with the results.

I would avoid dropping any frequencies to zero, as I think this sounds a bit artificial. I prefer using a low shelf for low frequencies. If you drastically reduce certain frequencies the resultant spectogram looks a bit off, with a large white bands present (normally top and bottom of the spectogram). I have seen threads on Xeno-Canto where people have complained about some users excessive post processing - white bands on the sonograms are a give away.

Using an equaliser is rather an imprecise process to reduce (or increase) certain frequencies over the whole recording. I think trying to divide a recording, apply different equaliser effects then piece it back together, would not work - or it would take a lot of work, a certain amount of genius, and probably a lot of luck. The use of an equaliser is therefore a lot different to something like SpectralLayers, which I think uses AI to pull apart the structure of the recording into separate layers, and allows each layer to be precisely edited and the loudness altered.

There are also noise reduction tools, but these can impact bird vocals and I think work best on constant noise. I have had limited success with trying to reduce the noise of passing vehicles or aircraft, when the sound changes in intensity and pitch over time. That said the guys at Cedar processed a file for me as a trial and did a really good job of reducing the noise from a passing truck - the software they used however was outside my budget.

With regard the -3dB rule. I recall that Xeno Canto stated that this was a standard, so that the listener knows what to expect. As 3dB is a approximately a doubling of sound intensity, I suppose that if you listened to one recording at -3dB then listen to another recording at 0dB it could come as bit of a shock!

I agree that if the listener turns up the volume then they should still get the hiss, but then at least this is the listeners choice. I would prefer to label the recording as a quite or distant recording, then let the listener decide whether to turn up the recording. I think this would be better than the listener having to turn down the recording to bury the excess noise.

The problem with quiet recordings is that environmental noise, mic self noise, pre-amp noise all become more evident compared to the target recording. These are generally wide band, so are hard separate out - even a tool like SpectralLayers won't perform magic with a quite recording with lots of noise - I have tried this with NocMig recordings with disappointing results. With distance you also get attenuation of high frequencies, and any completely lost frequencies cannot ever be recovered. There has been plenty of discussion regarding close placement of microphones not representing what the observer hears, but the converse of this is that a bird recorded at distance will have the tonal quality of a bird recorded at distance (subject to variation from temperature and humidity) regardless of how loud we make it. I think it is best to say that the recording is quiet and made at distance and be done with it.
 
Last edited:
Looking at Audacity, I don’t think the spectral editing tools include a copy and paste function. You only seem to be able to select a region of the sonogram and delete the audio or apply effects.

If this is the limit of Audacity’s function, you would need to be a bit more careful with as alteration or deletion of a region may result in a more noticeable edit (particularly on the sonogram).

In my experience using software where you can copy and paste small adjacent regions to mask out clicks and pops results in a better ‘seamless’ edit.

I know you consider SpectralLayers is too expensive, but it does have great selection tools (not just a rectangular selection tool) so precise deletions are a doddle. It can also recreate noise ‘behind’ the tonal or transient, so when you make a deletion you are not left with an annoying void in the sonogram. With this software you don’t need to copy and paste, just split the recording into stems then select and delete tonals and transients (clicks and pops) you don’t want.
 
Thanks again Jon for the highly specific advice. So much to learn!

I've seen large white bands on sound clips online (and am myself guilty of contributing some) and at times find them somewhat inevitable, especially for birds like doves that are soft and at low frequency that everything else - especially cicadas - drown them out. Guess it'll take myself more practice to be able to find the right balance when post-processing so that things don't sound unnatural.
I think it is best to say that the recording is quiet and made at distance and be done with it.
Mm that's true, at times I'll just need to learn that certain recordings simply aren't good enough and excessive post-processing won't do any good.

Have only played around with the noise reduction on Audacity but am not quite a fan of it because it frequently creates muddled and "watery" sounding result. Didn't know that the layering in SpectralLayers was so intricate, it seems like trying to emulate it on Audacity with equalisers might not work out well indeed. RE: Cut and paste - I am able to copy/cut paste a duration from the sound clip but not of specific frequencies (if I'm understanding how SpectralLayers work correctly). I suppose it should be a reasonable placeholder to remove the pops and clicks for the time being. Will probably have to make do with them for now until I can eventually invest in a more expensive program!
 
Hi
I often reduce the lowest 4-600 hz in Audacity with 6 or 12db to reduce annoying hums and rumble in the lower part, Audacity has a smooth transition from 0 to 500hz makeing it quite ok in my ears. For water, wind I use the Noice reduction filter with maximum 5db
Stein
 
I have a question that comes to mind: I recorded a thunder storm with a Zoom f3 (32 b) and thunder broke (+10 kHz) and distorted the sound. Could you fix it in editing with Audacity? What hidden tools or steps to follow (tutorials)?
 
When you say the have distortion, have you normalised the level to less than 0dBfs? Or are you trying to play a clipped audio. Not an Audacity user, but I think to normalise the menu selection is
  • Effect -> Volume and compression -> Loudness Normalization... to open the Loudness Normalization window
Solving clipping issues should be a real benefit of 32 bit float.

If the recording isn’t clipped, has the thunder caused mic distortion. I don’t think you could fix the issue if the mics have been overloaded - I.e the noise level was louder than the max SPL for the mic.
 
When you say the have distortion, have you normalised the level to less than 0dBfs? Or are you trying to play a clipped audio. Not an Audacity user, but I think to normalise the menu selection is
  • Effect -> Volume and compression -> Loudness Normalization... to open the Loudness Normalization window
Solving clipping issues should be a real benefit of 32 bit float.

If the recording isn’t clipped, has the thunder caused mic distortion. I don’t think you could fix the issue if the mics have been overloaded - I.e the noise level was louder than the max SPL for the mic.
It distorts the speakers, I think there is no solution anymore. I did not normalize the audio or edit it
 
It distorts the speakers, I think there is no solution anymore. I did not normalize the audio or edit it
No sure this is true. If you play a clipped file it will be distorted when you play back, even in a 32 bit float recording.

Clipping is when the recording goes above 0dbfs. With 32 bit float although playback will be distorted, the data above 0dbfs is not lost (it is with none float recordings).

So to fix it you must normalise the track to below 0dbfs (a common value is -3dbfs). Once normalised then try and listen again to see if the distortion has been fixed and if so save the normalised file.
 
No sure this is true. If you play a clipped file it will be distorted when you play back, even in a 32 bit float recording.

Clipping is when the recording goes above 0dbfs. With 32 bit float although playback will be distorted, the data above 0dbfs is not lost (it is with none float recordings).

So to fix it you must normalise the track to below 0dbfs (a common value is -3dbfs). Once normalised then try and listen again to see if the distortion has been fixed and if so save the normalised file.
I'm going to normalize it, just as you say, to see if it's possible to save the file. Thank you!

This is the dameged audio: (at thirty seconds) Recorded with Zoom F3
 
There are a few high peaks in this recording and ‘distortion’ at one point. I haven’t checked the peak level, but I think the audio is clipped.

Thunderstorms are quite hard to record as there is a large dynamic range between the quiet and loud bits. Obviously, if you normalise the whole file, the rain noise will be quieter, so you need to see how this goes - you probably don’t want near silence and then the clap of thunder. If the rain noise ends up too quiet, you can try bringing down the peaks only. I think to do this you would zoom in on the peaks in the wave form, make a selection of the thunderclap and normalise to less than 0dbfs. You would not want to select too much, as the problem could then be a noticeable step in the ambient noise level. I don’t think Audacity has the capability to fade in and out a change to the sound level.

I would suggest trying a global normalisation first and seeing whether the noise of the rain sounds OK, but if this doesn’t sound good, then trying to normalise just the peaks.
 

Users who are viewing this thread

Back
Top