Fix One Way VoIP Audio (SIP, NAT and STUN)

The Problem - When making VoIP calls (particularly with SIP) you can ring phone numbers but once the call is answered there is either no voice or it is only one way.

The Cause - I am pretty sure the cause of this will be the same regardless of what protocol you are going to use for your VoIP solution but I only have experience of SIP. So this will definitely be an issue with SIP but I haven't confirmed it with the other protocols.

The problem arises because VoIP uses dynamic UDP ports for each call. This causes problems when traversing a NAT device for two reasons; the NAT device changes the source port of outbound packets as part of the NAT process. The second is because UDP by its very nature is designed for one way traffic (broadcasts, video stream etc). Where TCP traffic is bi-directional across the one connection UDP can have 1 connection for inbound and another for outbound meaning they can use different ports. If the inbound connection uses different ports as the outbound connection the inbound traffic will be dropped because the NAT device does not have a mapping for it in its NAT table. If you are confused by now I suggest you read up on NAT first.

What is SIP and why is it important to VoIP Just as TCP/IP is not a protocol by itself but rather a family of protocols like TCP, IP, PPP, PPTP, ARP etc so is VoIP. There are several protocols you can use with VoIP each having their own pros and cons. The one we will focus in this article though is SIP. SIP stands for Session Initiate Protocol. It is responsible for setting up the call, ringing, signalling, engaged tones etc.

In most SIP environments there will be several VoIP calls in use concurrently. Every one of these calls will be managed through the VoIP switch, each one requiring its own voice channel. Each channel (or phone call to look at it another way) must use a unique port. If there are 100 concurrent VoIP calls in use there must be 100 ports available for the VoIP switch to allocate to each call. This is where SIP comes in. It basically controls everything that is needed in setting up the call. For each call SIP will find a spare port, allocate it, send these details to all parties, set the call up and ring the phones. Once the call has finished SIP terminates the session and informs the phone switch that this port can be reassigned to another call.

The range of ports is usually configurable, Avaya for example allow you to configure this in the VoIP portion of the system config. The default range for Avaya VoIP is 49152 to 53246. This gives us a possibility of 4094 concurrent VoIP calls licensing permitting.

In a LAN environment this is not a problem as firewalls usually permit all traffic on all ports for all devices. Once the internet is involved where the traffic has to traverse a NAT and firewall we start to run into problems. In the Avaya example above it can pick a port anywhere in the range of 49152 to 53246. You can't just open this port range to the internet. A range of 4000 ports open isn't very secure.

How SIP is meant to work on the internet As with all network traffic one endpoint must initiate the connection first. This means at least one port must be open using port forwarding to the VoIP switch. SIP usually runs on port 5060. For the two offices to call each other both sites must have this port being forwarded to the phone switch. When you read documentation on SIP most of it will say that this is all you need to do...But in all likelihood this is not the case.

The following happens when you dial a VoIP number:

    * You dial the number and your local VoIP switch matches this up with a site ID which locates the public IP address of the remote location.
    * Your local VoIP will connect to the remote IP on port 5060 using SIP (which is why the port must be open).
    * The two phone switches now negotiate and set up the phone call. Several things are done in the negotiation process but the most important one (for this article) being the ports that they will use to transmit the UDP voice streams.

The problem here is that SIP doesn't know it is behind a NAT. Let's say your local switch IP is and the remote IP is Although NAT modifies the SIP packets to the public IPs when traversing the internet it does not change the actual data in the SIP packets themselves (the payload). It is the payload that contains the information about what ports and IP addresses to use for the actual phone call. The local VoIP tells the remote VoIP (via SIP) to send voice data to its local IP of and vice versa. As we all know this is never going to work as internet routers drop packets from and to private IP addresses. Once the call is set up and the UDP voice data actually starts transmitting it will be sent to private IP's and consequently dropped. So how do we fix this?

STUN Stun stands for Session Traversal Utilities for NAT and as you may have guessed by its name it is a collection of utilities to aid in the traversal of a NAT devices.

STUN (as in our case) helps a program or device learn whether it is behind a NAT and modify packets accordingly. It requires the help of a 3rd party server on the internet known as a STUN server. This now means that our VoIP phones can modify their SIP content to contain the public IP instead of the private one. Some of you may be thinking this same problem also affects ports.

It is common with NAT to also change the source port of an outbound packet to a new randomly generated one. When the remote device responds it does so to this new random port. When packets come back in on this port NAT allows it through because it mapped this port to the internal client. As you might have guessed it this is also an issue for SIP. The STUN server also takes this into account. The STUN client (the VoIP switch) sends a UDP packet outbound on the port it wishes to use for the VoIP call to the STUN server. This will be NATTED to the public IP and a new port number. The STUN server sends this information back allowing the VoIP switch to learn its public IP and mapped (modified) external port for the voice traffic. Now we have all the info we require to modify the SIP data with the correct information to traverse a NAT. The local switch now contacts the remote switch via SIP and tells it to send the UDP voice call to its public IP and public port. Once this data comes back the NAT has a mapping for this in the NAT table and sends it to the internal VoIP switch. This how I thought it should work...Have you found what is wrong with this yet? I was stuck on this for a while...

The reason I was stuck was not through a lack of understanding the technologies (honest ), it was because of the stupid documentation (from Avaya) I had on setting up SIP and my confidence in that it was right. I checked everything again and found I had done everything correctly then it hit me...I thought "Hold on, when the UDP voice packets start coming in ON A RANDOM port how does it get through the NAT device when the only port forwarding I have is 5060 for SIP???"

I mislead you above a bit on purpose to see if you could spot it yourself. I said there was a mapping for the incoming UDP traffic in the NAT table but there isn't. You, like me may have assumed this because you don't have to port forward any other ports. The only way traffic can come into your network through a NAT without port forwarding is if it was first requested from an outbound connection. The outbound connection adds the entry in the NAT table to map incoming packets on this port to the internal client. This added to my confusion. The documentation clearly states you only need to port forward 5060 but the voice calls use random UDP ports so how do these get past the NAT? If you are still confused it will be because you don't understand (or have forgotten) one fundamental difference between UDP and TCP which is very important for us here.

TCP requires that one end point must first establish a connection for data to be sent back. As we know you have inbound and outbound connections. If I am making an outbound connection then it is an inbound connection at the other end. And inbound connection requires port forwarding which we don't have set up in this scenario. Also for data to be sent back the socket MUST BE ESTABLISHED. This is very important as it is not a requirement of UDP. UDP is connection-less remember (see The Differences Between TCP and UDP for more info). It can send data without ever being aware of the remote location. It is this key difference between TCP and UDP that allows you to traverse a NAT using UDP without port forwarding. The technique is called UDP hole punching.

UDP Hole Punching Let's add all the technologies so far to get a working solution. The two VoIP switches learn of each others public IP and ports to be used via the STUN server. They then use SIP on port 5060 to send this information to each other then they use UDP hole punching for the delivery of the VoIP packets.

UDP hole punching is a clever technique. It works by "punching" holes through the NAT device to create the NAT mappings. The local VoIP sends UDP packets to the remote VoIP to the port and public IP it was told to use from the SIP data. When this data hits the NAT device at the remote location it will not be delivered because there is no port forwarding in place and no outbound data has been requested yet. The exact same process happens from the remote VoIP to your local VoIP and packets are dropped as well. The purpose of this though is not to send the packets, it is to "punch" a hole through the NAT and create a mapping of the external port and IP to the internet port and IP consequently allowing incoming traffic on this port. As this happens at both ends we now have NAT mappings for these ports to the internal clients. Because these mappings now exist the NAT device sees these as outbound requests and will accept new packets coming back in on the same port. So in summary the first packet exchange will always fail from both parties but this "punches" holes through the NAT allowing all subsequent traffic to pass through. This is why you don't need to port forward these ports when using UDP. This technique is exclusive to UDP because UDP doesn't guarantee or even check as to whether the packets arrive. When the first packet fails it doesn't matter because the sender doesn't even know it failed (as UDP does no error checking), it just sends more UDP packets. This won't work with TCP because it creates a socket before sending data. As the initial packet will always fail TCP will error and keep trying to establish a socket first before sending any data. The socket will never connect so no data will be sent.

So Why Does The Thing Still Fail?? OK, sorry for the long post but I am big believer that the best way to learn is by the teacher (me, ha) leading you down the path so you solve it yourself rather than me. This is the last bit now I promise.

If you never knew about UDP hole punching then you would naturally think that you need to open ports to allow the UDP traffic through. This would explain why you get no voice at all. But what about one way traffic? This means that the port is open at one end and not the other. How is it possible to have UDP hole punching working at one end and not at the other when both NAT devices are configured the same?

In all likelihood you have different types of NAT at each site. To complicate things more NAT isn't standardised and there are various implementations of it. In an ideal world the documentation I read about setting up SIP would be correct because UDP hole punching would take care of the port forwarding of the UDP traffic. But as we often find out this is never the case...

It gets complicated and I am not going to re-invent the wheel. What you are looking for is what type of NAT device you have. It is probably a symmetric NAT as this is the one that is incompatible with STUN. Yes this is the problem!! STUN doesn't work with a symmetric NAT, here is why.

All the other types of NATs allow traffic from different IP's to come back into the network as long as it is on that port regardless of where I sent the packets to. So if I connect to the STUN to learn the external IP and port to use for VoIP this mapping now existing. A DIFFERENT IP can send packets to me as long as they use the same port I sent the UDP packets out on. In other words once a mapping has been created and linked to the internal client it will accept connections from any IP as long as it is on this port. This is not allowed in a symmetric NAT. An outbound packet sent to a specific IP and port will only allow packets coming back from that IP and port. So, we do the same as above and contact the STUN server to get our public IP and port.


This info is sent to the remote VoIP via SIP. It now tries to send data back to your local VoIP via this port but because it is a different IP a symmetric NAT blocks it. This NAT mapping is exclusive to the STUN server. To allow data to come in from the remote VoIP which is a different IP a new mapping must be created, which uses a different port... As you can see this is a problem because the port that will be used for the actual UDP voice call is different to the one the STUN server detected. Because the ports are dynamic and STUN won't work, your local VoIP can never learn what that external port is to be used for the traffic to and from the remote VoIP.

This is why you get one way traffic in some scenarios. If both NAT devices are non symmetric NATs they will get the correct information through STUN and voice flows both ways ok. If one device is symmetric and the other is non symmetric only one of them can get the correct info through STUN and data can pass one way producing the one way audio. If both are symmetric you can't hear anything at all because traffic can't get through either NAT device.

So How Do I Fix It!?!? Buy a new NAT device! One that isn't a symmetric one!!

Replacing your NAT device is one solution but the other is far more simple than the you might think. All you need to do is the following:

    * On your phone switch (Avaya in my case) reduce the dynamic port range. How many VoIP calls do you think you will have going at any one time max? Most of you reading this will be 10 at a guess, maybe 20. In my case the range was 49152 to 53246 so I reduce the max range to 49162 giving me 10 ports.
    * On your NAT device set up port forwarding for the 10 ports to your VoIP switch.

The reason this works is because you are effectively mapping your external port numbers to the same internal port numbers (remember that NAT replaces port numbers with random ones by itself). You now know that your VoIP will only use a range of 10 ports and STUN will fail. This means that the SIP information sent over to the remote VoIP will actually list the internal ports and not the NATted ones. This means your traffic goes out on random ports (because it is NATTED) but the remote VoIP sends back to ports in the range you specified in your local VoIP. There won't be a NAT mapping for this of course and it should be blocked but this is why you use port forwarding instead. Have Fun!