ClearFoundation

Developer Apps QoS Taking QoS For A Spin

Taking QoS For A Spin

Overview

Published: Feb 1, 2013

For the hackers out there, the new ClearOS QoS engine is ready for testing!

A few years ago, the old Bandwidth Manager was introduced as a way to manage Internet bandwidth usage for a local network. The tool allowed administrators to reserve enough bandwidth for high priority traffic, and shuffle low priority traffic into a small slice of the pipe. It works, but it only solves half the problem and arguably the less important half. The other half is QoS (quality of service) – a way to prioritize network traffic that needs to move quickly. VoIP is probably the most important service that benefits from QoS, but terminal services and other traffic can benefit from QoS as well.

If Netflix, Youtube, and (pick your audio streaming service) are causing hiccups with your VoIP system, then ClearOS QoS will be your new best friend.

Ready for Testing

The instructions below have been updated since the original post.

The first thing you will need know is your actual upload and download speeds of your Internet connection. What your ISP tells you is not always (ahem) accurate, so use one of the web-based speed test tools to verify your speeds if necessary. Once you know your Internet speed, go ahead and install the QoS app from Marketplace or via the command line:

yum install app-qos

Go to the QoS web management interface to configure your upload and download speeds.

The Bandwidth Manager and QoS apps do not play well together and only one or the other can be enabled for now. If you are using the existing Bandwidth Manager, you will need to disable it via the web interface, or on the command line – in /etc/clearos/bandwidth.conf, set the following:

BANDWIDTH_QOS="off"

QoS in Action

The following shows results from a handy command line tool called mtr - a network diagnostic tool that is helpful when diagnosing latency on the network. One word of caution: mtr uses ICMP (ping) for collecting data, and some devices out there might be purposefully lowering the priority on this type of traffic. However, the most important thing about this exercise is not the absolute value of the measured latency, but the relative values shown in different scenarios.

Quiet Network

To get a good baseline, try to find a relatively quiet moment on the network. Run the mtr command specifying a remote host. In the example below, the Google public DNS server is used: 8.8.8.8.

 # mtr 8.8.8.8
 HOST: 192.168.4.1                 Loss%   Snt   Last   Avg  Best  Wrst StDev
   1. 10.126.12.129                 0.0%    25    7.1   7.3   6.0  13.9   1.5
   2. 67.231.220.197                0.0%    25    7.4  10.0   7.0  11.9   1.4
   3. yorkmills1.cable.teksavvy.co  0.0%    25    7.8  12.5   6.7  92.9  17.7
   4. yorkmills7.cable.teksavvy.co  0.0%    25    9.0  13.1   7.2  58.2  11.3
   5. 72.14.212.134                 0.0%    25    8.2  12.4   7.1  47.7  11.1
   6. 209.85.255.232                0.0%    25   10.7   8.7   7.1  11.5   1.1
   7. 72.14.236.226                 0.0%    25   27.1  27.1  26.1  28.4   0.6
   8. 209.85.249.11                 0.0%    25   33.1  32.3  31.1  34.3   0.8
   9. 72.14.238.16                  0.0%    25   33.0  33.9  32.0  38.5   1.6
  10. 216.239.49.145                0.0%    25   35.0  37.9  31.9  45.9   4.8
  11. google-public-dns-a.google.c  0.0%    25   33.0  34.6  29.0  59.1   5.7

On my quiet network, average latency time was in the mid-30 millisecond range and the worst was just shy of 60 milliseconds. According to the experts, VoIP calls start getting annoying at 250 ms, and 150 ms is the VoIP standard that is thrown around the most. So far, so good.

Uploading a File

Though the download speed on my connection here is 25 Mbps, only 1 Mbps is available for upload. This test demonstrates that saturating the upstream bandwidth with a simple file upload absolutely destroys latency times on this connection.

HOST: 192.168.4.1                 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. 10.126.12.129                 0.0%    25  212.5 830.8  36.9 3691. 1115.2
  2. 67.231.220.197                0.0%    25  127.4 854.1  41.8 3661. 1141.2
  3. yorkmills1.cable.teksavvy.co  0.0%    25   37.5 867.1  37.5 3762. 1180.9
  4. yorkmills7.cable.teksavvy.co  0.0%    25  141.4 821.9  43.1 3799. 1185.2
  5. 72.14.212.134                 0.0%    25   43.3 796.9  38.7 3972. 1191.5
  6. 209.85.255.232                0.0%    25  160.9 861.7  47.5 4101. 1203.0
  7. 72.14.236.226                 0.0%    25   92.5 829.9  59.5 4168. 1166.6
  8. 209.85.249.11                 0.0%    25  332.5 855.4  61.9 4077. 1133.9
  9. 72.14.238.16                  0.0%    25  244.1 890.5  63.5 3988. 1102.3
 10. 72.14.232.21                  0.0%    25  155.3 846.7  50.1 3900. 1085.5
 11. google-public-dns-a.google.c  0.0%    25   64.1 897.9  64.1 3807. 1115.4

The average latency time was nearly 900 ms while the worst reached over 4 seconds. Wah? Really? That's brutal. A VoIP call would have no chance. Now it is time to run the same test with the QoS engine enabled. In /etc/clearos/qos.conf/, set the following parameter:

QOS_ENABLE="on"

And restart the firewall:

service firewall restart

Running the same file upload test, the results improve dramatically:

HOST: 192.168.4.1                 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. 10.126.12.129                 0.0%    25    6.0  14.4   6.0  97.1  18.3
  2. 67.231.220.197                0.0%    25   22.5  16.1   7.6  33.9   8.1
  3. yorkmills1.cable.teksavvy.co  0.0%    25    6.8  16.4   6.8  55.4  13.0
  4. yorkmills7.cable.teksavvy.co  0.0%    25   17.0  14.0   6.9  57.2  12.6
  5. 72.14.212.134                 0.0%    25    7.5  16.7   7.3  43.7  11.2
  6. 209.85.255.232                0.0%    25    8.2  23.9   6.2 262.4  50.3
  7. 72.14.236.226                 0.0%    25   27.1  38.1  26.2 203.1  35.3
  8. 209.85.249.11                 0.0%    25   35.2  40.7  31.2 104.8  16.1
  9. 72.14.238.16                  0.0%    25   32.0  37.6  32.0  50.2   5.3
 10. 216.239.49.145                0.0%    25   34.0  41.8  32.1  74.0  10.5
 11. google-public-dns-a.google.c  0.0%    25   32.3  44.1  32.2 213.1  35.8

The average is excellent at just north of 40 ms. The worst ping time is a bit of a concern, but certainly a huge improvement over 4 seconds.

IMAP Client

With the file upload test under my belt, it was time to test a real world problem that we had in our old Toronto office. We always had trouble with our VoIP system when the Thunderbird IMAP client was busy doing something with the IMAP server - synchronizing mail, deleting messages, etc. The following data was from a previous test on a slower network, so these should not really be compared with the other results here. For the record, this was done using the Evolution mail client (not Thunderbird).

HOST: 192.168.4.1                 Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. ???                          100.0    25    0.0   0.0   0.0   0.0   0.0
  2. ???                          100.0    25    0.0   0.0   0.0   0.0   0.0
  3. gw-primus.torontointernetxch  0.0%    25  466.8 324.1  68.7 929.2 242.5
  4. gw-google.torontointernetxch  0.0%    25  382.8 316.7 100.0 878.1 249.6
  5. 216.239.47.114                0.0%    25  301.5 282.1  61.5 883.0 244.2
  6. 72.14.236.224                 0.0%    25  276.4 285.7  86.8 889.9 243.3
  7. 72.14.239.93                  0.0%    25  285.4 326.0 115.4 874.7 255.5
  8. 72.14.238.18                  0.0%    25  258.4 319.8 115.2 908.5 246.1
  9. 72.14.232.21                  4.0%    25  814.4 343.1 116.8 908.4 231.3
 10. google-public-dns-a.google.c  4.0%    25  725.3 318.9 132.6 867.5 211.4

300+ ms latency with worst performance nearing a full second. This was caused by a single IMAP client which was not even using all that much bandwidth. Ouch. Sure enough, with the QoS engine enabled, the latency problem went away. What is interesting here is that latency times suffered even though there was significant bandwidth available.

Downloads and Streaming

With the file upload and noisy IMAP client tests under my belt, I moved on to downloads. If you are able to saturate your download speed with a few video streams and/or downloads, you will see that latency times start to suffer. I did not capture the data for this test, but the QoS engine certainly improved the situation just like the previous two scenarios.

BitTorrent - Doh

There is one very big gotcha - BitTorrent. Though there was a mild improvement using the QoS engine, BitTorrent still managed to stretch latency to nearly a full second.

A gateway running QoS will see all sorts of traffic going though it. At a fundamental level, a QoS engine has to figure out which network packets are important and which network packets can chill out for a bit. Standard port numbers (80 for web traffic, 22 for SSH, etc.) help with identification, but other protocols like BitTorrent are hard to identify.

In the early part of 2013, work will get started on improving and integrating the l7-filter (Layer 7 Filter) engine. This will help with identifying traffic which in turn will improve the QoS engine.

But that's an article for another day.

Please feel free to post any comments or feedback in the forums

- Peter Baldwin (“Pinky”)

- Darryl Sokoloski (“The Brain” behind QoS)



Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Share Alike 3.0 Unported
Video demonstrations - Copyright © 2010 ClearCenter Corporation