do you use virtio-net of you VMHi guys,
I am new to this forum, I was wondering if anyone got better performance by disabling hyper threading. I will need to take down production routers to do this. We have a 9 core cloud core router doing 5x more traffic and the CPU usage is around 40-50% at peak. I have 2x CHR routers with 8 cores of 2.6GHZ Xeon processor, doing around 15Mbps of traffic per interface (WAN and LAN) and the CPU is sitting around 15-25% which I feel is very high for the CPU spec.
These are VM's running on ESXi 6.0.0, similarly, I have used VMXNET3 and the Licence is P1, I won't need more than 1Gbps. My firewall rules are limited to dropping invalid connections and anything that doesn't match my allow rules, there are 11 rules. No NAT rules, no Mangle rules.
I am keen to find a solution to this issue, anyone input would be appreciated. I have attached some screenshots below:
![]()
i am not sure,but you can check it online.i use the Virtualbox, when i use virto-net ,the cpu usage reduce 20%.I don't see that setting, are you sure it is available in ESXi 6.0.0?
Re: I was wondering if anyone got better performance by disabling hyper threadingHi guys,
I am new to this forum, I was wondering if anyone got better performance by disabling hyper threading. I will need to take down production routers to do this. We have a 9 core cloud core router doing 5x more traffic and the CPU usage is around 40-50% at peak. I have 2x CHR routers with 8 cores of 2.6GHZ Xeon processor, doing around 15Mbps of traffic per interface (WAN and LAN) and the CPU is sitting around 15-25% which I feel is very high for the CPU spec.
These are VM's running on ESXi 6.0.0, similarly, I have used VMXNET3 and the Licence is P1, I won't need more than 1Gbps. My firewall rules are limited to dropping invalid connections and anything that doesn't match my allow rules, there are 11 rules. No NAT rules, no Mangle rules.
I am keen to find a solution to this issue, anyone input would be appreciated. I have attached some screenshots below:
![]()
Disable hyper-threading on the physical computer in the BIOS.I don't see that setting, are you sure it is available in ESXi 6.0.0?
That's official Intel recommendation, if virtualization is used. HyperThreading does more harm than good, in this case.A method to get more speed out of a very busy CHR router:
On the physical computer , in the BIOS , disable hyper-threading & set for maximum performance.
Isn't that mainly because of security (Meltdown & co)?That's official Intel recommendation, if virtualization is used. HyperThreading does more harm than good, in this case.A method to get more speed out of a very busy CHR router:
On the physical computer , in the BIOS , disable hyper-threading & set for maximum performance.![]()
No. This is from far earlier than that. It's about performance: it is better without HyperThreading (this kind of workload is).Isn't that mainly because of security (Meltdown & co)?That's official Intel recommendation, if virtualization is used. HyperThreading does more harm than good, in this case.A method to get more speed out of a very busy CHR router:
On the physical computer , in the BIOS , disable hyper-threading & set for maximum performance.![]()
Hey Tom,Pirlet ,
FYI - I assume you are running a Mikrotik CHR (64-Bit ROS).
Heads up - I don't think you need the "SCSI controller 0" in your configuration. If I am correct , the CHR does not even have SCSI drivers and the virtual CHR hard disk is actually IDE.
By removing the SCSI controller , you free up some resources and at least one interrupt.
Here is my configuration on my CHR which happens to be one of my BGP routers:
CHR-NoScsi.png
I use only Intel Xeon processors.Hey Tom,Pirlet ,
FYI - I assume you are running a Mikrotik CHR (64-Bit ROS).
Heads up - I don't think you need the "SCSI controller 0" in your configuration. If I am correct , the CHR does not even have SCSI drivers and the virtual CHR hard disk is actually IDE.
By removing the SCSI controller , you free up some resources and at least one interrupt.
Here is my configuration on my CHR which happens to be one of my BGP routers:
CHR-NoScsi.png
Are you on an AMD platform or Intel platform?
Can I PM you so maybe we could compare configs? That would really help me out massively.
How much traffic/users do you run on system like this? What do you do on on it? QoS, Firewall, NAT, BGP...?Getting the most performance out of a CHR (in my case , two CHRs on a single physical computer with two Intel Xeon CPUs (10 cores each) with Hyper-Threading disabled)
OK - so I have been blabbing quite alot about CPU cache. Well , here is another trick to get even more out of a couple of CHRs running on the same physical box.
Use CPU affinity (specify which CPU cores a hosted CHR can use
Info: I have a SuperMicro box , two 10-core Xeon processors (Cores 0 through 19) , Hyper-Threading disabled , I am using VMware ESXi for my Hyper-Visor.
CHR#1 - Set host CPU affinity to 1-9 ( 9 cores reserved on physical CPU1 )
CHR #2 - Set host CPU affinity to 11-19 ( 9 cores reserved on physical CPU2 )
CHR #1 does not share any CPU cache with CHR #2 (Both CHRs have their own semi-private CPU cache and dedicated cores to each running CHR)
The results , both CHRs now run faster.
Note: CHR #2 might be a little faster because it is likely not sharing any CPU cache with the VmWare ESXi hosing Hyper-Visor operating system.
Note: CHR #1 might share some CPU cache with the host VmWare ESXi Hyper-Visor operating system.
Note: If you do not define CPU affinity , you can potentially end up with some CPU cores doing nothing at almost zero percent CPU usage and other CPU cores running at 50 to 100 percent CPU usage handeling multiple hosts at the same time (swapping jobs).
One thing I have not tried yet is to define what CPU cores the actual VmWare ESXi system can use for it's own processes.
North Idaho Tom Jones
EDIT - additional info added
If you have more than 1 physical CPU on your Hyper-Visor system (example two 10-core CPUs with 30-Meg CPU cache per Xeon CPU), and you only run a single HOST CHR with no other hosts running - you can also do this:
--> use CPU affinity 6-14 (you end up with 60-Meg of CPU cache on your single CHR host)
Re: How much traffic/users do you run on system like this & BGPHow much traffic/users do you run on system like this? What do you do on on it? QoS, Firewall, NAT, BGP...?Getting the most performance out of a CHR (in my case , two CHRs on a single physical computer with two Intel Xeon CPUs (10 cores each) with Hyper-Threading disabled)
OK - so I have been blabbing quite alot about CPU cache. Well , here is another trick to get even more out of a couple of CHRs running on the same physical box.
Use CPU affinity (specify which CPU cores a hosted CHR can use
Info: I have a SuperMicro box , two 10-core Xeon processors (Cores 0 through 19) , Hyper-Threading disabled , I am using VMware ESXi for my Hyper-Visor.
CHR#1 - Set host CPU affinity to 1-9 ( 9 cores reserved on physical CPU1 )
CHR #2 - Set host CPU affinity to 11-19 ( 9 cores reserved on physical CPU2 )
CHR #1 does not share any CPU cache with CHR #2 (Both CHRs have their own semi-private CPU cache and dedicated cores to each running CHR)
The results , both CHRs now run faster.
Note: CHR #2 might be a little faster because it is likely not sharing any CPU cache with the VmWare ESXi hosing Hyper-Visor operating system.
Note: CHR #1 might share some CPU cache with the host VmWare ESXi Hyper-Visor operating system.
Note: If you do not define CPU affinity , you can potentially end up with some CPU cores doing nothing at almost zero percent CPU usage and other CPU cores running at 50 to 100 percent CPU usage handeling multiple hosts at the same time (swapping jobs).
One thing I have not tried yet is to define what CPU cores the actual VmWare ESXi system can use for it's own processes.
North Idaho Tom Jones
EDIT - additional info added
If you have more than 1 physical CPU on your Hyper-Visor system (example two 10-core CPUs with 30-Meg CPU cache per Xeon CPU), and you only run a single HOST CHR with no other hosts running - you can also do this:
--> use CPU affinity 6-14 (you end up with 60-Meg of CPU cache on your single CHR host)
i have simmilar opinion, single core performance is criticalim on my way too switch to chr... Im in my testing so what i learn so far is single thread performance is a big fact to obtain bandwidth speed and convergence ... I do a small bandwidth test on hyper-v with my station 2600x and i got around 40 gbps... So im looking to go to the intel 8600k wich have the best price ratio on single thread performance. I order it today so i will test it and give you results as soon i got it. Second i found on google on a mum document that you get more performance with hyper-v then esxi or proxmox kvm.https://mum.mikrotik.com/presentations/ ... 562405.pdf
CHR support PVSCSIPirlet ,
FYI - I assume you are running a Mikrotik CHR (64-Bit ROS).
Heads up - I don't think you need the "SCSI controller 0" in your configuration. If I am correct , the CHR does not even have SCSI drivers and the virtual CHR hard disk is actually IDE.
By removing the SCSI controller , you free up some resources and at least one interrupt.
Here is my configuration on my CHR which happens to be one of my BGP routers:
CHR-NoScsi.png
Hi,What is the brand/model of the server? I happened to me once that the PCIe slot was x8 but in the specs I found out it was wired as x4, so the card was not able to use its full bandwidth
Your CPU has 4 CCD, 8 CCX and two active cores per CCX. ( https://en.wikipedia.org/wiki/Epyc#Seco ... pyc_(Rome) )Bandwidth test to 127.0.0.1 is giving about 180Gbps with 14 vCPU, strange enough with only 2vCPU I get 800Gbps
Hi Paternot,Your CPU has 4 CCD, 8 CCX and two active cores per CCX. ( https://en.wikipedia.org/wiki/Epyc#Seco ... pyc_(Rome) )Bandwidth test to 127.0.0.1 is giving about 180Gbps with 14 vCPU, strange enough with only 2vCPU I get 800Gbps
I have 4 ideas, about this.
1) With just two vCPUs, the OS puts all the VM on the same CCX. This will allow them to share L3 cache and speed up the communication.
2) Bandwidth test will use a fair chunk of the CPU, just to manage everything. Maybe using just 12 cores to the CPU, and leaving the rest for the host server?
3) If I'm not mistaken, the Epyc CPU uses 4 memory channels. You do have 4 (or 8) memory sticks (and in the right slots), don't You?
4) Have You looked into the CPU usage, while dropping packets? Just to make sure the machine is CPU bound?
Have You tried to pass traffic through the host server, to check if it's just the VM or both loosing packets? Have You pinned the VM CPUs to the host CPU cores?We have 4 memory sticks, and I have checked and no CPU core on the host is maxing during production traffic yet we still see packetloss
I have not seen how its possible to PIN cpu core on a VM with PROXMOXHave You tried to pass traffic through the host server, to check if it's just the VM or both loosing packets? Have You pinned the VM CPUs to the host CPU cores?We have 4 memory sticks, and I have checked and no CPU core on the host is maxing during production traffic yet we still see packetloss
The problem with packet loss is definitely not from the side of Internet lags? Sometimes it happens.I have not seen how its possible to PIN cpu core on a VM with PROXMOXHave You tried to pass traffic through the host server, to check if it's just the VM or both loosing packets? Have You pinned the VM CPUs to the host CPU cores?We have 4 memory sticks, and I have checked and no CPU core on the host is maxing during production traffic yet we still see packetloss
Hi ,The problem with packet loss is definitely not from the side of Internet lags? Sometimes it happens.I have not seen how its possible to PIN cpu core on a VM with PROXMOXHave You tried to pass traffic through the host server, to check if it's just the VM or both loosing packets? Have You pinned the VM CPUs to the host CPU cores?We have 4 memory sticks, and I have checked and no CPU core on the host is maxing during production traffic yet we still see packetloss