vSphere: Working with traffic filtering in the vNetwork Distributed Switch

Introduction

Within a physical and virtual infrastructure there are several options to limit the inbound and outbound traffic from and to a network node, part of the network or entire network (security zone). A limit can be, filtering (allow or dropping certain traffic) or the prioritization of traffic (QoS / DSCP tagging of the data) where a defined type of traffic is limited versus a kind of traffic with a higher prioritization.

Options include filtering with ACL, tagging and handling sort of traffic with QoS / DSCP devices, firewalling (physical or virtual appliances), physical or logical separation or Private VLAN’s (PVLAN for short). Furthermore, an often overlooked component, keep all your layers in view when designing the required security. If required to filter traffic from a specific data source to a specific group of hosts where the requirement is that those VM’s are not allowed to see or influence the other hosts, traffic filters setup on the physical network layer will not always be able to “see” the traffic as for example blade servers in certain blade chassis can access the same trunked switch ports / VLAN, or VM’s with same portgroup / VLAN are able to connect to each other’s network as the traffic is not reaching or redirected to the physical network infrastructure where these filters are in place. That is when not using a local firewall on the OS. You could say this is bad designing, but I have seen these described “flaws” pop up a little too often.

 Options in the VMware virtual infrastructure

You have to option to use third party virtual appliances as firewalls, vCloud suite components or network virtualization via NSX (SDN) for example. Not always implemented due to constraints overheard around, like: overhead of the handled traffic by the virtual firewall (sizing), single point of failure when just using one appliance, added complexity for certain IT Ops where networking and virtualization are strict separated (Bad bad bad) or just no budget/intention to implement a solution that goes further than just the host virtualization the organization is at (as they probably just started). These are just a few, not all are valid in my opinion….

From vSphere 5.5 there is another unused option (mostly unknown); use the traffic filtering and tagging engine in the vNetwork Distributed Switch (vDS or dvSwitch). That is when you have an Enterprise Plus edition, but hey without this a vDS is not available in the first place. Traffic filtering is introduced in version 5.5 and therefore can only be implemented on vSphere 5.5+ members of the 5.5+ version of vDS. This vDS option is the one I want to show you in this blog post.

Traffic filters, or ACL, control which network traffic is allowed to enter or return (ingress and/or egress rules) from a VM, a group of VM’s or network via the port group, or a uplink (vmnic). The filters are configured at the Uplink or port group, and allow for an unlimited number of rules to be set at this level. These handle the traffic from VM to the portgroup and/or the traffic from portgroup to the physical uplink port, and vice versa. The rules are processed in the VMkernel, this is fast processed and there is no external appliance needed. With outgoing traffic rule processing happens before the traffic leaves the vSphere host, which also possibly will save on the ACL on the physical layer and networking traffic when only types of traffic or to a specific destination are allowed.

With the traffic filter we have the option to set rules based allow drop (for ACL) on the following Qualifiers:

vDS - image1

The tag action allows setting the traffic tags. For this example we don’t use the tag action.

System Traffic are the vSphere traffic types you will likely see around, where we can allow a certain type of traffic to a specific network. MAC let’s us filter on layer 2, and specific source and/or destination MAC addresses or VLAN ID’s. IP let’s us filter on Layer 3 for the IP traffic types TCP/UDP/ICMP traffic for IPv4 and IPv6.

The following System traffic type are predefined:

vDS - image2

Make it so, number One

I will demonstrate the filtering option by creating a vDS and adding a ESXi host and VM to this configuration. Just a simple one to get the concept.

My testlab vDS is setup with a VM like this screenshot:

vDS - image3

I got a DSwitch-Testlab vD switch with a dvPortgroup VM-DvS (tsk tsk I made a typo and therefore not consistent with cases, please don’t follow this example ;-)). A VM Windows Server 2012 – SRDS is connected to this portgroup.

 The VM details are as follow:

vDS - image4

The IP address 192.168.243.165 we will be looking at.

A the VM-DvS and going to the manage tab, we can choose Policies. When we push the edit button we can add or change the traffic filtering (just look for the clever name).

vDS - image5vDS - image6

As you see I already have created an IP ICMP rule which action currently says something completely the opposite as the rule name. This is on purpose to show the effect when I change this action. When I ping the VM from a network outside of the ESXi host, I get a nice ICMP response:

vDS - image7

When we change the ICMP rule to drop action, we get the following response:

vDS - image8

 

That’s what we want from the action. Other protocols are still available as there are no other rules yet, I can open an RDP to this Windows Server.

vDS - image9

When wanting to allow certain traffic and others not you will have to create several rules. The applied network traffic rules are in a strict order (which you can order). If a packet already satisfies a rule, the packet might not be passed to the next rule in the policy. This concept does not differ from filtering on most physical network devices. Document and draw out your rules and traffic flows carefully else implementation/troubleshooting will be a pain in the $$.

This concludes my simple demonstration.

 – Enjoy!

Sources: vmware.com

Learned Lessons – Nexus 1000V and the manual VEM installation

At an implementation project we implemented 24 ESXi hosts and used the Nexus 1000V to have a consistent network configuration, feature set and provisioning throughout the physical and the virtual infrastructure. When I tried to add the last host to the Nexus it failed on me with the InstallerApp (Cisco provided java installer that adds the VEM and adds the ESXi host to the configured DVS and groups). Another option is to use Update Manager (the Nexus is a appliance that can be updated by update manager), but that one threw an error code at me. I will describe the symptoms a little bit later, first some quick Nexus product architecture so you will have a bit understanding how the components work, where they are and how they interact.

Nexus 1000V Switch Product Architecture

Cisco Nexus 1000V Series Switches have two major components: the Virtual Ethernet Module (VEM), which runs inside the hypervisor (or with other words on the ESXi host), and the external Virtual Supervisor Module (VSM), which manages the VEMs (see figure below). The VSM can be either a virtual appliance or an appliance in the hardware device (for example in the physical Nexus switch).
The Nexus 1000V replaces the VMware virtual switches and adds a Nexus Distributed Switch (Enterprise Plus). Uplink and portgroup definitions are bound to the Cisco ethernet profile configurations. The VEM and VSM use a control and data link to exchange configuration items.

Configuration is performed through the VSM and is automatically propagated to the VEMs. Virtualization admins can pick up these configuration to select the portgroups at VM provisioning.

image

Symptoms to a failing installation 

The problem occurred as follows:

– Tried to install the VEM with the InstallerApp. The installer app finds the host and when the deployment is done, it stops when adding the hosts to the existing Nexus DVS. This happens somewhere from moving existing vSwitches to the DVS. Error presented is: got (vim.fault.PlatformConfigFault) exception.

– Checked the status of the host in update manager and this showed a green compliant Cisco Nexus Appliance. This probably delayed me a bit, because it really wasn’t.

– Tried to manually add a host to the Nexus DVS in the vSphere Webclient. This gave an error in the task. Further investigation let me to the line in the vmkernel.log: invalid Net_Create: class cisco_nexus_1000v not supported. Say what?

– With the Cisco support site and a network engineer I tried some VEM cli on the host. But hey wait vem isn’t there. An esxcli software vib list | grep cisco doesn’t show anything either (duh). While on an VEM installed ESXi host this shows the installed VEM software version. So Update Manager is screwing with me.

Manual Installation

That leaves me with trying to manually install the VEM. With the Cisco Nexus 1000 installation guides the working sollution is as follows:

  • The preferred Update Managers does not work. It fails with an error 99.
  • copy the vib file containing the VEM software from the VSM homepage using the following url: http://Your_VSM_IP_Address/. Check an ESXi host that is installed for the running version (esxcli software vib list | grep cisco). Download this file (save as).
  • Upload this file to a location where the host can access this. On the host or a datastore accessible from the host. I did the latter as the host did not have direct storage. Used WinSCP to transfer the files to a datastore directory ManualCiscoNexus.
  • On the hosts I added this vib by issuing the following command:

esxcli software vib install -v /vmfs/volumes/<datastore>/ManualCiscoNexus/Cisco_bootbank_cisco-vem-v160-esx_4.2.1.2.2.1.0-3.1.1.vib

  • vem status -v now gives output. Look for VEM Agent is running in the output of the vem status command.
  • vemcmd show port vlans only shows the standard switches. Communication with the VSM is not yet there.
  • I added the host manually to the Nexus DVS and success. When migrating the standard vmkernel management port to the DVS groups the hosts is also visible on the VSM. Communication is flowing and the host is part of the Nexus 1000v.

I hope this post will help when you experience the same problem, and also learns you a little about the Nexus 1000V Product Architecture.