VIMAGE Jails on FreeBSD-8

While FreeBSD-9.1 has recently been released, I find myself still working in FreeBSD-8 (and at the time of this writing, 8.4 is soon to be released). This article is based on work performed in FreeBSD-8.1 (only because we froze our development environment for several years to prove the technology that I’m going to talk about in this article).

In FreeBSD-8 there is a new feature for “jails” that is not enabled by default (for base information on standard FreeBSD jails, see the Jail Section of the FreeBSD Handbook).

The new optional feature is called “VNET” and it allows each jail to have a private networking stack. However useful and exciting this may sound, these VNET jails require you to “Bring Your Own Network Interface.”

The VNET feature allows you to move network interfaces in/out of the view of a jail. When a network interface is moved into the view of a jail, it is no longer visible to the host of said jail.

A VNET jail starts with an empty network stack.

For example, you can create an ad hoc persistent VNET jail using the following command (requires VIMAGE option enabled in kernel — more on that later):

jail -c vnet name=vj1 host.hostname=vj1 path=/ persist

NOTE: If your kernel is not compiled with the options VIMAGE line, then you’ll get an error of jail: unknown parameter: vnet).

This jail is now a special jail, called a VNET jail. In this jail, ifconfig shows a private network stack which is currently empty except for an unconfigured local-loopback interface (lo0).

You can now move network interfaces into that jail using the below ifconfig syntax from the host:

ifconfig <interface_name> vnet <jail_id>

Meanwhile, you can move the network interface back out-of that jail using the below syntax from the host:

ifconfig <interface_name> -vnet <jail_id>

Notice the -vnet to recover an interface versus vnet to relinquish an interface to a VNET jail.

Where <interface_name> is something like fxp0, em0, bge0, igb0, etc. and <jail_id> is the value shown in the “JID” column of the output produced by the jls utility (executed without arguments).

However, the usefulness of this is still very limiting. With the VNET feature alone, all one is afforded is the ability to move whole network interfaces into and out-of VNET jails. When one does this, the jail and host do not share the network interface but rather the host can no longer see (using ifconfig for example) nor use the network interface.

To alleviate this limitation, a bridging technology is required to allow the host and VNET jail to share a network interface.

Enter netgraph(4), another optional component in FreeBSD — graph based kernel networking subsystem — publicly available since FreeBSD-3.4. By adding a few NETGRAPH options to the kernel we can provide the VNET jails a solid bridging layer able to provide the jails fresh/functional network interfaces.

However, we’re still not to a point of usefulness because the following gaps still exist:

  • How to automate the bootup, shutdown, and management of VNET jails?
  • How to automate the creation, destruction, and bridging of netgraph devices?
  • How to make sure bridged interfaces have unique MAC addresses?
  • How to allow multiple VNET jails to have the same root directory? (considering each sets their own IP from rc.conf(5))

Enter vimage, software I’ve written to tie VNET jails together using netgraph.

NOTE: My “vimage” boot/management script (which lives in /etc/rc.d and oft-executed via “service”) should not be confused with the “vimage” utility from the VirtNet project (imunes.tel.fer.hr/virtnet/).

My vimage is a fork of the jail rc.d script — rewritten to work only with VNET jails and help you harness the most flexibility in network topology and jail structure.

HINT: If you are familiar with the IMUNES project (http://www.imunes.tel.fer.hr), both their work and my own performance-test results were factored into the ultimate choice to settle on netgraph for our bridging needs.

My vimage boot/mgmt script is available here:
http://druidbsd.sourceforge.net/download.shtml#vimage

Let’s get started…

First, you’ll need a custom kernel for your FreeBSD host before you can run any VNET jails, let alone vimage jails (VNET jails with netgraph).

Add the following to a GENERIC FreeBSD kernel config as a minimum:

makeoptions NO_MODULES=yes
options VIMAGE
options NETGRAPH
options NETGRAPH_ETHER
options NETGRAPH_BRIDGE
options NETGRAPH_EIFACE
options NETGRAPH_SOCKET
nooptions SCTP

NOTE: In the future, when VIMAGE is compatible with SCTP, you will not need the nooptions SCTP line.

UPDATE: A known issue is that the non-default options IPFILTER conflicts with VIMAGE in stable/9 (and possibly HEAD too) in that a kernel panic occurs during boot. A backtrace shows the trap originating from fr_resolvenic() in sys/contrib/ipfilter/netinet/fil.c. My current recommendation for running a kernel with both Firewall abilities and VIMAGE is to use the IPFIREWALL family of options and ipfw(8).

Compile your kernel and boot it. Next, grab the FreeBSD package for vimage-1.4 and use the following command to install 2 files (/etc/rc.d/vimage and /etc/rc.conf.d/vimage):

pkg_add vimage-1.4.tbz

Now we’re ready to build a jail. Personally, I use my own “jail_build” script to build jails from binary releases. You can get jail_build from my SourceForge page:

http://druidbsd.sourceforge.net/download.shtml#jail_build

NOTE: If you are planning on running i386 jails under an amd64 kernel, be advised that you may need a patch to your amd64 kernel to solve a problem with applying a default gateway using the 32-bit route command (as discussed here).

Currently jail_build works with any release, 8.x and older (sorry, no 9.x support yet).

How you use jail_build in a nut-shell:

fetch http://druidbsd.sourceforge.net/download/jail_build.txt
mv jail_build.txt jail_build
chmod +x jail_build
mkdir -p /usr/repos
cd /usr/repos
# For example
mkdir 8.3-RELEASE
cd 8.3-RELEASE

# Now pick a single base-URL from below:
# ftp://ftp.freebsd.org/pub/FreeBSD/releases/i386/8.3-RELEASE/
# ftp://ftp.freebsd.org/pub/FreeBSD/releases/amd64/8.3-RELEASE/
# ftp://ftp-archive.freebsd.org/pub/FreeBSD-Archive/old-releases/i386/8.3-RELEASE/
# ftp://ftp-archive.freebsd.org/pub/FreeBSD-Archive/old-releases/amd64/8.3-RELEASE/

# Download "base" directory from FTP to local "."
# Download "doc" directory from FTP to local "."
# Download "dict" directory from FTP to local "."
# Download "games" directory from FTP to local "."
# Download "info" directory from FTP to local "."
# Download "manpages" directory from FTP to local "."
# Download "proflibs" directory from FTP to local "."
# Download "kernels" directory from FTP to local "."

You should now have:

/usr/repos/8.3-RELEASE/{base,doc,dict,games,info,manpages,proflibs,kernels}/

TIP: jail_build will automatically probe for /usr/repos/*-{RELEASE,STABLE,CURRENT}

NOTE: You can certainly run older releases as a jail, but generally speaking you should not run newer releases than what your host is running. So if you build an 8.3-RELEASE jail, you should have at least an 8.3 kernel or higher.

Executing jail_build brings up a menu of repositories to select from (living in /usr/repos) and allows you to select one before prompting you to enter the directory you wish to unpack the contents to (creating a new jail root directory). The default directory is /usr/jail/<jail_hostname> (change <jail_hostname> to the fully qualified hostname of the jail — this is suggested for easy tracking but feel free to make this whatever you like).

NOTE: You don’t have to use my jail_build technique to build jails, in fact there is a long-standing tradition of populating jails from source-code. For details on how to build a jail from source, see the Creating and Controlling Jails section of the FreeBSD Handbook.

After you’ve created a jail, it’s time to configure it as a vimage.

In the host’s /etc/rc.conf file, add the following:

vimage_enable="YES" # Set to NO to disable starting of any vimages
vimage_list="vj1" # Space separated list of names of vimages

vimage_vj1_rootdir="/usr/jail/vj1" # vimage's root directory
vimage_vj1_hostname="vj1" # vimage's hostname
vimage_vj1_devfs_enable="YES" # mount devfs in the vimage

This configures your basic vimage (we’ll cover how to give it a network interface in a moment). You can compare this directly to the procedures documented in the Creating and Controlling Jails section of the FreeBSD Handbook.

The same rules that apply to jail_list in the Handbook apply to vimage_list. The methodology is the same because again, /etc/rc.d/vimage is a fork of the documented /etc/rc.d/jail script.

Next, if you only want to move whole network interfaces into the vimage when it is started (and implied, automatically move the interface back out when the vimage is stopped), add the following to the same rc.conf file as above:

vimage_vj1_vnets="igb0" # list of interfaces to give to the vimage

However, the real flexibility comes from using the bridging option based on netgraph. To create a bridged interface for the vimage — leaving the original interface on the host unaffected — add the following instead of the above:

vimage_vj1_bridges="igb0 igb0" # list of interfaces to bridge

In the above example, a single port physical Intel Gigabit network adapter is listed twice — this will create two unique bridged interfaces from the same physical interface. This is not an error and is perfectly valid (imagine simulating a router that runs on two subnets but over the same physical wire).

If you start the vimage at this point (using service vimage start) you can get a list of the bridged interfaces from the jail’s point of view by executing:

jexec vj1 ifconfig

You’ll see (in the bridging example) two network interfaces, one named ng0_vj1 and the other ng1_vj1. The naming convention for bridged interfaces is ng#_NAME where # is a counter that starts at zero and increases by one for each bridge (regardless of the device being bridged) and NAME is the vimage name as seen in vimage_list.

NOTE: Due to an internal limitation, the name of any network interface in FreeBSD cannot exceed 15 characters. For long vimage names, be aware that ng#_NAME will be truncated to be less than 16 characters if necessary.

PRO-TIP: By placing the vimage name within the name of the bridged network interface, it makes it simple to configure multiple vimages to use the same root directory.

The network configuration for the above vimage is configured within the vimage’s own /etc/rc.conf file, example below:

ifconfig_ng0_vj1="inet 192.168.1.100 netmask 255.255.255.0"
ifconfig_ng1_vj1="inet 10.0.0.1 netmask 255.255.255.0"
defaultrouter="192.168.1.1"

Let’s say you had a second vimage named “vj2” pointed at the same root directory (and therefore the same /etc/rc.conf file). You would then add the following, for example:

ifconfig_ng0_vj2="inet 10.0.0.2 netmask 255.255.255.0"

The vj1 vimage jail would ignore the vj2 entries and vice-versa.

At this point, let’s recap — you should have done the following:

  1. Compiled, installed, and executed a custom kernel that enables the VNET and netgraph(4) requirements.
  2. Installed my vimage package from druidbsd.sf.net.
  3. Created a jail using either the source-method mentioned in the FreeBSD Handbook or using jail_build and a binary release.
    PRO-TIP: You can actually get by in testing without building a discrete jail but instead use “/” as the jail root path. This is perfectly valid and acceptable.
  4. Configured the vj1 test vimage in /etc/rc.conf
  5. Configured vj1 to bridge at least one physical interface using vimage_vj1_bridges (in the example above, we use igb0 — twice)
  6. Configured ifconfig_ng0_vj1 and [optionally] defaultrouter in the /etc/rc.conf file within the vj1 root directory.

At this point, the vimage will boot with the rest of the machine and can be controlled after boot with the following syntaxes:

# Stop, start, or restart all vimages
service vimage stop
service vimage start
service vimage restart

# ... versus: Stop, start, or restart just the vj1 vimage
service vimage stop vj1
service vimage start vj1
service vimage restart vj1

When the vj1 vimage starts, the ng*_vj1 network interfaces will automatically be created with unique MAC addresses and moved into the VNET jail before kickstarting the FreeBSD boot process within the jail. As the jail boots, it will automatically configure the network interfaces through rc.conf(5). You can use jls(8) to see the running vimages and you can use jexec(8) to execute processes inside them (like tcsh(1), ps(1), and ifconfig(8)).

PRO-TIP: ps(1) can produce the JID of all running processes with the syntax:
ps axopid,jid,command

However, what if you want to SSH into the vimage jail? SSH is not automatically started inside the vimage (and in-fact, only the network services required to get the vimage talking to the net are started). The answer is to add a new configuration line to the host machine’s /etc/rc.conf to configure sshd to be started in each/every jail:

vimage_afterstart_services="sshd" # set new default for all vimages

The default for vimage_after_services is NULL but can be set to a space separated list of services (names of scripts living in /etc/rc.d like sshd). If you wanted to leave this the default and only have the vj1 vimage start sshd, you can instead opt for the following line:

vimage_vj1_afterstart_services="sshd"

So now we have a usable framework for creating multiple vimages that can be SSH’d into as (let’s say) development environments. When you scale this out, netgraph starts to shine. Afterall, netgraph is based on graphs.

Without any additional work, simply by using vimage to produce VNET jails with netgraph bridged interfaces, we can produce graphs of the vimage network topology with the following syntax:

ngctl dot | dot -Tsvg -o vgraph.svg

PRO-TIP: The dot(1) utility can produce more than SVG (like PNG, JPEG, and GIF), but I find SVG to be the most scalable and informative (modern SVG viewers such as latent browsers display much information as tool-tips which I find helpful).

Here are some graphs of different topologies we’ve used over the years:

netgraph
warden0.jbsd netgraph of bridges for vimage jails

netgraph
folsom netgraph of bridges for vimage jails

It’s worth noting that if an interface is not bridged, it is shown in the “disconnected” cluster. This does not imply that the network interface is unused within FreeBSD — just that it has not been connected to any nodes within the netgraph layer.

That’s it for now, thank you for reading. Comments welcome. Depending on comments, I may do another installment showing the more exotic things you can do with this configuration.

Cheers,
Devin

Leave a Reply