A while ago I wrote about my VyOS config for Init7’s Fiber7-X product. Since then there has been a number of breaking changes, and a few additions that I would like to cover.
I will copy/paste a lot of the narrative from that post, and avoid a bit of the abstract conversation that went with it, so that this stands on its own.
If you have questions or comments, hit me up.
Last week I had some fun in Amsterdam, and at no point was there any debauchery.
For those who are unaware, Autocon is a conference put together by the Network Automation Forum. Autocon1 was the second event ever, and the first in Europe. They ran autocon0 in Denver in the Autumn of 2023.
TL:DR - if you are in the network automation space, you have to try and get yourself there.
As part of my VPP Adventures series, we have talked about what VPP is, why its interesting, and how we can prove it works. Today we spend a bit of time on what we can actually do with it.
Who actually uses MACSEC these days? My first interest for a real world test of VPP was straight BGP routing for DFZ connected services. Kinda obvious no? For long and complicated reasons, it actually wasn’t (more specifically it couldn’t - we use IS-IS as part of our edge routing environment and VPP has issues there).
So far we have covered what VPP is, and why its interesting to us.
Part of the story with any new service/implementation always centres around testing. How do you prove, definitively, that something does what it says on the tin. RFC2544 outlines a series of testing strategies and for the purpose of this work we try to keep it simple.
I have deployed a TRex traffic generator on Debian 11 (OFED 5.
In the previous post we were talking about what VPP was. Here we explore a little why it matters.
What’s the point anyways? It’s a fair question. Surely its not logical to invest so much time and effort into something that has been described numerous times as “janky”. One of my engineers even now says, “I understand why you want to do it, but I don’t agree that this is the right solution”.
Linux Routing is becoming a thing with me. I cant decide if the motivation is the extreme cost of dedicated hardware, or the knowledge that with a little effort you can make a free/cheap thing into a giant killer. David and Goliath is a fun story I guess.
VPP has been on my radar now for a few years. I have tried and failed a few times to get it into production typically on the internet edge of a datacentre in place of something expensive like a Cisco ASR or a Juniper MX.
Last year I wrote extensively about my experience with deploying VyOS to support my new uber fast internet connection.
I learned a lot in the process, and for this past year it has mostly worked fine. I am one of those people that can’t leave things alone however, and I was always tinkering with the setup. The VyOS box itself was happily communicating at 10G, but I would find the internal LAN would get choked up a lot and rarely hit 1G even with extreme threading (say 50-60 conns).
Weird. In my first hugo post just two days ago I speculated it would take me weeks to get this content off blogger and into hugo.
It took me three evenings.
All the DNS has been switched and we are now fully on hugo with a gitops workflow, previews on a branch push and I was even able to retain the old URL paths as an alias, so anyone googling my stupid opinions can still read them on the new or the old path.
Over the past year or so I found myself returning to my own blog to remind myself how I did things in VyOS and how I configured this or that thing or whatever. Each time I sort of hated the fact the theme was crap. Blogger themes are so dated, they remind me a little of Geocities now. I know that dates me somewhat too, but anyways. They suck and brosing for something that isn’t terrible seems more and more futile.
So it turns out that if you want metrics from VyOS, your two options are SNMP or Telegraf (towards InfluxDB).
SNMP is one of those things that has been around so long, you think its good, but really, its trash. Its a 1990s technology that is mostly singlethreaded and provides you very very fuzzy numbers. 5 min averages are not that useful in situations like today where clients plausibly have access to gigabit+ grades of connectivity.