@home:~$

Nexus 9000: CLI Wrapper or Model-driven API?

If you want to programatically interact with NX9K switches, you’re given a few options. You can either send the device CLI commands within the CLI wrapper, or you can use the devices OM (Object Manager) and pass in object models. What’s the difference? Why would you want to use one over the other?

Comparing The Approaches

The managed object approach uses a session via APIC-cookie while the the CLI wrapper uses HTTP basic auth. Another difference is the CLI wrapper applies the commands provided sequentially in the order provided, not in a single atomic transaction. Looking at the documentation for the MO approach:

The DME validates and rejects incorrect attributes. The API is also atomic. If multiple MOs are being configured, and any of the MOs cannot be configured, the API stops its operation. It returns the configuration to its prior state, stops the API operation that listens for API requests, and returns an error code.

Seems ideal right? Why would anyone use the CLI wrapper instead when using managed objects will handle data validation and atomic operations for us? If I pass in invalid configuration, it shouldn’t apply.

not-so-fast

Let’s look at why that’s not exactly the case. On the device we have VLAN 1660 already, with a VNI assigned.

vlan 1660
  name THIS_VLAN
  vn-segment 700070

We’ll send in a new bdEntity object defining another VLAN with the same VNI:

{
  "topSystem": {
    "children": [
      {
        "bdEntity": {
          "children": [
            {
              "l2BD": {
                "attributes": {
                  "accEncap": "vxlan-700070",
                  "fabEncap": "vlan-3000",
                  "name": "VNI_ALREADY_EXISTS",
                  "pcTag": "1"
                }
              }
            }
          ]
        }
      }
    ]
  }
}

Now, this should fail. VNI 700070 belongs to VLAN 1660 but we’re also trying to assign VNI 70070 to VLAN 300. A VNI can only be assigned to one VLAN. But the response we get from the API:

{"imdata": []}

yunoreturnerror

Let’s try this via CLI now.

NX9K-DEV# sh run vlan 3000

!Command: show running-config vlan 3000
!Time: Fri Mar 30 16:29:16 2018

version 7.0(3)I6(1)
vlan 3000
vlan 3000
  name VNI_ALREADY_EXISTS


NX9K-DEV# conf t
Enter configuration commands, one per line. End with CNTL/Z.
NX9K-DEV(config)# vlan 3000
NX9K-DEV(config-vlan)# vn-segment 700070
NX9K-DEV(config-vlan)# Add segment id node failed as the same id is used by vlan     1660

Hm, we got the error we expected to see. Well, does the CLI wrapper act any differently? Will it throw an error if we delete VLAN 3000 and make the same call?

{
  "ins_api": {
    "version": "1.0",
    "type": "cli_conf",
    "chunk": "0",
    "sid": "1",
    "input": "vlan 3000 ;  name VNI_ALREADY_EXISTS ;  vn-segment 700070",
    "output_format": "json"
  }
}

{
  "ins_api": {
    "sid": "eoc",
    "type": "cli_conf",
    "version": "1.0",
    "outputs": {
      "output": [
        {
          "code": "200",
          "msg": "Success",
          "body": {}
        },
        {
          "code": "200",
          "msg": "Success",
          "body": {}
        },
        {
          "code": "200",
          "msg": "Success",
          "body": {}
        }
      ]
    }
   }
 }

Nope. It sure doesn’t. It says all 3 command were applied without issue.

# More Gotchas to Consider

VLAN management isn’t the only thing you’re probably considering automating on your switches, but it’s more than likely the configuration element you do the most CRUD operations on day-to-day. Another scenario you’ll probably look to handle via automated workflow, if you’re doing VLAN pruning, is trunk interface management. Anyone who has been through a scenario of of someone missing the ‘add’ keyword in ‘switchport trunk allowed vlan add ${VLAN}’ knows what the blast radius of a human error can be. The object model that control this is pcAggrIf. Let’s take note of what happens if you use the DME to update this without considering the object in place already.

Before:

interface port-channel34
  description TEST-TRUNK
  switchport mode trunk
  switchport trunk allowed vlan 1,1006,1033,1037

Payload sent in defining VLAN 3000 in trunkVlans:

{
  "topSystem": {
    "children": [
      {
        "interfaceEntity": {
          "children": [
            {
              "pcAggrIf": {
                "attributes": {
                  "id": "po34",
                  "trunkVlans": "3000"
                }
              }
            }
          ]
         }
      }
    ]
  }
}

After:

interface port-channel34
  description TEST-TRUNK
  switchport mode trunk
  switchport trunk allowed vlan 3000

WTF just happened? Well, trunkVlans on the model for port-channel is just a string of comma-seperated VLAN defintions, just the same as you would see it in the CLI config. It’s not an array of VLAN objects that are allowed on that interface object. Since we didn’t query the exisiting object, pull it’s trunkVlans value, append our VLAN to it, then post the object back, we overwrote the allowed VLANs on the trunk with only the VLAN we passed in. Not ideal. Now, you could do do exactly that, but you’d definitely want to make sure your unit/acceptance testing is up to snuff and you’re testing all your failure scenarios. In this case the CLI wrapper has a leg up.

Takeaways

At first, I was very skeptical about the CLI wrapper. You’re putting a shiny new coat on a crappy configuration interface (the CLI) that everyone is trying to move away from. On the surface it seemed really backwards, and using the MO API was clearly the way to go. After coding against both for some common workflows, that may not be the case. In the case of trunk VLANs, why introduce the potential of messing up trunked VLANs via MO when it’s already abstracted away from us if we simply tell the CLI wrapper to add it via commands we’re already used to? The idea that the MO API is atomic and won’t apply bad configurations doesn’t appear to be totally true. Maybe in this case, using the previous configuration abstraction is good enough. In either case, as usual, validation is key.