Lightweight service orchestrator
This page describes the inner workings of the Lightweight Service Orchestrator (LSO), that handles the interaction between GSO and Ansible.
Motivation
For the deployment of new services in the GÉANT network, Ansible playbooks are used to deploy configuration statements onto remote devices. To make this interaction possible, LSO exposes an API that allows for the remote execution of playbooks.
The need to externalise this interaction comes from the fact that the Python library used to execute playbooks, introduces a potential situation where dependency versions could be conflicting. To prevent this from happening, GSO and LSO each are their own Python package, with each their own, independent library dependencies.
Inner workings
LSO uses ansible-runner
for the execution of Ansible playbooks.
This package fully dictates the way in which GAP interacts with Ansible itself.
LSO only introduces an API with a single REST endpoint that exposes its
functionality.
In the case of GAP, all Ansible playbooks operate without an inventory that
contains all relevant group_vars
and host_vars
. The inventory is passed to
the API endpoint for executing a playbook, which contains all required
host_vars
. For the other information relevant to the playbook, this is passed
through the API by making use of extra_vars
. In virtually all cases, the
extra_vars
will at least consist of the subscription object that is being
deployed, and assisting variables, such as 'verb' used to express an operation.
As an example, the following object is passed to the Ansible playbook for the deployment of a new router in the network.
extra_vars = {
"subscription": {
"product": {
"product_id": "27c9dc35-f0fa-4901-bda4-65df5bb7499d",
"name": "Router",
"description": "A Router",
"product_type": "Router",
"tag": "RTR",
"status": "active",
"created_at": "2024-01-24T15:47:13+00:00",
"end_date": None,
},
"customer_id": "8f0df561-ce9d-4d9c-89a8-7953d3ffc961",
"subscription_id": "b57cbbc8-e8d1-47f8-add6-7923ecd7e3d5",
"description": "Router SrzptDtKBIFGijnHrglQ.flores.bb.geant.net",
"status": "provisioning",
"insync": False,
"start_date": None,
"end_date": None,
"note": None,
"router": {
"name": "RouterBlock",
"subscription_instance_id": "09d6bea9-8c79-4e75-9a69-ef249bb9de5e",
"owner_subscription_id": "b57cbbc8-e8d1-47f8-add6-7923ecd7e3d5",
"label": None,
"router_fqdn": "SrzptDtKBIFGijnHrglQ.flores.bb.geant.net",
"router_ts_port": 4223,
"router_access_via_ts": True,
"router_lo_ipv4_address": "74.95.57.63",
"router_lo_ipv6_address": "ac6f:7008:40d3:d431:bcc4:2eac:b443:f6b8",
"router_lo_iso_address": "49.51e5.0001.0740.9505.7063.00",
"router_role": "amt",
"router_site": {
"name": "SiteBlock",
"subscription_instance_id": "874ffb0b-cf55-49ea-810f-7268c02891fa",
"owner_subscription_id": "324239ea-555b-464d-bfde-54666470d71d",
"label": None,
"site_name": "flores",
"site_city": "Whitemouth",
"site_country": "Zimbabwe",
"site_country_code": "BB",
"site_latitude": "45.39258",
"site_longitude": "137.727838",
"site_internal_id": 9881,
"site_bgp_community_id": 8738,
"site_tier": "1",
"site_ts_address": "137.105.143.190",
},
"vendor": "nokia",
},
},
"dry_run": True,
"verb": "deploy",
"commit_comment": "GSO_PROCESS_ID: 549aae60-0574-4c5a-a736-00c83fdb446a -
TT_NUMBER: TT#1987043028032905 - Deploy base config"
}
In this example, four top-level keys are included: subscription
, dry_run
,
verb
, and commit_comment
. In order, these are used for the following.
The subscription
key includes a dictionary representation of the subscription
that is being provisioned. In the case of a router, router
contains
information about the subscription object, with its child key router_site
that
contains information about the site at which this router is deployed.
Information about this router site comes from the related site subscription
which is already 'deployed' in GSO.
For the distinction between practice runs, and actual deployments, the variable
dry_run
is included. The difference between an execution which is a dry run
and one that is not, is the commitment of configuration. With a dry run,
configuration is only checked, and not committed to the remote machine. When
dry_run
is set to False
, the configuration is checked and then committed.
To distinguish between different actions that can be taken with service deployments, 'verbs' are introduced. In the example, the verb is set to 'deploy' to provision a new service. Other examples of verbs can include 'deactivate', 'modify', or 'terminate'.
The commit_comment
is used for bookkeeping purposes on the remote machines.
This can be used for debugging or accounting purposes, among others. It always
includes the process ID of the workflow that is related to an operation, and the
associated trouble ticket number.
The full API request
From the previous section, extra_vars
is only one piece of the puzzle. For a
full-fledged API request to LSO, an example call is given.
{
"playbook_name": "deploy_a_service.yaml",
"callback": "https://orchestrator.gap.geant.org/api/processes/(…)/callback/(…)",
"inventory": {
"all": {
"hosts": {
"edge1-host": {
"example-var": "A value",
"another-var": "Totally optional, and can differ per host"
},
"edge2-host": null // Note that the `null` is a mandatory YAML-restriction
}
},
"extra_vars": {
…as shown above
}
}
}
Code documentation
Code documentation for LSO can be found here.
Deployment within GÉANT
For the deployment in GÉANT, LSO runs inside a Docker container. The Dockerfile used to build this container is available here.
When building the Docker image, some Ansible roles and collections are installed that are required for interacting with Juniper and Nokia equipment. For another organisation that would want to use LSO in their deployment, it is highly recommended to use this Dockerfile as a starting point. From this another Docker image can be built with custom Ansible requirements pre-installed.
It also opens up the possibility to include an Ansible inventory, if so desired. Do note however, that this introduces a requirement to re-build LSO every time the inventory is updated, or to have it included as a volume mount inside the running container. Including a dynamic inventory with every API call is therefore the recommended way to go.