14 May

Manage the Docker Swarm nodes – new Ansible 2.8 modules developed by me!

Ansible 2..8 introduces huge update for Docker Swarm modules

There are several good things about open source projects. But there is one which makes the good projects grow quickly – if you miss a feature and you have some programming skills you can write your extension. Then you can propose the change to get merged into the official repository. This is how my journey as a community Ansible developer started.

Lately, I worked more and more with Docker Swarm clusters and wanted to automate some tasks. Unfortunately, with the Ansible 2.7 there were only three modules available – thedocker_swarm, docker_swarm_serviceanddocker_secret. The tasks I wanted to automate required features as reading the Swarm configuration (this was partially covered by the docker_swarm), node status and change the node configuration. I had three options – use the shell module and execute the CLI command locally on the remote host, wait for someone to provide the module or write the module by myself.

I may not be a professional developer, but I had a great teacher when I started coding in the last class of primary school and some experience in programming. You can’t really be an engineer if you cannot write scripts or simpler apps. I think it is not a surprise I decided to test myself in professional project 😛

In the previous post, I presented the docker_swarm_facts module. I am a co-author there. Let me now show you the Swarm Node related modules fully developed and now maintained by me. All of them will be available in Ansible 2.8 release in next few weeks, but you can test it using the devel branch code on GitHub!

Tell me about my Docker Swarm node…

Before you start changing the configuration of any device, software etc. you really should get and store some information about its current state. To read the essentials about the Docker Swarm node you need to use the docker_node_facts module. The module inherits all default docker modules options like the TLS configuration. You can execute it locally using the Docker socket or remotely using the Docker API. Remember that Docker by default use an only local socket so you need to reconfigure your daemon to support remote management. And one of the most important requirements – you need to run the module on Docker Swarm Manager node.

    - name: Get the swarm node facts
      docker_node_facts:
        self: yes
        docker_host: "tcp://172.10.10.10:2376"
        tls: true

By default, Ansible will open an SSH connection to the remote host and execute the module there connecting to local Docker socket. The host is usually an inventory entry. Using the API requires providing the URL of the management interface as the docker_host parameter. You can execute the module still on the remote host (the same as in docker_host or different), but usually, in such case, the playbook is set with the parameter connection: local and run on the localhost. Understanding this difference is crucial for running any Swarm module and avoid problems.

You also need to tell the module information on which Swarm node you want the manager to return. The default behavior is returning facts about all Swarm nodes. If you know the names of the nodes you are interested in you can provide them as a list in the name parameter. Setting the self: yes option will tell the module you want facts only about the node module communicates with.

Module output matches the output of the docker node inspect CLI command. The nodes key in the returned structure contains an array of dictionaries where each element matches the CLI command. Its structure may vary depending on the version of Docker daemon and Docker API – you need to check the documentation for details.

Changing the node configuration

The docker_node module allows changing some parameters of the Swarm node configuration like the node role or the availability. The first parameter defines if the node is a manager or a worker. Initially, you must define the role when you add a node to the cluster. The availability defines how the swarm cluster will use the node for work distribution. Three allowed states are active (accepting new containers), pause (operating the existing containers but not accepting new ones) or drain (not accepting new containers and existing will restart on another node). The last state is useful during upgrades or when you want to remove a node from the cluster

The docker_node module also allows changing the labels assigned to the Swarm node. Labels are usefull for management and automated work distribution. Labels are the key-value pairs. The default module behavior is merging the labels provided in playbook with those already assigned to node and update the value of a label if it is already defined.

- name: Merge node labels and new labels
  docker_node:
    hostname: mynode
    labels:
      key: value

To override default behavior and replace the labels with new ones you must set labels_state: replace. To remove all assigned labels you also have to use this option, but without providing any new labels.

- name: Remove all labels assigned to node
  docker_node:
    hostname: mynode
    labels_state: replace

If you want to remove specified labels you need to provide the list of label keys in parameter labels_to_remove.

11 Mar

Using Docker Swarm? You gonna love Ansible 2.8!

Ansible 2..8 introduces huge update for Docker Swarm modules

The Ansible is kind of an icon of automation platform. Owned by RedHat but available as a free product on the public license. Developed both by RedHat employees and the community. Docker itself is an icon of containerization. If you use containers you know that automation is the key to simplify management of dockerized infrastructure. In Ansible 2.x many modules covering the docker operations has been introduced, but Docker Swarm was not really covered. However, it is going to change soon! There is a huge update coming with Ansible 2.8 release!

Lately, I started missing some features in Ansible that will allow me to perform some operations on Docker Swarm clusters. And I try to avoid using the command or shell modules as much as possible – running CLI commands on a remote host is like asking for troubles. I decided to fill this gap by myself and as a result, I can say I am now the Ansible community developer, author, and maintainer of a few modules – docker_swarm_facts, docker_node_facts, docker_host_facts, docker_node; and co-author of docker_swarm and the ansible.docker.swarm library.

In next posts, I wanna show you how those modules work. However, if you are looking forward using them I strongly advise you to give them a try now and report any bugs you find so we can fix them before Ansible 2.8 release. You can find them in the devel branch of Ansible repository.

Read More
31 Jan

DHCP domain-search on Juniper SRX

DHCP Option 119

In the last few years, the DHCP service in enterprises got more and more integrated into complex management and provisioning systems or became a part of the Active Directory solutions. Routers, switches, wireless controllers or other devices in such networks act as DHCP Proxy instead of DHCP Servers just sending the requests to the central server. But the feature itself is not dead! Sometimes you need to run it locally and not only provide the IP address and default gateway to the client device but also DNS servers information or domain-search parameters. The last feature is an DHCP Option 119 and I put it in “tricky, non-obvious DHCP config”, together with bunch of other DHCP options.

The central point of my home and lab networks is the Juniper SRX220H – it is quite old but the services and its performance fulfills my requirements so far. One of the services it provides is DHCP Server for some of the VLANs. I was not really missing the domain-search feature so far but decided to add it to make my work more convenient. This feature configuration is not straightforward, and there are many misleading guides about setting up this feature on different platforms or operating systems. Let me show you how it works on Juniper.

Read More
17 Jan

Accessing host socket when using namespaces for Docker isolation

Socat will connect two sockets together

Nothing is secure by default and Docker is no exception. One of the recommended change to improve Docker security is isolation of the containers in user namespace which was introduced in Docker Engine 1.10. The namespaces makes the process run on the host thinks that it has its own access to some global resources like the PIDs. User namespaces provides the mechanism of remapping container resources to host resources limiting container access to the host system.

In some cases you may need to access the host resource from the container like the Docker own socket. It is required if you run Docker inside of the Docker container or you deploy a tool that will manage your Docker hosts or Docker Swarm cluster. The problem is the socket on the hosts is owned by the root, while the root PID from inside of your container is remapped to non-root PID on the host. It make the host socket inaccessible for the processes inside the container. But there is an easy workaround for this problem

Read More
04 Jan

Observium pollers vertical scaling

Observium in containers

There are several free monitoring tools available for commercial or non-commercial use. I decided to use Observium to monitor my home and remote labs. I had no specific requirements except that it had to have an option to run in Docker containers on RaspberryPi (ARMv6/8 CPUs). The downside of all monitoring solutions are the resource requirements. There are three resources that matter – storage, memory, and CPUs. Vertical scaling of Observium pollers and other processes is one of the solutions if your hardware resources are limited.

The vertical scaling means adding more workers doing the same process in parallel, but tasks assigned to each process should not overlap. In my case, I wanted to spread the polling across my RaspberryPi cluster. Polling process can consume the CPU and make RaspberriPi unresponsible. In worst scenario, if you have many devices or you are poling lots of data from your devices the polling processes may not finish their work within 5 minutes (this is how often devices have to be queried) so you will miss some data. You may of course tune number of threads the polling process start but any platform has it resources limitations you cannot bypass.  

In this article I will present my solution based on Docker Swarm cluster. It has some limitations and downsides but may work for many people, not only on RaspberyPi.

Read More
11 Dec

Upgrading the VMware Harbor

VMware Harbor is a docker images registry. You can use it instead of docer registry from official repository.

I lately decided to upgrade my local Docker registry installation. I use VMware Harbor as a Docker registry – In my opinion, it is much better and easier than official registry software. Recently I upgraded it from version 1.5.1 to 1.6.2.

VMware Harbor runs in containers which simplifies managing the software, but the upgrade is not straightforward as you may think. Most significant change is database consolidation. Instead of separate databases for Harbor, Clair and Notary version 1.6.0 introduces a single database engine for all components – the PostgreSQL. The 1.5.0 Harbor uses the MySQL, while Clair already uses the PostgreSQL. The developers prepared a dedicated container with migration engine that performs all the work. However, I found upgrade documentation missing the crucial explanation of steps and commands which may lead to loss of your data. I will try to cover my findings in this post.

Read More
26 Nov

Conditional parameter value in Ansible playbooks

Juniper automation with Ansible

Ansible playbook is just a list of tasks executed one by one in the order you define them in the playbook code. Using the conditional statements you may skip execution of some tasks, but there is one general rule that should apply to all playbooks you create – keep the number of tasks at the minimum.

Execution of each task takes time, and until you provide additional optimization, Ansible will establish the connection, perform an authentication process and then terminate the connection with a remote device. The longer this process takes, the more time you waste when you execute the playbook. It is a good practice to consolidate the tasks – if you need to perform multiple commands on a remote network device, you should define them as a parameter of one task instead of running them in separate tasks. In rare cases it may lead to unexpected problems like the one I described in my post Automated scripts can send commands faster than RP can process.

Sometimes you need to vary the value of an option provided to the task module. If you need to get an output of a CLI command on Juniper device, you will use the module junos_command which is part of standard Ansible library. Using the display parameter, you can specify if command output will be encoded in XML or JSON format. JSON is a more flexible format, but it is not a supported output format on older JunOS version. If you try to request it but the firmware does not support it your task, and the whole playbook will fail. Most of the developers will create two separate task and the conditional test to check version as a task with the when option. However, let me show you the other, not that well known, way.

Read More

14 Nov

Should vendors ask for a license to enable the API?

My friend let me use one of his older, unused servers in his data center so I have a place to run some virtual machines for my automation projects. As you may expect there is no SLA for this server, no redundancy, limited storage. For me, it is still better to have lab running 24/7 there than using my desktop PC or pay for resources in the public cloud. The server is running ESXi 6.0 hypervisor and have a free license installed. 

The latest release of new VMware modules for Ansible was a trigger to develop a playbook that will let me back up the virtual machine from the datastore on a remote server to my home storage. Quite simple and straightforward idea lead to some unexpected problems and questions – Where should be the limit of features available in free licenses? Should automation be blocked, or at least some tasks, in free features?

Read More
05 Nov

Ansible can’t read some facts from Juniper devices

Juniper automation with Ansible

It is really amazing how fast Ansible is developed lately. Stable versions are released more often and contain more changes required by IT professionals. Many of them fill the gaps between two worlds – the developers and operations engineers. Unfortunately, some modules are not catching up as fast as they should which causes problems in developing simple tasks. I experienced such when I was working on playbook example required for my latest press articles for ‘IT Professional’ magazine. The default Ansible junos_facts module couldn’t correctly read JunOS version on some devices. Usually on devices running the older firmware release. This can be a real problem if some tasks execution depends on the firmware version on the router or switch.

Besides the official modules and lots of roles available on Ansible Galaxy repository many vendors developed their own modules and let them use for free. In many cases, it should be considered a better, more secure approach as long as the vendor repository is still maintained. In my situation it was the easiest workaround of my problem.

Read More
08 Oct

REST API in VMware Workstation 15

VMware Workstation

If you have a small home lab or use virtualization on your desktop PC or laptop you must hear about VMware Workstation – a hosted hypervisor that runs on the x64 version of Windows or Linux. It is a really good product for all engineers and enthusiasts that do not have or don’t need dedicated server-class hardware for they work. You can even run ESXi hypervisor as VMware Workstation virtual machine. What you could not do is manage the configuration and virtual machines in a programmable way. You had to do everything manually via GUI interface. Not anymore! The gap is filled with the REST API in VMWare Workstation 15 release that hit the market late September.

The REST API features are limited to 20 operations including the most essential ones and match the features in VMware Fusion 10. This includes VM management, VM power management as well as host and guest virtual networking. Let’s take a quick look at how it works.

Read More