Site icon CormacHogan.com

Docker SWARM on Photon Controller

Continuing on my journey of getting familiar with all things “Photon Controller” related, I wanted to take you through the process, step-by-step, of getting Docker SWARM running on top of Photon Controller. Now, my good pal William Lam has already described the process in a lot of detail over on his virtually ghetto blog. I thought I might try to expand on that a bit more, and highlight where things might go wrong (if you are a newbie like me to this stuff). I also wanted to do everything from the Photon CLI, rather than going through the UI for any of the steps.

*** Please note that at the time of writing, Photon Controller is still not GA ***

*** The steps highlighted here may change in the GA version of the product ***

My Setup

1. Initial Configuration – deploy Photon Controller “Installer” OVA

The first step is to deploy the Photon Controller “installer” OVA. You can get v0.8 from this link.  This appliance is the mechanism by which you can deploy Photon Controller.

I also have Photon CLI deployed on my desktop, which I will use for creating the Photon Controller, and then deploying a Docker SWARM cluster on top. You can get the CLI tools from the same link as Photon Controller v0.8.

2. Deploying Photon Controller via Photon CLI

Once the “installer” is deployed (its full name is ESXCloud Installer) and is up and running, you can roll out Photon Controller in a few ways. First note the IP address of the installer appliance – you will need this shortly. As I mentioned, you can deploy it via the UI, or you can deploy it via the Photon CLI tools. I will do this via the Photon CLI tools in this example. In my setup, I have 4 hosts. I am going to use one of these as my management (MGMT) host, and the other 3 as CLOUD hosts for deploying my containers. I created a YAML file, which contained all the relevant information about my hosts (although I have put xxxxxx where the actual passwords should be below). Note that the Photon Controller manager will need a static IP – here it is set to 10.27.51.117. The netmask, DNS and gateway all pertain to the management VM, and not the ESXi host.

hosts:
  - metadata:
      MANAGEMENT_DATASTORE: esxi-hp-05-local
      MANAGEMENT_PORTGROUP: VM Network
      MANAGEMENT_NETWORK_NETMASK: 255.255.255.0
      MANAGEMENT_NETWORK_DNS_SERVER: 10.27.51.252
      MANAGEMENT_NETWORK_GATEWAY: 10.27.51.254
      MANAGEMENT_VM_IPS: 10.27.51.117
    address_ranges: 10.27.51.5
    username: root
    password: xxxxxxxxxx
    usage_tags:
      - MGMT
  - address_ranges: 10.27.51.8,10.27.51.7,10.27.51.6
    username: root
    password: xxxxxxxxx
    usage_tags:
      - CLOUD
deployment:
  resume_system: true
  image_datastores: isilon-nfs-01
  auth_enabled: false
  syslog_endpoint: 
  ntp_endpoint: 10.27.51.252
  use_image_datastore_for_vms: false
  loadbalancer_enabled: true

All CLOUD hosts have access to the NFS share, isilon-nfs-01, where images are to be stored. The first step is to point the target at the Photon Controller installer using the Photon CLI. In this example, the installer IP is 10.27.51.33.

C:\Users\chogan>photon target set http://10.27.51.33
Using target 'http://10.27.51.33'
API target set to 'http://10.27.51.33'
C:\Users\chogan>

The next step is to run a Photon CLI command to create my deployment. That command is “photon system deploy”, as shown below, and I’ve included the different steps that the command goes through:

C:\Users\chogan>photon system deploy Downloads\my_config.yaml
Using target 'http://10.27.51.33'
Created deployment c85aef0d-f79b-4271-b706-da987117ca9c

0h 0m 6s [= ] CREATE_HOST : CREATE_HOST | Step 1/1

0h 0m 2s [==     ] PERFORM_DEPLOYMENT : PROVISION_CONTROL_PLANE_HOSTS | Step 2/6

0h12m42s [===    ] PERFORM_DEPLOYMENT : PROVISION_CONTROL_PLANE_VMS | Step 3/6

0h31m44s [====   ] PERFORM_DEPLOYMENT : PROVISION_CLOUD_HOSTS | Step 4/6

0h41m24s [=====  ] PERFORM_DEPLOYMENT : PROVISION_CLUSTER_MANAGER | Step 5/6

0h42m44s [====== ] PERFORM_DEPLOYMENT : MIGRATE_DEPLOYMENT_DATA | Step 6/6

Deployment 'c85aef0d-f79b-4271-b706-da987117ca9c' is complete.

When the deployment succeeds, we can now set our target to the Photon Controller rather than the installer, and since we included load-balancer as an option, we include the load-balancer port of 28080 in the target:

C:\Users\chogan>photon target set http://10.27.51.117:28080
Using target 'http://10.27.51.117:28080'
API target set to 'http://10.27.51.117:28080'

The load-balancer port is important or you might run into the issue described here when trying to upload images.

3. Create tenant, project and resources

We are now at the point where we can being to consume some of the resources on the Cloud hosts for our particular cluster deployment. To do this we need to create a tenant, a resource ticket, and a project. First we will create the tenant, and set it.

C:\Users\chogan>photon tenant create Cormac
Comma-separated security group names, or hit enter for no security groups):
Using target 'http://10.27.51.117:28080'
Created tenant 'Cormac' ID: d3ca1f91-5a54-4128-9af9-bf4512570dde
C:\Users\chogan>photon tenant set Cormac
Using target 'http://10.27.51.117:28080'
Tenant set to 'Cormac'

The next step is to create a resource ticket. In this resource ticket, we will create 100 VMs, and each VM can have 16GB of Memory. We will call it gold:

C:\Users\chogan>photon resource-ticket create
Using target 'http://10.27.51.117:28080'
Resource ticket name: gold

Limit 1 (ENTER to finish)
Key: VM
Value: 100
Unit: COUNT

Limit 2 (ENTER to finish)
Key: VM.Memory
Value: 16
Unit: GB

Limit 3 (ENTER to finish)
Key: <hit ENTER>

Tenant name: Cormac
Creating resource ticket name: gold

Please make sure limits below are correct:
1: VM, 100, COUNT
2: VM.Memory, 16, GB
Are you sure [y/n]? y
Resource ticket created: ID = 1297060f-1f3f-44c6-b64a-9d082624810f

And then finally we will create a project to consume some of those resources in the resource ticket. In fact, this project will claim all of the resources in the resource ticket, but of course you could have multiple projects consuming resources from the same ticket. We will also set our project to this newly created project, called SWARM:

C:\Users\chogan>photon project create -r gold
Using target 'http://10.27.51.117:28080'
Project name: SWARM

Limit 1 (ENTER to finish)
Key: VM
Value: 100
Unit: COUNT

Limit 2 (ENTER to finish)
Key: VM.Memory
Value: 16
Unit: GB

Limit 3 (ENTER to finish)
Key: <hit ENTER>

Tenant name: Cormac
Resource ticket name: gold
Creating project name: SWARM

Please make sure limits below are correct:
1: VM, 100, COUNT
2: VM.Memory, 16, GB
Are you sure [y/n]? y
Project created: ID = f9807ab2-186b-4564-ae23-4acb5a09dbb4
C:\Users\chogan>photon project set SWARM
Using target 'http://10.27.51.117:28080'
Project set to 'SWARM'

Excellent, we can now move on to building our SWARM cluster.

4. Create a SWARM image

Before building our cluster, we first of all need an image that can be consumed by the nodes deployed on the cluster. We are deploying SWARM, so we need a docker swarm image. At present, there is only the photon management image available:

C:\Users\chogan>photon image list
Using target 'http://10.27.51.117:28080'
ID                                    Name                             \
State  Size(Byte)   Replication_type  ReplicationProgress  SeedingProgress
465ee30f-7b53-41b3-93a9-3e4ab4b7355c  photon-management-vm-disk1.vmdk  \
READY  85899345972  ON_DEMAND         20.0%                100.0%
Total: 1

The next step is to create a Docker SWARM image using the one provided on github here. Just download it to your desktop, and create it as follows (assuming it is in the Downloads folder):

C:\Users\chogan>photon image create Downloads\photon-swarm-vm-disk1.vmdk \
-n swarm-vm.vmdk
Image replication type (default: EAGER):
Using target 'http://10.27.51.117:28080'
Created image 'swarm-vm.vmdk' ID: 83dbd815-3638-40cd-a553-dac48efcfe8f

Now there is a second image available:

C:\Users\chogan>photon image list
Using target 'http://10.27.51.117:28080'
ID                                    Name                             \
State  Size(Byte)   Replication_type  ReplicationProgress  SeedingProgress
465ee30f-7b53-41b3-93a9-3e4ab4b7355c  photon-management-vm-disk1.vmdk  \
READY  85899345972  ON_DEMAND         20.0%                100.0%
83dbd815-3638-40cd-a553-dac48efcfe8f  swarm-vm.vmdk                    \
READY  85899345968  EAGER             20.0%                100.0%
Total: 2

Caution: There are two things to highlight here. The first is the image create command. Note that there is no command line option to provide the location of the image; you simply put the path in. In my case it was in the Downloads folder. There is also a -i option (which I did not include) which is for the type of image that you will create, e.g. -i EAGER or -i ON_DEMAND, which will create the image up-front, or when needed.

The second issue here is that the image now needs to be transferred to the image datastore. This can take some time, and it is probably worth waiting for these transfers to complete before trying to build the cluster. This is because the cluster build process may time out if the images are not yet available. I use the HTML5 Host Client to verify the progress of the transfer:

I’ve requested that we be able to track this progress from the Photon CLI.

5. Allow deployment to support Docker SWARM cluster

For this part of the process, you need both the ID of the SWARM image and the ID of the deployment. The image ID is above. The deployment ID can be captured as follows:

C:\Users\chogan>photon deployment list
Using target 'http://10.27.51.117:28080'
ID
c85aef0d-f79b-4271-b706-da987117ca9c
Total: 1
The command to enable the cluster to deploy a SWARM cluster with a particular images is as follows:
C:\Users\chogan>photon deployment enable-cluster-type \
c85aef0d-f79b-4271-b706-da987117ca9c -k SWARM \
-i 83dbd815-3638-40cd-a553-dac48efcfe8f
Are you sure [y/n]? y
Using target 'http://10.27.51.117:28080'
Cluster Type: SWARM
Image ID:     83dbd815-3638-40cd-a553-dac48efcfe8f

We are now ready to deploy the SWARM cluster.

6. Create a Docker SWARM cluster

As I mentioned at the beginning on the post, I have a pretty simple setup with only a single VM network. If you are deploying in an environment that has multiple VM networks, you will have to select the appropriate one by following the guidance in this article.

However, since we only have a single VM network, we do not need to worry about this. Let’s run the cluster create command. This is where the other static IP is needed, for the first etcd container. This is used for discovering the SWARM cluster nodes (master, slaves). We’re also going with the simplest setup where these is a single master and a single slave in our SWARM cluster.

C:\Users\chogan>photon cluster create -n SWARM -k SWARM \
--dns 10.27.51.252 --gateway 10.27.51.254 --netmask 255.255.255.0 \
 --etcd1 10.27.51.138
Using target 'http://10.27.51.117:28080'
Slave count: 1
etcd server 2 static IP address (leave blank for none):
Creating cluster: SWARM (SWARM)
  Slave count: 1

Are you sure [y/n]? y
Cluster created: ID = 0b8b8639-5335-4b97-b8ab-5f90a6155cf3
Note: the cluster has been created with minimal resources. You can use the cluster now.
A background task is running to gradually expand the cluster to its target capacity.
You can run 'cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3' to see the state of the cluster.

7. Verifying the state of the cluster

Excellent. The SWARM cluster has been created. Let’s check a few things using these useful Photon CLI commands:


C:\Users\chogan>photon cluster list
Using target 'http://10.27.51.117:28080'
ID                                    Name   Type   State  Slave Count
0b8b8639-5335-4b97-b8ab-5f90a6155cf3  SWARM  SWARM  READY  1
Total: 1
READY: 1
C:\Users\chogan>photon cluster list_vms 0b8b8639-5335-4b97-b8ab-5f90a6155cf3
Using target 'http://10.27.51.117:28080'
ID                                    Name                                         State
0010d6bc-84f0-4183-a77f-990399b96460  slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4   STARTED
09029e50-d348-40b7-a33a-5ddf82e1c2e6  etcd-f1c423a2-4852-4f42-94ea-b30259da9800    STARTED
44c18c9b-e070-439d-ac25-a69c7be6474e  master-0362f6f7-0905-44dd-8382-d37be65df0c4  STARTED
Total: 3
STARTED: 3
C:\Users\chogan>photon cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3
Using target 'http://10.27.51.117:28080'
Cluster ID:             0b8b8639-5335-4b97-b8ab-5f90a6155cf3
  Name:                 SWARM
  State:                READY
  Type:                 SWARM
  Slave count:          1
  Extended Properties:  map[netmask:255.255.255.0 dns:10.27.51.252 \
etcd_ips:10.27.51.138 gateway:10.27.51.254]

VM ID                                 VM Name                                      \
VM IP
09029e50-d348-40b7-a33a-5ddf82e1c2e6  etcd-f1c423a2-4852-4f42-94ea-b30259da9800    \
10.27.51.138
44c18c9b-e070-439d-ac25-a69c7be6474e  master-0362f6f7-0905-44dd-8382-d37be65df0c4  \
10.27.51.39

C:\Users\chogan>

And this final output shows us the IP address of the master node, which is what we will need to run some docker commands.

Caution: I had some issue deploying SWARM whereby the cluster create fails with “VmProvisionTaskService failed with error VM failed to acquire an IP address”. Sometimes this was when only the etcd was deployed, sometimes with etcd and master, and then other times when it was etcd, master and slave. My gut feel is that etcd was not discovering correctly. As part of the troubleshooting effort, I switched to another VLAN, and my SWARM cluster deployed without incident. Not sure what the root cause is, but I continue to investigate.

If this happens, delete the cluster, and retry the create command. If that does not work, try an alternative network for the cluster.

8. Working with Docker SWARM

Now we need to log onto a host that is running Docker. The easiest way to do this is to log onto the Photon Controller “installer” which already has docker installed.

Once logged in, point the DOCKER_HOST variable at the Swarm master IP address, and port 8333. Then you can start to run some docker commands, and the “docker ps -a” command should show you the nodes that make up the SWARM cluster. Ignore the fact that some of the IDs in the outputs may not match up what was shown earlier – I went through the exercise a number of times which is the reason for that.

esxcloud [ ~ ]$ export DOCKER_HOST=tcp://10.27.51.39:8333
esxcloud [ ~ ]$ docker info
Containers: 3
Images: 4
Role: primary
Strategy: spread
Filters: affinity, health, constraint, port, dependency
Nodes: 2
 master-0362f6f7-0905-44dd-8382-d37be65df0c4: 10.27.51.39:2375
  └ Containers: 2
  └ Reserved CPUs: 0 / 4
  └ Reserved Memory: 0 B / 8.187 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=4.0.9, \
operatingsystem=VMware Photon/Linux, storagedriver=overlay
 slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4: 10.27.51.40:2375
  └ Containers: 1
  └ Reserved CPUs: 0 / 1
  └ Reserved Memory: 0 B / 4.053 GiB
  └ Labels: executiondriver=native-0.2, kernelversion=4.0.9, \
operatingsystem=VMware Photon/Linux, storagedriver=overlay
CPUs: 5
Total Memory: 12.24 GiB
Name: 47786077a0a0
esxcloud [ ~ ]$ docker version
Client:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 02:49:29 UTC 2015
 OS/Arch:      linux/amd64

Server:
 Version:      swarm/0.4.0
 API version:  1.16
 Go version:   go1.4.2
 Git commit:   d647d82
 Built:
 OS/Arch:      linux/amd64

esxcloud [ ~ ]$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED             \
STATUS              PORTS                       NAMES
85577bf5fa55        swarm:0.4.0         "/swarm join --addr=1"   5 minutes ago       \
Up 5 minutes        2375/tcp                    slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/elegant_nobel
47786077a0a0        swarm:0.4.0         "/swarm manage etcd:/"   6 minutes ago       \
Up 6 minutes        10.27.51.39:8333->2375/tcp   master-0362f6f7-0905-44dd-8382-d37be65df0c4/goofy_wozniak
03db84efe339        swarm:0.4.0         "/swarm join --addr=1"   6 minutes ago       \
Up 6 minutes        2375/tcp                     master-0362f6f7-0905-44dd-8382-d37be65df0c4/trusting_newton
esxcloud [ ~ ]$

Now lets see what happens when I run a simple “hello-world” container a few times. In reality, these containers should get balanced across the nodes in the swarm cluster:

esxcloud [ ~ ]$ docker run hello-world

Hello from Docker.
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.
To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash
Share images, automate workflows, and more with a free Docker Hub account:
 https://hub.docker.com
For more examples and ideas, visit:
 https://docs.docker.com/engine/userguide/

Now if I run it a few more times, then take a look at my containers:

esxcloud [ ~ ]$ docker ps -a
CONTAINER ID        IMAGE               COMMAND                  CREATED               \
   STATUS                      PORTS                        NAMES
214c689f5d93        hello-world         "/hello"                 Less than a second ago\
   Exited (0) 2 seconds ago                                 master-0362f6f7-0905-44dd-8382-d37be65df0c4/focused_mestorf
03dc237d3c5d        hello-world         "/hello"                 19 seconds ago        \
   Exited (0) 20 seconds ago                                slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/ecstatic_mestorf
4a9a914b4794        hello-world         "/hello"                 29 seconds ago        \
   Exited (0) 30 seconds ago                                slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/prickly_bhabha
85577bf5fa55        swarm:0.4.0         "/swarm join --addr=1"   6 minutes ago         \
   Up 6 minutes                2375/tcp                     slave-a34d23ca-45aa-4aec-8a64-f93f85b3d2e4/elegant_nobel
47786077a0a0        swarm:0.4.0         "/swarm manage etcd:/"   7 minutes ago         \
   Up 7 minutes                10.27.51.39:8333->2375/tcp   master-0362f6f7-0905-44dd-8382-d37be65df0c4/goofy_wozniak
03db84efe339        swarm:0.4.0         "/swarm join --addr=1"   7 minutes ago         \
   Up 7 minutes                2375/tcp                     master-0362f6f7-0905-44dd-8382-d37be65df0c4/trusting_newton

esxcloud [ ~ ]$

And if we look closely, we can see that some of the “hello-world” containers were run on the master, and others have been run on the slave.

9. Scale out the SWARM cluster

We can also scale out the number of slave nodes in the SWARM cluster using the Photon CLI tools. Here is how:

C:\Users\chogan>photon cluster resize 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 3
Using target 'http://10.27.51.117:28080'

Resizing cluster 0b8b8639-5335-4b97-b8ab-5f90a6155cf3 to slave count 3
Are you sure [y/n]? y
RESIZE_CLUSTER completed for '' entity
Note: A background task is running to gradually resize the cluster \
to its target capacity.
You may continue to use the cluster. You can run 'cluster show '
to see the state of the cluster. If the resize operation is still \
in progress, the cluster state
will show as RESIZING. Once the cluster is resized, the cluster \
state will show as READY.

Let’s examine the state of the cluster once more:

C:\Users\chogan>photon cluster show 0b8b8639-5335-4b97-b8ab-5f90a6155cf3
Using target 'http://10.27.51.117:28080'
Cluster ID:           0b8b8639-5335-4b97-b8ab-5f90a6155cf3
  Name:                 SWARM
  State:                RESIZING
  Type:                 SWARM
  Slave count:          3
  Extended Properties:  map[netmask:255.255.255.0 dns:10.27.51.252 \
etcd_ips:10.27.51.118 gateway:10.27.51.254]

VM ID                                 VM Name                                      VM 
IP40255f5c-0d63-4ad9-a6ac-4c95f7f583e3  etcd-b3e288c8-0151-49a1-9cc4-1754f2b62060  10.27.51.118
f1cbe821-f9bf-4fa5-93e6-3b55a5f52c91  master-184431d2-d4ff-4f21-a16d-66190ac95021  10.27.51.48

And lets look at the new slave nodes:

C:\Users\chogan>photon cluster list_vms 7e14ce09-f6ab-4ece-bb73-ca2c96459495
Using target 'http://10.27.51.117:28080'
ID                                    Name                                         State
40255f5c-0d63-4ad9-a6ac-4c95f7f583e3  etcd-b3e288c8-0151-49a1-9cc4-1754f2b62060    STARTED
40fc5084-6e57-44b3-97c5-ac6a37fd0583  slave-4aeb2353-0720-417f-ba52-7603d1f781ad   STARTED
7dbcfd3a-5919-4912-8c8a-7d4b2478a017  slave-39b26ca6-4752-4d27-b481-32544e35d254   STARTED
edd38a15-a588-423a-809b-d03c0256a4dc  slave-5fdc2fe3-64a7-4790-b2a1-841e04dab7e8   STARTED
f1cbe821-f9bf-4fa5-93e6-3b55a5f52c91  master-184431d2-d4ff-4f21-a16d-66190ac95021  STARTED
Total: 5
STARTED: 5

C:\Users\chogan>

And if you now run the docker commands ran previously, you should see the additional slave nodes in the SWARM cluster, and running additional containers should balance across the master and the additional slaves.

Exit mobile version