Portal Stack

Thursday, January 22, 2015

CICD and Continuous Communication (CICDCC)

This is a presentation for test driven infrastructure utilizing continuous integration and delivery tools and realtime communication with slack.

madisonhub/mean-skel

Just like most CICD systems, I can push code and have it build it. Whats cool with slack is that I can get notifications, in realtime, from GitLab and GitLab-CI.

The gitlab links will take you directly to the project and commit diffs. Clicking on the gitlab-ci links will allow you to watch build output, in realtime, as well as build charts and graphs.

Since gitlab-ci shows the build is passing, its likely the build will work if done in a production container. Ill create a container and add it to my salt-master environment.

# docker run -d --name mean-skel -h mean-skel.docker01 -t docker-minion

# docker inspect mean-skel | grep IPAddress
"IPAddress": "172.17.0.14",

Now that I know the IP address of the container, I need to add it to my servers pillar and tell nagios to reload.

/srv/pillar/servers.sls:

mean-skel:
ip: 172.17.0.14
node_type: container
nagioscfg:
use: linux-server
parents: docker01

salt 'nagios*' state.sls nagios.running container

I am now monitoring it, in realtime.

By default, I monitor for salt-minion process and users. Since the new container isn't in its final highstate, I am getting notifications.

Since I decided not to automatically install the nrpe agent, I am getting those errors. Notice that "CRITICAL" is highlighted yellow. That is a custom filter I am looking for specifically in the alert channel on slack. If I am not actively in slack, I will get notified on my phone with the slack app of critical events in my infrastructure.

salt 'mean-skel*' state.sls nagios.nrpe container

Slack has notified me that services have come back. Taking a look at the Tactical Overview in nagios, I can see that everything is OK.

Whats next? What isn't shown here is the salt-master interface and its hooks into slack for the state of minions. That is next. Stay tuned.

Thursday, September 11, 2014

vagrant-gitlab

The project vagrant-gitlab is a self serve gitlab install for your local machine. It uses a combination of open source technologies to easily deploy in seconds.

First, clone the project:

git clone https://github.com/MadisonHub/vagrant-gitlab.git

Then start the virtual machine:

vagrant up gitlab --provision

At this point, saltstack will be bootstrapped. Ignore any errors about salt-minion trying to start. We will call salt locally inside the virtual machine.

vagrant ssh gitlab
sudo salt-call -l debug --local state.highstate

Gitlab should be installed and running on http://localhost:8888. Use the username root and the password 5iveL!fe. You will be prompted to change your password.

You can now create groups and projects.

Each project has a wiki. You can even publish to the wiki using git.

Each project also has project management with milestones and issues.

For backup and rake tasks, head to the github repo and look at the readme.md.

https://github.com/MadisonHub/vagrant-gitlab

Saturday, August 2, 2014

Building my own CI/CD Platform

Building my own personal CI/CD Platform

One of the more interesting concepts I have been focusing on lately is continuous integration and development [and (re)deployment] of web based solutions and services. Combined with agile methodologies, social networking, and test driven development, it promotes healthy techniques and strategies to iterate successful solutions based on code for teams of developers and sysadmins alike.

In this post I will discuss my journey from Vagrant and SaltStack to my new adventure into Mesos. Ill quickly shame on the service discovery movement, and then close up with my progress and current project path researching Mesosphere and Software Defined Networking.

Agile Methodologies

A common theme in my posts and work ethic includes the concept of agile software development. It's a modern way to manage a technical project with many moving parts. Breaking down goals and milestones into iterative code changes and pushing them as merge requests is extremely self documenting and keeps others informed and engaged.

Agile Test Driven Infrastructure
Interestingly, this agile software development strategy can be applied to infrastructure as well. With Salt recipes, you can fully provision and manage infrastructure and network devices through a unified control interface. These recipes, which consist of yaml code, can be added to a git workflow. When recipes are changed, gitlab can hook into the commit and merge events and run test cases against the infrastructure with the new recipes. Since Salt recipes consist of states, they will either pass, change, or fail.

Since we can test the state of infrastructure against code changes, we can treat infrastructure like software and attack the project management of infrastructure with scrum and agile test driven development (but lets call it agile test driven infrastructure).

SaltStack standardizes the way you manage infrastructure. Because of this, its possible to gauge the time effort necessary for completing a task. For scrum to be effective, tasks need to be scored on a time effort scale (1, 2, 4 hours for example) and assigned appropriately for maximum parallel tasking between team members to fill up the total time effort allotment.

Social Networking

You might not think social networking plays a part here. Ever hear of github? Github drastically changed the way people code by making it cool and fun. Think of facebook, but for coders. The addition of the concept of merge/pull requests forces social interaction and code review. There is no better way to increase code quality than to have developers socialize and discuss the code in real-time.

Test Driven Development

I have been combining a number of open source projects to find a quick way to build an end-to-end platform for continuous integration and development (CI/CD). The use case was so I could embed CI/CD into my development work while also incorporating linux namespaces to save on compute and memory resources so I could run it locally, as well as for cost savings from providers such as AWS, DigitalOcean, On-Premise, etc when running a "production" version.

I want to push code, have it run tests and show me the output in real time, when tests pass it should restart my web service with the new code. Sounds simple, right? Actually, when you use SaltStack, Docker and Open Virtual Switch, it is!

Using the following open source projects, I was able to create my Developer environment not only locally, but also in the cloud with less than anticipated cost: GitLab, SaltStack, Docker, OpenvSwitch, Pipework, and Mesosphere.

The Journey

The journey began with a simple goal of building a development environment using Vagrant and have it run in much the same way a production system would. With my personal introduction to SaltStack, Software Defined Networking and Docker containers the journey turned into a twist of configuration management, cluster orchestration, resource management, containers, and software defined networking.

The goal was to create a super project centered on vagrant that was reusable and could be deployed in minutes. It had to be configured to mimic provider environments such as AWS and Digital Ocean and on-premise. It needed to be deploy, configured and run end-to-end locally as well as on any provider with the same effort and ease as issuing vagrant up.

End-to-end implies that the solution deploys, configures, and starts up a complete infrastructure to develop applications using continuous integration and development. This applies to any provider in the cloud or on-premise.

Enter Vagrant:

The Vagrantfile stores useful building steps for an environment and its application. It typically instructs vagrant to build a virtual machine similar to a production environment. My blog post on Vagrant + VirtualBox + Ubuntu for linux development details working with Vagrant for locally running a development branch of an application. Since then, I've put a focus on scaling containers with as little effort as possible. Scale in this scenario means having docker containers communicate with each other across n docker hosts where n is an arbitrary number greater than one.

Enter Docker:
When I first started using docker containers to consolidate servers, I ran into the issue of streamlining them because of the need to pipe and forward ports. This caused complexity in a number of services including riak clusters, percona clusters, and other cluster type services. There was no straight forward pre-baked solution to pipe all the necessary components together across docker hosts. Therefore it was difficult creating a completely containerized environment for certain services.

Enter Service Discovery:
With the introduction of scale to the overall design, containers needed a way to easily communicate with each other across docker hosts, especially for database clusters and web applications with multiple components. Many service discovery projects started popping up on github. Some examples include skydock, etcd, and basic Docker patterns such as ambassador linking or even just port forwarding.

The problem I have with this shift, is that its not solving the actual problem. A new problem is created because communication is now a crazy rats nest of port forwarding and a reliance on yet another service which could fail that keeps track of the information. Besides, how do you scale service discovery? Scaling service discovery, now hat's a whole issue itself.

My Path to Service Discovery:
I believe we already have a reliable service discovery technology stack. It exists in the networking layer! Combine mac addresses with DHCP to get IP addresses matched with DNS entries and your service discovery is an arp table, available to any device attached to the network. Now, instead of re-inventing service discovery, we can simply "grease the wheel" by adding logic to Software Defined Networking such as customizing network flow and flow rates.

I chose to take a different path conceptualizing service discovery by using Software Defined Networking. Just like virtual machines run inside physical machines, SDN lets you create networks inside networks. Pretty cool, huh? Open vSwitch, in particular, has support for tunnel bridges, meaning you can virtually plug servers into each other like one big network switch. Then, you add docker containers to this bridged network, and they all communicate over layer 2 networking. No port forwarding or central discovery service required. All instances (virtual and containerized) that have an interface on the virtual switch bridge can talk to each other.

Enter SaltStack:
SaltStack is a great project. Defined on its website, its a "Fast, scalable and flexible systems management software for data center automation, cloud orchestration, server provisioning, configuration management and more". It's a configuration management system built on ZeroMQ and written in python. Recipes are written in yaml and the DSL is easy to pick up. It's extremely modular and supports many different aspects of linux systems. State recipes provide a way to daisy chain installation and configuration of dependencies such as packages, services, files, etc for service components such as Docker and Open vSwitch.

I now use SaltStack with Vagrant to automatically enforce state recipes on a schedule which run simple bash scripts to setup and maintain the health of an Open vSwitch network and its bridges as well as dependencies for docker and its services. So, with salt I can deploy an end-to-end environment for CI/CD, but I still need to pre-determine which docker host my containers run on as well as keep an eye on resource utilization and manually shuffle containers around if needed. Besides that, containers always use the same IP so if I needed to re-deploy a component, it would be available once the ARP table was updated.

I started looking around at existing multi-host compute systems and realized that big data and compute clusters was exactly what I needed to magically place docker containers where they needed to go based on rules and resources.

Resource Management and Orchestration:
Where docker containers end up being run is pre-determined and written to salt recipes for the most effective stacking. This allows for iterative changes and additions when scale is needed. The only problem with this approach is the complexity of the switch interconnects defined in open vSwitch. As the number of docker hosts increase the number of interconnects increases exponentially. Having a large number of interconnects is not the problem, manually writing them is.

This is currently where I am in my journey and after looking around for a while, I think Mesos and its deimos plugin for docker containers fits the bill quite nicely. Mesos is a cluster technology that supports pretty much all existing cluster engines including hadoop, google's kubernetes, and others.

What do I want to do now? I want to incorporate Open vSwitch and pipework in some way so that bridged networking can be deployed with containers kind of like how openstack works with virtual machines and their networks.

The Prototypes

Vagrant Prototype v1

I started with my first prototype inside Vagrant on my local workstation. I wasn't ready to throw anything up in the cloud because I wasn't sure how many resources I would need and thus how much it would cost me.

As things progressed and all the networking was configured properly, I began to notice that I could run all the services I needed for the full life cycle of an application with very little resources (as low as 4GB of RAM). This included nagios monitoring, the java based kibana logging stack (elasticsearch and logstash), GitLab, GitLab-CI, salt-master and the Open vSwitch network itself.

Network saturation was low. Overall cpu load was also low. The only thing that I had to keep an eye on was memory utilization. Even that was alleviated with a little bit of swap and proper planning.

I could clearly run this on Digital Ocean for around $50/mo no problem.

Digital Ocean Prototype v2

The first cloud prototype was launched on Digital Ocean and worked flawlessly. With the use of UFW and changing the SSH ports, the environment was, for the most part, secure and safe from outside trouble. All cluster networking was contained to just the private network which was physically located inside the data center. There was a limitation, though. The private networking did not span across data centers so there was no way to securely enable multi-zone high availability. Servers would go down bringing services down with them, but because of the way containers were spread out, impact was minimal and services were quick to restore.

AWS Prototype v3

The latest working prototype was launched on AWS after new EC2 pricing was announced. The pricing is actually cheaper than Digital Ocean, so it was a no brainer to move back over and take advantage of virtual private cloud (VPC) networks. VPC is Amazon's own implementation of software defined networking, allowing you to create a restricted environment not exposed to the outside world. This allowed me to create a multi-zone highly available docker cluster completely isolated from the public internet. The only exposure into the cluster is through ssl load balancers, and only to expose specific services on specific ports.

Whats Next?

After a number of successful prototypes and months of rock solid stability in terms of uptime and network performance, I decided to move up on the technology stack and focus on auto provisioning and management of compute resources. At some point, manually adding hosts will become a challenge due to the exponential increase of switch interconnects. This is not as complicated as a problem as service discovery, however. A simple algorithm can be written to deal with switch interconnects..

This is where Mesosphere comes in. Mesos has plugins for spawning docker containers. Marathon is a simple interface which allows you to specify how much cpu, memory, and number of containers to create for a particular application. It then goes out and spawns a container where it fits resource wise.

I want to figure out a way to extend Marathon and Mesos (particularly deimos), so that creating containers includes steps for adding virtual networks (using pipework).

I get really excited thinking about the potential applications of this sort of setup. I know there are a lot of "docker clusters" out there, but I feel like exploiting traditional networking technologies and building a cluster on SDN makes more sense.

Looking forward to writing more, as things progress. Until then.

Friday, November 15, 2013

Vagrant + VirtualBox + Ubuntu for linux development

The combination of Vagrant, VirtualBox and Ubuntu allows for some interesting potential. First of all, its a simple way to build and deploy cloud images similar to Amazon Web Service's AMIs, but all on your local machine. It is also capable of customizing the virtual machine settings through a configuration file leaving us the opportunity to create full linux development desktop experiences. Finally, it allows us to run linux containers (such as docker) on Windows and OS X environments in an extremely simple way.

The use case I will be showing in this post is for people who prefer bare metal installs of either Windows or OS X, but would like to have a full screen linux environment such as ubuntu running gnome. With Vagrant and VirtualBox, this is universally possible.

First, please head to http://git-scm.com and download the git installer. For windows, make sure to install the unix tools, its worth it. The installer for windows will get openSSH installed which is required by Vagrant.

Next step: Install VirtualBox. Once virtual box is installed, also install the extension pack.

Finally, Install Vagrant.

Once all the dependencies are installed, open up a shell window and lets get started. We will use a base cloud image from Ubuntu 13.04 found on http://vagrantbox.es. Vagrant works in project directories, so first create a project folder. Inside the project folder we will initialize a default configuration file and start an instance. Once the instance is ready we will install ubuntu-gnome-desktop.

$ cd ~
$ mkdir ubuntudesktop; cd ubuntudesktop

$ vagrant box add ubuntu http://cloud-images.ubuntu.com/vagrant/raring/current/raring-server-cloudimg-amd64-vagrant-disk1.box

$ vagrant init

$ vagrant up

Once you issue vagrant up, a new virtual machine will be provisioned inside virtualbox. Once the instance loads, vagrant will complete its tasks. At this point we can use vagrant to SSH into the instance using public keys.

$ vagrant ssh

We have lift off. Now lets install the desktop environment. While inside the virtual machine instance through SSH, do the following:

vagrant@vagrant-ubuntu-raring-64:~$ sudo apt-get update; sudo apt-get -y install ubuntu-gnome-desktop

Wait for a while at this point. Its going to install quite a bit of stuff (1.6GB to be exact). Once it finishes, exit the ssh session. Now that we are back on the host terminal session, lets halt the virtual machine and replace its configuration file with a more appropriate desktop capable set of configurations.

$ vagrant halt

Edit the Vagrantfile located inside the current working directory, which should be your project folder. Replace its contents with the following:

Now that we have updated our configuration, start the instance back up and be amazed. Yesss.

Now, login as vagrant (password vagrant).

Wednesday, October 23, 2013

Docker on EC2 Micro and maintaining storage growth

If you use docker for integration or continuous testing, you may be re-building images. Every command issued through docker keeps a commit of the fs changes, so disk space can fill up fast; extremely fast on an EC2 micro instance with 8GB of EBS.

Scrounging around the issues in the docker project on github, I ran across a thread talking about solutions for the storage growth. I took it and expanded a bit.

Below is the output of past and present docker commands. Anything with a status of "Up for x minutes", those are presently running docker commands. Anything with Exit 0 or other Exit values are ended and can be discarded if they are not needed, such as you do not need to commit the changes to a new image. In the screenshot below, you can see a re-build of the mongodb image. There are two commands the Dockerfile issued that stored commits and they can be discarded.

$ sudo docker ps -a

TO DELETE UNUSED CONTAINERS:

$ sudo docker ps -a | grep 'Exit 0' | awk '{print $1}' | xargs docker rm

This will find any container with a "I have no error" exit status and delete them. Note: There may be other exit statuses depending on how well an image build went. If some of the commands issued in the Dockerfile are bad or fail, the status field will have a different Exit value, so just update the piped grep command with that string.

TO DELETE UNUSED IMAGES:

$ sudo docker images | grep 'none' | awk '{print $3}' | xargs docker rmi

This will find any images that are not tagged or named. This is typical when an image is re-built. For example: if you have a running container based on an image that was re-built, `docker ps` will show that container with a hash for its image name. That just means the container is running an old (and referenced) image. Once you stop the container and replace it with a new one, running the above command will find it and remove it from the file system.

Saturday, October 19, 2013

Closures, javascript and how

From the wiki article, a closure (computer science) is a function or reference to a function together with a referencing environment-- a table storing a reference to each of the non-local variables of that function.

Closure-like constructs include callbacks and as such, are important in asynchronous programming. Here is a simple example in PHP that uses a closure as a callback to compute the total price of a shopping cart by defining a reference table for the callback function and including variables tax and total:

The concept of closures in javascript is important to understand because you might not even know you're using it. If you write in coffee-script classes or do classical inheritance patterns in vanilla javascript or even write callbacks in general for asynchronous programming, you are probably using closures. The following example is a starting point for classical inheritance in javascript. This shows how to hide private variables. It doesn't use "new" but the pattern is very similar.

According to Effective Javascript: 68 Specific Ways To Harness The Power of Javascript, there are three essential facts regarding closures:

JavaScript allows you to refer to variables that were defined outside of the current function
Functions can refer to variables defined in outer functions even after those outer functions have returned
Closures can update values of outer variables

Knowing this, we can do some fun stuff in Node.JS with asynchronous programming. With closures, we can pull a document collection from a NoSQL database, manipulate the results, and push it to an array stored via closure in the parent scope.

Hopefully you will use closures to your advantage, especially when developing in javascript, be it server side or client side or even in the database (Postgres with v8).

Tuesday, September 17, 2013

SharePoint 2013 - 503 Unavailable after March 2013 PU

SharePoint 2013 recently had two cumulative updates released: March 2013 PU mandatory update, and the June 2013 CU. I wont go into the details of obtaining these patches or running them. Basically, you'll need a lot of down time to run the patches, as they take a while. With that said, lets assume the patches ran, updated and completed. Lets also assume you have already run the product configuration wizard after each patch to update the database.

For me, I had to run the patch a few times. Due to randomness (or maybe just sloppy closed source coding) SharePoint CUs tend to fail. The good thing, is that they either succeed 100% or fail completely (leaving SharePoint more or less in an OK state. Luckily for me, I did not have any issues running the product configuration wizard. I've seen and heard of instances where product configuration wizard fails, but its usually a clean up task that fails and isnt a big deal.

After I updated the March 2013 PU, I ran into a 503 unavailable for central admin and my site collections. Instead of finding an immediate resolution, I plowed forth and installed the June 2013 CU with success. Unfortunately after the database upgrade and a SharePoint server reboot, I was still getting a 503 error when trying to access SharePoint.

After a bit of googling, I found a working solution to this particular problem. Load up IIS Manager and head to the Server Application Pools. As per this post and for me, all of my Application Pools were stopped. Without hesitation, I started every stopped pool, restarted IIS and once again was able to access Central Admin and my Site Collections.