Saturday, May 4, 2019

DevOps 101






DevOps 101


So, you've heard of the term DevOps and are curious, why are people going bananas over it?! What really is DevOps, is there an industry standard definition for it? Does it relate to Agile? Is it based on principles like Agile? My teams are already having a hard time delivering their best, why do I have to hire a DevOps person now?

All very valid questions. And guess what? I will have some answers here and some more, later.

Lets begin with the difficult part. There is no universal definition of DevOps, there is no DevOps Manifesto, unlike Agile and mostly when you talk to people, each will have similar but varying definition for it. In fact in the organization that I consult for, there are 172 Groups with name "DevOps" in them, under varying management! So, does it mean nobody can say what DevOps is? No, not really, that wouldn't be true. There are broad principles associated with DevOps practices and culture and these 172 groups would fit somewhere in the spectrum.

But wait.. culture? Did I just say culture, what has DevOps got to do with culture? Isn't it about using those work-in-progress fancy tools that evolve so fast that nobody can keep up with? Well, my esteemed reader, it is so much more than tools. It is about practices and culture and more. But I am getting ahead of myself here.

The term "DevOps: in itself comprises of "Developers" – people who write code and "Operations" – people who keep that code or the infrastructure under it running across environments, essentially implying close collaboration between these folks. When we discuss about some of the practices, probably in a future blog, we'd see how this collaboration gets reflected in various DevOps activities. Generally speaking, DevOps implies use of engineering practices that enable quicker delivery of well tested, good quality code on a robust production environment with significant automation tied in to the process.

With that high level my definition out, lets have a quick chat around some more details:

§  A brief History


Patrick Debois, a Belgian consultant is credited for coining the term DevOps, implying collaboration between developers and operations. Apparently, the term was first used for DevOps Days 2009 conference in Belgium. The idea of DevOps formed and spread like a wildfire, with coming together of various industry veterans sharing their learning and passion. In Velocity conference 2009, John Allspaw and Paul Hammond presented "10+ Deploys Per Day : Dev and Ops Cooperation at Flickr", link in references, which kind of started shaking the world of how deployments were looked at. The time to market has since been shrinking and based on a 2016 report Amazon deploys code to prod every 11.7 seconds on average.

This feat of continuous deployment is achieved through various architectural and engineering choices that have to be made at the start of application development. No wonder, you'd hear stories of new age companies more in DevOps space. However, it would be factually incorrect to assume that DevOps can be applied only for greenfield projects or only new / smaller companies. Most of the biggest organizations across various sectors, including highly controlled sectors such as federal government are adopting DevOps practices for better profitability / competitiveness or driving innovation faster.

You might wonder, why deploy faster? Great question, a really really good topic for a future blog 


§  What are DevOps practices?

How do we say a team is adopting DevOps, what do they do when they do DevOps? At a 10,000 ft perspective, this applies to using CALMS:


§  C for Culture

      • DevOps aims to establish motivated teams with shared pride, ownership and responsibility of product, that work with a growth mindset.

§  A for Automation

      • Automation is a cornerstone of the DevOps movement and facilitates collaboration. Automating tasks such as testing, configuration and deployment frees people up to focus on other valuable activities and reduces the chance of human error.

§  L for Lean

      • Team members are able to visualize work in progress (WIP), limit batch sizes and manage queue lengths. Again, we depend on our partners from Agile community to help with this

§  M for Measurement

      • DevOps teams measure a lot - from performance of delivery pipeline itself, to application and infrastructure health. This includes things like CPU/ memory monitoring, JVM monitoring or Change Lead Time. The Four Key Metrics, which is now "Adopt" section of ThoughtWorks radar, as name suggests are key metrics for DevOps measurement itself.

§  S for Share

      • Share Success, Failure, Feedback - between and across the teams and members

§  How do I learn DevOps

Ok, all that mumbo - jumbo is good. Now, where do I start learning DevOps?

I will give you three paths :
    1. Or, wait for more blogs 
    2. Or, look at this learning path 

I know this was a bad joke section. Lets move back on serious stuff :)

§  DevOps Thought Leaders

Fortunately, there are many folks in DevOps who really love to share their awesome work. Some folks that I follow are listed below. By no means this is not an exhaustive list, just the ones I follow







 











   




James Turnbull


Chris Riley


Kelsey Hightower


Sean Hull



 








References:
https://devops.com/the-origins-of-devops-whats-in-a-name/
https://newrelic.com/devops/what-is-devops
https://www.devopsdays.org/about/
https://techbeacon.com/devops/10-companies-killing-it-devops
https://docs.microsoft.com/en-us/azure/devops/learn/what-is-devops-culture
https://martinfowler.com/bliki/DevOpsCulture.html
https://whatis.techtarget.com/definition/CALMS
https://www.scaledagileframework.com/devops/
https://www.thoughtworks.com/radar/techniques/four-key-metrics
https://medium.com/@fabiojose/devops-kpi-in-practice-chapter-2-change-lead-time-and-volume-9e80ac7ca54
https://www.agilealliance.org/glossary/lead-time
https://github.com/kamranahmedse/developer-roadmap
https://sweetcode.io/top-10-thought-leaders-devops/

Friday, April 5, 2019

Jenkinsfile -- To collocate or not to collocate


To collocate or not to collocate Jenkinsfile

Problem

While building Pipeline-As-Code recently for one of our projects, we were faced with a conundrum; whether to co-locate our Jenkinsfiles with application code, or not. Or, does it even matter?


Default Solution

Our default opinion was to co-locate Jenkinsfile with application code, as that's the whole point - from the same code base we build and deploy code, such as below:



This idea had some advantages. With just a default checkout, Jenkins will be able to find code as well as pipeline to build and deploy it. We use Bitbucket for our development, so this approach comes with the added advantage that we could use multibranch pipelines without any additional effort.






Challenges

However, pretty soon after we started doing this, we ran into some challenge. While DevOps Engineer was modifying Jenkinsfile (remember we're the first ones to build it), and the application developers were simultaneously modifying code base, it resulted into multiple deployments, aka server restarts, while the developers were checking if their code was working in development. At times, this also resulted in broken builds, while DevOps Engineer was trying to fix the pipeline, such as, by adding Sonar scan. We knew, as first people to start using Jenkins Pipeline in enterprise there would be challenges and we chose to live with these challenges.

The application development continued rapidly, and then stabilized, things looked good, deployments were happening to dev and test as expected. However, we felt we were not doing the right thing. But why? We couldn't really put it in words. Until, we wanted to deploy to Acceptance environment, which we thought would be un-eventful. Except that, whenever we modify our pipeline, such as, to build deployment stage for ACPT, we were modifying the code base. And that's when we confirmed our problem, we were violating principle of keeping code and configuration separate, ref https://12factor.net/config. This meant that whenever we have changes to our pipeline, we would have to build the code again, not what we wanted. The code smell was obvious.




Final Approach



By now, we had realized that Jenkinsfile should not really be co-located, but we still wanted developers to be able to build code, run various tests on it, check code quality, and potentially deploy to a dev-like environment themselves. It was a choice between giving more powers to developers versus following sane conventions and keeping production deployments in the hands of people more experienced with doing that.

We eventually decided to have two kinds of Jenkinsfiles:

A usual Jenkinsfile, called just that, that does a build and runs tests on it (and potentially deploys to dev in a future state), used on feature branches
This was configured on Jenkins to run multibranch as well, ensuring that we are able to run those tests for each feature branch (which is created per story),
This sends emails to developers and culprits upon failure


Developers have full control over it and they can change it as needed, eg when our developer was working on a story to fix code Qualityissues, she was running Sonar and Nexus IQ Scans on this, which we generally don't run on feature branches.
Another set of Jenkinsfile, that is kept separate from code, in a different repository, and is used to build AND deploy code, from master This is really our deployment pipeline, that builds, deploys, and performs the whole nine yards of activities needed for taking code to production
This ensures that our pipeline, which is a configuration, remains separate from our code, and can be built and modified, without impacting code base
This sees more changes, especially now, where we are doing this for first time, although it will eventually stabilize too This is a little more controlled - and modified usually by DevOps Engineer only. However, developers have permissions to modify it
Failures to this pipeline should trigger emails to entire team





We did consider having a single Jenkinsfile that builds off of master and feature branches, with different workflows for feature vs master branch. However, we chose not to go this route, given our inexperience with Jenksfile, this would probably make our Jenkinsfile more complex than what we want. We want our developers to be able to understand and modify Jenkinsfile, but we dont want to burden them with too much information, that they usually don't need to dig in.




Looking forward

I believe eventually, we will move to a single Jenkinsfile, which is kept separate than code-base and has different workflows for master, feature branches and release branches. This may happen after we, including developers and DevOps engineers, become more proficient with Jenkinsfile usage.

We don't have any workflows for Pull Requests and neither are we using shared libraries at the moment, but both of these are on our bucket list. We don't think either of them would impact where we keep our Jenkinsfiles.

Friday, January 4, 2019

Managing Jar Hell in Tomcat 8



Classloading in Tomcat 8

Problem

We recently began development on a new microservice, that connects to an existing Sybase database and is deployed on tcServer 4.0.1, which has bundled Tomcat 8. For reference, we had a similar microservice that connects to Oracle database, and a legacy monolith that connects to Sybase database, but deployed on JBoss container. We were not doing anything fundamentally new, and we expected this development to be quite straightforward.

However, when we deployed this application, we started running into weird issues with Sybase jar (JConn4), a Cybe-Ark provider jar, that masks connection to database, using its own driver, to fetch connection details from vault.
We spent quite some time trying to analyze this with various teams to figure out what is going wrong. We also have a third party jar integrated as a handler through logging.properties (in tcserver/conf) that send alerts when it finds errors in logs, and that just complicated things more. 
I guess it’s that lucky time in my career, where I finally run into Jar Hell!

Analysis

Our default setup consists of loading some jars from a given file path, instead of using them from within the war / tcServer lib, as these are expected to be consistent across multiple applications we deploy on tcServer.
We tried experimenting with modifying where these jars are declared, versus where they are kept (external file / tcServer/lib / inside war) and kind of got sense that the issues seem to be due classes not being loaded when they are getting invoked.
This led us to analyze and understand classloading in Tomcat 8. Here are details on how Tomcat 8 loads classes and what tools we could use to debug.

Classloaders in Tomcat 8

When Tomcat is started, it creates a set of class loaders that are organized into the following parent-child relationships, where the parent class loader is above the child class loader. The default classloaders are:



  • Bootstrap : This class loader contains the basic runtime classes provided by the Java Virtual Machine, plus any classes from JAR files present in the System Extensions directory ($JAVA_HOME/jre/lib/ext). Generally speaking, you wouldn’t setup anything here.
  • System : This class loader is normally initialized from the contents of the CLASSPATH environment variable. In our context, this one is actually important, as this loads up all jars by putting them in classpath. Also, this classloader is responsible for loading Tomcat's logging implementation, implying this loads up logging.properties, which was also important for us.
  • Common : This classloader by default loads classes / resources/ jars from tcServer/lib directory. Per Tomcat documentation, normally, application classes should NOT be placed here. However, our enterprise bundle adds tcapp/lib to this. Jars loaded by this can be configured through common.loader property in tcserver/conf/catalina.properties
  • WebappX : This loads classes from WEB-INF/classes and WEB-INF/lib

Classloading Hierarchy :

The default loading hierarchy for loading of these classes are :
  1. Bootstrap
  2. Webapp
  3. System
  4. Common
That does seem a little weird to me, but I guess, there must be a good reason on why that is the default loading mechanism. Also, the order in which jars are loaded by a given classloader is not defined (see bug in references below). Coming from a Spring background, which holds bean initialization / dependency resolution as late as possible, to load other beans, this was a surprising fact. (And yes, I am aware that loading /wiring beans, is an entirely different thing than loading the classes themselves).

Customization 1: Specifying Loader Delegate as True

This hierarchy can be configured by specifying, in context.xml,then the order becomes:
  1. Bootstrap
  2. System
  3. Common
  4. Webapp
We did some funny combinations of the three jars, with combinations on where they are placed, with different loader delegate conditions and almost hacked ourselves to death, trying to figure out what is going on with classloading. Sometimes, the classes would load up, but on other times, with what seemed reasonable approach, they would not. We tried loading all three under same classloader, but they would fail, which seemed weird, but remember, the order in which jars are loaded by a given classloader is not defined 

Customization 2 : Explicitly loading classes before/after in Webapp classloader

Tomcat believes that depending upon a class to be loaded before should be done by putting it in a way that it is loaded by a different classloader which is loaded first, and ordering of jars within a given classloader is a smell. However, if we necessarily need this, this can be achieved by modifying the context.xml, to include Pre/Post Resources. So, in order to load files from a given path first, we could use the following block to order classloading:
                   base="/Users/theuser/mypictures" webAppMount="/pictures" />

Note here, that we don't need a custom class for this. There are couple of classes available from Tomcat that can be used to look at directory/ file /jar (DirResourceSet/ FileResourceSet / JarResourceSet). The resources such loaded can be made available to one or all contexts, using the webAppMount element.

Debug Tools

To our rescue, we added, -verbose:class to JAVA_OPTS in ApplicationEnv. With this we could see the actual order in which classes were getting loaded and that really helped with understanding what is going on. Although, the logs were interleaved between System and Webapp classloaders, to some extent, overall it was a big help.
The second thing we did (although it makes logs very very confusing) was plain and simple, to enable logging for Tomcat, by adding following to logging.properties in tcserver/conf:



With the help of these, we were eventually able to resolve our classloading issues. We also identified there is no one single way to achieve similar results. We could have used PreResources using default classloading hierarchy, but we ended up using Delegate=true, and customizing the load order.

References