Author Archives: chkroemer

Ice cream sales break microservices, Hystrix to the rescue

In November 2015, we had the opportunity to spend three days with a greenfield project in order to get to know Spring Cloud Netflix. At comSysto, we always try to evaluate technologies before their potential use in customer projects to make sure we know their pros and cons. Of course, we had read about several aspects, but we never really got our hands dirty using it. This had to change!

Besides coming up with a simple scenario that can be completed within a few days, our main focus was on understanding potential problems in distributed systems. First of all, any distributed system comes with the ubiquitous problem of failing services that should not break the entire application. This is most prominently addressed by Netflix’ “Simian Army” which intentionally breaks random parts of the production environment.

However, we rather wanted to provoke problems arising under heavy load due to capacity limitations. Therefore, we intentionally designed a distributed application with a bottleneck that turned into an actual problem with many simultaneous requests.

Our Use Case

Our business case is about an ice selling company, which is acting on worldwide locations. On each location there are ice selling robots. At the company’s headquarters we want to show an aggregated report about the ice selling activities for each country.

All our components are implemented as dedicated microservices using Spring Boot and Spring Cloud Netflix. Service discovery is implemented using Eureka server. The communication between the microservices is RESTful.

architecture

Architecture overview of our distributed system with the deployment setup during the experiments.

There is a basic location-service, which knows about all locations provided with ice-selling-robots. The data from all these locations has to be part of the report.

For every location, there is one instance of the corresponding microservice representing an ice-selling-robot. Every ice-selling-robot has locally stored information about the amount of totally sold ice cream and the remaining stock amount. Each of them continuously pushes this data to the central current-data-service. It fails with a certain rate, which is configured by a central Config Server.

For the sake of simplicity, the current-data-service stores this information in-memory. Every time it receives an update from one of the ice-selling-robots, it takes the new value and forgets about the old one. Old values are also forgotten if their timestamp is too old.

The current-data-service offers an interface by which the current value for the totally sold amount of ice cream or the remaining stock amount can be retrieved for one location. This interface is used by an aggregator-service, which is able to generate and deliver an aggregated report on demand. For all locations provided by the location-service the current data is retrieved from the current-data-service, which is then aggregated by summing up the single values from the locations grouped by the locations’ country. The resulting report consists of the summed up values per country and data type (totally sold ice cream and remaining stock value).

Because the connection between aggregator-service and current-data-service is quite slow, the calculation of the report takes a lot of time (we simply simulated this slow connection with a wifi connection, which is slow in comparison with an internal service call on the same machine). Therefore, an aggregated report cache has been implemented as fallback. Switching to this fallback has been implemented using Hystrix. At fixed intervals the cache is provided with the most current report by a simple scheduled job.

The reporting service is the only service with a graphical user interface. It generates a very simplistic html-based dashboard, which can be used by the business section of our company to get an overview of all the different locations. The data presented to the user is retrieved from the aggregator-service. Because this service is expected to be slow and prone to failure, a fallback is implemented which retrieves the last report from the aggregated-report-cache. With this, the user can always request a report within an acceptable response time even though it might be slightly outdated. This is a typical example for maintaining maximum service quality in case of partial failure.

report

The reporting “dashboard”.

We used a Spring Cloud Dashboard from the open source community for showing all registered services:

cloud-dashboard

Spring Cloud Dashboard in action.

The circuit-breaker within the aggregator-service can be monitored from Hystrix dashboard.

Screen Shot 2015-12-30 at 22.22.26

Hystrix dashboard for reporting service under load. All circuits are closed, but 19% of all getReport requests failed and were hence successfully redirected to the cached version.

Understanding the Bottleneck

When using Hystrix, all connectors to external services typically have a thread pool of limited size to isolate system resources. As a result, the number of concurrent (or “parallel”) calls from the aggregator-service to the report-service is limited by the size of the thread pool. This way we can easily overstress the capacity for on-demand generated reports, forcing the system to fall back to the cached report.

The relevant part of the reporting-service’s internal declaration looks as depicted in the following code snippet (note the descriptive URLs that are resolved by Eureka). The primary method getReport() is annotated with @HystrixCommand and configured to use the cached report as fallbackMethod:

@HystrixCommand(
 fallbackMethod="getCachedReport",
 threadPoolKey="getReportPool"
)
public Report getReport() {
 return restTemplate.getForObject("http://aggregator-service/", Report.class);
}

public Report getCachedReport() {
 return restTemplate.getForObject("http://aggregated-report-cache/", Report.class);
}

In order to be able to distinguish primary and fallback calls from the end user’s point of view, we decided to include a timestamp in every served report to indicate the delta between the creation and serving time of a report. Thus, as soon as the reporting-service delegates incoming requests to the fallback method, the age of the served report starts to increase.

Testing

With our bottleneck set up, testing and observing the runtime behavior is fairly easy. Using JMeter we configured a testing scenario with simultaneous requests to the reporting-service.

Basic data of our scenario:

  • aggregation-server instances: 1
  • test duration: 60s
  • hit rate per thread: 500ms
  • historize-job-rate: 30s
  • thread pool size for the getReport command: 5

Using the described setup we conducted different test runs with a JMeter thread pool size (=number of concurrent simulated users) of 3, 5 and 7. Analyzing the served reports timestamps leads us to the following conclusion:

Using a JMeter thread count below the size of the service thread pool results in a 100% success rate for the reporting-service calls. Setting sizes of both pools equal already gives a small noticeable error rate. Finally, setting the size higher than the thread pool results in growing failures and fallbacks, also forcing the circuit breaker into short circuit states.

Our measured results are as follows (note that the average report age would be 15s when always using the cached version given our historize-job-rate of 30s):

  • 3 JMeter threads: 0,78s average report age
  • 5 JMeter threads: 1,08s average report age
  • 7 JMeter threads: 3,05s average report age

After gaining these results, we changed the setup in a way that eliminates the slow connection. We did so by deploying the current-data-service to the same machine as the aggregation-service. Thus, the slow connection has now been removed and replaced with an internal, fast connection. With the new setup we conducted an additional test run, gaining the following result:

  • 7 JMeter threads, fast network: 0,74s average report age

By eliminating one part of our bottleneck, the value of report age significantly drops to a figure close below the first test run.

Remedies

The critical point of the entire system is the aggregation due to its slow connection. To address the issue, different measures can be taken.

First, it is possible to scale out by adding additional service instances. Unfortunately, this was hard to test given the hardware at hand.

Second, another approach would be to optimize the slow connection, as seen in our additional measurements.

Last but not least, we could also design our application for always using the cache assuming that all users should see the same report. In our simplistic scenario this would work, but of course that is not what we wanted to analyze in the first place.

Our Lessons Learned

Instead, let us explain a few take-aways based on our humble experience of building a simple example from scratch.

Spring Boot makes it really easy to build and run dozens of services, but really hard to figure out what is wrong when things do not work out of the box. Unfortunately, available Spring Cloud documentation is not always sufficient. Nevertheless, Eureka works like a charm when it comes to service discovery. Simply use the name of the target in an URL and put it into a RestTemplate. That’s all! Everything else is handled transparently, including client-side load balancing with Ribbon! In another lab on distributed systems, we spent a lot of time working around this issue. This time, everything was just right.

Furthermore, our poor deployment environment (3 MacBooks…) made serious performance analysis very hard. Measuring the effect of scaling out is nearly impossible on a developer machine due to its physical resource limitations. Having multiple instances of the same services doesn’t give you anything if one of them already pushes the CPU to its limits. Luckily, there are almost infinite resources in the cloud nowadays which can be allocated in no time if required. It could be worth considering this option right away when working on microservice applications.

In Brief: Should you use Spring Cloud Netflix?

So what is our recommendation after all?

First, we were totally impressed by the way Eureka makes service discovery as easy as it can be. Given you are running Spring Boot, starting the Eureka server and making each microservice a Eureka client is nothing more than dependencies and annotations. On the other hand, we did not evaluate its integration in other environments.

Second, Hystrix is very useful for preventing cascading errors throughout the system, but it cannot be used in a production environment without suitable monitoring unless you have a soft spot for flying blind. Also, it introduces a few pitfalls during development. For example, when debugging a Hystrix command the calling code will probably detect a timeout in the meantime which can give you completely different behavior. However, if you got the tools and skills to handle the additional complexity, Hystrix is definitely a winner.

In fact, this restriction applies to microservice architectures in general. You have to go a long way for being able to run it – but once you are, you can scale almost infinitely. Feel free to have a look at the code we produced on github or discuss whatever you are up to at one of our user groups.

Advertisements

Developing a Modern Distributed System – Part II: Provisioning with Docker

As described in an earlier blog post “Bootstrapping the Project”, comSysto’s performance and continuous delivery guild members are currently evaluating several aspects of distributed architectures in their spare time left besides customer project work. In the initial lab, a few colleagues of mine built the starting point for a project called Hash-Collision:

Hash-collision's system structure

They focused on the basic application structure and architecture to get a showcase up and running as quickly as possible and left us the following tools for running it:

  • one simple shell script to set up the environment locally based on many assumptions
  • one complex shell script that builds and runs all services
  • hardcoded dependency on a local RabbitMQ installation

First Attempt: Docker containers as a runtime environment

In search of a more sophisticated runtime environment we went down the most obvious path and chose to get our hands on the hype technology of late 2014: Docker. I assume that most people have a basic understanding of Docker and what it is, so I will not spend too much time on its motivation here. Basically, it is a tool inspired by the idea of ‘write once, run anywhere’, but on a higher level of abstraction than that famous programming language. Docker can not only make an application portable, it allows to ship all dependencies such as web servers, databases and even operating systems as one or multiple well-defined images and use the very same configuration from development all the way to production. Even though we did not even have any production or pre-production environments, we wanted to give it a try. Being totally enthusiastic about containers, we chose the most container-like place we could find and locked ourselves in there for 2 days.

impact-hub-munich

One of the nice things about Docker is that it encourages developers to re-use existing infrastructure components by design. Images are defined incrementally by selecting a base image, and building additional functionality on top of it. For instance, the natural way to create a Tomcat image would be to choose a base image that already brings a JDK and install Tomcat on top of it. Or even simpler, choose an existing Tomcat image from the Docker Hub. As our services are already built as fat JARs with embedded web servers, things were almost trivial.

Each service should run in a standardized container with the executed JAR file being the sole difference. Therefore, we chose to use only one service image and inject the correct JAR using Docker volumes for development. On top of that, we needed additional standard containers for nginx (dockerfile/nginx) and RabbitMQ (dockerfile/rabbitmq). Each service container has a dependency on RabbitMQ to enable communication, and the nginx container needs to know where the Routing service resides to fulfill its role as a reverse proxy. All other dependencies can be resolved at runtime via any service discovery mechanism.

As a first concrete example, this is the Dockerfile for our service image. Based on Oracle’s JDK 8, there is not much left to do except for running the JAR and passing in a few program arguments:

FROM dockerfile/java:oracle-java8
MAINTAINER Christian Kroemer (christian.kroemer@comsysto.com)
CMD /bin/sh -c 'java -Dport=${PORT} -Damq_host=${AMQ_PORT_5672_TCP_ADDR} -Damq_port=${AMQ_PORT_5672_TCP_PORT} -jar /var/app/app.jar'

After building this image, it is ready for usage in the local Docker repository and can be used like this to run a container:

# start a new container based on our docker-service-image
docker run docker-service-image
# link it with a running rabbitmq container to resolve the amq dependency
docker run --link rabbitmq:amq docker-service-image
# do not block and run it in background
docker run --link rabbitmq:amq -d docker-service-image
# map the container http port 7000 to the host port 8000
docker run --link rabbitmq:amq -d -p 8000:7000 docker-service-image
# give an environment parameter to let the embedded server know it has to start on port 7000
docker run --link rabbitmq:amq -d -e "PORT=7000" -p 8000:7000 docker-service-image
# inject the user service fat jar
docker run --link rabbitmq:amq -d -e "PORT=7000" -v HASH_COLLISION_PATH/user/build/libs/user-1.0-all.jar:/var/app/app.jar -p 8000:7000 docker-service-image

Very soon we ended up with a handful of such bash commands we pasted into our shells over and over again. Obviously we were not exactly happy with that approach and started to look for more powerful tools in the Docker ecosystem and stumbled over fig (which was not yet deprecated in favor of docker-compose at that time).

Moving on: Docker Compose for some degree of service orchestration

Docker-compose is a tool that simplifies the orchestration of Docker containers all running on the same host system based on a single docker installation. Any `docker run` command can be described in a structured `docker-compose.yml` file and a simple `docker-compose up` / `docker-compose kill` is enough to start and stop the entire distributed application. Furthermore, commands such as `docker-compose logs` make it easy to aggregate information for all running containers.

fig-log-output

Here is an excerpt from our `docker-compose.yml` that illustrates how self-explanatory those files can be:

rabbitmq:
 image: dockerfile/rabbitmq
 ports:
 - ":5672"
 - "15672:15672"
user:
 build: ./service-image
 ports:
 - "8000:7000"
 volumes:
 - ../user/build/libs/user-1.0-all.jar:/var/app/app.jar
 environment:
 - PORT=7000
 links:
 - rabbitmq:amq

Semantically, the definition of the user service is equivalent to the last sample command given above except for the handling of the underlying image. The value given for the `build` key is the path to a directory that contains a `Dockerfile` which describes the image to be used. The AMQ service, on the other hand, uses a public image from the Docker Hub and hence uses the key `image`. In both cases, docker-compose will automatically make sure the required image is ready to use in the local repository before starting the container. A single `docker-compose.yml` file consisting of one such entry for each service is now sufficient for starting up the entire distributed application.

An Aside: Debugging the application within a Docker container

For being able to debug an application running in a Docker container from the IDE, we need to take advantage of remote debugging as for any physical remote server. For doing that, we defined a second service debug image with the following `Dockerfile`:

FROM dockerfile/java:oracle-java8
MAINTAINER Christian Kroemer (christian.kroemer@comsysto.com)
CMD /bin/sh -c 'java -Xdebug -Xrunjdwp:transport=dt_socket,address=10000,server=y,suspend=n -Dport=${PORT} -Damq_host=${AMQ_PORT_5672_TCP_ADDR} -Damq_port=${AMQ_PORT_5672_TCP_PORT} -jar /var/app/app.jar'

This will make the JVM listen for a remote debugger on port 10000 which can be mapped to any desired host port as shown above.

What we got so far

With a local installation of Docker (on a Mac using boot2docker http://boot2docker.io/) and docker-compose, starting up the whole application after checking out the sources and building all JARs is now as easy as:

  • boot2docker start (follow instructions)
  • docker-compose up -d (this will also fetch / build all required images)
  • open http://$(boot2docker ip):8080/overview.html

Note that several problems originate from boot2docker on Mac. For example, containers can not be accessed on `localhost`, but using the IP of a VM as boot2docker is built using a VirtualBox image.

In an upcoming blog post, I will outline one approach to migrate this setup to a more realistic environment with multiple hosts using Vagrant and Ansible on top. Until then, do not forget to check if Axel Fontaine’s Continuous Delivery Training at comSysto is just what you need to push your knowledge about modern deployment infrastructure to the next level.

Developing a Modern Distributed System – Part III: Provisioning with Docker, Vagrant and Ansible

Part II of our blog post series on ‘Developing a Modern Distributed System’ featured our first steps with Docker. In a second lab in early 2015, we tried to better understand the required changes in a production-like deployment. Without the assumption of all containers running on the same host – which makes no sense for a scalable architecture – Docker links and docker-compose are no longer valid approaches. We wanted to get the following three-node setup to work:

3-node-architecture

First of all, we created an automated Docker-Hub build linked to our github repository for rebuilding images on each commit. With that, the machines running the containers no longer had to build the images from Dockerfiles themselves. We used Vagrant to run three standard Ubuntu VMs and Ansible to provision them which included:

  • install Docker
  • upload service JARs that should be linked into the containers
  • upload static resources for nginx’s `/var/www` folder
  • run docker containers with correct parameterization (as we did with docker-compose before), still with some hardcoding to wire up different hosts

Why Ansible? First, some tool is required to avoid manually typing commands via ssh on multiple simultaneous sessions in an environment with multiple hosts. Second, Ansible was an easy choice because some of us already had experience using it while others wanted to give it a try. And last but not least, labs at comSysto are just the right place to experiment with unconventional combinations, see where their limitations are and prove they can still work! We actually achieved that, but after a `vagrant destroy` it took a full 20min on a developer machine to be up and running again. Not exactly the round-trip time you want to have while crafting your code… We needed to optimize.

The biggest improvement we were able to achieve came from the usage of a custom Vagrant base box with a ready-to-go Docker environment. Besides installing docker, we also pre-fetched all images from the Docker Hub right away which brings a huge productivity boost on slow internet connections. Even if images change, the large base images are typically pretty stable, hence download times could be reduced dramatically. The Docker image itself could also be optimized by using a minimal JDK base image such as jeanblanchard/busybox-java:8 instead of dockerfile/java:oracle-java8 which is built on top of Ubuntu.

Furthermore, we used CoreOS instead of Ubuntu as the operating system to get the base box smaller and faster to start up. CoreOS is a minimal OS designed to run Docker containers and do pretty much nothing on top of that. That also means it does not contain Python which is required to provision the VM using Ansible. Fortunately, Ansible can be installed using a specific coreos-bootstrap role.

Provisioning the running VMs with updated versions of our services, instead of destroying and rebuilding them from scratch, gave us a round-trip-time of roughly more than a minute, of which around 30s were required to rebuild all fat JARs.

Let’s have a closer look at a few aspects of that solution. First, we start a standard CoreOS box with Vagrant, and provision it with the following Ansible playbook:

- hosts: all
 gather_facts: False
 roles:
 - defunctzombie.coreos-bootstrap
- hosts: all
 gather_facts: False
 tasks:
 - name: Prepare latest Docker images to make start of fresh VMs fast in case these are still up-to-date
 command: docker pull {{item}}
 with_items:
 - dockerfile/rabbitmq
 - chkcomsysto/hash-collision-service
 - chkcomsysto/hash-collision-service-debug
 - chkcomsysto/hash-collision-nginx

Using `vagrant package` and `vagrant box add` we immediately create a snapshot of that VM state and make it available as a base box for further usage. After that, the VM has fulfilled its destiny and is destroyed right away. The new base box is used in the `Vagrantfile` of the environment for our application which, again, is provisioned using Ansible. Here is a snippet of the corresponding playbook:

- hosts: service
 sudo: yes
 gather_facts: False
 tasks:
 - name: Create directory for runnable JARs
 file:
 path: /var/hash-collision
 state: directory
 - name: Upload User service runnable JAR
 copy:
 src: ../../../user/build/libs/user-1.0-all.jar
 dest: /var/hash-collision/user-1.0-all.jar
 - name: Pull latest Docker images for all services
 command: docker pull chkcomsysto/hash-collision-service-debug
 - name: Start User service as a Docker container
 command: >
 docker run -d
 -p 7000:7000 -p 17000:10000
 --expose 7000 --expose 10000
 -e PORT=7000 -e HOST=192.168.60.6 -e AMQ_PORT_5672_TCP_ADDR=192.168.60.5 -e AMQ_PORT_5672_TCP_PORT=5672
 -v /var/hash-collision/user-1.0-all.jar:/var/app/app.jar
 chkcomsysto/hash-collision-service-debug

Where this leaves us

As we have virtualized pretty much everything, the only prerequisite left was a local Vagrant installation based on VirtualBox. After running a `quickstart-init-box.sh` script to build the Vagrant base box from scratch once, executing a `quickstart-dev-mode.sh` script was sufficient to build the application, start up three VMs with Vagrant, provision them with Ansible, and insert sample data. For a full development round-trip on a running system, another `refresh-dev-mode.sh` script was meant to build the application, provision the running VMs with Ansible, and again insert sample data (not that this is always required as we were still using in-memory storage without any persistence).

This setup allows us to run the entire distributed application in a multi-host environment during development. A more realistic approach would of course be to work on one component at a time, verify its implementation via unit tests and integrate it by starting this particular service from the IDE configured against an environment that contains the rest of the application. It is easy to see how our solution could be modified to support that use case.

Next Steps & Challenges

For several reasons, the current state feels pretty immature. First, we need to inject the own IP and global port into each container. It is questionable if a container should even need to know its identity. An alternative would be to get rid of the heartbeat-approach in which every service registers itself and build a service discovery based on Docker meta data with etcd or the likes instead.

Another area for improvements is our almost non-existent delivery pipeline. While uploading JARs into VMs is suitable for development, it is far from ideal for a production delivery. The Docker image should be self-contained, but this requires a proper build pipeline and artifact repository that automates all the way from changes in the service code to built JARs and fully functional Docker images ready to be started anywhere. Non-local deployments, e.g. on AWS, are also an interesting area of research in which the benefits of Docker are supposed to shine.

Last but not least, we need to work on all kinds of monitoring which is a critical part of any distributed application. Instead of using ssh to connect to a VM or remote server and then ssh again into the containers running there to see anything, it would be more appropriate to include a dedicated log management service (e.g. ELK) and send all logs there right away. On top of that, well-defined metrics to monitor the general health and state of services can be an additional source of information.

Obviously, there is a lot left to explore and learn in upcoming labs! Also crazy about DevOps, automation and self-contained deployment artifacts instead of 20th-century-style-delivery? See it all in action at our Continuous Delivery Training!

Free Wicket Guide

Our lean Java Expert Andrea Del Bene has written an excellent user guide about the Apache Wicket web framework. Being heavy users of the framework ourselves, we have blogged about Apache Wicket earlier and described its general advantages and best practices as well as concrete scenarios such as integration with Google Calendar. On top of Andrea’s efforts, we have included this expertise to provide a comprehensive documentation covering all the way from the framework’s basic structure and concepts to advanced topics such as complex forms, security, AJAX, or Spring integration.

The result has recently been published on http://wicketguide.comsysto.com/ for easy online access. Continue reading