Messaging for IoT

Deploying message broker for an IoT use case introduces some new challenges to the broker scalability. We’re talking now about thousands of connections, consumers and destinations, which make us think of how we provision, configure and monitor our messaging infrastructure much more carefully. In this post I’ll try to sum up some of techniques that can be used with the current Apache ActiveMQ in terms to scale it better for IoT deployment. I’ll also describe some of the new features we developed for the 5.12.0 release to make it a better fit in this new world. And finally I’ll try to explain where we can go from here and what can we work on in the future.

ActiveMQ Vertical Scaling

The two most common messaging protocols used for IoT are MQTT and AMQP and we spent a significant effort to make them rock-solid in the latest releases. But protocols aren’t everything and every time someone starts play with the broker, the same question arise: How can I get the maximum scalability from the single broker instance? So, here are some tips:

  • Always start with the general broker scaling techniques. That basically means to try using as minimal threads as possible no matter how much connections and destinations your broker handles. So use NIO transport and turn of thread-per-destination setup.
  • The first implementation of MQTT protocol for ActiveMQ assumed that QoS1 and QoS2 subscribers are mapped internally to JMS durable subscribers. JMS durable subscribers have a heavy state and don’t scale very well. In later versions, you can opt for another implementation that uses virtual topics instead and should scale much better.
    
    <transportConnectors>
      <transportConnector name="mqtt+nio"
        uri="mqtt+nio://0.0.0.0:1883&transport.subscriptionStrategy=mqtt-virtual-topic-subscriptions"/>
    </transportConnector>
    
  • One more reason to try the new 5.12.0 release is the new improvements of KahaDB message store that can now preallocate journal files. You can find more information about this in this post, but these tweaks on some file systems can gain significant performance improvements
  • All these little configuration tweaks are summed in the new example configuration file, which you can find in the

    examples/conf/activemq-mqtt.xml

    file in the new 5.12.0 distribution.

    Vertically scaling the broker is only one part of the equation. There a couple more important questions that needs to be met for the successful IoT deployment.

    SSL

    Many IoT devices depend on SSL certificates for authentication purposes. This is not something new and we saw that in traditional messaging setups as well (and supported it), but the difference is, again, in the scale. It’s easy to manually maintain a keystore with a handful of certificates in it. It’s entirely different story when the number of certificates start to raise.

    In 5.12.0 we added some new features to help people deal with this. There are a couple of tools that are standardised for solving these problems and supported in JDK. So now, you can use Certificate Revocation List, which provides an easy way to revoke invalid certificates during the runtime.

    You can also use OCSP (Online Certificate Status Protocol) which provides even more automated way to communicate with your certificate authority. You can find more information about these features here.

    I think that SSL certificate provisioning is much bigger story for IoT deployments (and clouds in general) and there are already some interesting projects emerging solve it, like pki.io. We’ll try to support whatever people use in this space and right now with the support of CRL and OCSP you can have a little more flexibility when dealing with your certificates.

    Monitoring under stress

    One topic I often encounter when talking about IoT deployments is how to monitor the broker that is on its limits of the vertical scaling (whatever they are). We usually monitor the broker behaviour using JMX or advisory messages. While these are perfect tools when the broker is under its limits, things get less optimal when you’re on the edge.

    With large number of destinations and connections coming and going, registering MBeans and firing advisory messages can become very expensive, especially when done in high volume. It can get in the way of the actual work the broker needs to do.

    Folks that want to get maximum from their broker instance usually just turn everything off, like

    <broker useJmx="false" advisorySupport="false" ...>

    but that leave us with no insight in the state of the broker, beyond peaking into the logs.

    One solution we came up with is making an mbean registration more selective

    Starting with 5.12.0, you can define mbean names you want to suppress from registering, by defining them on the management context, like

    <managementContext>
      <managementContext
        suppressMBean="endpoint=dynamicProducer,endpoint=Consumer,
                       connectionName=*,destinationName=ActiveMQ.Advisory.*"/>
    </managementContext>
    

    Using this feature, you can customize what view of the broker you need and by suppressing registration/unregistration of high frequency ephemeral mbeans, you can help your broker deal with the load. So even in the high scale/load scenario, you can get the basic metrics of the broker. There is much more we can do in this space, by defining custom views and such, so stay tuned.

    Legacy MQTT

    Apache ActiveMQ implements MQTT 3.1.1 specification, but MQTT is not a new protocol and there is already a vast amount of devices deployed that use older (3.1) clients. We made an effort to enable known use cases where older clients expect different behaviour than what’s in the 3.1.1 specification. So for example, you can enable publishing to the “dollar topics” and see a difference in behaviour on unsuccessful subscription attempts. We’ll try to cover all these corner cases and provide the support for legacy clients where it is sensible thing to do so.

    ActiveMQ Artemis

    In case you weren’t paying attention, there have been a bit of a consolidation in the Java message brokers land. The HornetQ broker has been donated to the Apache and is now part of the ActiveMQ project. Its asynchronous core gives us a good base for the next generation ActiveMQ that should scale better and have better performances than the current broker. It already have an initial support for AMQP and MQTT protocols. With these protocols hardened and features above fully implemented, it should be a very good message broker for the backbone of your IoT infrastructure.

    Help from the friends

    Having a good message a broker is sure an important piece of the puzzle. But for truly large IoT deployments we’ll need more than that. We need to have a more complex infrastructure that will allow us to partition our traffic (in terms of connections, destination, etc.), provide fault tolerance and high availability capabilities. There are a couple of interesting projects that can help build an elastic messaging infrastructure for IoT needs.

    Qpid Dispatch Router provides a broker-less routing of messages between clients, brokers and other endpoints based on AMQP. It helps build optimal topologies and routing messages from clients to their final destination. For example, dispatch router can serve as gateway between clients and broker, helping the large number of connections or destinations be concentrated and partitioned over multiple brokers without client awareness. That’s only one of the examples where adding a router to your messaging network can help. It’s an interesting topic and you’ll hear much more interesting things in this space in the future.

    Fabric8 and OpenShift, on the other hand provide us with an easy way to provision and manage this messaging infrastructure. You can use them to easily deploy new brokers, routers, gateways and discover existing components. Fabric8 also provide a gateway that can be used to partition the traffic among the endpoints.

    There are a many ways to slice this problem and the final solution is certainly dependent on the actual use case. But having all these components rock solid and working nicely together is crucial in architecting the final solution.

    Conclusion

    With this post I tried to give some perspective on where we are and where we’re heading with the supporting IoT cases. I hoped you enjoyed it and found it useful. I’ll cover this topic in much more details at Voxxed Days Belgrade and ApacheCon in October. For the end, take a look at the awesome Red Hat Summit IoT demo, which shows some of the this stuff in practice.

ActiveMQ and HawtIO

We introduced HawtIO console as a tech preview in 5.9.0 ActiveMQ release, with an idea to replace the old and rusty web console in the distribution. Unfortunately, that idea didn’t go well with the rest of Apache community so it’s voted out and 5.9.1 is released without it. You can read more on the topic of distributing non-Apache developed web consoles in Apache projects in this (lengthy) thread if you’re interested.

Anyhow, there’s a lot of people out there who liked HawtIO and are asking questions on how to use it with the future (and some old) releases. So here, I’ll try to sum up different ways of how ActiveMQ and HawtIO can be used together.

HawtIO is pure JavaScript application that doesn’t have any server-side component. It uses Jolokia REST API to access managed servers. As a pure JavaScript application it’s possible to package it as a Chrome application, so you can run HawtIO locally in your browser. Take a look at the HawtIO Get Started guide on how to do this.

Once, you have your console running you can use it to connect to any remote broker, running management REST API (5.8.0 and newer). Take a look at this connect form

activemq-hawtio1

You can notice that management API uses /api/jolokia/ path and that by default ActiveMQ web server is listening on port 8161. Just click Connect to remote server and you’ll have the access to the broker.

activemq-hawtio2

The nice thing is that you can save different broker settings in the application, so it’s easy to connect to any of the brokers you have in your environment with the single click.

activemq-hawtio3

So, if you’re a Chrome user or willing to use Chrome apps in this way, there’s really nothing stopping you from accessing remote brokers from your local HawtIO instance.

If this solution is not ideal for you, you can always embed the console back in your ActiveMQ installation. Luckily, it’s very easy thing to do.

First, you need to download Hawtio default war, presumably in the webapps/ directory of your installation

cd webapps
wget http://central.maven.org/maven2/io/hawt/hawtio-default/1.3.1/hawtio-default-1.3.1.war

Now, add appropriate web application context to the web server, by adding something lile

<bean class="org.eclipse.jetty.webapp.WebAppContext">
    <property name="contextPath" value="/hawtio" />
    <property name="war" value="${activemq.home}/webapps/hawtio-default-1.3.1.war" />
    <property name="logUrlOnStart" value="true" />
</bean>

to the etc/jetty.xml

The final step is to configure HawtIO authentication and adjust it to the broker’s one. This is done by providing the following system properties

-Dhawtio.realm=activemq -Dhawtio.role=admins 
-Dhawtio.rolePrincipalClasses=org.apache.activemq.jaas.GroupPrincipal

The easiest way to do it, is to add them to the ACTIVEMQ_OPTS variable in the bin/activemq startup script.

Now, run your broker and enjoy the hawtness.

Finally, if you’re interested in a great platform for running ActiveMQ and other integration technologies (HawtIO included), you should definitely give fabric8 a try. It provides an easy way to provision, configure and manage vast array of integration endpoints (broker included).

Or if you prefer standalone broker installation, you can try RedHat distributions that still come with the HawtIO included by default.

So, even if HawtIO is not distributed with ActiveMQ, you can easily use it in number of different setups depending solely on your preference.

MQTT over WebSocket transport in ActiveMQ

So, we have more and more users want to connect to ActiveMQ directly from the browser using WebSockets. For quite a while now we support Stomp clients which are really easy to use from JavaScript. Now, as more mobile users trying the same approach, we added support for very efficient binary MQTT protocol to the mix for upcoming 5.9.0 version.

The good thing is that you really don’t have to change anything on the broker side to support it. Both Stomp and MQTT can work over the same connector, as clients identify the protocol they want to use when they initialize the connection. We also provided a nice little demo using Eclipse Paho JavaScript client. You can play with the demo if you start the broker with activemq-demo.xml config like

bin/activemq console xbean:conf/activemq-demo.xml

And go to the

http://localhost:8161/demo/mqtt/

You can also peek at the source code

If you’re interested in messaging for web and mobile, I’ll be talking more about Stomp, MQTT, WebSocket and stuff at OSCON later this July, so pop by if you can. Happy messaging!

Lightweight Messaging For Web And Mobile With Apache ActiveMQ

Messaging once was a thing of “enterprises” but times are changing fast and devs now want to use it from virtually any environment. I thought it’s important to talk about messaging technologies available for web and mobile, so I’ll give a talk about it at CamelOne and OSCON. If you’re attending one of those give me a ping, so we can have a chat over some beers.

Apache ActiveMQ 5.7.0 released

We managed to keep our goal of making more frequent releases and today we’re happy to announce Apache ActiveMQ 5.7.0. The main goal of this release was Java 7 compatibility. The project is built using JDK 6, but it’s tested to work properly with Java 7. This was needed as we now use Camel 2.10, which also added support for Java 7.

Besides this, there’s a couple of new features and close to two hundred bug fixes in this release. Some of the prominent new features are:

  • Secure WebSocket (wss) transport – which means that you can now securely connect to the broker directly from your browser. You can find more information about it here
  • Broker Redelivery – which allows you to define redelivery policy such that broker will resend a message to a different consumer in case that processing fails. Here you can find more information on when you’d want to use this feature and how.

We also improved our storage locking mechanism, so now it’s completely pluggable. This means that locking is not tied to the store itself, but it’s separately configured. And also you can tune it or implement new locking strategies to suit your environment. We also provided a new database locker, called Lease Database Locker, which should do much better job for JDBC master slave scenarios.

So, that’s about it. Give 5.7.0 a try and let us know what do you want to see in 5.8.0.

And while I have your attention, there’re two upcoming sessions where you can learn more about ActiveMQ:

Pluggable ActiveMQ Storage Lockers

Shared storage master slave broker topologies depend on successful storage locking. Meaning that only a single broker (the master) is active and use the message database. So far locking was tied to a specific message store, so KahaDB was using shared file locking while JDBC store was using a specialized database table to keep slaves from starting. This work fine for the most use cases, but sometimes folks need to use a custom locker (like when the standard file locking doesn’t work on their NFS) or tune existing solutions.

For the upcoming 5.7.0 release we introduced pluggable storage lockers, which means that message storage locking is totally separated from the store itself. That means that you can now use any locking strategy with any store. An example configuration is shown below:

<persistenceAdapter>
    <kahaDB directory = "target/activemq-data">
        <locker>
            <shared-file-locker lockAcquireSleepInterval="5000"/>
        </locker>
    </kahaDB>
</persistenceAdapter>

You can find more details on this new feature here. It will also allow us to implement new locking strategies, based on ZooKeeper for example, which will make high-availability setups even easier. So stay tuned.

Messaging Anti-Patterns: Part 3

OK, after basic anti-patterns discussed in part 1 and 2 of this series, it’s time to discuss a bit more sophisticated messaging anti-patterns and how to write better messaging-oriented applications.

Using appropriate message type

So let’s start with the first principles of messaging. Why we want to use a message broker in our architecture? The most probable answer is to exchange data in asynchronous and lously-coupled way between our systems. In that terms we should see how our data are best represented in terms of messages. JMS specification defines a few types of message types to be used which I’d classify in two groups. In the first group I’ll put TextMessages and ByteMessages which provides a kind of a plain-sheet in terms of what kind of data is carried in the message body. I think that you should strongly consider using only these message types as they provide a framework for loosely-coupled data exchange without introducing unnecessary complexity. Let’s cover briefly other message types and discuss what they bring to the picture:

ObjectMessages

If you haven’t already, be sure to read a post Jeff Mesnil wrote on this subject. It sums up pretty well what’s wrong ObjectMessage type. In the nutshell:

  • You can get into a classloading mess as your systems need to share a classpath information as they need to be able to (de)serialize same objects. This increases coupling between the systems which we wanted to avoid in the first place.

  • It adds unnecessary performance penalties for serializing and transferring the whole objects, instead of only valuable data

StreamMessages

In a similar fashion, StreamMessage adds some semantics over the basic ByteMessages. Instead of treating all bytes equally, you can now write and read strings, integers, objects, etc. This was very valuable in times when JMS API was designed, but today in the age of all these advanced serialization frameworks, both binary (Protobuf and co) and text-based ones (Jackson and co.), I think you’ll better forget about it. Encode your data with the some of the tools you’re probably already using in your applications and transfer them using ByteMessages or TextMessages.

MapMessages

MapMessages are just one more example of a too specialized interface. It’s true that map (properties) collection format is commonly used, but it’s just one of many, so why depend and couple your application to it.

Additionally usage of ObjectMessages, StreamMessages and MapMessage ties your solution to the JMS land and prevents you to exchange messages with Stomp-based clients for example.

Avoid fat-messages (Blobs)

While it’s certainly possible to move large messages through message brokers, you should think twice if you want to do that. And when I say large messages, I mean gigabytes and gigabytes of data in a single message. If you find yourself with the requirement like that, ask yourself do you really need a messaging service to move this data. Broker internals are optimized to move large amount of messages and adding big blobs in the combination can cause various side effects (connection controls kicking in, blocking other clients, exhausting resources, etc).

The usual pattern suggested for this use case is that you should use some traditional transport, like FTP, for moving large data; and use messaging service to notify clients when data is ready for download (and where to find it). ActiveMQ even provides and [API which will do this for you under the cover of JMS API] (http://activemq.apache.org/blob-messages.html)

That’s it for today, choose your message type and size wisely so you don’t end up with tightly coupled systems and brokers/clients struggling with oversized messages.

Messaging Anti-Patterns: Part 2

OK, now that you promised that you won’t store your messages in the broker (see part one of this post series), let’s consider one more thing that you should avoid when dealing with messaging systems.

Short-lived connections

One thing that reoccur regularly is folks (knowingly or unknowingly) creating and tearing connections to the broker for every message they produce and consume.

This anti-pattern is especially common in two environments: Stomp and Spring. Stomp is very lightweight text-oriented messaging protocol. Which makes it really easy to write clients in almost any programming language available. This is one of the main strengths of Stomp and if you’re not JVM-exclusive shop I strongly recommend you to take a detailed look at it. But this impose a problem of course; large number of clients are poorly written and/or used inappropriately. For example, you can have a script which when ran will send or consume some messages from the broker. So if you don’t care much, you’ll open a new connection every time, send and consume messages and (hopefully) disconnect from the broker. Then you’ll put your script under load which will then transfer that load to the broker.

Spring, on the other hand, uses nice abstractions for dealing with JMS brokers, but the problem is that it was designed to be ran in a container of some kind and it is expected that container will manage resources for it. So when you write your standalone Java application and don’t care about this you end up with messaging clients behaving badly. A new connection, session, producer, consumer objects will be created for every message exchange and that far from optimal.

Why is all this such a problem? Well, first of all opening and closing connections requires from broker to do some work and having large number of clients opening and closing connections to exchange a single message produces a huge overhead and steals resources broker could use to do some other useful work. This is not specific to messaging and broker. You don’t open and close a database connection for every query (hopefully) for the same reasons. And also a spike in load in this case can spike in number of sockets used (as they need some time shutdown on the system) and eventual system resources exhaustion.

Besides this, messaging services are all about long lasting connections and clients. ActiveMQ implements various concepts, like producer flow control and consumer message prefetch, which are aimed to improve messaging experience and performance of the whole system. Not only that these messaging mechanisms are meaningless in a short-lived connections scenario, but can also introduce additional overhead in the system.

Finally if you have network brokers, information on large number of message consumers coming and going will be propagated through the network. This can significantly increase a broker to broker traffic and even impact the stability of remote brokers.

So what’s to be done? In Spring it’s easy, just use some kind of a cached connection factory and you’ll be sorted. There’ll be no more a new connection, session, producer, consumer for every message exchanged. More resources on ActiveMQ Spring support can be found here so please give it read if you’re using JMS Spring clients in your applications.

For Stomp, there’s no silver bullet. But for starters be aware of what you’re doing and try to reuse your resources smartly. If you do it, messages will flow smoothly and you’ll have a stable system.

Messaging Anti-Patterns: Part 1

If you have a hammer everything looks like a nail, right? So we all witnessed that people sometimes try to solve the problem with wrong technology. Heck we probably all did it at one point or another. Common reasons are familiarity with an exiting technology stack we have at hand or perception that some of the features will justify it all. But using any technology in a way it’s not designed for, will lead to all kind of problems and you’ll eventually be forced to do it properly (probably cursing at the project in question because it hasn’t met you wrong expectations).

In this and couple of follow-up posts I’ll try sum up some things we saw in mailing lists, Jiras, etc related to improper usage of messaging systems (ActiveMQ in particular). Hopefully, it will help people that consider using messaging in their architecture, see if it is the right tool for solving their particular problem.

So let’s kick off with one common mistake people make

Using message queue as a database

Messaging systems are built to asynchronously connect multiple systems, by passing messages between them. So everything is designed with that in mind; how to most efficiently pass messages from producers to consumers. This means that messages are expected to be reasonably short-lived, and not stored in a queue.

From time to time we see people trying keep application state in the broker. Put some messages in a queue, than browse them, cherry-pick just some of them, delete others and similar stuff. While most of the messaging systems have some kind of support to do this it’s not what they’re designed to support primarily. Client APIs, internal storage system, client-server contracts, etc. are optimized for entirely different set of tasks.

An example could be a system that wants to keep a single most-recent data as a queue message (timestamped or versioned somehow). So that application can find that message (usually by browsing) and do the house keeping by deleting stale data. People are sometimes inclined to do this as brokers provide high-availability, reconnection logic, can be geographically distributed which is all fine and well. But brokers are a poor choice for maintaining application state.

A workaround is not to keep any state in the broker, of course. Either use a centralized high-availability database or a local copy of data and use messaging system to propagate changes.

So if you find yourself wanting to store some messages in a queue and then later browse them, query them or maintain them, please don’t. Get yourself a database of some kind (relational or not) and manipulate data there. Queues (and topics) are for data that should be consumed as they come and moved from one system to another as fast as possible. They should live in the broker only as long as it takes to consume them or if something has gone wrong and they cannot be consumed. Which should be an exceptional situation rather than what we design for.

If there’s any messaging anti-patterns you observed (or designed yourself – c’mon don’t be ashamed), send them to me. I’ll gladly put them on my list and document them in coming days.

Conference week wrap-up

I had a blast of a week at CamelOne and JEEConf. Both organized perfectly and awesome crowd all around.

davidfoxphotographer-196

CamelOne was packed with FuseSource customers and users with a great feedback on the things we do. There were a lot of interest in Fuse Fabric which should help folks provision their integration infrastructure with ease. I covered how Fabric can help with complex ActiveMQ deployments and you can find the slides embedded below.

If you have to deploy and manage more than one broker I strongly recommend you taking a look at Fabric and FuseESB Enterprise 7.0 which incorporates it.

JEEConf was a more general Java conference; a quite larger than last year. Although most of the sessions were in Russian so I couldn’t actually follow, at least I had time to write this blog post :)
I gave a talk on ActiveMQ Apollo subproject, which is packed with cool new stuff that should bring open source messaging to the next level. Take a look at the slides here

It was well received and I’m looking forward to more feedback as people start playing with it more.
All in all, it was a great week and it’s always a pleasure to present and get a feedback on the our projects. Can’t wait for Monday when the new cycle of development begins.