Recently, Cloud Foundry developer advocate, Alvaro Videla, gave a talk at the Norwegian Developer’s Conference, Norway’s largest conference covering .NET and Agile development. Alvaro co-authored the book, RabbitMQ in Action, and his talk shows us why we should decouple our code and how to insert a messaging architecture with RabbitMQ into a Node.js application. He provides sample code to run the app on CloudFoundry with several other components—Sock.js, MongoDB and Redis. This article summarizes Alvaro’s talk and uses his example app, Cloudstagram, to show a similar, basic version of how Instagram scales with RabbitMQ, Redis, and Celery. If you would like to watch the full video of Alvero’s presentation, you can find it and the sample code below.
Why do we need messaging?
If we look at a simple example of a classic web application, like implementing a photo gallery, we could decompose it into two parts—uploading an image and viewing the gallery. This is a fairly simple app, but it gets more complex as new requirements arrive, and they always do. First, we might have a product manager add the requirement for notifying the friends of a user who uploaded. Then, a social media guru gets requirements approved for a point system to award badges to users who use the app as well as adding each post to twitter. The sysadmin finds that full size images are tripling the bandwidth used, and they need to be resized. And, another app developer needs to call your app from Python or Java. Of course, the user doesn’t care what goes on in the background—they just don’t want to wait for the application.
Without a messaging paradigm, applications like this end up being developed to execute the above functions synchronously, which means the user waits for all steps to complete. Let’s take a look at how this code might be written, using Erlang in the examples.
How the code typically evolves through each new requirement
First, we write a program to move the file up to a temp file system.
Then, we add the requirement for resizing the picture by calling a second function.
Then, we were asked to notify friends when a new image is uploaded. For this, we add a third function.
Then, we needed to add points to the user to build up their badges.
Lastly, we add a way for the user’s image to be tweeted.
How the code evolves with messaging
In the code above, there may be a way to scale some by adding web servers, but what happens if we need to add 30 more functions and many if then statements? What happens if we need to speed up image conversion? What happens if we need to wait to post to Twitter? What happens if we have to add another image size or format? What happens if we need to add an email notification? What if we decide to move a function to a different language without downtime? Each time we keep adding to the code invoked by the user, the harder we make it to scale. With messaging, we can scale much more easily.
With messaging, the code becomes decoupled. Below, we transform the previous example with a common publish/subscribe pattern. The first implementation has code to put the image on a file system as before, but, then, we send a message with the user identifier and the image identifier as metadata on a queue or exchange. The code doesn’t care what happens after that, and there is no waiting for other if-then statements or functions to run while the user waits. Once the message fires, you could say that the system is now globally aware that the event happened for a specific user and image—both user and app can go on about their business. There is no second implementation of the image_controller code, and you don’t have to update the first implementation of the code to get the other functionality deployed.
We can add the additional functions to our application and trigger them from the events sitting on a message queue. In the code above, a friends notifier consumes the message from the queue, grabs the friend list, and sends out friend notifications. The points manager sees the message on the queue, and updates the user’s points, calculating whatever it needs to. The resizer gets a message with the user and image ID, grabs the original image, and processes it.
Each of these functions as well as any additional functions can subscribe to a message queue and process the message separately and asynchronously. Since the functions are now decoupled, they can run at different times, on different servers, scale up, scale down, or go down without bringing down the whole system. As well, we could wait to run some processes at night or prioritize more compute power for image resizing versus Twitter updates or point calculation.
The Cloudstagram Architecture and Example
The example functions above are collected into Alvaro’s Cloudstagram sample app and mock some key features within Instagram to show how messaging can be implemented in a cloud-based architecture. From a user experience perspective, the app allows us to see resized images in real time—as they are uploaded, they are resized, land in our timeline, update followers, and more. The application’s architecture includes the following:
- Node.js: The web application portion is written using the Express.js framework
- MongoDB: Images are stored and served from GridFS, and each image upload creates an imageid to be used later in the exchanges and queues
- Redis: Acts as primary data store for the main application data, including user sessions
- RabbitMQ: Provides the queues for image processing and real-time notification broadcasting
From a messaging perspective, the application uses three exchanges—Image Upload, New Image Event, and Image Broadcast. These are also linked together for some use cases.
The Image Upload Exchange
Whenever a user uploads an image, it is first stored in MongoDB. If successful, a message is sent to a new image exchange with the userid, filename, comment, date, and mime type. In the image below, many users can upload images, and these are the producers that connect to the upload exchange and resize queue. The image resizers were written in Clojure, and they consume from this queue, resize the images, and rebroadcast information to the New Image Event Exchange. It’s important to note that the resizers act as a consumer, do their job, and act as a producer for a new event.
The New Image Event Exchange
In this exchange, the resizers trigger the new image exchange and populate three queues—add image to user, new image, and image to followers. The new image queue will push the image to a latest images list in Redis. The followers queue adds the images to the user’s followers. Lastly, the to user queue will register the image in Redis for the user’s timeline and republish the image to the Broadcast Exchange.
The Image Broadcast Exchange
As we saw before, the consumer of the prior queue becomes a producer for this exchange. Similarly, this exchange also updates three queues—broadcast to uploader, broadcast to anonymous users, and broadcast to followers. To broadcast this information, we use the Thumper library’s methods to send information to users with WebSockets via Sock.js. In this broadcast model, you can also add Node.js servers, and they will grab the message if they own the session for a given user, allowing for scale.
What we’ve accomplished with this architecture
With this architecture, we have achieved a significant level of loose coupling to provide the scale of distributed computing—what was once executed in one single chunk of code on one server is now split across many exchanges and queues. Now, users can go do other things without waiting on the app. At virtually any point, we can also add or remove servers to increase throughput. Importantly, we have captured an event that can be used to invoke any other code that comes up in future requirements.
There are additional benefits. Because RabbitMQ is multi-protocol, we can now message with MQTT, STOMP, and AMQP or even HTTP. Since RabbitMQ is written in Erlang, we have a highly concurrent platform with message passing and supervisory processes built in. There tons of developer tools and a plug-in architecture that makes RabbitMQ a true, polyglot platform—you can develop in PHP, Node.js, Erlang, java, Ruby, .NET, Haskell, Clojure, Akka, Scala, and many other clients. There is even an ability to make Hadoop talk with RabbitMQ via AMQP. As well, Spring has an AMQP project, and Grails also has a plugin.
- Learn more about Cloudstagram and download the source
- Learn how real-time geo-tracking can be accomplished with Python, Celery, and RabbitMQ
- Read how Roblox scaled their architecture with RabbitMQ
- Find out how Nokia, Socialvane, and Superfeedr use RabbitMQ.
- Read over 50 similar articles on RabbitMQ at the vFabric Blog
- Watch Alvaro’s presentation:
About the Author