Welcome!

Big Data Journal Authors: Elizabeth White, Roger Strukhoff, Esmeralda Swartz, Liz McMillan, Trevor Parsons

Related Topics: DevOps Journal, Java, Wireless, Linux, Web 2.0, Big Data Journal, IoT Expo

DevOps Journal: Article

Using Docker For a Complex "Internet of Things" Application

The goal of any DevOps solution is to optimize multiple processes in an organization

View Aaater Suleman's @ThingsExpo sesion here

The goal of any DevOps solution is to optimize multiple processes in an organization. And success does not necessarily require that in executing the strategy everything needs to be automated to produce an effective plan. Yet, it is important that processes are put in place to handle a necessary list of items.

Flux7 is a consulting group with a focus on helping organizations build, maintain and optimize DevOps processes. The group has a wide view across DevOps challenges and benefits, including:

  • The distinct challenge of a skills shortage in this area and how organizations are coping to meet demands with limited resources.
  • The technical requirements: From stacks to scripts, and what works.
  • The practical and political challenges: Beyond the stacks and the human element is a critical success factor in DevOps.

Recently at Flux7, we developed an end-to-end Internet of Things project that received sensor data to provide reports to service-provider end users. Our client asked us to support multiple service providers for his new business venture. We knew that rearchitecting the application to incorporate major changes would prove to be both time-consuming and expensive for our client. It also would have required a far more complicated, rigid and difficult-to-maintain codebase.

We had been exploring the potential of using Docker to set up Flux7's internal development environments, and, based on our findings, believed we could use it in order to avoid a major application rewrite. So, we decided to use Docker containers to provide quick, easy, and inexpensive multi-tenancy by creating isolated environments for running app tier multiple instances for each provider.

What is Docker?
Docker provides a user-friendly layer on top of Linux Containers (LXCs). LXCs provide operating-system-level virtualization by limiting a process's resources. In addition to using the chroot command to change accessible directories for a given process, Docker effectively provides isolation of one group of processes from other files and system processes without the expense of running another operating system.

In the Beginning
The "single provider" version of our app had three components:

  1. Cassandra for data persistence, which we later use for generating each gateway's report.
  2. A Twisted TCP server listening at PORT 6000 for data ingestion from a provider's multiple gateways.
  3. A Flask app at PORT 80 serving as the admin panel for setting customizations and for viewing reports.

In the past, we'd used the following to launch the single-provider version of the app:

12: nohup python tcp_server.py & # For firing up the TCP server.nohup python flask_app.py & # For firing up the admin panel

view rawsingle-provider-launch.sh hosted with ❤ by GitHub

Both code bases were hard coded inside the Cassandra KEYSPACE.

Our New Approach
While Docker is an intriguing emerging technology, it's still in the early stages of development. As might be expected, it has issues remaining to be resolved. The biggest for us was that, at this point, Docker can't support multiple Cassandra instances running on a single machine. Consequently, we couldn't use Cassandra to provide multi-tenancy. Another issue for us was that hosting multiple database instances on a single machine can quickly cause resource shortages. We addressed that by implementing the solution in a fairly traditional way for making an application multi-tenant. We used KEYSPACE as the namespace for each provider in the data store. We also made corresponding code changes to both the data ingestion and web servers by adding the keyspace parameter to the DB accesses. We passed the Cassandra KEYSPACE (the provider ID) to each app instance on the command line, which makes it possible to use custom skins and other features in the future. Thus, we were able to create a separate namespace for each provider in the data store without making changes to the column family schema.

The beauty of our approach was that, by using Docker to provide multi-tenancy, the only code changes needed to make the app multi-tenant were those described above. Had we not used Docker in this way, we'd have had to make major code changes bordering on a total application rewrite.

How We Did It

Docker diagram 1.jpg

First, we created a Docker container for the new software version by correctly setting up all of the environments and dependencies. Next, we started a Cassandra container. Even though we weren't running multiple instances of Cassandra, we wanted to make use of Docker's security, administrative and easy configuration features. You can download our Cassandra file from our GitHub here.We used a locally running container serving at PORT 9160 BY using this command:

1

docker run -d -p 9160:9160 -name db flux7/cassandra

view rawCassandra Container hosted with ❤ by GitHub

We then created a keyspace "provider1" using pycassaShell.

We fired up our two code bases on two separate containers like this:

12

docker run -name remote_server_1 -link db:cassandra -p 6001:6000 flux7/labs python software/remote_server.py provider1docker run -name flask_app_1 -link db:cassandra -p 8081:80 flux7/labs python software/flask_app.py provider1

view rawCode base launch in container hosted with ❤ by GitHub

Voila! We had a provider1 instance running in no time.

Automation
We found Docker-py extremely useful for automating all of these processes and used:

12345678910111213141516171819202122232425

# Yes. We love Python!def start_provider(provider_id, gateway_port, admin_port ):docker_client = docker.Client(base_url='unix://var/run/docker.sock'
version='1.6'
timeout=100) # start a docker container for consuming gateway data at gateway_portstart_command = 'python software/remote_server.py ' + provider_idremote_server = docker_client.create_container('flux7/labs', # docker image
command=start_command, # start command contains the keyspace parameter, keyspace is the provider_id
name='remote_server_' + provider_id, # name the container, name is provider_id ports=[(6000, 'tcp'),]) # open port for binding, remote_server.py listens at 6000docker_client.start(remote_server,
port_bindings={6000: ('0.0.0.0', gateway_port)},
links={'db': 'cassandra'}) # start a docker container for serving admin panel at admin_portstart_command = 'python software/flask_app.py ' + provider_idremote_server = docker_client.create_container('flux7/labs', # docker image
command=start_command, # start command contains the keyspace parameter, keyspace is the provider_id
name='admin_panel_' + provider_id, # name the container, name is provider_id
ports=[(80, 'tcp'),]) # open port for binding, remote_server.py listens at 6000docker_client.start(remote_server,
port_bindings={80: ('0.0.0.0',admin_port)},
links={'db': 'cassandra'})

view rawmulti-tenant-docker.py hosted with ❤ by GitHub

To complete the solution, we added a small logic to allocate the port for newly added providers and to create Cassandra keyspaces for each one.

Conclusion
In the end, we quickly brought up a multi-tenant solution for our client with the key "Run each provider's app in a contained space." We couldn't use virtual machines to provide that functionality because a VM requires too many resources and too much dedicated memory. In fact, Google is now switching away from using VMs and has become one of the largest contributors to Linux containers, the technology that forms the basis of Docker. We could have used multiple instances, but then we'd have significantly over allocated the resources. Changing the app also would have added unnecessary complexity, expense and implementation time.

At the project's conclusion, our client was extremely pleased that we'd developed a solution that met his exact requirements, while also saving him money. And we were pleased that we'd created a solution that can be applied to future customers' needs.

More Stories By Aater Suleman

Aater Suleman, CEO & Co-Founder at Flux7, is an industry veteran in performance optimization on servers and distributed systems. He earned his PhD at the University of Texas at Austin, where he also currently teaches computer systems design and architecture. His current interests are in optimizing DevOps and reducing cloud costs.

Cloud Expo Latest Stories
With the explosion of the cloud, more businesses are transitioning to a recurring revenue model to generate reliable sales, grow profits, and open new markets. This opportunity requires businesses to get to market quickly with the pricing and packaging options customers want. In addition, you will want to take advantage of the ensuing tidal wave of data to more effectively upsell, cross-sell and manage your customers. All of this is possible, but only with the right approach. At 15th Cloud Expo, Brendan O'Brien, Co-founder at Aria Systems and the inventor of cloud billing panelists, will lead a panel discussion on what it takes to launch and manage a successful recurring revenue business. The panelists will offer their insights about what each department will need to consider, from financial management to line of business and IT. The panelists will also offer examples from their success in recurring revenue with companies such as Audi, Constant Contact, Experian, Pitney-Bowes, Teleko...
Planning scalable environments isn't terribly difficult, but it does require a change of perspective. In his session at 15th Cloud Expo, Phil Jackson, Development Community Advocate for SoftLayer, will broaden your views to think on an Internet scale by dissecting a video publishing application built with The SoftLayer Platform, Message Queuing, Object Storage, and Drupal. By examining a scalable modular application build that can handle unpredictable traffic, attendees will able to grow your development arsenal and pick up a few strategies to apply to your own projects.
Come learn about what you need to consider when moving your data to the cloud. In her session at 15th Cloud Expo, Skyla Loomis, a Program Director of Cloudant Development at Cloudant, will discuss the security, performance, and operational implications of keeping your data on premise, moving it to the cloud, or taking a hybrid approach. She will use real customer examples to illustrate the tradeoffs, key decision points, and how to be successful with a cloud or hybrid cloud solution.
The cloud provides an easy onramp to building and deploying Big Data solutions. Transitioning from initial deployment to large-scale, highly performant operations may not be as easy. In his session at 15th Cloud Expo, Harold Hannon, Sr. Software Architect at SoftLayer, will discuss the benefits, weaknesses, and performance characteristics of public and bare metal cloud deployments that can help you make the right decisions.
Over the last few years the healthcare ecosystem has revolved around innovations in Electronic Health Record (HER) based systems. This evolution has helped us achieve much desired interoperability. Now the focus is shifting to other equally important aspects – scalability and performance. While applying cloud computing environments to the EHR systems, a special consideration needs to be given to the cloud enablement of Veterans Health Information Systems and Technology Architecture (VistA), i.e., the largest single medical system in the United States.
Cloud and Big Data present unique dilemmas: embracing the benefits of these new technologies while maintaining the security of your organization’s assets. When an outside party owns, controls and manages your infrastructure and computational resources, how can you be assured that sensitive data remains private and secure? How do you best protect data in mixed use cloud and big data infrastructure sets? Can you still satisfy the full range of reporting, compliance and regulatory requirements? In his session at 15th Cloud Expo, Derek Tumulak, Vice President of Product Management at Vormetric, will discuss how to address data security in cloud and Big Data environments so that your organization isn’t next week’s data breach headline.
Scott Jenson leads a project called The Physical Web within the Chrome team at Google. Project members are working to take the scalability and openness of the web and use it to talk to the exponentially exploding range of smart devices. Nearly every company today working on the IoT comes up with the same basic solution: use my server and you'll be fine. But if we really believe there will be trillions of these devices, that just can't scale. We need a system that is open a scalable and by using the URL as a basic building block, we open this up and get the same resilience that the web enjoys.
Is your organization struggling to deal with skyrocketing volumes of digital assets? The amount of data is growing exponentially and organizations are having a hard time managing this growth. In his session at 15th Cloud Expo, Amar Kapadia, Senior Director of Open Cloud Strategy at Seagate, will walk through the essential considerations when developing a cloud storage strategy. In this discussion, you will understand the challenges IT is facing, why companies need to move to cloud, and how the right cloud model can help your business economically overcome the data struggle.
If cloud computing benefits are so clear, why have so few enterprises migrated their mission-critical apps? The answer is often inertia and FUD. No one ever got fired for not moving to the cloud – not yet. In his session at 15th Cloud Expo, Michael Hoch, SVP, Cloud Advisory Service at Virtustream, will discuss the six key steps to justify and execute your MCA cloud migration.
The 16th International Cloud Expo announces that its Call for Papers is now open. 16th International Cloud Expo, to be held June 9–11, 2015, at the Javits Center in New York City brings together Cloud Computing, APM, APIs, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
Most of today’s hardware manufacturers are building servers with at least one SATA Port, but not every systems engineer utilizes them. This is considered a loss in the game of maximizing potential storage space in a fixed unit. The SATADOM Series was created by Innodisk as a high-performance, small form factor boot drive with low power consumption to be plugged into the unused SATA port on your server board as an alternative to hard drive or USB boot-up. Built for 1U systems, this powerful device is smaller than a one dollar coin, and frees up otherwise dead space on your motherboard. To meet the requirements of tomorrow’s cloud hardware, Innodisk invested internal R&D resources to develop our SATA III series of products. The SATA III SATADOM boasts 500/180MBs R/W Speeds respectively, or double R/W Speed of SATA II products.
In today's application economy, enterprise organizations realize that it's their applications that are the heart and soul of their business. If their application users have a bad experience, their revenue and reputation are at stake. In his session at 15th Cloud Expo, Anand Akela, Senior Director of Product Marketing for Application Performance Management at CA Technologies, will discuss how a user-centric Application Performance Management solution can help inspire your users with every application transaction.
SYS-CON Events announced today that Gridstore™, the leader in software-defined storage (SDS) purpose-built for Windows Servers and Hyper-V, will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Gridstore™ is the leader in software-defined storage purpose built for virtualization that is designed to accelerate applications in virtualized environments. Using its patented Server-Side Virtual Controller™ Technology (SVCT) to eliminate the I/O blender effect and accelerate applications Gridstore delivers vmOptimized™ Storage that self-optimizes to each application or VM across both virtual and physical environments. Leveraging a grid architecture, Gridstore delivers the first end-to-end storage QoS to ensure the most important App or VM performance is never compromised. The storage grid, that uses Gridstore’s performance optimized nodes or capacity optimized nodes, starts with as few a...
SYS-CON Events announced today that Cloudian, Inc., the leading provider of hybrid cloud storage solutions, has been named “Bronze Sponsor” of SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. Cloudian is a Foster City, Calif.-based software company specializing in cloud storage. Cloudian HyperStore® is an S3-compatible cloud object storage platform that enables service providers and enterprises to build reliable, affordable and scalable hybrid cloud storage solutions. Cloudian actively partners with leading cloud computing environments including Amazon Web Services, Citrix Cloud Platform, Apache CloudStack, OpenStack and the vast ecosystem of S3 compatible tools and applications. Cloudian's customers include Vodafone, Nextel, NTT, Nifty, and LunaCloud. The company has additional offices in China and Japan.
SYS-CON Events announced today that TechXtend (formerly Programmer’s Paradise), a leading value-added provider of server and storage virtualization, and r-evolution will exhibit at SYS-CON's 15th International Cloud Expo®, which will take place on November 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA. TechXtend (formerly Programmer’s Paradise) is a leading value-added provider of software, systems and solutions for corporations, government organizations, and academic institutions across the United States and Canada. TechXtend is the Exclusive Reseller in the United States for r-evolution