OneUptime provides a many-in-one server and data monitoring and management service for its customers.
Its open source design and scalability made it a strong alternative to other single-service providers – but there was one catch: its services relied on a single public cloud hyperscaler.
About OneUptime
• OneUptime is an open source observability platform.
• OneUptime’s comprehensive monitoring provides an all-in-one platform that replaces
multiple web monitoring, status updates, and other services.
• OneUptime is built on a robust foundation of open source software such as Redis,
Postgres, Clickhouse, Docker, and NodeJS – and now uses Kubernetes and Ceph for
its infrastructure.
When clients had concerns about downtime affecting their systems, OneUptime began looking for ways to become a truly independent monitoring service.
The answer it found was as simple as it was effective: a migration to bare metal infrastructure using Canonical’s Kubernetes and Ceph distributions.
Thanks to its investment in its own server infrastructure and adoption of MicroK8s and MicroCeph, OneUptime has taken back control of its services, reduced its reliance on third parties, streamlined operations, and cemented customer faith – all while saving over a third of a million dollars.
Migrating to bare metal infrastructure using Canonical’s open-source technologies allowed monitoring services provider OneUptime to have greater control over its infrastructure and services – while saving over 76% of their cloud costs and opening up its budget for more hiring and growth operations.
Highlights
• Moving from the public cloud to bare metal saved OneUptime $352,500 a year, or around 76% of their total cloud costs.
• OneUptime started with a 28-node managed Kubernetes cluster on the public cloud, costing approximately $456,000 a year to operate.
• Open source technologies, like Kubernetes, Helm, and Ceph, gave a fast, high-performance, and easily managed way to offer data reliability and hosting.
• The move improved stability and performance and gave OneUptime confidence to offer full data ownership and server independence to clients.
“The benefits have been astounding. Transitioning to bare-metal servers has provided us with dedicated resources [… and] complete control over our hardware, and this autonomy allows us to fine-tune our servers to meet our specific needs, optimising performance and efficiency. We can customise every aspect of our infrastructure, from the operating system and network architecture to the type and amount of storage used. […] It’s honestly magical, I’m not sure how [Canonical] did it.”
– Nawaz Dhandala, Founder and CEO of OneUptime.
How a single question changed everything
OneUptime began with a simple premise: you shouldn’t need five different technologies or applications just to monitor and manage your servers and data.
The start-up found rapid success with its low-cost open source alternative, and their kick-off on a large public cloud gave everything a start-up could ask for: scalability,
flexibility, and big-name backing.
Over time, however, a question began popping up in client conversations: what happens if the cloud goes down?
The OneUptime team started looking for answers to become a truly independent monitoring service. What they discovered on the way to this independence was game-changing: a migration to bare metal infrastructure wouldn’t just solve their main issues with dependence and reliable services, it also held incredible savings potential.
Following a pain-free migration to bare metal infrastructure using Canonical’s Kubernetes and Ceph distributions, OneUptime has taken back control of its services, reduced its reliance on third parties, streamlined operations, and cemented customer faith – all while saving over a third of a million dollars.
“We wanted to empower our customers with the ability to self-host OneUptime on their own clusters and avoid reliance on any proprietary cloud technology. And to do that, we realised that we had to move away from these clouds.”
— Nawaz Dhandala,
Founder and CEO, OneUptime
Challenge
In the beginning, OneUptime relied on the public cloud hyperscaler to provide their all-in-one monitoring and management services.
Their systems started as a 28-node Kubernetes cluster.
The setup worked, but over time a few drawbacks became clear.
“The cloud offers scalability and flexibility, especially in the start-up phase when you don’t know your load requirements. However, with block storage and network fees included, our monthly bills amounted to over $38,000, which brought our annual expenditure to over $456,000 a year,” said Nawaz Dhandala, Founder and CEO
of OneUptime. “We began to realise that these benefits could be achieved elsewhere and at a fraction of the cost. This realisation sparked a shift in our approach and led us to explore more costeffective yet equally efficient alternatives.”
Cost wasn’t the only factor pushing the OneUptime team to look for alternatives.
Many of their customers were using public cloud, often using the same infrastructure. The co-dependency led to hard questions.
“We realised that if the cloud goes down not only do we go down with our clients, but we can’t even tell them that everything is down,” said Nawaz Dhandala. “We wanted to avoid reliance on any proprietary cloud technology. And to do that, we realised that we
had to be away from these clouds.”
In the background, OneUptime also faced a serious deadline: the end of their hyperscaper credit.
“We were approaching a situation where our credits would run out and cloud costs would literally double,” said Nawaz Dhandala. “It made me think, ‘How do we test this out as fast as possible?’”
Their internal testing began, and it quickly showed them that Kubernetes and Ceph were the best, and fastest, options to meet their goals in as little time as possible.
Solution
After some intensive small-scale experimentation and testing of different options in their offices, Canonical’s distributions (MicroK8s and MicroCeph) impressed the OneUptime team.
“We tried other distributions like K3s, but these presented various difficulties, either in installing, deploying, or customer management, for instance adding nodes and updating them was a pain. And then we tried MicroK8s,” said Nawaz Dhandala.
“MicroK8s takes care of all of this automatically – even magically, honestly.
At the same time, we were looking to deal with our storage requirements – we store a lot of data and telemetry data and logs and all of that stuff, so we need massive amounts of
storage – and MicroK8s brought us to test bare metal Ceph, and if you’re doing MicroK8s, why not do MicroCeph?”
After thorough research of the options available to them, OneUptime chose to run a MicroK8s cluster in a co-location facility, using a single rack configuration with 40 rack units of usable space.
Each server gave a potential of 18 units with a 2 socket 128 core CPU with about 80TB usable storage, upgradeable to over 1 TB of RAM.
“There is a common misconception that MicroK8s is only for edge computing or development purposes”, said Nawaz Dhandala.
However, this is not true at all. MicroK8s is a small, fast, and singlepackage Kubernetes distribution that can run on any platform.
Many companies, including ours, are using MicroK8s in production environments and the official documentation supports this use case.
We have been very satisfied with MicroK8s so far, but we are also flexible to change to another Kubernetes distribution if needed.”
MicroK8s, MicroCeph, and Helm made the migration a breeze.
Kubernetes’ automated deployment, scaling, and operating application containers made it easier to manage applications on their own servers, while Helm (a package manager for Kubernetes that simplifies the process of defining, installing, and upgrading complex Kubernetes applications) made it faster to package and deploy onto their cluster.
MicroCeph took care of volume provisioning, as it makes it extremely simple to set up a productionready Ceph cluster.
Benefits
At first glance, a migration to bare metal might seem expensive; OneUptime invested $150,000 in new servers to make the transition.
However, the service, business, and financial benefits have eclipsed this up-front cost.
“The benefits have been astounding. Transitioning to bare-metal servers has provided us with dedicated resources, effectively eliminating the ‘noisy neighbour’ issue often experienced in shared hosting environments, where multiple customers share the same
server resources. No more performance degradation!” said Nawaz Dhandala. “Now, we have complete control over our hardware, and this autonomy allows us to fine-tune our servers to meet our specific needs, optimising performance and efficiency. We can customise every aspect of our infrastructure, from the operating system and network architecture to the type and amount of storage used.”
The savings have allowed them better compute, and the space to focus investment into expanding their business and hiring more team members.
“Now, our monthly operational expenditure, including power, cooling, energy, and remote hands, is approximately $5,500. When compared to our previous public cloud costs, we’re saving over $352,500 roughly per year if you amortise the CapEx costs of the server over 5 years,” said Nawaz Dhandala. “This substantial reduction in expenses has enabled us to allocate resources to other critical areas of our business and has facilitated the hiring of more engineers. That’s over 76% in savings for better compute!”
Finally, the move gave their clients better peace of mind and assurances that, if one of the internet giants goes down, their services will keep going.
“Thanks to the migration, our system and our clients’ systems are not affected by the possible failure of larger service providers,” said Nawaz Dhandala. “We can continue to provide our status reports, monitoring, and all our services reliably and independently
of the public cloud’s status.”
Download full case study
Reduce operational complexity by allowing Canonical’s experts to operate and maintain your Ceph cluster. Learn about fully Managed Ceph
Additional Resources
How to run workloads on bare metal Kubernetes with MAAS
Easy and automated bare metal Kubernetes provisioning
AI storage with Ceph
Discover how AI impacts your storage systems and what it means for your business
Secure Kubernetes at the Edge
How strict confinement enables a secure IoT landscape