This is a simplified description but it captures the heart of the app. Beyond that we make use of a few other tools and services such as Elasticsearch and Ansible, but before I get onto those I’d first like to take you on a journey through the core of the application.
MongoDB is a document database, which is to say it specialises in storing unstructured data in a document format. This is a different approach to more traditional relational databases, where data is structured into a rigid schema. Unsurprisingly this leads to trade-offs that have sparked the occasional holy war among developers, but MongoDB makes sense for us because we rarely need to query the database for aggregated (joined) data, which is a primary strength of a relational database.
On the other hand we do make heavy use of Mongo’s ability to store documents with varying properties within the same collection. When we have needed to perform more complex queries we’ve found the aggregation framework to be more than capable of meeting our requirements, and in many places we’re able to streamline transfer of data between the client and the database since both support the JSON format. The nature of the data we’re currently concerned with requires little to no normalisation, which also fits in with MongoDB’s ethos. Quite simply we’ve found MongoDB to be a natural fit for the way we want the app to work.
The other feature of MongoDB we’ve embraced is Replica Sets, which is Mongo’s failover implementation that provides redundancy at the storage level. Here data is quickly made consistent across a cluster of nodes, which means that should one or more of our database instances fail, another is automatically elected to take its place. The importance of this provision in an application where data loss is abhorrent is clear.
Finally, we’ve enjoyed the flexibility a schema-less database affords. As we transition out of the startup phase and our schema becomes more rigid this becomes less useful, but in earlier stages of development we’ve appreciated the ability to dramatically change our data structures without having to worry about complex schema migrations; a small thing perhaps, but nevertheless it has represented one less task to concern ourselves with.
PHP on the server
Our team are comfortable with a broad range of server-side languages, including Java and Ruby, but we chose to use PHP partly because we wanted to use something ubiquitous and familiar but also because we’ve found it easier to hire talented PHP developers in the past, which will continue to be important as the company grows. The language is currently going through something of a renaissance, improving rapidly in terms of quality and performance, which addresses the traditional complaints held against it in the past.
We chose to use the Slim micro framework, complemented with other high quality third party libraries, because it’s not prescriptive and provides us with the tools we need to build a MVC framework and nothing more. Where it makes sense to we’ve strived to own our code; we don’t want to rely on third parties unnecessarily.
We don’t have a great deal of logic on the server side; it’s architected into a dual layer system where the uppermost layer is responsible for communication with the client, and that service then makes requests over HTTP to a second layer which is essentially a secure, private API. This second layer is able to communicate with our data stores, forming a restriction which enforces a separation of application logic from the technical, helping us keep our code clean.
We also have a public API which is available to users depending on their package, and is separated from the data stores in the same way the app’s client service is. We’re able to tailor this to our clients’ needs, and we can replicate this approach for other services too, like mobile app APIs. Our aim is always to keep our backend code as flexible as possible, and to introduce new services to power additional features, instead of developing a monolithic backend.
These attributes are crucial as Bipsync’s design and features have changed dramatically over the two years of its development; had we been wedded to a prescriptive, volatile framework a significant amount of refactoring work would have been related to third party code, which is to say working around someone else’s vision. By keeping our code lean from both a first and third-party point of view we’ve been able to change the product frequently with minimum fuss.
We’ve also fallen in love with d3, the data visualisation library that powers the beautiful charts and diagrams within the app. We use it to render aggregated data dashboards within Bipsync, and it is also used in our admin application to allow advanced users like IT administrators to visualise analytics of the users in their team. d3 is extremely easy to use, and has had an incredible impact on the application interface.
Services and infrastructure
No content-focussed web app would be complete without a Lucene-based search tool, and Bipsync is no different. We chose to use Elasticsearch because it’s very easy to scale; it’s distributed by nature, so our application doesn’t need to concern itself with how many instances of it are running or where each of them are. Like MongoDB it uses JSON natively, so it’s consistent with the rest of our services and familiar to the team – and most importantly, its search capabilities are first class.
We’ve put a lot of effort into making Bipsync a product of high quality and we want it to remain so. To this end we have an extensive test suite and continuous integration pipeline that validates our code on commit via a series of checks that include a set of browser-based integration tests which validate that the app works in our supported browsers. This complements our unit test coverage to give us confidence in each release we make. Sean, our Head of Test Automation, plans to expand more on this topic on this blog in the near future – watch this space.
Away from the application itself we’ve also invested significantly in tooling to manage our infrastructure. Ansible is an automation platform which allows us to orchestrate our systems using simple, expressive scripts written in YAML. These scripts form living documentation for our infrastructure and allow us to create production environments comprised of numerous machines in a matter of minutes.
Our production systems are hosted by Amazon’s ec2 service and we make use of several other Amazon Web Services (AWS) features; Ansible offers several modules that interact with AWS which have made it easy for us to automate common tasks like machine provisioning or load balancer pooling. We’re also able to use the same Ansible scripts to provision our development and integration environments, so we can be sure the code we write is being executed in identical circumstances right through our pipeline, from development to testing to live.
Hopefully I’ve given you an idea of what’s running under the hood of Bipsync. We’ll shortly sharing more technical details about the application, as well as of our soon-to-be-released iOS app, so please stick around to learn more.