Our Work

Examples of previous projects the team has completed

Analysis

Data Integration & Visualisation

Situation
    A client ran an early access programme for its first consumer product, involving multiple stages that required close tracking. The programme included lead generation, applications to the programme, product shipments, customer usage, performance and, ideally, culminated in customers repurchasing the product. This process generated data that was recorded across multiple systems.


Task

  • Data Integration: Integrate the systems to create a detailed, interrogable data model capable of identifying trends, and providing detail on   individual customers. 
  • Analysis: Develop visualisations, tools, and more advanced analyses on top of the data model. 
  • Deployment: Automate loading of the data model and feeds into downstream systems.
     

Action

    The first step involved collaborating with stakeholders to understand business processes, and with systems teams to identify the source data. A suite of simple functions was created to load the source data into a SQL database. Data integration was achieved with a view that joined the tables and added calculations, creating a single object containing all the data required for the project.
    Flexible visualisations, tracking customer progress and filtering to particular cohorts, were built on top of the data model. Additional tools were developed to extract detailed customer data and to create hit-lists.

 

Result
    A critical and complex system was modelled and visualised to enable monitoring and interrogation by non-data staff. This allowed for measurement and understanding of the various paths through the programme, leading to better decision making at both a strategic and individual customer level. As a result, the customer onboarding process was refined and shown to be improved.

Failure Mode Classifications & Regression Tests

Situation
    A client sold a consumable product with numerous intricate failure modes. Each consumable had a large number of sensors; the majority of sensors started in a functional state. However, as the consumable was used, these sensors transitioned into various bad states, ranging from temporarily faulty to permanently dead. Minor changes in the consumables manufacture and configuration could lead to big shifts in the evolution of these failure modes.

 

Task

  • Develop an Algorithm: Create an algorithm to classify the different failure modes accurately. 
  • Trace Root Causes: Link the identified failure modes to changes in the manufacturing process and software configuration.
  • Build Monitoring Tools: Create tools to track failure mode distributions over time and generate alerts when significant changes are seen.

Action
    Initially, a simple approach was taken to classify the different failure modes. The count of sites in each state at multiple points during the consumables’ life was calculated and passed through a clustering algorithm. While not perfect, this method effectively matched known past change points and was later refined.
    Reports were created to track classified failure distributions modes across various ordered cohorts (i.e. software releases ordered by release date), enabling R&D teams to periodically check for change points. Automated alerts were implemented to trigger when the changes in the distribution exceeded certain thresholds.
    Additionally, MLOps processes were established to monitor the distributions within each failure mode and alert when new data had shifted too far from the initial training distribution.

 

Result
    These reports allowed R&D to quickly check that changes to the platform didn’t negatively impact the consumable, preventing problematic releases. The automated tests caught issues that went unnoticed internally, allowing for proactive customer and stock management - avoiding reputational damage.
 

Data Quality From Edge Devices

Situation 
    While working with a client, we noticed a subset of outlier devices that would produce non-physical or incomplete data. This problematic data resulted in these devices being excluded from monitoring and analysis. This meant that critical issues were not being effectively diagnosed from the data received from the device.

 

Task
    Systematically produce a categorised list of outlying devices, grouped by the type of data problem they exhibit, enabling engineers to address the issues either remotely or in the field.

 

Action
    We developed a suite of outlier tests: some were analytical, for example identifying devices with negative and non-physical values, and some were statistical and therefore required fine-tuning and experimentation to ensure problematic devices were not missed while minimising false positives.

    These tests were then automated to run daily, delivering a categorised lists of outlier devices to engineers.

 

Result
    The engineering team received daily categorised lists of devices requiring attention. This allowed them to quickly address the devices that could easily be fixed remotely and arrange for site visits to fix the others. As a result, data quality was greatly improved, leading to genuine problems being identified and fixed much faster. Ultimately, this led to higher customer satisfaction.

Change Point Detection

Situation
    A client manufactured a product under continuous development, therefore requiring constant monitoring. While some attributes of the product could be measured at manufacturing, others could only be measured destructively, and so could only be recored during customer use. Additionally, the product’s performance varied significantly under different manufacturing conditions, and use cases.    

 

Task
    The client needed to measure multiple attributes, identify change points, and link them to manufacturing processes. The solution had to account for data from a batch, manufactured on a particular day, being received over a period of months. Additionally, it needed to minimise false triggers caused by batches with limited data or those dominated by use cases known to perform poorly.

 

Action
    The necessary data was already stored in a data warehouse, enabling easy retrieval via a few SQL queries.
A simple framework was developed to replicate the gradual arrival of past data, facilitating the testing of change point detection algorithms. We started with basic algorithms, such as step changes and forward/backward windowed averages, progressing to more complex approaches as required.
The system included a backend to store configurations of metrics and algorithms, lists of stakeholders to notify, and previously raised change points. It was set up to run on a schedule, automatically alerting relevant stakeholders when a new change point was identified.

 

Result
    The system was adopted across the organisation, saving resources spent manually looking for change points, while simultaneously expanding the number of metrics being monitored. This led to a better understanding of the product, as more change points were mapped, and improved customer experience through informed stock management.
 

Engineering

Flexible Analysis Execution & Storage Platform

Situation
     A previous client had a wealth of valuable analyses created by a diverse group of analysts and scientists. However, as data volumes grew and the complexity of the analyses increased, running them in an ad-hoc setting (i.e. notebooks) became unviable.


Task
    Develop a system that enables a wide range of complex analyses to be executed and their results stored. The system had to be designed in such a way that scripts written by research scientists could be quickly ported across to the system. The framework needed to allow for data volumes and complexity to grow further.   


Action
    The key to this project was identifying common performance bottlenecks in existing processes and producing a framework to address them while maintaining maximum flexibility in the outputs it could produce to allow for varied downstream analysis. 
    To achieve this, we developed a lightweight framework that enabled the arbitrary analysis of granular scientific data, under the constraint of a fixed final level of aggregation suitable for business analytics. 
    It provided controlled access to internal databases for input, and leveraged a NoSQL database for storing results, ensuring maximum flexibility of output. The system was designed to run frequent incremental loads, enabling it to push high data volumes through complex pipelines while comfortably keeping pace with incoming data.


Result
    This system unlocked many new, highly detailed analyses, which were impossible previously due to long processing times, by storing a full history. This allowed for the exploration of significantly longer histories, resulting in more thorough and accurate science being performed. 
    Additionally, the system also served as a centralised, version controlled repository for company analyses, ensuring consistency, traceability and collaboration across teams. The framework allowed for seamless transition from initial analysis code to an automated production setting. This enabled  scientists and analysts to continue writing their scripts while being able to quickly move them into production, fostering innovation and accelerating product development.

Data Driven Alerting System

Situation
    A lot of good work was being done at a previous client to identify and address product issues, but once an issue was resolved, focus would quickly shift, and past issues were often forgotten. Additionally, there were known issues with the product that had to remain unresolved, as a fix wasn’t possible at the time, and required ongoing management.


Task
    Develop a system that runs on a schedule, monitors known issues, and alerts interested stakeholders when an issue recurs, worsens or significantly impacts an individual customer.


Action
    We implemented a lightweight and flexible framework that allowed for arbitrary checks to be performed on the data, and for exceptions to be raised as alerts to stakeholders
    Examples of the diverse issues monitored were: 

  • RAM disks filling up and causing experiments to crash.
  • Devices overheating when operating at high concurrency levels.
  • Customers having an excessively high proportion of consumables failing QC checks within a specific time window.
  • Change points in product quality.
  • Failure modes actively under investigation by R&D.    


Result
    The system transformed processes around the business, driving targeted efficiency. Customers could now be managed proactively, the business often knowing about issues before the customer. Previously resolved issues that recurred were identified and investigated quickly, preventing them from becoming widespread and avoiding reputational damage. Unseen performance issues caused by changes in manufacturing were quickly detected, allowing faulty stock to be managed proactively. R&D could be immediately directed to devices showing particular faults, streamlining processes and reducing idle time.
    A key factor in the system’s success was its flexibility. The platform was designed in such a way that scripts produced by research scientists could be quickly integrated as a new alert. This fostered a positive development cycle, where new use cases could be continually integrated into the system.

Transformation

Building a Data Platform For The Analysis of Edge Devices

Situation
    A client had thousands of edge devices deployed at customer sites. These devices generated telemetry data that summarised their utilisation, performance, and any errors encountered during operation.
    Although it was known that this telemetry data contained valuable insights into operational performance, which was highly variable across the fleet, a significant amount of what was collected was unused.

 

Task
    Our goal was to leverage the collected telemetry data to analyse the performance of the client’s suite of products and build the analytics outputs into a production-grade data platform. 
    Key components of the project included:

  • Conducting iterative exploratory analysis to uncover new insights and define the required analytics outputs.
  • Collaborate with the instrument software team to define and implement additional data requirements.
  • Develop and deploy scalable data pipelines that feed the analytics outputs.  
  • Support end-to-end data flow through the system, and re-engineer the pipelines as data volumes and analytical complexity increase. 


Action

    When embarking on a far-reaching data transformation project, we always begin by collaborating with stakeholders to identify small, focused, and fundamental analysis tasks that can be completed quickly. This approach delivers immediate value while building a strong foundation of knowledge that supports the longer-term goals of the project.
    We began by working closely with stakeholders to quickly address their initial questions, gaining a deeper understanding of the available data and how to access and analyse it effectively, while identifying what data was missing. As our understanding grew, we transitioned to asking more complex questions of the data.
    This exploratory process was repeated as we answered more and more questions. Over time, it became clear what data formed the core of the business and what state it needed to be in to become information-rich and flexible. Once this core data-model was defined, we automated its construction - this then allowed for more complex analyses to be built around a common, reliable foundation. 
    The iterative process of exploring new data, extracting the value, defining how it should be modelled, and then automating the loading of the data-model is always ongoing. Over time, the initially small collection of tables evolve to become a comprehensive data-warehouse.
    At this stage, the iterative process of exploratory analysis had yielded numerous valuable insights and tools, with their automation naturally creating a data warehouse. Self-service access points were established on top of this foundation. Tools were developed as web-apps or simply notebooks, BI tools showed KPIs and supported user-led analysis, while views and data-marts were built to allow direct access to the data via SQL.
    With the basic data platform in place, we moved on to develop more advanced analytical systems on top of the data warehouse. Examples of these systems include rule-based alerts, change-point detection, and automated data quality checks.

 

Result
    By providing fully automated and detailed visibility into products performance, this project transformed the client’s analytical capability. Insights and results that were previously impossible were now available at the click of a button. Some practical benefits of the project included:

  • Visibility: Performance of product with customers became fully visible and comparable to internal performance.
  • Strategic Prioritisation: R&D teams were prioritised at a strategic level based on the insights from the analyses.
  • Operational Efficiency: Tools developed during the project were being used daily by operational staff, thus streamlining numerous                   processes.
  • Proactive Issue Resolution: Checks and alerts enabled the identification and resolution of issues before they became widespread.
  • Centralised Analytics: Analytical efforts became centralised and controlled, improving consistency and governance.

 

Conclusion
    When executed well, a project like this can fundamentally shift the way a company operates. It brings reliable, well managed and easily understandable data to the forefront of all projects. This ensures that all work becomes aligned and based on consistent and comparable foundations.
    The key takeaway is that this project applied an exploratory approach to a transformative data initiative. The first phase of every new piece of work was to analyse the data and prove its value before moving on to subsequent steps. This approach offers multiple advantages: 

  • Projects either quickly provide useful outputs or fail fast. 
  • Anything that progresses to engineering has already demonstrated its value.
  • Prioritisation becomes simpler, as business value of a particular piece of work can be assessed much earlier in the process.
  • Lots of unexpected insights and valuable tools emerge when data is explored. 

    In contrast, an engineering led approach, typically requires everything to be specified and designed up front, with the system built before any real value has been demonstrated. Changes are then difficult to make as the system is live and code already abstracted and committed. By taking an exploratory approach, organisations can ensure that data projects are efficient, impactful, and aligned with business objectives from the beginning.

©Copyright. All rights reserved.

We need your consent to load the translations

We use a third-party service to translate the website content that may collect data about your activity. Please review the details in the privacy policy and accept the service to view the translations.