Industry
Text Link
Genetics
Technology used
Airflow, Kubernetes, MLFlow, DVC, Jenkins, Terraform, AWS
Share

There are multiple very interesting areas in which biotech, and genetics, in particular, continue to evolve in order to provide help to patients, examples varying from drug discovery to early disease (or even just predisposition for them) identification. Nowadays, machine learning and artificial intelligence are often utilized in order to attempt to solve such problems – but those are very challenging, and companies face unique obstacles related to the field itself as well when working on such solutions. These hurdles are mostly attributed to the large volume of data involved as well as the complexities of ML development, which ultimately cause organizations to often struggle to move their ML models into production due to the intricacies of data management, model deployment, and scalability.

Our company, as an expert in the MLOps area, recently partnered with a genetics company to help them overcome these challenges in order to bring their ML solution closer to the market and release of the user-facing product. The collaboration began with a comprehensive analysis of the client's requirements, existing infrastructure, goals, and technical constraints – after which we designed, a target solution architecture, which we also later implemented. The implementation encompassed various components of MLOps, which ultimately allow for streamlining ML development, but also fulfilling some regulatory requirements (like data traceability). These components included data versioning, experiment tracking, data/processing pipelines orchestration, and more. All of those were implemented utilizing also cloud capabilities – incorporating infrastructure as code paradigm and cloud-agnostic technologies such as Kubernetes, providing the client with the option to switch between different cloud providers if necessary seamlessly.

The project can be considered a success, as the implemented MLOps components greatly help bring structure and efficiency to data management and processing, as well as the ML development lifecycle, facilitating collaboration among the client's data scientists and engineers. Furthermore, utilizing cloud technologies and infrastructure as code enabled the client to easily reproduce the environment as necessary and easily adapt to future growth. In future collaboration, we look forward to following up with bringing the models trained to serve them to the end-user in a scalable and cost-efficient way, helping drive innovative biotech use cases to a wider audience.

Highlighted work

see more case studies

Let's talk about how datarabbit can help your company.

Contact us