When is it worth for you team to build MLOps tool from scratch? Interview with Jakub Czakon, Chief Marketing Officer and Data Scientist from Neptune.ai.

Those who are not new to my website, probably will remember my article from several months ago when I experimented with Microsoft MLFlow. If not, I highly recommend you to start from there, where I have explained the MLOps concept in detail. As this topic is definitely not losing on popularity those days, it was extremely important for me to find someone experienced in this topic to have a conversation with here on my blog. While listening to one of my favourite Polish podcasts about ML (shoutout to Michał Dulemba and his “Nieliniowy“!) I have came across the interview with Jakub Czakon, Chief Marketing Officer and Data Scientist from Neptune.ai.

Neptune.ai is a Polish startup that helps enterprises manage model metadata. According to the product description, “Neptune is a metadata store for MLOps, built for research and production teams that run a lot of experiments. It gives you a central place to log, store, display, organize, compare, and query all metadata generated during the machine learning lifecycle. Individuals and organizations use Neptune for experiment tracking and model registry to have control over their experimentation and model development.” Not so long after, I have heard about this Polish startup again, while listening to Super Data Science podcast lead by Jon Krohn, which I think is really huge! Come on, they are promoting Polish data scientists in the United States! 🙂

You can only imagine how happy and pleasantly surpised I was when Jakub has agreed for an interview for DataScientistDiary. I really hope you get as much valuable knowledge for yourself from this article as I did.

Introducing MLOps.toys - Aporia
Sandra: Hello Jakub! Thank you so much for agreeing to this interview. I came across an interview with you on one of Polish podcasts dedicated do ML topics and immediately thought that you would be a perfect interlocutor on MLOps subject! But before I start shooting with all the questions swirling in my head, could you please tell a few words about yourself to the readers of my blog? What do you do currently? What is your
background? Why have you chosen this field and what are your past experiences?

Jakub: I am Jakub, a co-founder of Neptune.ai where I focus on the marketing side of things, making sure that the right people hear about us and understand how neptune can solve their problems. I studied Theoretical Physics and Finance and Accounting which probably tells you something about me I really love learning things, different things! That curiosity is what made Data Science so interesting for me and it is what makes working at a startup so fulfilling.

Before Neptune I was a data scientist at Deepsense.ai where I was building models and sometimes talked about them at conferences. The fact that I worked in ML really helps me when it comes to communicating how what we do can help ML folks. Now I actually focus my time getting deeper into understanding folks in the MLOps and generally how good developer marketing should look like. I even write about it on developermarkepear.com.

Sandra: While doing a small research about your person, I have also noticed that you are an international chess master since 2005! That’s huge! Is there anything what you find in common between those two disciplines in your life? I assume that passion to chess has started earlier than the one to ML and AI. How your chess mind helps you in your career?

Jakub: Haha, yeah I was even a Polish chess champion in blitz games at one point. Feels like a lifetime ago. Yeah, I think there are things that help especially when it comes to solving open problems. There is this chess rule I love to use when I don’t know “how to find a winning solution” to a startup problem. So basically in chess sometimes there is no clear path to winning a game. But you can improve your position and maybe then the winning path will be clear. And so I like to use the “rule of the weakest piece” where I try to figure out what is the weakest part of the current “business position” and improve that. At some point you can see clear winning paths but there is no way you could have seen those at the beginning.

Coincidentally I think it maps really nicely to ML where we do that all the time training deep learning models with gradient decent. You don’t know where the winning path is (global minima) but you are constantly improving your “networks` position”.

Sandra: This reference resonates with me so much. I am definitely stealing this for future! It’s always good to focus on small improvements first, before throwing yourself on a deep water. If we have already. As we have already moved to ML topic thanks to this nice comparison, could you please share a few words on your current company, Neptune.ai? How has it all started? What is the mission of it? And what’s most important, who is the product dedicated for?

Jakub: Yeah, so it all started over at Deepsense.ai. During one of the Kaggle competitions the team (who eventually finished first) needed a tool to keep track fo all the experiment configurations. Believe it or not, there were no experiment tracking tools back then in 2017 and so they created an MVP themselves. Later Piotr Niedzwiedz, then CTO at Deepsense.ai saw what they created, talked to some other ML teams out there and realized that this could actually be a global product. We spun off neptune.ai, got VC funding and as they say the rest is history.

When it comes to the mission, we want to provide production ML teams with the same level of control and confidence when developing and deploying models as software developers have when shipping applications. ML fueled products are even more complex than standard software applications and we as a community need proper tooling to make this happen. We also believe in a pragmatic approach to solving any problem where you don’t blindly take “industry best practices” and try to copy paste what folks over at Google or Netflix are doing. That is why we talk a lot about companies doing ML and MLOps at a reasonable scale (vs hyperscale of those super advanced companies). And so we build our product for data scientists and ML engineers at those reasonable scale companies. We help them with managing model-building metadata.

It Worked Fine in Jupyter. Now What? | Iguazio
Sandra: What are the components you need to consider when you aim to build MLOps tool on your own?

Jakub: Yeah, it seems that there are are five main pillars of MLOps that you need to implement somehow:

  • Data ingestion (and optionally feature store)
  • Pipeline and orchestration
  • Model registry and experiment tracking
  • Model deployment and serving
  • Model monitoring

Each of those can be solved with a simple script or a full-blown solution depending on your needs. Depending on their use case, they may have something as basic as bash scripts for most of their ML operations and get something more advanced for one area where they need it.

For example:

  • You port your models to native mobile apps. Probably don’t need model monitoring but may need advanced model packaging and deployment.
  • You have complex pipelines with many models working together. Then you probably need some advanced pipelining and orchestration.
  • You need to experiment heavily with various model architectures and parameters. You probably need a solid experiment tracking tool.

By pragmatically focusing on the problems you actually have right now, you don’t overengineer solutions for the future. You deploy those limited resources you, as a team doing ML at a reasonable scale have, into things that make a difference for your team/business.

Sandra: What are the crucial characteristics of MLOps tool we need to take into account (like scalability, etc.)? What should we take into account regarding further support of self-created tool? Is it possible to estimate somehow how much people capacity we will need to allocate there?

Jakub: It depends on the particular pillar but in our case of experiment tracking and model registry what you need to think about is:

  • Flexibility and expressibility: can I log/display metadata in a way I want to (not the way the tool allows).
  • Scalability and reliability: can I rely on this tool to log the data, will the backend and frontend work, will it slow down my training, what do I do when the internet goes down, what happens when I hit the limits of the DB holding metadata, etc.
  • Great UI: with experiment tracking the core “job to be done” is to organize, debug and compare different runs. You just need great experience here. Now when it comes to the biggest costs those are somewhat invisible at the beginning imho, especially if you have less experience building tools.

The question is not if you can build an MVP of a tool that does XYZ but can you maintain it over time:

  • improving/fixing all the scalability issues
  • creating documentation and exampels for other folks to understand how to use it
  • solving bugs and adding improvements requested by the users
  • infrastructure costs

This all sums up to a lot of your time spent on maintaining this tool. And this is the real cost, your time. If you feel that this is the best place to put it, alright. But often you can add more value to your organization by solving ML problems.

Sandra: What are the most common mistakes that people do when trying to build own MLOps tool? What do they forget about during planning?

Jakub: That you should first understand what you actually need, is there a tool that solves what I actually need, and should I build it. No but srsly, building tools is great, especially if you are an MLOps tooling company. 🙂

But if you are solving some ML problems you should really try and put all your time into those ML problems and leverage tools, both open source and paid to do the things that are not core to your business. It is not about whether you could build an MVP over a few weeks. It’s about whether the cost of building/fixing/maintaining it is higher than using something that is already out there.

Think about it say you put two people on this tooling project. What is the benefit of doing that vs “hiring” a company or a community that is putting all their time and effort into solving this or that infrastructure problem? How is the time of those two internal people well spent when there is something out there that solves your problem? Obviously it could be that there is nothing that solves your problem, and then you may have to build it yourself or glue different solutions together. That is fine but be really pragmatic about it and build only what you have to.

Sandra: When is it worth for a team to build own tool from scratch? What skills do you need to achieve this? When its better to choose external vendor?

Jakub: Again, I’d first understand what do you actually need not what “industry best practices are”. I’d also look into the tools that solve it and see if anything fits the bill. Only when I have to would I go build something myself. Imagine asking this question about the deep learning framework or code repository. When would it make sense to build another PyTorch or GitHub for you? Only if PyTorch or GitHub really suck for your problem 🙂 So yeah, probably not too often.

The product delivered by Neptune.ai doesn’t cover the entire ML end-to-end lifecycle comparing to the vendors like DataRobot or Valohai. If I understand it correctly, it’s more a managed service to get rid of maintenance tasks.

Sandra: Can I ask you why have you decided not to cover it all? What components haven’t you included to the product and why? What is your value proposition in Neptune.ai?

Jakub: We want to solve one problem well, managing model building metadata which is covered by our experiment tracking tool and model registry. We help folks log, display, organize, compare and share all the model metadata in a single place. And we only want to do that. But do it really really well. The way I see it you shouldn’t try to do it all when building for developers. In software engineering, there are no end-to-end platforms but rather an ecosystem of well-integrated tools. I think, in the long run, ML and MLOps will be the same. You will connect your experiment tracking tool with a component for workflow
orchestration or model serving. You will connect them to your GitHub or reports in Notion. You will not do everything with one platform.

There are many teams who try to do it all and I believe this is a very risky approach. There are so many problems in each and every part of the MLOps stack that thinking you can solve it all and make everyone happy is super hard if not impossible imho. And if it didn’t happen in software development I don’t believe it will happen in ML which is even more complex. So yeah, that is why we focus on doing one thing really really well.

Sandra: Are there any specific industries which are more problematic to include in MLOps? What are the “easier” and “harder” cases speaking about your customers?

Jakub: Hmm, interesting question. I think all industries need it and looking at our clients it seems to be true. We have folks from self-driving cars, finance, healthcare, industrial optimization, and more. I’m not sure which ones are the hardest but there are certain industries that need more flexible tools because the way models are built there is slightly different than the classical ML lifecycle.

For example, folks who do time series forecasting need a very flexible experiment tracking. That is because in forecasting you rarely build one model. The algorithm may be the same but you train it on 10s of time series, one model per product or per location for example. You also want to visualize and compare across those locations, look at a particular time range or you want to update the models when new data comes in. All that puts additional requirements on your experiment tracking or monitoring tool. You need to be very flexible and customizable to support that.

Another example is that you have drug discovery and chemical modeling where your models are “generating” possible chemical compounds. You need chemistry subject matter experts to look at those and see if they make sense. So you need to support the human-in-the-loop MLOps where you can let SMEs visualize and comment on those compounds. Again, being very flexible and focused on doing one part really well helps a lot in supporting those teams.

Sandra: How many external vendors are now on the market both on Polish scene and globally? What you should pay attention to when choosing one?

Jakub: At this point it’s probably in the hundreds globally, not sure if we have any other MLOps company in Poland (but hopefully I am wrong 🙂 ). I’d just focus on understanding your minimal viable MLOps setup and choose tools where you actually need them. Remember that a lot of the time you don’t need auto-retraining models triggered by advanced anomaly detection triggers in production. It is often way enough to just run prod evaluation every week with a simple cron job, look at the results and if needed run a bash script that re-trains and updates models. 🙂

Sandra: What are Neptune AI goals for this year? What are its goals in the longer term (like 2025 horizon)?

Jakub: Our vision is to become the go-to solution for reasonable-scale ML teams for experiment tracking and model registry component of their MLOps stacks.

To support this vision, in 2022 we will focus on three things:

  • Deliver the best developer experience around experiment tracking. We’ll improve the organization, visualization, and comparison for specific “machine learning verticals,” including computer vision, time series forecasting, and reinforcement learning.
  • Support core model registry use cases. We’ll add better organization of model versions, stage transitions, reviews and approvals, better model access to packaged models.
  • Create more integrations with tools in the MLOps ecosystem. We’ll add integrations with tools for model deployment, pipelining and orchestration, and production model monitoring.

I mention those teams doing ML and MLOps at a reasonable scale a bit. But I do believe that as a community we should focus on those teams of 4 people and 7 production models more. Teams that are at hyperscale with hundreds of ML engineers and 1000s of models will take care of themselves really. 🙂

Sandra: ML world is evolving so quickly. Skills we are learning now might be automated and not necessary in future with the new tools like the one you are offering. What would you advice to data scientists, machine learning engineers on the beginning of their career? What would you advice to students? In which skills, courses should we invest our time and money? How much can we rely on Polish education system? How to best invest in ourselves to find a lucrative place on the job market in several years?

Jakub: Good question. I think overall what helps a lot is to understand that ML (or pretty much any other technology) is a tool that helps people solve problems. For example, I want to have great tutorial videos that help people actively use our tool. I don’t really care if those are human-generated videos, AI-generated videos, or if the videos are crowdsourced and are verified by blockchain. All approaches have their problems, risks, and costs. At the end of the day, I want great tutorial videos that activate my users. That is it. That is the problem. I think this problem thinking is what can push you in your career a lot. Looking into product management helps with that. I think the book “Building ML-powered applications” is a great step in connecting product thinking with ML as a tool.

Sandra: I haven’t heard about this book yet. Thanks for recommendation! And also, totally agree. What you’ve just said, reminds me of a quote that “Data science is not about complex models, but about solving problems”. 🙂 Now another challenge for yourself – could you please tell us about most interesting project you have ever done? Do you have any dream project? Let’s get a little dreamy now – if there were no limitations in the world, all data and technical requirements were accessible, what would you like to work on? What world problems would you like to solve?

Jakub: I am actually working on such a project right now 🙂 I am really passionate about helping people who build tools for devs get devs to use those tools. I have seen the power that great marketing that actually educates and inspires can have. And so I really want to change how dev community looks at marketing. I want to change how marketers approach talking to devs about their products. I know to imagine a world where developers respect marketing is very “dreamy” but hey, I am a dreamer. That’s why I started developermarkepear.com and I think this will be one of my passion projects for years to come.

Sandra: I know you’ve already mentioned one book, but apart from this one, do you have any positions which have made a great impact on your career or life and you could recommend? Doesn’t need to be data oriented but would be great definitely.

Jakub: Yeah sure there are many but let me give you three from completely different fields:

  • “Building ML powered applications” -> product approach to solving things with ML
  • “Made to stick storytelling” -> how to tell powerful stories that people remember
  • “Every page is page one” -> how to write content that is designed for the web
Sandra: Any final words? Recommendations, advice for the future or call to actions? How readers can reach out to you?

Jakub: Embrace the pragmatic reasonable scale MLOps mindset (join the mlops.community slack, come to our biweekly MLOps live event, or check out our MLOps blog). Remember that ML is a tool that helps solve “real” problems. Understand those problems first. Love the problem, not the solution (which may or may not be ML models). Don’t hate on marketers. Most often we are really trying our best to tell the story of our product.


I would like to thank Jakub for this interview so much. I feel it’s a great repository of knowledge for all teams struggling with MLOps implementation to their own structure. Should we build our own tool from stratch? Do we have the necessary skills to make it? Or maybe it would be better idea to outsource from external vendor? What possibilties do we have on the market currently?

I hope that you’ll agree with me that Jakub has pretty covered those questions up showing us all difficulties we need to prepare ourselves for on at a given stage of development. Doesn’t matter what option will you choose, it is a huge investment not only speaking about money but also about your team’s time. Learn from Jakub – first understand your problem very well, listen to your people. Ask the right questions and answer together. Are you ready for such challenge? Is it worth to reinvent another wheel (or GitHub? 🙂 ).

Please let me know about your thoughts and experience below in comments!

Leave a Reply

Your email address will not be published. Required fields are marked *