Making ML services means bridging development cultures

Yair Morgenstern
3 min readNov 5, 2021

--

Around 2017, there were fierce debates at my company regarding which development teams were responsible for what parts of development. Most discussions revolved around the job of The Framework vs the job of The Algorithm, and it took us a month to realize that every person in the room was using these terms differently.

What is The Algorithm?

For a User, there is no framework. I want to run an algorithm, I trigger it and then I see results.

Oh, silly users. Obviously we Full-stack Devs know that The Algorithm is just the part in the middle, the service that we call after the trigger.

Poor misguided fullstack souls — we Backend Devs know that The Algorithm is only *part* of the backend service, that we activate only after preparing the data, and when we get results from it we then save that data. You don’t really think The Algorithm actually accesses any databases, do you?

Wait a second. “The Algorithm”? Do backend devs really think there’s just one block that’s The Algorithm? Us Algorithm Devs know that there’s an entire pipeline of different algorithms each with its own specific part to play. Honestly, the lengths at which they’ll go to hide the complexity of our work is astounding.

So, then, what is The Framework?

First off, why do we care what the framework is? We care because the assumption is that when introducing a NEW algorithm, we can use the existing framework to Do Less Work.

As such, every party subconsciously (or perhaps consciously) adopts a view of The Framework that absolves them of as much work as possible.

For fullstack devs, a New Algorithm that requires minimal work is one that takes requests from the same requests DB and puts results into the same results DB. If the requests or responses are different, that’s extra UI work, and it’s the responsibility of the algorithm to deal with that.

For backend devs, this also means that the preprocessing and post-processing are the same. If not, that’s the algorithm’s responsibility.

For the algorithm developer, this is baffling. You call this a framework? I, a Matlab/R/Tensor scientist, need to handle UI work and backend work? What kind of a cut-rate production is this? Who’s in charge of the algorithm pipeline, anyway?

Now obviously, there’s a LOT to unpack here. Shared responsibility never has a one-size-fits-all approach, and when all’s said and done, some people will be leaving the room much more annoyed than they entered it.

But to even start having these hard discussions, there needs to be a base agreement on the language level. We can’t even argue if we use the same words for different things.

For our discussions, we split the architecture into:

  • The Process (end-to-end, from trigger to user-displayed results)
  • The Service (backend service that processes requests and stores results)
  • The Pipeline (receives input parameters, returns output, made of multiple components)
  • The Algorithmic Component (self-contained)

I hope this helps you establish a working language to start arguing about the important things faster :)

--

--

Yair Morgenstern
Yair Morgenstern

Written by Yair Morgenstern

Creator of Unciv, an open-source multiplatform reimplementation of Civ V https://github.com/yairm210/Unciv

No responses yet