When you fall in love with a model

“The first principle is that you must not fool yourself and you are the easiest person to fool.” -
Richard Feynman, Nobel physicist

“Pygmalion”

Carbon-neutral flying cars are an admiral goal but are they realistic? Be honest with yourself about what you can reasonably achieve with AI.

When a model flatters your ego

When you build a machine learning model, how do you measure how good it is? Oh sure, you’re a maths whizz so you know exactly which algorithms to employ. But it’s surprisingly easy to become emotionally attached to a model and ignore the warnings when it’s cheating on you.

I once worked at a retail bank where we built models that classified customer transactions. Things were complicated by the categories being ambiguous. For instance, was a parking fine to be classified as an “automobile”, “municipal” or “discretionary” cost?

The bank’s own customers could not agree. They allocated their transactions equally to all three. Many payments were even self-classified as “gifts” - possibly somebody paying the fine of their adult child - who knows? As a result, building a model that had laser accuracy was challenging to say the least.

To solve this problem, a data scientist on the team tested his model against his own personal transactions. Since he knew them intimately, he reasoned, he could evaluate his model accurately. And what do you know? The model was a huge success. However, he only had a hundred or so transactions and was ignoring the millions of other customers the bank had.

It was the devil’s own work to persuade him that this was not reasonable, that the when his model was tested on millions of other customer’s data, it did not perform quite as well.

So, agree on what the criteria for success are before you start working on your next model.

Be Honest With Others

There are very few consultants out there telling clients that their idea for a machine learning pipeline is unrealistic. Why should they? That’s not how they make their money.

I’m old enough to have been part of the Dotcom boom in the 90s. Consultants back then were telling their clients that, if they paid them lots of money, they would have a multi billion dollar IPO in no time. Nobody was making any money by being reasonable. That’s how I successfully remained poor during the biggest economic boom in my lifetime.

(“And remember to click like and subscribe for more of my great career advice”).

Similarly, it’s clear that some people today are building internal corporate empires by promising their new team, department or project is going to shake their industry to its foundations. Without success criteria that are agreed up-front, how can they be proved wrong? When there is no strict criteria for truth, charlatans thrive - homeopaths, fortune tellers, economists - you know the types.

What we’re currently experiencing in this industry is Dotcom v2.0. So, be realistic about what can be achieved by an ML project. You’ll sleep better.

“This is my truth, tell me yours.”

Software is binary and generally speaking so are software projects. It either works or it fails. Of course, there is more nuance in the real world but it’s generally clear when something is wrong.

Machine learning is more subtle. If the bank’s customers themselves don’t know how to classify their own transactions as mentioned above, how can your machine learning pipeline?

The ancient Greeks knew how you can fall in love with your own work. They gave us the story of Pygmalion who fell in love with the sculpture he created. So, fall in love with your models if you like. But like all love, set boundaries.