Quality (AI) Tools require Quality Use

I have been working on an ML-based algorithmic trading system. My models—a simple linear model/MLP model, and a decoder-only transformer based on the Attention paper and several others—produce quality predictions, meaning they perform well both in validation and in real-world use. I wrote these models in Python using Pytorch, so I’m excited by their relative accuracy. It feels good to make something and see it work.

BUT—and you probably guessed there would be a but—it’s one thing to make decent price predictions, and quite another to execute profitable trades based on those predictions. In practice, the software around the models—the trade execution system, the logic for setting targets and stops, the rules for trading frequency, the compliance monitoring, and so much more—are just as important as the models themselves. It turns out it’s pretty easy to make terrible trades based on great predictions. 

A quality machine learning model must be used in a quality way. Algorithmic trading efficiently, brutally, and clearly reinforces this point. A trading system either makes money or it loses money, and the results come from the model, the system around the model, or some combination of the two. I’ve seen the same truth in a number of other projects, though in more subtle ways. 

For example, while writing blog posts. Midway through this article I decided to give Claude this prompt: Write an article based on this title: A quality machine learning model must be used in a quality way. Here’s a paragraph from Claude’s response:

The organizations deploying machine learning systems have an ethical responsibility to monitor models across various slices of data, enable avenues for redress, document decisions, and continually check models for real-world performance across all user groups. They also must establish oversight procedures involving stakeholders beyond just technical teams, including domain experts in law and ethics.

Blah. Actually, worse than blah. It’s pablum. It’s LLM vomit. It’s lifeless and barren, and if I published an article based on this prompt I would be: 1) lying about my authorship; and 2) furthering the AI-induced stupification of the internet (some might argue the internet was already stupid/stupified, but that’s probably another post). 

It’s not Claude’s fault. My prompt was terrible (intentionally so). I shouldn’t be surprised when a machine produces something mechanical. So again, a quality tool—and I argue Claude is an excellent tool for many tasks—requires quality use. Here are some better prompts to get Claude (or Bard or ChatGPT) to help you while writing an article:

Given this outline, suggest additional points I should make.

Analyze this article and look for logical fallacies. Pay specific attention to any paragraph whose tone is sarcastic or undercutting.

Summarize the main points in this article and suggest two to three counter arguments to each point. 

I could further engineer these prompts to coax better responses from the model. In fact, each of the above are shortened versions of much longer prompts I’ve used while writing. If I treat Claude as a writer, I get poor results. If I treat Claude as a writing assistant, I get much better results.

[Side note: I’ve read several prompt-engineering research papers in which the authors have attempted to teach an LLM basic logic and arithmetic. These papers are fascinating from the perspective of emergent behavior—unexpected capabilities that emerge from a system designed to do something else—but vexing from the perspective of practicality. We already built computers that are quite efficient at both math and logic. Let’s not make LLM’s a hammer in search of a nail. Luckily, the advent of LLM tool use and retrieval-augmented generation (RAG) are big steps in avoiding this.]

Again, quality tools require quality use. I also saw this while training a private LLM in an effort to increase my productivity as a novelist. And I’ve seen it while using Copilot during coding sessions. So I’ll end with this bit of wisdom: if you’re having trouble getting what you want from an ML model, the problem is just as likely to be you as it is the model. At least, that’s how it’s been for me. 

[LATE ADDITION]

I used Stable Diffusion to generate a hero image for this post. Here’s the evolution of prompts and resulting images:

PROMPT: transformer-based LLM, with a technology background

IMAGE:

PROMPT: an LLM graph, with a technology background
IMAGE:

PROMPT: a transformer-based neural network diagram, with a technology background

IMAGE:

PROMPT: a hero image for a blog post about neural networks and how to use them in effective ways. the image should have a dark blue theme with white accents, and the background of the image should look like interconnected graph elements.

IMAGE:

Next
Next

The power of Collaborative Leadership in business growth