Increase performance by repeating LLM calls

Dan Cleary

Jul 30, 2024

2

Sometimes throwing more API calls at a problem is the easiest way to increase accuracy

Read →

2 Comments

Neethu Nath

Jul 30

Wouldn't this increase the latency?

Expand full comment

When generating the outputs you could run the prompts in parallel to minimize latency increase. So then the only increase in latency comes from the last prompt where the model is judging the outputs

Expand full comment

Reply

Share

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts

The Prompt Engineering Substack

Increase performance by repeating LLM calls