Building a diamond Manifold prediction bot in a weekend

2025/04/01

I built a prediction market bot in a few hours that ranked fourth in Manifold’s Platinum Sage Gremlins league for March. The bot has 10x’d it’s mana using nothing but free tools and simple LLM prompting over the last 2 months.

Why prediction markets matter

Prediction markets like Manifold let people “bet” on future events and reward accuracy. Metaculus and Manifold are two sites that offer play money, while Kalshi and Polymarket let you wager real money.

I’m not particularly well-calibrated at predictions, so I wanted to see if I could build a bot to do better. Building it took just a few hours, it runs on Retool Workflows (which lets us take advantage of free OpenAI credits!), and it has been surprisingly effective.

LLM consensus

The bot’s logic is straightforward:

Adding instructions to return 0.0 if the model was not confident, and using the consensus of two models meant when both agreed on a significantly different probability than the market, they were usually right.

The bot currently only works for Yes/No markets, which are the simplest markets on Manifold.

The market selection logic is pretty crude. I limited it to the 15 most recent markets because of Retool’s rate limiting for OpenAI tokens. Since Manifold’s API returns markets in chronological order, this approach focuses on newly created markets—which turns out to be where most of the alpha is.

An example API response with market’s information looks like this:

[
	{
	  "id": "A2NuS86yps",
	  "creatorId": "b1QnZhu3AfSelAPwOuQ8hzJQ43v1",
	  "creatorUsername": "BJensen",
	  "creatorName": "B Jensen",
	  "createdTime": 1741574763971,
	  "creatorAvatarUrl": "https://lh3.googleusercontent.com/a/ACg8ocLkmEIpPNEXJGXYIL9w8GW89gVYLWGSB3qbzjKS1fdVr5xhAw=s96-c",
	  "closeTime": 1742187540000,
	  "question": "Will the Machines rise up against us?",
	  "slug": "will-the-machines-rise-up-against-u",
	  "url": "https://manifold.markets/BJensen/will-the-machines-rise-up-against-u",
	  "pool": {
	    "NO": 588.2792760938454,
	    "YES": 1699.872901591853
	  },
	  "probability": 0.25709796832169085,
	  "p": 0.5000000000000001,
	  "totalLiquidity": 1000,
	  "outcomeType": "BINARY",
	  "mechanism": "cpmm-1",
	  "volume": 2320.127098408147,
	  "volume24Hours": 0,
	  "isResolved": false,
	  "uniqueBettorCount": 2,
	  "lastUpdatedTime": 1741575274325,
	  "lastBetTime": 1741575274325,
	  "lastCommentTime": 1741574943821
	}
]

We’ll extract the question, description, probability, totalLiquidity, uniqueBettorCount, and pool fields for more analysis.

Decision making logic

The core of the bot is passing each market’s information to o1 / gpt-4o to generate probability estimates. We pass it to these two models, to try and get some consensus on what the expected probability should be.

For each market, we pass the relevant information to both models with this prompt:

Analyze this prediction market question and estimate the probability of it resolving to YES.
Consider all available information and provide a numerical probability between 0 and 1.

Market title: {{value.question}}
Description: {{value.description }} 

Current market probability: {{ value.probability }} 
Total liquidity: {{value.totalLiquidity }}
Unique bettors: {{value.uniqueBettorCount }}
Unique bettors: {{value.pool }}

If you are not confident in a prediction, return 0.0. 

Provide ONLY the numerical probability estimate with 2 decimal places, nothing else.

I originally did not include the current market probability and liquidity, but found it helpful to give additional context to the models.

The decision algorithm has two criterion:

This double-check system helps filter out noisy predictions and focuses on markets with significant mispricing:

MIN_PROBABILITY_DIFFERENCE = 0.30  # Only bet when 30% different from market
MAX_PROB_DIFFERENCE = 0.1  # Models must agree within 10%

successful_bets = []

for i, market in enumerate(markets):
    # Skip resolved markets or markets without probability
    if market.get('isResolved', False) or 'probability' not in market:
        continue
    
    # Skip if no o1 probability available
    if not get_o1_probabilities.data[i] or not get_4o_probabilities.data[i]:
        continue
        
    current_prob = market['probability']
    o1_prob = float(get_probabilities.data[i])
    4o_prob = float(get_probabilities_raw.data[i])
    
    # Check for large discrepancies between probabilities
    if abs(o1_prob - 4o_prob) > MAX_PROB_DIFFERENCE:
        print(f"Large difference on {market['question']}: {o1_prob} vs {4o_prob}")
		continue
    
    # Skip markets where AI is not confident
    if o1_prob == 0 and 4o_prob == 0:
        print(f"NOT CONFIDENT: {market['question']}")
        continue
    
    # Determine trade direction based on probability difference
    if o1_prob > current_prob + MIN_PROBABILITY_DIFFERENCE:
        outcome = "YES"
    elif o1_prob < current_prob - MIN_PROBABILITY_DIFFERENCE:
        outcome = "NO"
    else:
        print(f"{market['question']}. Expected probability {o1_prob}. Difference {current_prob - o1_prob}")
        continue
    
    # Place bet and track results
    print(f"Placing {outcome} bet on {market['question']}. Expected probability {o1_prob}")
    success = place_bet(market['id'], INITIAL_BET_AMOUNT, outcome)
    
    if success:
        successful_bets.append(
            f"Placed {outcome} bet on {market['question']}. Expected probability {o1_prob}. Current probability {current_prob}"
        )
    else:
        print("Failed to place bet")

return successful_bets

Results: A slow but steady climb

This bot has been running for ~2 months and has grown its mana from the initial 100 to 1000- a 10x increase. While not amazing, seeing it generate consistent mana has been pretty cool.

mana

Most profitable bets

Least profitable bets

What it does well

The bot excels at catching newly created markets where the initial probability of 50% is misaligned. By running on a cron job every 2 hours, it can react relatively quickly to new questions. The limiting factor for run frequency is Retool’s limit of 500 free workflow runs per month.

The bot bets a very small amount- 5 mana. This doesn’t move the market much, even on questions with little liquidity. This is definitely below the threshold that people would choose to manually trade at.

What it doesn’t do well

The bot struggles with questions about recent news events. Since o1 and gpt-4o have a knowledge cutoff of October 2023 (and a lot has happened since then!), it can’t accurately assess probabilities for events that depend on post-cutoff information. 3 of the bots worst trades are politics related questions that it has no knowledge of.

Future improvements

There’s a ton of things that could improve this bot:

Dynamic bet sizing

The bot currently places a fixed bet of 5 mana regardless of confidence or mana balance. This was a quick initial value I chose and never revisited. Now that the mana balance has grown 10x, using Kelly criterion or another position sizing method would make more sense.

Position management

This is probably the #1 thing I want to add. Right now, I manually review and sell the most profitable positions every few days to free up Mana. Automating this process would free up my time and potentially improve results through more disciplined exits.

Web search integration

The largest losses come from questions about recent events due to the models’ knowledge cutoff. Adding tool calling or web search capabilities would help address this blindspot, though this would increase complexity and potentially API costs.

Lessons learned

Building this was a lot of fun! It’s been able to make a slow, but steady profit.

If you found this interesting, Metaculus also recently put out a video on how to build a Metaculus bot in 30 minutes that uses a similar approach.