eph baum dot dev

← Back to blog

Don't Trust AI - An Advent of Code Tale

Published on 12/09/2023 11:25 AM by Eph Baum

Featured Image

Or: Trust but Verify AI

I’ve been working on Advent of Code 2023 and having a good time. I might even complete them all this time as I’ve historically fallen off usually due to the typical hustle and bustle that December creates. You can find my code for this year here on Repl.it, which I’ve been using because I don’t want to melt my own computer trying to brute force a problem like an idiot on a first pass of trying to use my part 1 code on part two (if you’re not familiar: each day of AoC presents you with a computer programming puzzle to solve using any programming language (or method, even manually if you’re nasty) in two parts, the second part often ramps up the complexity significantly and sometimes in a “infinite loops & cycles might melt your silicon” kind of way).

Recently I was working on part 2 of day 7 (spoilers for day 7’s solution) on Repl.it. Sometimes I work locally and copy pasta my work over to Repl.it when I’m done or as I go and sometimes I just work right in the Repl.it environment. I’m easy breezy like that.

In this case I had refactored my working part 2 code in the quest for optimization as my solution was working against the example inputs but taking too long to run against the full input. So the logic I’d already written didn’t need to change, it was being copy and pasted between locations. It seems that in so doing their Co-Pilot-esque AI seems to have generated a less than helpful line of code and I don’t even recall actually accepting it. In its defense, I don’t pay for Repl.it Core, so I only have access to their “basic” level AI.

So, without giving away too many spoilers, part of Day Seven’s challenge was to determine the rank of card hands based on some traditional poker hands, so I had a determineBestHand method:

function determineBestHand(cardCounts, jokerCount) {
  if (jokerCount === 5) {
    return { type: 'Five of a Kind', weight: 1 };
  }

  const frequencies = Object.values(cardCounts).sort((a, b) => b - a);

  if (frequencies[0] + jokerCount >= 5) {
    return { type: 'Five of a Kind', weight: 1 };
  }
  
  if (frequencies[0] + jokerCount >= 4) {
    return { type: 'Four of a Kind', weight: 2 };
  }

  if ((frequencies[0] === 3 && (frequencies[1] >= 2 || jokerCount > 0)) 
      || (frequencies[0] === 2 && frequencies[1] === 2 && jokerCount > 0)) {
    return { type: 'Full House', weight: 3 };
  }

  if (frequencies[0] + jokerCount === 3) {
    return { type: 'Three of a Kind', weight: 4 };
  }

  if ((frequencies[0] === 2 && frequencies.length > 1)
      || (frequencies[0] === 1 && jokerCount >0)) {
    return { type: 'Two Pair', weight: 5 };
  }

  if (frequencies[0] === 2 || 
      (frequencies[0] === 1 && jokerCount >= 1)) {
    return { type: 'One Pair', weight: 6 };
  }

  return { type: 'High Card', weight: 7 };
}

The issue was not super obvious as a first glance. I kept missing the issue due to thinking this logic was sound, as I’d used the exact same code in the previous part’s challenge with success.

Thankfully someone was able to spot the issue for me: my results no longer included “One Pair” results at all.

That made the bug obvious: the condition (frequencies[0] === 2 && frequencies.length > 1) will be true for any situation where there is at least one pair, regardless of whether there’s a second pair or not.

The correct logic for ‘Two Pairs’ was:

if (frequencies[0] === 2 && frequencies[1] === 2) {
  return { type: 'Two Pair', weight: 5 };
}

🤦

I use generative AI now. I really like it, though it’s not always helpful or accurate. I’ve been using GitHub’s Co-Pilot for a while and it’s mostly useful in saving me from typing out repetitive and basic code like for loops or switch or match statements. In short: I think I’m still smarter than “AI”. Co-Pilot Chat and ChatGPT are still helpful in reasoning through problems sometimes- it’s like rubberducking without having to waste a real person’s time.

Some things that drive me nuts about AI:

In conclusion, while AI, in forms like GitHub’s Co-Pilot or Repl.it’s coding assistant, offers undeniable convenience and speed in coding, it’s not a panacea. As my experience with Advent of Code highlights, AI can sometimes lead us astray with its suggestions. It’s a reminder that AI, at its current stage, is more of a co-pilot than an autonomous driver. It requires our expertise, vigilance, and, occasionally, our skepticism. The key is to use AI as a tool to enhance our skills, not replace them. Trust, but verify, remains a prudent approach when navigating the evolving landscape of AI in software development. As we continue to explore this symbiotic relationship, it’s essential to maintain a balance between leveraging AI’s efficiency and retaining our critical problem-solving abilities. After all, the true power of coding lies in the human mind’s creativity and ingenuity, something no AI has yet to replicate.

Written by Eph Baum

  • Making Brutalist Design Accessible: A Journey in WCAG AA Compliance

    Making Brutalist Design Accessible: A Journey in WCAG AA Compliance

    How I transformed my brutalist blog theme to meet WCAG AA accessibility standards while preserving its vibrant, random aesthetic. Talking about contrast ratios, color theory, and inclusive design.

  • Building Horror Movie Season: A Journey in AI-Augmented Development

    Building Horror Movie Season: A Journey in AI-Augmented Development

    How I built a production web app primarily through 'vibe coding' with Claude, and what it taught me about the future of software development. A deep dive into AI-augmented development, the Horror Movie Season app, and reflections on the evolving role of engineers in the age of LLMs.

  • Chaos Engineering: Building Resiliency in Ourselves and Our Systems

    Chaos Engineering: Building Resiliency in Ourselves and Our Systems

    Chaos Engineering isn't just about breaking systems — it's about building resilient teams, processes, and cultures. Learn how deliberate practice strengthens both technical and human architecture, and discover "Eph's Law": If a single engineer can bring down production, the failure isn't theirs — it's the process.

  • Using LLMs to Audit and Clean Up Your Codebase: A Real-World Example

    Using LLMs to Audit and Clean Up Your Codebase: A Real-World Example

    How I used an LLM to systematically audit and remove 228 unused image files from my legacy dev blog repository, saving hours of manual work and demonstrating the practical value of AI-assisted development.

  • Migrating from Ghost CMS to Astro: A Complete Journey

    Migrating from Ghost CMS to Astro: A Complete Journey

    The complete 2-year journey of migrating from Ghost CMS to Astro—from initial script development in October 2023 to final completion in October 2025. Documents the blog's 11-year evolution, custom backup conversion script, image restoration process, and the intensive 4-day development sprint. Includes honest insights about how a few days of actual work got spread across two years due to life priorities.

  • 50 Stars - Puzzle Solver (of Little Renown)

    50 Stars - Puzzle Solver (of Little Renown)

    From coding puzzle dropout to 50-star champion—discover how AI became the ultimate coding partner for completing Advent of Code 2023. A celebration of persistence, imposter syndrome, and the surprising ways generative AI can help you level up your problem-solving game.