EN DE

Responder: When the Interview Is a Bot, So Am I

Responder is a small tool and a statement. It automatically sits through AI-powered audio job interviews, the ones where no human is on the other end, where a model asks questions and another model decides whether you proceed. Built in Go, with Google Speech-to-Text, Gemini, and a virtual microphone on Linux.

The Problem with Automated Interviews

At some point in the last few years, a format quietly spread through hiring pipelines: the fully automated audio interview. You click a link, a bot asks you questions, you answer into the void, and a model on the other end decides whether a human ever sees your application. No conversation, no context, no one to ask a question back to.

This is disrespectful to candidates. It treats a conversation as a data extraction problem and hides the fact that no one at the company cared enough to show up. It filters for people who perform well in front of language models, which is not the same thing as filtering for people who do good work.

So I Built the Other Side

Responder captures the system audio of the call, transcribes what the bot says through Google Speech-to-Text, sends the transcript along with my CV to Gemini, and speaks the response back through a virtual microphone that the call app picks up as the input device. Bot on one end, bot on the other. If the format works, it should not matter.

It is not designed to fool humans, and it cannot. It is built specifically for the automated case, as a demonstration that if both sides of an "interview" can be a language model, the interview was never doing what it claimed to.

Responder running in a terminal: transcribed questions from the bot and generated answers side by side.

💡 The Point

Automated AI interviews reduce a conversation into a one-sided interrogation by a system that cannot actually listen. Responder is a proof, not a product: the format is hollow. Use a real human, or do not be surprised when candidates stop showing up as humans either.

How It Works

The pipeline is straightforward. A virtual PipeWire sink captures the audio coming out of the call app. parec pipes raw PCM into a Go process, which streams it to Google's Speech-to-Text v2 API using the chirp_2 model. Streams rotate every few minutes to stay within session limits. When a final transcript arrives, it goes to Gemini with a system prompt and my CV as context. The response is spoken through Google Text-to-Speech and routed into a virtual microphone, which the call app sees as a normal input device.

There is a small buffering layer in front of the model: if the bot asks a multi-part question in quick succession, fragments are collected for a couple of seconds before a single response is generated. Otherwise you end up interrupting yourself.

The Other Point

A job interview is not a one-way street, at least not one I want to drive down. I want to get to know the team too, hear what you are building, and figure out if we would actually enjoy working together. If you ask me a technical question, you might even get one back. 😉

As a freelancer I can afford to be picky about this. And honestly, so should you. The best hires come from conversations, not from interrogations conducted by a language model at 2 a.m. because it was cheaper than scheduling a call.

Tech Stack

  • Language: Go, single binary, no runtime dependencies beyond the system audio tools
  • Speech-to-Text: Google Cloud Speech v2 with the chirp_2 model, streaming with interim results
  • LLM: Gemini 2.5 Flash with a system prompt and the candidate CV as context
  • Text-to-Speech: Google Cloud Text-to-Speech, piped directly into the virtual microphone sink
  • Audio: PipeWire / PulseAudio virtual source and sink, parec for capture
  • Platform: Linux only, self-hosted, no cloud orchestration

Source and Disclaimer

The code is on my Gitea at gitea.karlbreuer.com/karl/responder. It is a demonstration and a statement. No responsibility is assumed for any use or consequences thereof. You are responsible for complying with the terms of service, laws, and ethical norms that apply to you.

If you are hiring and you want to actually meet the people you might work with, drop me a line at mail@karlbreuer.com. I take on architecture, development, and technical leadership, and I promise to show up in person. ;-)