What 2,800 AI Conversations Taught Me About My Users
What I learned running topic modeling on 2,800+ user messages, and how you can do it too.
You’ve shipped an AI prototype. People are using it. Now what?
I shipped an AI language conversation partner a few months ago. People were asking questions, practicing phrases, having full conversations with my AI tutor. I implemented traffic analytics to tell me where they came from, and session tools to show me where they clicked.
But none of that answered the question that actually drives product decisions:
What are people actually trying to accomplish when they use it?
That’s when I turned to topic modeling to help surface the answer. It clusters raw user queries into themes, so you can see which use cases dominate, which segments you’re under-serving, and where your roadmap should go next.
Here’s what I found when I ran it on my dataset of 2,800+ user interactions.
Why Do Topic Modeling
Topic modeling treats each query as a point in an embedding space and clusters them by semantic similarity. So “how do I say chicken in Cantonese?” and “我鍾意食雞飯” (I like eating chicken rice) land close together even though they’re in different languages and share almost no vocabulary.
Older NLP methods like LDA count word frequencies, so they’d treat Cantonese and English as separate universes with no overlap. Instead, embeddings capture meaning, so cross-language code-switching actually clusters the conversations by semantic meaning regardless of language.
I used BERTopic because it’s built on top of sentence transformers, and handles the embedding and clustering pipeline in one go. The hard part isn’t the algorithm. It’s the filtering, the label quality, and the visualization choices that turn raw clusters into something valuable.
Visualization 1: What People Actually Talk About
This treemap shows topic categories sized by interaction count. Each rectangle’s area is proportional to how many conversations fell into that theme.

Food & Drink is 23% of all conversations. People are rehearsing how to order dim sum, explain what they want at a restaurant, talk about cooking with family. Food is a central part of culture.
Language & Translation at 15% was surprising. A meaningful chunk of users treat my AI tutor like a dictionary, asking “how do I say X in Cantonese?” I intended for it to be a conversation partner, but some people use it as a lookup tool.
Greetings & Social at 15% told me the complete beginner segment is larger than I assumed. They practice new year wishes, self-introductions, and basic small talk. My product targets diaspora speakers, but beginners are finding value too. That’s worth leaning into.
Family & Relationships at 12% is where the emotional core lives: heritage speakers reconnecting with their roots. Someone practicing how to explain their job to relatives they’ve lost touch with. Someone learning what to say to their toddler. This is the segment that stays longest and engages most deeply.
Visualization 2: The Conversation Map
Next, I wanted to see how topics relate spatially by visualizing which conversations sit next to which. This scatter plot does that: each dot is one turn, and dots that land near each other are semantically similar messages. I color-coded them by category and tuned the projection into informative clusters.

You’ll notice that the colours don’t separate cleanly. This is because some conversations cross category boundaries. For example, a query like “how do I say fish in Cantonese?” sits at the intersection of Food and Translation.
That’s a useful reminder that product features shouldn’t be siloed either. Someone practicing food vocabulary is simultaneously doing translation, learning new words, and talking about culturally specific dishes.
What I Learned about User Profiles and Product Usage
The analysis surfaced three distinct usage patterns that map to different learner profiles:
Diaspora — Heritage speakers bringing their actual lives into practice: family stories, hobbies (sports, piano, pets), cultural context. This is the segment I built for, and they’re the most engaged.
Utility seekers — People who want translation or pronunciation help: “how do I say X?”. I built a conversation partner, and this segment wants a lookup tool. The product should embrace it instead of pushing everyone toward open conversation.
Complete beginners — This is a larger segment than I originally assumed, practicing greetings, self-introductions, and basic small talk. They need a different pace; some told the AI to “slow down” even after toggling playback speed.
I originally designed for heritage speakers, but different segments have found value in different ways. None of this would have been obvious from gut feeling.
Topic Modeling Beats Surveys and NPS
If you’re building an AI product and sitting on conversation logs, you’re sitting on the most direct signal of user intent you’ll ever get.
I find that surveys and live interviews have a performative problem: people sometimes tell you what they think you want to hear. NPS scores compress everything into a single number and lose the nuances. What people actually type when they’re trying to get something done is neither; it’s messy but it’s exactly what you need to shape a roadmap.
Running topic modeling on 2,800+ turns involved clustering by semantic similarity, visualizing the output, then iterating on labels and layout until it was legible. The model fit in under a minute, but making it readable and turning that into a few informative visualizations is where the work lives.
If you're sitting on conversation logs / customer feedback / support tickets, you're sitting on the clearest signal of user intent you'll ever get, and most builders never look at it.
Next week's paid post is the exact pipeline I used: the algorithms, parameter choices, label filtering system that took four rounds to get right, and copy-pasteable code. If you want to run this on your own data this weekend, that's where to go.
If you’ve tried topic modeling on your own data, what have you found?


I like how you are looking at what the users want and are willing to adjust. So many tools don’t get updated and jut left as they are. I am curious about how data privacy works into all of this. Do people agree to the storage of conversations for application improvements? Or do you run the analytics blind with data processing on the dataset looking for overall patterns not specific instances? I am genuinely curious as I am trying to understand these systems better.
i'm in phase 2 of developing my software and i never would have thought about topic modeling. i find your work and conversations so inspiring!