Back to Blog

How to Analyze a WhatsApp Chat: Export, Tools, and What the Data Shows

Berk Güneş · Jun 03, 2026
Jun 03, 2026 · 8 min read
How to Analyze a WhatsApp Chat: Export, Tools, and What the Data Shows

Short answer: Export the chat from WhatsApp as a .txt file (Settings is the wrong place — the option lives inside the conversation), then either drop the file into a recap tool for results in a couple of minutes, or parse it yourself in Python with pandas. The no-code path is faster and survives the export quirks; the Python path is more flexible if you want to write your own questions.

I exported and parsed several of my own WhatsApp chats to write this — a two-person thread, a noisy 40-person group, and one chat full of voice notes and photos. The export step is identical on iOS and Android, but what comes out the other side is messier than people expect. Three things quietly wreck most DIY scripts, and almost nobody warns you about them up front. Below is the exact path that worked, and the spots where it broke.

Get the export right first, or nothing downstream works

WhatsApp exports a single chat as a plain-text file (optionally with a media folder). The control is not in the app's main settings — it is inside the individual conversation. On both platforms the route is the same idea: open the chat, open the contact or group name at the top, scroll to the export option, then choose whether to include media. The official steps are documented in the WhatsApp Help Center under "Export chat history," and they differ slightly between iOS and Android wording, so check there if your menu looks different.

One decision matters more than it looks: "Without Media" vs "Include Media." For analysis, choose without media. You still keep every message — each photo or voice note simply becomes a placeholder line in the text — and you avoid exporting a folder of files you don't need for counting words, timestamps, and who-said-what. If you want the images too, that's a separate decision, not an analysis requirement.

Claim: The WhatsApp export is plain text, so any tool that reads text can analyze it.
Evidence: WhatsApp's own Help Center describes the export as a chat-history file you can email or save; opening it in a text editor shows one message per line with a timestamp prefix.
Limit: The line format is not standardized across regions and OS versions, which is exactly what trips up parsers.
Action: Open the .txt in any editor and look at the first ten lines before you choose a tool — the format you see decides the rest.

Path A — the no-code route (the one most people actually want)

If you don't write code, the realistic goal is a readable recap: message counts per person, busiest hours and days, who starts conversations, most-used words and emoji, maybe a sentiment trend. A dedicated recap tool reads the exported file and renders these directly. In my test the slow part was not the analysis — it was finding the export button. Once the .txt was in hand, getting to a finished recap took a couple of minutes, not a coding afternoon.

This is the gap Wrapped AI is built to close. It takes the chat you exported from your own account, parses the timestamp and sender format for you, and turns it into a shareable "Wrapped"-style recap — without asking you to install Python or learn pandas. The honest framing: it is a recap and analysis tool, not a magic mind-reader. It counts and visualizes what is in the text; it doesn't infer things the messages don't contain.

Path B — the Python route (more control, more sharp edges)

If you do code, the open-source approach is well documented. The common stack is Python with pandas for the dataframe and a plotting library for charts; several public "whatsapp-chat-analysis" projects on GitHub publish notebooks that read the export, split each line into timestamp, sender, message, and compute the same metrics. The pandas documentation covers the parsing and grouping you'll lean on (read_csv/regex splitting, groupby, datetime handling).

The flexibility is real: you can ask any question you can express in code. But the export format fights you. Here is where it broke in my runs, and how to handle each.

Troubleshooting box: the three things that break DIY parsers

  1. Timestamp format varies by region and OS. Some exports use [DD.MM.YY, HH:MM:SS], others MM/DD/YY, HH:MM with AM/PM, others a 24-hour clock with no brackets. A regex tuned to one device silently drops or misaligns lines from another. Detect the format from the first valid line instead of hardcoding it.
  2. System messages aren't real messages. "Messages and calls are end-to-end encrypted," "You added X," "Missed voice call," and "This message was deleted" all appear as lines with no real sender. Count them as messages and your per-person totals are wrong. Filter them before you tally.
  3. Multi-line messages and media placeholders. A long message with line breaks spills across several text lines — only the first has a timestamp. Naive line-by-line parsing treats the continuation as a new (broken) message. And media becomes a placeholder like <Media omitted> or a localized equivalent, which you must decide to count or exclude. Append non-timestamped lines to the previous message; bucket placeholders separately.

None of these are hard once you know they exist. They are just invisible until a chart looks wrong and you can't tell why. That's the actual difference between the two paths: the no-code tool absorbs these quirks for you, while the Python route hands you full control and the responsibility for getting them right.

What the data actually shows — and what it doesn't

The reliable outputs are the ones grounded in raw counts: total messages, messages per person, message length, active days, hour-of-day and day-of-week patterns, emoji frequency, and word frequency. These are arithmetic on timestamps and text, so they're trustworthy as long as the parsing above is clean.

Sentiment is the part to treat carefully. Automated sentiment analysis assigns a positive/negative/neutral score to text, and the academic literature on text and sentiment analysis is candid about its limits: lexicon-based and even model-based scoring struggles with sarcasm, inside jokes, code-switching between languages, and emoji that flip a sentence's meaning. So read a sentiment trend as a rough mood signal, not a verdict on a relationship. It suggests a pattern; it does not prove one.

The privacy part, said plainly

A chat export contains two people's words, not just yours. Analyze your own conversations, and treat a group export as something you should only run with the awareness of the people in it — the same courtesy you'd want. Practically: prefer a tool that processes the file without uploading your messages to a server you don't control, delete the .txt when you're done, and don't post a recap of a private one-on-one chat without the other person's okay. WhatsApp's end-to-end encryption protects messages in transit; once you export, that protection is in your hands. None of this requires bypassing WhatsApp or reading anyone's private account — it works only on a file you legitimately exported yourself.

FAQ

How do I export a WhatsApp chat to analyze it?

Open the conversation, tap the contact or group name at the top, scroll to "Export chat," and choose "Without Media" for analysis. WhatsApp emails or saves a .txt file with one message per line. The exact menu wording differs slightly between iOS and Android; the WhatsApp Help Center lists the current steps for both.

Do I need to know Python to analyze my WhatsApp chat?

No. A recap tool reads the exported .txt and produces counts, charts, and word/emoji frequencies directly, usually in a couple of minutes. Python with pandas is worth it only if you want to write custom questions the tool doesn't answer — and you'll then own the parsing quirks yourself.

Why do my message counts look wrong in a DIY script?

Almost always one of three causes: system lines ("X joined," "Missed call," encryption notices) counted as messages, multi-line messages split into fake extra messages, or <Media omitted> placeholders inflating totals. Filter system messages, append non-timestamped lines to the previous message, and decide explicitly whether media placeholders count.

Can chat analysis tell how someone really feels?

Not reliably. Frequency, timing, and length are solid because they're just counts. Sentiment scoring is approximate — research on text sentiment analysis documents weak performance on sarcasm, mixed languages, and emoji. Read mood trends as directional, not as proof of anything about a relationship.

Is it private to analyze a WhatsApp export?

It depends on the tool. The export is a plain file once it leaves WhatsApp, so end-to-end encryption no longer applies. Choose a tool that doesn't ship your messages to a third-party server, analyze only chats you're part of, delete the file afterward, and don't publish a private chat's recap without consent.

What I'd do

If you just want the insights and a shareable recap, take the no-code path: export without media, drop the file into a recap tool, and you're done before a Python environment would finish installing. If you genuinely want to ask custom questions, go Python — but write the parser around those three gotchas first, because every chart you build sits on top of them. Either way, start by opening the raw .txt and reading the first ten lines. That one habit tells you which format you have and saves the most time. Wrapped AI is made by DynApps, the studio behind the parser, for people who want the recap without the regex.

Language
English en العربية ar Dansk da Deutsch de Español es Français fr עברית he हिन्दी hi Magyar hu Bahasa id Italiano it 日本語 ja 한국어 ko Nederlands nl Polski pl Português pt Русский ru Svenska sv 简体中文 zh