OpenAI Won't Give You Your Data (Not Really)

You can export your ChatGPT conversations. You just can't read them. Here's why OpenAI's data export is technically compliant but practically useless, and what I did about it.

I had 813 conversations in ChatGPT. Research sessions, architecture planning, code debugging, business strategy. Months of thinking captured in dialogue. When I decided to move my workflow to Claude, I wanted to take that knowledge with me. So I clicked the export button.

OpenAI sent me a zip file. Inside: hundreds of JSON files with names like unknown_Cerberus_Ops_architecture_planning_69348361.json. I opened one. Here’s what “your data” looks like:

{
  "mapping": {
    "aaa5e6f2-d1c8-4b3a-9f2e-847291c3d5a1": {
      "id": "aaa5e6f2-d1c8-4b3a-9f2e-847291c3d5a1",
      "parent": "bbb7f3a1-e2d9-4c4b-a03f-958302d4e6b2",
      "children": ["ccc8g4b2-f3e0-5d5c-b14g-069413e5f7c3"],
      "message": {
        "content": {
          "content_type": "text",
          "parts": ["Your actual message buried in here"]
        }
      }
    }
  }
}

That’s not a conversation. That’s a tree structure with UUID-based parent-child pointers. OpenAI stores conversations as directed graphs, not sequential text. Every message is a node in a tree, linked to its parent and children by 36-character identifiers. To read a single conversation, you’d need to find the root node, walk the tree through each child pointer, extract the parts array from the content object inside the message object at each node, and reassemble the whole thing in order.

For 813 conversations.

Technically compliant, practically useless

OpenAI isn’t violating any data portability laws. GDPR says they have to give you your data, and they do. They just give it to you in a format that requires software engineering skills to read. For most people, “your data” is a zip file that sits in their Downloads folder forever.

This is a pattern. Companies satisfy the letter of data portability requirements while making the exported data as inconvenient as possible to actually use. The format isn’t wrong — it’s the internal representation of how they store conversations. But “we gave you our database schema” isn’t the same as “we gave you your conversations.”

What I actually wanted

I wanted to search my old conversations. I wanted to find that architecture discussion from November. I wanted to know which sessions were worth revisiting and which were throwaway questions. I wanted my conversations in a format that any text editor, any search tool, any note-taking app could work with.

Markdown. That’s it. One file per conversation, with the title and date at the top, clean You/ChatGPT turns, and a folder structure organized by month. Something I could drop into Obsidian, grep through in a terminal, or just open in Notepad.

So I built it

I wrote a Python script that walks OpenAI’s tree structure, extracts the conversation thread in order, and writes clean Markdown files. No dependencies beyond Python’s standard library. No API keys. No internet connection. Everything runs locally.

Point it at your export folder:

python chatgpt_to_markdown.py /path/to/your/chatgpt/export/

It generates an output folder organized by month, with a master index linking every conversation and a statistics file showing your most active months, longest conversations, and total word counts. My 813 conversations converted in under 30 seconds.

The output looks like this:

# Homelab Docker Architecture

**Date:** 2025-11-15 14:32 UTC
**Messages:** 42
**Words:** 12,847

---

**You:**

I'm setting up Docker on my homelab server running Ubuntu 24.
I want to run Portainer, Uptime Kuma, and a reverse proxy...

**ChatGPT:**

Great setup. Here's how I'd approach this...

That’s a conversation I can read. I can search it. I can reference it six months from now without remembering which chat window it was in.

What the numbers looked like

Running the stats on my full export was revealing. 813 conversations. Over 2 million words. Some sessions were 30,000+ words. Deep architecture planning that would have been lost in an unusable JSON tree.

The monthly activity chart showed exactly when I was leaning on ChatGPT hardest: late 2024 when I was building out my e-commerce infrastructure, and early 2025 during a round of business planning. That usage pattern alone was worth seeing.

The data portability problem is bigger than ChatGPT

This isn’t just an OpenAI problem. Every AI platform. Claude, Gemini, Copilot. Stores your conversations in proprietary formats. Your thinking, your questions, your problem-solving process is locked inside platforms that could change their terms, raise their prices, or shut down.

If you’ve spent months having substantive conversations with an AI, that conversation history is a knowledge asset. Treating it as disposable because it’s stored in an inconvenient format is accepting a loss you don’t have to accept.

Get the tool

I packaged the script as a simple download. Python 3.8+, zero dependencies, works on Windows, macOS, and Linux.

ChatGPT Export to Markdown Converter →

If you’ve been meaning to export your ChatGPT data and actually do something with it, this is the fastest path from “zip file in Downloads” to “searchable knowledge library.”

Your conversations are yours. You should be able to read them.

Discussion

Adam Bishop

Veteran, entrepreneur, and independent researcher. Writing about formal methods, AI governance, production systems, and the operational discipline that connects them. Every project here demonstrates hard thinking on simple infrastructure.