What Is Ghidra? A Beginner’s Guide to Reverse Engineering and Firmware Analysis

Ghidra is one of the most powerful free reverse engineering tools available today. Originally developed by the NSA and released as open source in 2019, it allows security researchers to analyze binaries, firmware, malware, and embedded systems across architectures like x86, ARM, and MIPS.

In this guide, we explain what Ghidra is, how it works, and why it matters for vulnerability research, IoT firmware analysis, and modern binary exploitation.

:satellite_antenna: Ghidra Reverse Engineering Series Overview

Phase # Title Status
:green_circle: Phase 1 1 What Is Ghidra and Why Should I Care? :white_check_mark: You are here
:green_circle: Phase 1 2 Installing Ghidra — Java, Download, First Launch coming soon
:green_circle: Phase 1 3 The CodeBrowser — Every Window Explained coming soon
:green_circle: Phase 1 4 Importing Your First Binary — What Ghidra Detects coming soon
:green_circle: Phase 1 5 Finding Your Way Around — Strings, XREFs, Navigation coming soon
:green_circle: Phase 1 6 Ghidra Projects — Saving, Exporting, Organizing coming soon

:pushpin: Series Index | Next → Part 2: Installing Ghidra (coming soon)


What Is Ghidra and Why Should I Care?

(told by someone who found it intimidating at first)

There are a lot of tools in security that look complicated from the outside. Ghidra was one of those for me. I kept seeing it mentioned in CVE writeups, firmware analysis blogs, malware reports and I kept skipping past it because it felt like something I wasn’t ready for yet.

Then I actually opened it. And I realized: I wasn’t skipping it because it was hard. I was skipping it because nobody had explained the why before throwing me into the how.

This part fixes that. No binary, no screenshots of the tool yet. Just: what problem does Ghidra solve, and why should you care?


What Is a Binary and Why Does It Look Like Gibberish?

You already know that computers don’t understand English. You write code in C or Python that’s for you, not for the CPU. The compiler takes your human-readable code and translates it into something the CPU actually understands: raw bytes. Machine instructions. 1s and 0s.

That translation is one-way. The compiler throws away everything that was meant for humans variable names, function names, comments, structure. What’s left is a binary file containing nothing but CPU instructions in byte form.

Now open that binary in Notepad++.

You will see something that looks completely broken. Random characters, symbols, unreadable noise. That is not corruption. That is the program. The CPU reads those bytes perfectly. Your text editor is trying to display bytes as characters but those bytes were never meant to be characters, so you get gibberish.

The information is still there. Every loop, every function, every condition your programmer wrote it’s all still in those bytes. The structure is gone. Ghidra’s job is to reconstruct that structure.

It looks at the bytes, recognizes patterns “this sequence looks like a function entry”, “this looks like a loop”, “these bytes are a string” and rebuilds something a human can navigate.

Not perfectly. Not the original source code. But close enough that you can understand what the program does.


What Is Reverse Engineering?

Imagine you have a samosa in front of you. No recipe. You’ve never made one before. But you want to know exactly what’s inside the spices, the proportions, how it was put together.

So you eat it slowly. You taste each layer. You work backwards from the final product to figure out how it was made.

That is reverse engineering.

In security research, it looks like this: you have a binary maybe a router firmware, maybe a piece of malware, maybe a proprietary protocol implementation. No source code. You need to understand what it does.

So you load it into Ghidra. You find a string maybe an error message you’ve seen the device produce. You search for that string. You find the function that uses it. You trace who calls that function. You follow the data where does your input go? What functions does it pass through? Is there anywhere along that path where something dangerous could happen?

You’re not recreating source code. You’re building understanding. Working backwards from the compiled result to the logic the programmer originally wrote.

This is what vulnerability researchers do. This is what malware analysts do. This is what firmware security people do including everything I’ve been building toward with the Mirai series, the fuzzing series, and the FirmX work.


What Is Ghidra — and Where Did It Come From?

Ghidra was built by the NSA the United States National Security Agency. They used it internally for intelligence and security work for years. Nobody outside knew it existed.

Then in 2017, WikiLeaks released the Vault 7 documents, which included references to Ghidra as an NSA tool. The secret was out.

In 2019, NSA made the decision to release it as open source at the RSA Conference. Free. Publicly available. Anyone can download it.

Think about what that means. A tool that the NSA trusted for serious national-security-level binary analysis and it costs nothing.

Here is what Ghidra gives you:

Binary file (.exe / .elf / .bin / firmware)
            |
            ▼
        [ Ghidra ]
            |
     ┌──────┴──────┐
     ▼             ▼
Disassembly    Decompiler
(Assembly)     (Pseudo-C)

Two views of the same binary. Two levels of readability. We’ll get to both in a moment.

Ghidra runs on Windows, Linux, and Mac. It supports almost every architecture you’ll encounter x86, x64, ARM, MIPS, and more. It’s scriptable in Python and Java. And for IoT firmware research specifically, the ARM and MIPS support is genuinely excellent not just “free alternative” excellent. Actually excellent.


Disassembly vs Decompiler Output - What’s the Difference?

This confused me until I thought about it like this.

Imagine a Nepali movie. Someone gives you two English versions:

Version 1 - word-for-word literal translation. Every Nepali word converted directly to English. Grammatically strange, hard to read naturally, but 100% accurate to exactly what was said.

Version 2 - proper English dub. Flows naturally, easy to understand. But the translator made interpretation choices. Might not be word-for-word exact.

Disassembly is Version 1. Raw assembly instructions. Exactly what the CPU executes. Hard to read, but perfectly accurate. What’s in the binary is what you see.

Decompiler output is Version 2. Ghidra looks at the assembly and guesses what the original C code might have looked like. Easier to read. But Ghidra is making interpretations. Sometimes it gets it wrong. Sometimes it gets it very wrong.

In Ghidra, you see both side by side. Assembly on the left, pseudo-C on the right. The habit you need to build from the start: never trust the decompiler blindly. When something looks wrong in the pseudo-C, go look at the assembly. The assembly is always the truth.

We will come back to this in Part 10. For now just remember: decompiler is convenient, assembly is correct.


Static Analysis vs Dynamic Analysis - and Why You Need Both

If you’ve been following the OllyDbg series, you’ve been doing dynamic analysis - you opened the program, ran it, and watched what happened in real time. What went into registers. What was on the stack. How memory changed during execution.

Ghidra is the opposite. You never run the binary. You read it like a book. That is static analysis.

STATIC ANALYSIS                    DYNAMIC ANALYSIS
─────────────────                  ─────────────────
Tool: Ghidra, IDA Pro              Tool: OllyDbg, GDB, WinDbg
Binary: not running                Binary: running
See: full code structure           See: real values, real execution
Risk: safe — binary never runs     Risk: malware could execute
Works on: any architecture         Works on: must match your system
Limitation: no runtime values      Limitation: only one execution path

Neither one is complete on its own.

Ghidra gives you the map - the full picture, every function, every string, every possible path through the code. But it’s a static map. You’re making educated guesses about what values will actually be at runtime.

OllyDbg gives you the territory - real execution, real register values, real memory at that exact moment. But you only see what happens in that one run with those specific inputs.

Both go hand in hand. Use Ghidra to understand the structure. Use a debugger to confirm what actually happens. This is the workflow real researchers use.


Why Not Just Use AI?

I use AI. I use it constantly. Anyone telling you not to use AI in 2025 is giving you bad advice the field moves too fast to ignore tools that make you faster.

But here’s the trap I’ve seen and fallen into myself:

Paste disassembly into AI
        |
        ▼
Get explanation
        |
        ▼
Understand nothing personally
        |
        ▼
Paste next function
        |
        ▼
Same result
        |
        ▼
Never improve

AI only works if you know. If AI gives you a wrong explanation and you have no foundation to catch it you follow the wrong path the whole way down.

If you build the skill first, then AI becomes a multiplier. You look at a function, form your own hypothesis, then use AI to verify or challenge it. You can push back. You can say “wait, that doesn’t match what I see in the assembly.” That’s the difference between someone who uses AI as a crutch and someone who uses it as a tool.

This series builds the foundation. Use AI to verify, not to replace thinking.


Ghidra vs IDA Pro vs Binary Ninja

Three tools. Same category. Different situations.

Tool Cost Strengths Who Uses It
IDA Pro $$$$ (thousands) Industry standard, extremely powerful, best plugin ecosystem Professional malware labs, large security firms
Binary Ninja $$ (hundreds) Excellent scripting API, popular in exploit dev CTF players, exploit developers
Ghidra Free NSA quality, excellent ARM/MIPS, Python/Java scripting, active community Everyone who isn’t paid to use IDA

The concepts are the same across all three. If you learn Ghidra properly the way we’re doing it in this series switching to IDA Pro later is just an interface change. The thinking is identical.

For IoT firmware research, Ghidra is not just the free option. It is the right option. ARM and MIPS support is where Ghidra genuinely shines, and those are exactly the architectures you find in routers, embedded devices, and IoT hardware.


What You Will Be Able to Do After This Series

I want to be honest with you about something. I’m not writing this series because Ghidra is on some job requirement list. I’m writing it because IoTSec.in exists for one reason someone should be able to land here at 2am, completely lost, and find a series that actually makes sense to them.

Every series on this forum Mirai, ChipWhisperer, OllyDbg, NRF52840 exists because I needed it and it didn’t exist. This one is the same.

By Part 60, you will be able to:

  • Load any binary ELF, PE, raw firmware and navigate it without panicking
  • Read disassembly in x86, ARM, and MIPS
  • Understand what bare-metal embedded firmware does without source code
  • Find stack overflows, command injections, and hardcoded credentials in real router firmware
  • Script Ghidra to automate repetitive analysis tasks
  • Diff two firmware versions to find what a security patch actually fixed
  • Walk into any security role using IDA Pro or Binary Ninja and adapt in a day

And if an interviewer asks why you learned on Ghidra and not IDA Pro that’s actually a good answer. Ghidra is free, open source, supports every architecture, and if you can use it well, you understand the concepts. That’s what they’re actually testing for.


What I Found Confusing (And Now Don’t)

1. “Ghidra” vs other similar tools are they all the same thing?

I knew tools like jadx existed for Android decompilation and assumed Ghidra was basically the same category. It is — but the key difference I didn’t appreciate at first is architecture support. jadx is Java-specific. Ghidra handles x86, ARM, MIPS, and dozens more. That’s what makes it powerful for IoT work specifically.

2. Disassembly vs decompiler output I kept mixing these up

I thought “decompiler” meant the tool that converts binary back to readable code — and I assumed that output was accurate. The thing that made it click: disassembly is Version 1 translation (literal, accurate, hard to read), decompiler is Version 2 (natural, easy, but interpreted). The assembly is always the truth. The decompiler is a convenience. Never confuse the two.

3. Static analysis felt “incomplete” compared to dynamic

When I first understood static analysis, my instinct was but you can’t see real values, so what’s the point? The map vs territory analogy fixed this. Static gives you the full map. Dynamic gives you one path through the territory. You need both. Neither replaces the other.


Hands-On Summary

No Ghidra this part. That comes in Part 2. But here’s what you should do right now to make Part 1 concrete:

  1. Find any .exe on your Windows machine C:\Windows\system32\notepad.exe works perfectly
  2. Open it in Notepad++
  3. Look at what you see that gibberish is the program. The CPU reads it perfectly.
  4. Remember this moment. Part 2 is where we make that readable.

What We Learned

Term What it means
Binary A compiled program file raw CPU instructions in byte form, not human-readable
Reverse engineering Working backwards from a compiled binary to understand what it does, without source code
Static analysis Analyzing a binary without running it Ghidra’s approach
Dynamic analysis Running a binary and observing its behavior OllyDbg’s approach
Disassembly The binary converted to assembly instructions accurate, hard to read
Decompiler output Ghidra’s guess at what the original C code looked like readable, but interpreted
Ghidra Free, open-source reverse engineering tool originally built by the NSA, released in 2019

Coming Up Next

Part 2 is where we stop talking about Ghidra and actually install it. Java dependency, download, first launch, first look at the interface. If anything goes wrong during setup good. That’s content. We document it as it happens.

Enough theory. Let’s see what we’re working with.

→ Part 2: Installing Ghidra Java, Download, First Launch (coming soon)


Resources