Knowledge Engine (ke-simple)

Why ke-simple

What makes it different

identity

Content is identity

Files are addressed by SHA-256 and stored once — duplicates collapse automatically. Names, buckets, and locations are just mutable metadata around the hash.

operate

One service, one Postgres

No Celery, Meilisearch, ChromaDB, or Redis to run. Full-text and vector search live in Postgres; bytes live on disk. Stand it up with a single command.

analyze

Similarity across the corpus

Find functions and binaries that resemble a known sample — by microcode structure, flow-graph embeddings, shared strings, or FLIRT signatures.

ingest

Git-native projection

Push binaries and IDBs to a hosted repo; a small .ke/actions.yml maps files into buckets with tags and processing — one repo can feed many buckets.

provenance

Links that self-resolve

An IDB records the source binary it was built from; the link is keyed by content hash and resolves in any ingest order — no back-fill, with tag inheritance.

automate

Hash-addressable API

Pipelines fetch any object's exact bytes by hash over plain HTTP — no UI, no bucket/key needed. The web UI and API share one origin.

Who it's for

Reverse-engineering teams that want a shared, searchable, de-duplicated corpus of binaries and IDBs instead of scattered folders and ad-hoc naming.
Malware analysts triaging large sample sets — cluster by similarity, tag by family, and pivot by hash, string, or function name.
Tooling engineers who need a hash-addressable HTTP corpus to wire into CI, sandboxes, or IDA plugins.

How it fits together

One front door, one datastore. Every ingest path funnels into a single storage service; a reactor turns changes into jobs; workers run extractors and import their results; operations answer queries.

CLI ingest / REST PUT / Git push ──► storage (sha256 + bucket/key + event) │ reactor ──► jobs ──► workers ──► extractors ──► plugin tables │ search · similarity · provenance

Read the Concepts for the model, or jump to Use cases for concrete workflows.

Start here

Install in 5 minutes Take the UI tour API & CLI reference

Every binary, addressed by content.
Searchable, linked, similar-matched.

From `git push` to a searchable, matchable corpus

What makes it different

Content is identity

One service, one Postgres

Similarity across the corpus

Git-native projection

Links that self-resolve

Hash-addressable API

Who it's for

How it fits together

Start here

Every binary, addressed by content.Searchable, linked, similar-matched.

From git push to a searchable, matchable corpus

What makes it different

Content is identity

One service, one Postgres

Similarity across the corpus

Git-native projection

Links that self-resolve

Hash-addressable API

Who it's for

How it fits together

Start here

Every binary, addressed by content.
Searchable, linked, similar-matched.

From `git push` to a searchable, matchable corpus