GRAPHCTI
about // why, who, and what changes

Built because the analysis was always the analyst's problem

GraphCTI is a working prototype of an AI-native threat intelligence platform: a knowledge graph of assertions extracted from threat reporting, with a reasoning model as the interface. This page is the long-form version of the argument: where it came from, who it serves, and what actually changes if it works.

the why

Threat intelligence has a consumption problem, not a collection problem

The industry is excellent at producing intelligence: leak-site trackers, vendor reporting, government advisories, ATT&CK, vulnerability catalogs. The tooling built to hold all of it, the traditional Threat Intelligence Platform, is an enriched repository. Everything hangs off siloed report objects. The relationships that make intelligence intelligence (which actors share techniques, which campaigns answer which requirements, what your collection doesn't cover) exist only if an analyst manually constructs them, one pivot and one browser tab at a time.

The result is a quiet tax paid by every intel function: hours spent extracting relevance from PDFs, dashboards that answer last year's questions, requirements documents that live in SharePoint where no query can reach them, and leadership briefings assembled by copy-paste every Friday. The analysis, the actual job, happens in the analyst's head and nowhere else.

GraphCTI starts from a different premise: extract what reports actually assert, the claims about actors, campaigns, techniques, intent, and indicators, into a graph at ingestion, make requirements (PIRs) nodes in that same graph, and let an interpretation layer traverse it in plain language. The repository stops being the product. The reasoning over it becomes the product.

the impact

What changes, concretely

MONDAY MORNING · TRADITIONAL TIP

Leadership asks which ransomware actors matter to the company right now. The analyst opens the platform, runs a search, pages through report objects, opens six PDFs, extracts TTPs by hand, cross-references the requirements doc from memory, builds a spreadsheet, and writes the summary. Wednesday, the deck is ready. The next question starts the process over.

MONDAY MORNING · GRAPHCTI

The same question is asked in a sentence. The interpretation layer (Claude, OpenAI, Gemini, etc.) matches it to the standing PIR, runs bounded Cypher queries, cross-checks observed victimology against community reporting, and returns the ranked answer with provenance, plus what the collection doesn't know, logged as a collection gap. The analyst spends the morning on judgment: so what, what next, what to brief.

"Which actors target organizations like ours, and how do they get in?"

Threat groups, initial access techniques, and exploited CVEs are extracted from threat reports during data ingestion and returned to analysts within seconds to minutes. The answer arrives with provenance.

"What changed in the landscape this month?"

Trend and novelty analysis across every report and every victim record in the collection, not the sample one analyst had time to read. New actors, new actor-technique pairings, and shifts in sector targeting surface automatically as new data is ingested into the knowledge graph.

"Are we collecting against what matters?"

PIR coverage is measurable. Requirements with heavy answering reporting and requirements with none look different in the graph, so known unknowns become collection tasks instead of footnotes.

"What do we brief leadership?"

Grounded facts, labeled assessment, and explicit gaps, drafted from the graph in minutes. Rather than sifting through various reports manually, an afternoon becomes a half-hour where analyst smartly queries intelligence related to PIRs, leveraging the context windows and reasoning of LLMs to identify trends that would be otherwise, difficult to spot in a traditional TIP. Analysis is possible at the speed and scale of the whole collection.

known and unknown

An instrument for what you know, and for what you don't

Estimative tradecraft begins with an honest inventory of knowledge. Because requirements, reports, and claims share one graph, that inventory is queryable across all four quadrants:

Known knowns

Facts in the graph: span-grounded claims, victim records, technique mappings. Every figure traceable to a source. This is the only quadrant a traditional TIP ever shows you.

Known unknowns

PIRs with thin or no answering coverage. The graph measures the silence: heavy reporting against one requirement and nothing against another is an intelligence gap, surfaced automatically and turned into a collection task.

Unknown knowns

Connections you already collected but never made: two actors sharing a technique across reports nobody cross-read. Traversal finds what is latent in your own holdings.

Unknown unknowns

Outside collection entirely. Broad ingestion and first-seen novelty detection shrink this quadrant, and honest reporting acknowledges it is never empty. Saying so in a briefing is tradecraft, not weakness.

who it's for

Analysts first, deliberately

CTI analysts are the first users, because they can pressure test an early system: they can recognize a wrong attribution, a miscategorization, an overconfident claim. MSSP and MDR teams feel the pain most acutely (the same landscape questions, answered repeatedly across many clients) and intel-led enterprise teams are the natural second home.

Ideally, once refinement of ingestion pipelines has matured: security leaders and IT teams without dedicated threat intelligence analysts can eventually consume real intelligence directly, through scoped briefings, sector landscape answers, and prioritized risk context, at a cost structure the traditional TIPs and intel service providers can't. That expansion comes after the grounding has been hardened by expert use, not before. Sequence, not segment choice.

It is important to note that GraphCTI is not a replacement for premium threat intelligence such as what is provided by CrowdStrike's Counter Adversary Operations, Google Threat Intelligence, Intel471, RecordedFuture, etc. Instead, GraphCTI would consume these datasets as it would any other data source (similar to a traditional TIP).

who's building it

A practitioner, in the open

GraphCTI is built by a practicing Senior Cyber Threat Intelligence Analyst defending a Fortune 500 enterprise, currently completing a graduate degree in cybersecurity. This was originally a personal project built on personal time, born from the daily experience of doing this job with tools that store intelligence but do little to help analyze it.

status

What's real today, and what isn't yet

Operational: automated ransomware victimology ingestion into a knowledge graph
Operational: natural-language analysis layer with validated, deduplicated query patterns
Operational: entity resolution and data-quality guards learned the hard way
Not built: report-to-assertion extraction at scale, the next milestone
Not built: automated dissemination, planned as augmentation: the graph drafts, the analyst owns the assessment
Not built: use of analytical methdologies sourced from CIA tradecraft to help aide in analysis
Not built: multi-tenant hosting, web application, anything resembling a product you can buy

My working prototype proves the architecture on one dataset (ransomware victim data); the roadmap extends it to the long-form reporting where campaigns and threat activity lives.

the ask

Tell me the first question you'd ask it

If you're an analyst, an intel lead, or someone responsible for security decisions without an intel team: what would you ask this graph first? That single answer is the most useful thing you can offer a project at this stage.

Monthly landscape analyses generated from the live graph will publish here. [ email signup · coming soon with launch ]