ContentGraph

Documentation

Analysis logic, the concept and relationship model, the explanation framework, product assumptions, limitations, and open product risks.

1

How the tool works

ContentGraph maps the concept and relationship structure of explanatory content. The design is grounded in the same principle behind schema.org: that entities and the relationships between them are the primary unit of content signal, not prose alone. Where schema.org achieves this through structured markup generated for crawlers, ContentGraph achieves it through natural language — analyzing the relational structure already present in the writing, surfacing where it is thin, missing, or inconsistent, and producing actionable guidance for closing those gaps. The tool works in two phases.

Analysis pipeline
PHASE 01Content Analysis1 LLM call
Anchor concept
Observed concepts
Relationships
Question coverage
→ observed map
PHASE 02Framework Generation2 LLM calls
Explanation frameworkoptimal coverage model
Writing guidanceeditorial instructions
→ framework + guidance
1.1

Why query fan-out matters

Modern LLM-driven retrieval systems rarely process a query as a single lookup. They decompose it into multiple sub-queries — a process called query fan-out — each targeting a different facet of the topic. A question like “how does DNS resolution work?” might fan out to sub-queries for “DNS definition,” “resolver role,” “recursive lookup process,” “TTL caching,” and more.

A piece of content does not succeed or fail as a whole in this model — it succeeds or fails concept by concept, as each sub-query attempts to match a specific facet of the topic. A page that covers five of eight expected concepts may answer five fan-out queries confidently and fail the rest, regardless of how well-written the covered sections are.

Query fan-out — concept-level retrieval
Fan-out sub-queryRetrieval outcome
"What is it?"
anchor conceptwell_integrated
"How does it work?"
mechanismunderexplained
"What does it depend on?"
prerequisiteweakly_integrated
"What does it produce?"
absent
"What does it compare to?"
absent

Each row is one sub-query in the fan-out. Absent concepts cannot satisfy their sub-query regardless of how well the rest of the page is written.

ContentGraph is built around this behavior. The anchor concept defines the primary query. The explanation framework models the expected fan-out space — the concepts a retrieval system is likely to sub-query for when a user asks about the anchor topic. Integration states predict how reliably each concept will satisfy its sub-query when found. The writing guidance closes the gaps.

1.1.1

What ContentGraph cannot know

No retrieval system documents how it decomposes queries. The sub-queries any given system generates are not published, not versioned, and not consistent across providers or over time. ContentGraph models an idealized fan-out space — a principled estimate of what sub-queries would be issued for the anchor topic — not an empirically observed one.

This means the framework is a structural target, not a guarantee. It reflects a well-reasoned theory of what an explanatory piece on this topic should cover. Whether any specific retrieval system fans out to exactly those concepts is unknown. The full limitations of this are documented in Section 7.

1.2

Phase 1: Content analysis

Phase 1 analyzes the content to produce an observed map — a graph of the concepts, relationships, and coverage gaps as they exist in the current text.

For each input, ContentGraph determines:

  • Anchor concept: the primary subject the content is explaining.
  • Observed concepts: all named entities, terms, processes, and ideas found in the content.
  • Relationships: connections between concepts, whether stated directly or inferred from context.
  • Question coverage: whether the content addresses eight standard diagnostic questions for the identified topic.
1.3

Phase 2: Framework generation

Once the observed map is complete, ContentGraph generates an explanation framework — a model of what concepts and relationships should be present for a complete explanation of the anchor topic — and translates the gap into writing guidance.

1.3.1

What Phase 2 is producing

For each concept in the framework, ContentGraph determines:

  • Is this concept already present in the content?
  • If present, is it adequately developed?
  • If absent or thin, how central is it to a complete explanation?
  • What relationships between concepts should be made explicit?
1.3.2

Why the framework matters

The observed graph shows what exists. The framework graph shows what should exist. The gap between them — not a score, but a structured comparison — is the main diagnostic output of ContentGraph. Phase 2 is where the tool transitions from description to direction.

1.4

Output format

ContentGraph produces three output layers:

  • the observed graph — a visual map of concepts as they appear in the content
  • the explanation framework — a visual model of optimal coverage for the anchor topic
  • writing guidance — actionable editorial instructions for closing the gap

These layers are complementary. The graphs communicate structure visually; the guidance communicates it editorially. Both are necessary to act on the analysis.

1.5

Model

All three pipeline calls use Claude Sonnet 4 (claude-sonnet-4-20250514) by Anthropic. Sonnet 4 is Anthropic’s capable general-purpose tier — chosen over faster, cheaper models because structured extraction, integration state judgment, and framework generation require strong reasoning and high instruction-following reliability. Haiku-class models produce noticeably less consistent results for these tasks.

Each analysis run costs three API calls against the user’s own Anthropic key: one for Phase 1 (content analysis) and two for Phase 2 (framework generation and writing guidance). Cost scales with content length. Typical analyses of a 1,500–3,000 word article run well within standard Sonnet pricing; very long documents cost proportionally more.

The explanation framework is generated from the model’s training knowledge of the anchor topic, not from a live index or curated ontology. This means the framework reflects the world as Claude understood it at its training cutoff. Topics that have evolved significantly since then — new protocols, renamed standards, emerging practices — may produce frameworks that are incomplete or subtly dated. Users working in fast-moving domains should treat the framework as an informed starting point and apply their own judgment to its coverage recommendations.

2

The observed graph

The observed graph is ContentGraph’s record of what a piece of content currently contains. It is built from two components: concepts — the named entities, terms, processes, and ideas identified in the content — and relationships between them. Schema.org encodes the same structural information as typed ontological properties in JSON-LD (schema:isPartOf, schema:produces); ContentGraph encodes it as natural language verb phrases — the predicate in a subject-verb-object triple — making the same semantic structure readable and actionable by writers rather than crawlers.

2.1

The anchor concept

The anchor concept is the primary subject the content is explaining. It is the conceptual root of the observed graph: all other concepts are placed in relation to it.

Anchor identification is the first and most consequential judgment ContentGraph makes. If the anchor is wrong, the observed graph may still be internally consistent, but it will not reflect the content’s actual purpose, and the framework generated in Phase 2 will target the wrong topic.

2.1.1

What this means for users

The anchor is surfaced prominently in both the observed graph and the content findings panel. Users should verify it before interpreting integration states or acting on framework recommendations. It is the single most important thing to check after Phase 1 completes.

2.2

Integration states

Each concept in the observed graph is assigned one of four integration states describing how well it is developed and connected within the content.

Integration states
well_integrated
Present, defined, and meaningfully connected to other concepts through explicit relationships.
weakly_integrated
Present in the content but with few or thin connections to other concepts. Mentioned but not woven in.
underexplained
Appears in the content but is not given adequate definition or development for the identified topic.
naming_inconsistent
Referred to by multiple different names or phrasings without normalization, which can confuse extraction.
Extracted Graph
usesrequiresrelies onvalidatesissued bygeneratesHTTPSTLS handshakeCertificateCert. AuthorityPublic-key cryptoSession key
Cert. AuthorityUnderexplained
Role in explanation

Validates authenticity of the server’s public key via certificate signing.

Frequency

Mentioned in content

Also called
"CA""certificate issuer"
Relationships

Certificate → issued by → Cert. Authority

Well integratedWeakly integratedUnderexplainedNaming inconsistentImplied
Each node is sized by mention count and colored by integration state. Hovering a node surfaces its role, naming variants, and extracted relationships. Cert. Authority is marked Underexplained — present in the content but never given enough context for a retrieval system to use it confidently.
2.2.1

Interpretation

Integration state is not a frequency count. A concept mentioned many times but never defined or connected to others may still be assigned underexplained. A concept mentioned once but given a clear definition and an explicit relationship to the anchor may be well_integrated.

Read through a fan-out lens, the states have a direct retrieval prediction. well_integrated concepts are likely to satisfy their sub-query. weakly_integrated concepts will be found but may return a passage too thin to answer the sub-query confidently. underexplained concepts will be found but fail to answer because they lack definition or context. naming_inconsistent concepts may not be found at all — if the retrieval system fans out using a name variant the content does not use, the passage will not match.

The most important states to act on are underexplained and naming_inconsistent. These are the primary driver of the toClarify category in writing guidance.

2.3

Explanatory role

Each concept is assigned an explanatory role — a label describing the conceptual function it serves in the explanation. Common roles include mechanism, prerequisite, outcome, context, component, and contrast.

Roles are assigned by the model based on how the concept is used in the content. They are not a fixed taxonomy — two analyses of the same content may produce slightly different role labels while accurately capturing the same underlying function.

2.3.1

What this means for users

Roles are most useful as a reading aid, not a diagnostic signal. They help users understand why a concept appears in the framework and what explanatory job it is expected to do.

2.4

Explicit vs implied relationships

Relationships are classified as either explicit or implied:

Explicit
The relationship is stated directly in the text as a subject-verb-object structure. Rendered as a solid line in the observed graph.
Implied
The relationship is inferred by the model from context, co-occurrence, or proximity — not stated directly. Rendered as a dashed line in the observed graph.
2.4.1

What this means for users

Implied relationships represent the model’s inference, not the content’s statement. A high proportion of implied relationships may indicate that the content discusses concepts without connecting them explicitly — which is precisely the failure mode that the toMakeExplicit writing guidance category addresses.

2.5

SVO extraction

For each relationship, ContentGraph identifies a subject-verb-object triple from the text where possible. These triples are displayed in the content findings panel and serve as evidence for each relationship claim.

SVO structures are the most reliably extractable unit of propositional content. A relationship that cannot be reduced to an SVO triple is harder to verify and harder for a downstream system to interpret or reuse.

2.5.1

Typical failure patterns

  • Concepts co-occur in the same paragraph but are never syntactically connected
  • Relationships expressed through lists or tables rather than prose
  • Causal claims asserted without a connecting predicate
  • Agency is ambiguous — it is unclear which concept acts and which is acted on
2.6

Directionality

Relationships in ContentGraph are directional. “A depends on B” and “B depends on A” are not the same relationship, and the graph renders them with directional arrows accordingly.

When source and target are ambiguous in the text, the model infers direction. This inference may be wrong, particularly for symmetric, reciprocal, or loosely phrased relationships. Users who notice a directional error should read the relationship label alongside the graph edge rather than relying on arrow direction alone.

3

The explanation framework

The explanation framework is ContentGraph’s model of what concepts and relationships should be present for a complete explanation of the anchor topic. It is generated by a separate LLM call that operates independently of the observed map.

3.1

What the framework represents

The framework models the expected fan-out space for the anchor topic — the concepts and relationships a retrieval system is likely to generate sub-queries for when a user asks about the anchor.

It is most useful as a comparison layer. When rendered alongside the observed graph, it reveals which concepts are well covered, which are thin, and which are missing entirely. The writing guidance in Phase 2 is generated directly from this comparison — closing the gap between what the content covers and what the fan-out space expects.

Extracted Graph
usesrequiresrelies onvalidatesissued bygeneratesHTTPSTLS handshakeCertificateCert. AuthorityPublic-key cryptoSession key
Well integratedWeakly integratedUnderexplainedNaming inconsistentImplied
Proposed Graph
usesrequiresrelies onenforcescreatesissued byHTTPSTLS handshakeCertificateCert. AuthorityPublic-key cryptoSession keyHSTSForward secrecy
In contentInferredOptionalImplied
The Extracted Graph (left) shows concepts as they appear in the content, colored by integration state. The Proposed Graph (right) shows the expected fan-out space for the anchor topic, colored by basis — amber and violet nodes are absent from or thin in the current content.
3.1.1

What this means for users

The framework reflects the model’s trained knowledge of the topic, not a curated ontology or a domain expert’s judgment. For niche, technical, or audience-specific topics, it may suggest concepts that are correct in general but wrong for the specific context.

Users should treat the framework as a starting point and editorial reference — not a specification that must be satisfied.

3.2

Basis categories

Each concept in the framework is assigned a basis describing where its inclusion comes from.

Framework basis categories
observed_in_text
The concept was found in the content. The framework confirms it belongs and assesses how well it is covered.
strong_topic_inference
The concept is strongly expected for the anchor topic but is missing or thin in the current content.
optional_enrichment
The concept would add nuance or depth but is not essential to a complete explanation of the anchor topic.
3.2.1

Interpretation

Basis categories are a signal of source, not importance. An optional_enrichment concept may still be highly valuable to cover depending on the audience; a strong_topic_inference concept may turn out to be less relevant than assumed for the specific angle of the content.

Basis is best read alongside priority and the user’s own judgment about the content’s purpose and audience.

3.3

Priority levels

Each framework concept is assigned a priority reflecting its importance for a complete explanation of the anchor topic.

essential
Must be present and adequately developed for the explanation to be complete. Absence is a significant gap.
important
Significantly improves coverage and coherence. Absence weakens the explanation but does not make it incomplete.
useful
Adds nuance, depth, or supporting context. Appropriate for comprehensive treatments of the topic.

Priority reflects a concept’s expected centrality in the fan-out distribution for the anchor topic. An essential concept will appear in nearly every retrieval path for this topic — its absence is a broad gap, not a narrow one. A useful concept appears in a narrower range of sub-queries. Priority is not a directive; a concept may be marked essential by the framework but be intentionally out of scope for a specific piece of content.

3.4

Coverage status

Each framework concept is also assigned a coverage status indicating how well it is addressed in the observed content.

yes
The concept is present and adequately developed in the observed content.
partial
The concept is present but under-developed — present enough to be identified, not developed enough to be well-integrated.
no
The concept is absent from the observed content. If the concept is also marked essential or important, it will appear in the toAdd guidance category.
4

Writing guidance

Writing guidance is the actionable output of Phase 2. It translates the gap between the observed map and the explanation framework into four categories of specific editorial instructions.

Writing Guidance
Export .md
What to add

Concepts missing or underdeveloped in your current content.

InstructionExample phrasing
Add a section explaining HTTP Strict Transport Security. HSTS headers instruct browsers to use HTTPS exclusively for the specified max-age period, preventing protocol downgrade attacks.HTTP Strict Transport Security (HSTS) is a browser directive that refuses all subsequent HTTP requests for the specified max-age duration.
Add forward secrecy as a property of modern TLS configurations. It ensures session keys are not recoverable even if the server's private key is later compromised.Forward secrecy is achieved through ephemeral key exchange — each session generates a fresh key pair that is discarded after use.
Writing guidance translates the gap between the Extracted and Proposed graphs into a two-column table: what to write, and an example sentence to seed it. Each toAdd item names a concept from the framework that is absent or undercovered.
4.1

Concepts to add

toAddAbsent or partial framework concepts

Identifies framework concepts with coverage status no or partial and priority essential or important. For each, the guidance describes the concept, explains why it matters for the anchor topic, and provides an example sentence for introducing it.

4.2

Concepts to clarify

toClarifyUnderexplained or naming_inconsistent observed concepts

Identifies concepts present in the observed map with integration state underexplained or naming_inconsistent. For each, the guidance describes the integration problem and suggests an approach to resolution.

These items target concepts the content is already trying to discuss but failing to handle clearly. Improving them typically requires expanding a definition, adding a relationship, or standardizing terminology.

4.3

Relationships to make explicit

toMakeExplicitImplied observed relationships

Identifies implied relationships in the observed map that should be stated directly. For each, the guidance names the source and target concepts, describes the relationship, and provides an example sentence that states it explicitly.

4.4

Sentence-level guidance

sentenceGuidanceConcept-anchored writing directives

Sentence-level guidance ties specific editorial instructions to one or more concepts. Unlike the above categories, which are concept-centric, sentence guidance is phrased as a writing directive that can be applied locally to relevant passages.

4.4.1

How to use sentence guidance

This is the category most dependent on the model’s interpretation of the content’s intent. Read it as a suggestion, not a prescription, and verify that each directive makes sense for the specific audience and purpose.

5

Output structure

ContentGraph streams results as newline-delimited JSON (NDJSON). Each event arrives as the corresponding analysis stage completes. The full structured output is available only after Phase 2 finishes.

Output structure
NDJSON stream
├──extracted_content Phase 1
└──sentences[]
├──observed_analysis Phase 1
├──context
└──anchorType · primaryAnchor · inferredTopic · inferredAudience · inferredGoal
├──observedMap
├──concepts×n
└──id · label · explanatoryRole · integrationState · mentionCount · isAnchor · namingVariants · evidence · definitionSentence
└──relationships×n
└──id · source · target · label · isExplicit
└──questionCoverage
└──whatIsIt · howDoesItWork · whatDoesItDependOn · whatDoesItProduce · whenDoesItApply · whatContrastsWithIt · whatGoesWrong · whatEvidence
├──explanation_framework Phase 2
├──concepts×n
└──id · label · isAnchor · explanatoryRole · whyItMatters · priority · basis · alreadyPresent · relatedConcepts
└──relationships×n
└──id · source · target · label · basis
└──writing_guidance Phase 2
└──summary · toAdd[] · toClarify[] · toMakeExplicit[] · sentenceGuidance[]
5.1

Phase 1 events

extracted_contentThe numbered sentences extracted from the input after HTML parsing or text splitting.
observed_analysisThe content context, observed concept map, relationships, and question coverage.

The observed_analysis event contains the full data powering the observed graph and the content findings panel. It arrives as a single event once the Phase 1 LLM call completes.

5.2

Phase 2 events

explanation_frameworkThe optimal framework concepts and relationships generated from Phase 1 output.
writing_guidanceThe four guidance categories with all items and examples.

These two events arrive sequentially. The framework graph renders when explanation_framework arrives. The writing guidance panel renders when writing_guidance arrives.

5.3

Concept fields

For each observed concept:

idSlug-normalized identifier derived from the concept label.
labelThe canonical name for the concept.
explanatoryRoleThe conceptual function the concept serves in the explanation.
integrationStateOne of: well_integrated, weakly_integrated, underexplained, naming_inconsistent.
mentionCountNumber of times the concept appears in the extracted sentences.
isAnchorTrue for the anchor concept only.
namingVariantsOther names or phrasings used to refer to this concept in the content.
evidenceSentence indices from the extraction where this concept appears.
definitionSentenceThe sentence that most clearly defines the concept, if one exists.

For each framework concept, the fields differ: priority, basis, alreadyPresent (yes / partial / no), whyItMatters, and relatedConcepts replace the observed-map-specific fields.

5.4

Relationship fields

For each observed relationship:

id
sourceConcept ID of the source node.
targetConcept ID of the target node.
labelShort description of the relationship type.
isExplicitTrue if stated directly in the text; false if inferred.

Framework relationships carry the same source, target, and label fields, with basis replacing isExplicit.

5.5

Why the output is structured this way

The NDJSON streaming model serves two goals: progressive rendering and structured access. The observed graph begins rendering as soon as Phase 1 completes, without waiting for Phase 2. The structured fields support both graph rendering and writing guidance generation from the same data.

6

Product assumptions

ContentGraph relies on a number of assumptions. These assumptions are not incidental; they shape what the analysis means and where it can mislead users.

6.1

The model can identify the correct anchor concept

ContentGraph assumes the model can determine the primary subject of the content from the text alone. This assumption is necessary because the anchor defines the reference frame for all subsequent analysis — the observed graph, the framework, and the writing guidance are all organized around it.

6.1.1

What this means for users

If the content is multi-topic, weakly signposted, or begins with an extended preamble, the anchor may be identified incorrectly. All downstream analysis may still appear internally coherent but will be directed at the wrong subject. Users should check the anchor after Phase 1 before proceeding.

6.2

Sentence-level extraction captures meaningful relationships

ContentGraph assumes that breaking content into numbered sentences and presenting them to the model preserves enough structure for accurate concept and relationship extraction. HTML formatting, tables, captions, and visual layout are stripped before analysis.

6.2.1

What this means for users

Content whose meaning depends on layout, visual hierarchy, or tabular structure may be under-analyzed. Relationships expressed through comparison tables, specification lists, or visual sequencing may not be captured in the observed graph.

6.3

The explanation framework represents a useful editorial target

ContentGraph assumes the model's generated framework is a meaningful editorial goal — not an arbitrary list of concepts, but a principled model of what a complete explanation would cover for the identified anchor topic.

6.3.1

What this means for users

The framework is generated from model training knowledge, not from a curated ontology or domain expert. For niche, proprietary, or audience-specific topics, the framework may recommend concepts that are correct in general but wrong for the specific context. The framework is a starting point, not a specification.

6.4

The 8 diagnostic questions cover the relevant explanation dimensions

ContentGraph assesses coverage against eight questions: whether the content explains what the anchor is, how it works, what it depends on, what it produces, when it applies, what it contrasts with, what commonly goes wrong, and what evidence supports it.

6.4.1

What this means for users

These questions are well-suited to explanatory content. They are less appropriate for narrative, procedural, or persuasive content where the expected semantic structure is different. Users applying the tool to non-explanatory content should treat question coverage results with lower confidence.

6.5

Named concept variants can be normalized reliably

ContentGraph assumes the model can identify when different names or phrasings refer to the same concept and normalize them under a single label. The system also re-derives concept IDs from labels rather than trusting LLM-generated slugs, to reduce inconsistency.

6.5.1

What this means for users

Inconsistent naming in the content can still produce duplicate or split nodes in the observed graph, particularly when naming variants are highly dissimilar. Users should verify that concepts they know to be synonymous appear as single nodes in the observed graph.

6.6

The selected model can judge consistently enough for analysis

The analysis is configured around claude-sonnet-4-20250514. ContentGraph assumes the model can make stable judgments about anchor identification, concept extraction, integration state assignment, framework generation, and guidance production.

6.6.1

What this means for users

Analysis is model-mediated, not rule-based. The same content may produce slightly different results across runs or model versions. Users should not expect exact repeatability, but should expect directional consistency for the same input.

7

Limitations

ContentGraph’s limitations should be explicit because many users will otherwise over-interpret or over-apply the results.

7.1

HTML extraction requires semantic structure

ContentGraph looks for main or article elements when parsing HTML input. If neither is present, it falls back to all p tags. Navigation, headers, footers, scripts, styles, and sidebars are stripped.

7.1.1

User impact

Pages that use non-semantic HTML, render content via JavaScript, or present body copy inside divs without semantic roles may produce incomplete or inaccurate extractions. The model will analyze only what the parser can reach.

7.2

Sentence splitting is heuristic

ContentGraph splits content into sentences by detecting sentence-ending punctuation followed by a space and an uppercase letter, with a minimum sentence length of 10 characters.

7.2.1

User impact

Complex sentences, abbreviations, decimal numbers, and unconventional punctuation can cause incorrect splits. The model may receive incomplete or incorrectly bounded sentences, which can affect the accuracy of concept and relationship extraction.

7.3

Phase 2 depends entirely on Phase 1 output

If Phase 1 produces an incorrect anchor, incomplete concept extraction, or poor question coverage assessment, Phase 2 uses that output as its foundation. Errors in Phase 1 are not corrected by Phase 2 — they are amplified.

7.3.1

User impact

Users should inspect the observed graph and verify the anchor concept after Phase 1 completes before acting on the framework or writing guidance. Phase 2 output is only as reliable as Phase 1.

7.4

Integration states are LLM-mediated

Integration states are assigned by the model based on its reading of the content. They are not computed from text statistics, keyword frequency, or parse-tree analysis.

7.4.1

User impact

A concept that a human editor considers well-explained may be assigned underexplained, and vice versa. Integration states are most useful directionally — as signals of where to look and what to investigate — rather than as precise classifications to act on uncritically.

7.5

Graph layout is non-deterministic

Both graphs use a D3 force-directed simulation. The layout converges to a stable position but is sensitive to initial conditions. The same content may render with different node positions on different runs.

7.5.1

User impact

Graph layouts are not semantically meaningful on their own. Proximity of nodes in the rendered graph does not indicate conceptual closeness. Relationships are expressed through edges and labels, not through spatial proximity.

7.6

No input size limit is enforced

ContentGraph does not reject large inputs. Very long documents may approach model context limits, causing the LLM to truncate its analysis or produce an incomplete concept map.

7.6.1

User impact

For long documents, the concept map may be incomplete without any visible error. Users working with long content should consider running analysis on individual sections rather than the full document.

7.7

Relationship extraction favors explicit prose

ContentGraph extracts relationships most reliably when they are expressed as clear subject-verb-object structures in prose. Relationships expressed through tables, comparison lists, headings, or other non-prose structures may not be captured.

7.7.1

User impact

Content that uses structured formats to express relationships may appear to have fewer or thinner connections in the observed graph than it actually communicates. The toMakeExplicit guidance may be over-populated for such content.

7.8

The API key is session-scoped

The user's Anthropic API key is stored in sessionStorage and cleared when the browser tab closes. It is never sent to any server other than the Anthropic API directly.

7.8.1

User impact

Users must re-enter their API key in each new browser session. The key is tied to the specific browser and device. Users who are uncomfortable entering an API key into a browser-based interface should treat this as a product trust consideration.

7.9

The framework is generated from general training knowledge

The explanation framework is produced by a single LLM call using the model's general training knowledge. It is not derived from a curated ontology, a domain expert, or the user's specific audience requirements.

7.9.1

User impact

For specialized or proprietary topics, the framework may surface concepts that are standard in the general domain but irrelevant to the specific content goal. Users should apply domain judgment when deciding which framework recommendations to act on.

7.10

ContentGraph is a diagnostic tool, not an editorial authority

The framework, guidance, and integration states are one model's structured reading of one piece of content at one point in time. They do not account for the user's audience, purpose, tone, voice, or strategic goals.

7.10.1

User impact

Users should treat ContentGraph as decision support — a structured prompt for editorial thinking — not as a final judgment on what their content should contain. The tool surfaces gaps; the author decides whether to close them.

7.11

Query fan-out behavior is model-specific and undocumented

No retrieval system publishes how it decomposes queries. The sub-queries any given system generates for a topic are not documented, not versioned, and not consistent across providers. ContentGraph's framework models an idealized fan-out space — a principled estimate of what sub-queries would be issued for the anchor topic — not an empirically observed one.

7.11.1

User impact

Content improved using ContentGraph may perform well across most retrieval paths while still missing those specific to a particular system's undocumented behavior. The framework is a structural target, not a guarantee of retrieval performance with any specific system.

7.12

Fan-out patterns change without notice

Retrieval system internals — including how queries are decomposed and expanded — are updated silently and frequently. A framework that accurately models the fan-out space for a topic today may be less accurate after an update that changes how the system fans out for that topic.

7.12.1

User impact

ContentGraph's value is more durable for structural improvements — explicit relationships, naming consistency, concept development — than for precise concept selection. Structural quality tends to remain valuable across changes to retrieval behavior; which specific concepts matter may shift as fan-out patterns evolve without notice.

7.13

The 8 diagnostic questions approximate fan-out, not represent it

The 8 question coverage dimensions are a principled approximation of common fan-out sub-query patterns for explanatory content. They are not a taxonomy derived from observed retrieval system behavior and do not correspond to the internal query decomposition logic of any specific system.

7.13.1

User impact

Question coverage is directionally useful as a proxy for fan-out coverage breadth. It should not be read as a precise measurement of how well the content will perform against any specific system's actual fan-out behavior.

8

Open product risks

The following risks are not just theoretical; they follow directly from how ContentGraph is designed.

8.1

Risk: anchor misidentification cascade

The anchor concept is the foundation of the entire analysis. If it is wrong, the observed graph may be coherent but misaligned with the content’s purpose, the framework will target the wrong topic, and the writing guidance will suggest changes that move the content in the wrong direction.

8.1.1

Why this risk exists

  • Anchor identification is the first judgment the model makes — before any other analysis
  • Multi-topic or ambiguously signposted content makes anchoring harder
  • There is no mechanism for the user to correct the anchor before Phase 2 begins
8.1.2

What could go wrong

  • Framework recommendations target the wrong topic entirely
  • Writing guidance suggests adding concepts that are irrelevant to the actual content goal
  • The observed graph appears complete for the wrong anchor, masking real coverage gaps
8.1.3

Mitigation

  • Surface the anchor prominently in the observed graph and findings panel
  • Encourage users to verify the anchor as the first step after Phase 1
  • Allow users to re-run with a corrected anchor prompt in a future version
8.2

Risk: framework treated as a content specification

The explanation framework is an editorial reference, not a content requirement. Users may interpret essential or important concepts as mandatory additions, even when those concepts are not relevant to their specific content goal, audience, or angle.

8.2.1

Why this risk exists

The framework is structured and labeled, which implies authority. Priority labels like essential create a sense of obligation. The comparison view between observed and framework graphs is visually compelling in a way that can feel prescriptive.

8.2.2

What could go wrong

  • Users add concepts to satisfy the framework rather than serve their audience
  • Content becomes more generic as it converges toward the model’s expected structure
  • The tool’s perspective overrides the author’s judgment about what their content is actually for
8.2.3

Mitigation

  • Frame the framework explicitly as a starting point and editorial reference throughout the UI
  • Explain that priority labels reflect topic-relative judgments, not universal requirements
  • Keep writing guidance actionable but framed as options to consider, not directives to execute
8.3

Risk: model drift

Changes to the underlying model may change how ContentGraph identifies anchors, assigns integration states, generates frameworks, or produces guidance.

8.3.1

Why this risk exists

All four analytical stages depend on model judgment. The model currently used — claude-sonnet-4-20250514 — may be updated or deprecated. Any model change can alter the character of the analysis at every stage of the pipeline.

8.3.2

What could go wrong

  • The same content produces meaningfully different maps across sessions or model versions
  • Users notice instability and lose confidence in the analysis
  • Previously generated frameworks become incomparable to new ones
8.3.3

Mitigation

  • Display the model version used for each analysis run
  • Avoid direct comparison of results across major model changes without caveat
8.4

Risk: false completeness in the observed graph

A dense observed graph — many nodes, many connections — may look thorough while still having significant coverage gaps. Content that mentions many concepts superficially will produce a visually rich graph, but integration states will reveal that most of those concepts are weakly_integrated or underexplained.

8.4.1

Why this risk exists

Graph density is visually compelling. Users who do not look past the observed graph to the integration state breakdown and the framework comparison may conclude that the content is structurally sound when it is not.

8.4.2

What could go wrong

  • Users stop at Phase 1 and conclude the content is well-structured based on graph density alone
  • Integration states signaling underexplained or weakly_integrated concepts are not acted on
  • Phase 2 is skipped, and the real coverage gaps remain invisible
8.4.3

Mitigation

  • Make integration state counts prominent in the Phase 1 results panel
  • Frame Phase 2 as a necessary step, not an optional enhancement, in the UI
9

Practical interpretation guidance

Taken together, Sections 1–8 imply a clear operating principle for users.

ContentGraph is most useful when it is treated as a structured editorial diagnostic. It is least useful when treated as a content generator, a completeness certification, or a specification the content must satisfy.

Strongest use cases

  • Understanding which concepts in an explanation are underdeveloped or poorly connected
  • Identifying relationships that are implied but never stated directly
  • Building a first draft of an explanation framework for a new topic
  • Comparing coverage across different versions or drafts of explanatory content

Weakest use cases

  • Treating the framework as a content brief that must be fully satisfied
  • Using integration states as a substitute for reading and judging the content directly
  • Applying the tool to narrative, opinion, or persuasive content where structural completeness is not the primary goal

The right way to use ContentGraph is to verify the anchor as the first validity check, read the integration states as signals of where retrieval is likely to fail, and treat the writing guidance as a prioritized editorial starting point — not a final authority on what the content should contain. The framework models an idealized fan-out space; whether it matches any specific system’s actual behavior is unknown. What it does reliably is surface structural gaps that are worth closing regardless of which retrieval system is reading the content.