Why Knowledge Graphs for Digital Equity Intelligence
Why Knowledge Graphs for Digital Equity Intelligence
Knowledge graphs provide the structured data foundation that stores relationships, enables complex queries, and grounds AI reasoning in your Digital Equity Intelligence System.
What This Document Covers:
- What knowledge graphs are (vs. traditional databases)
- Why they’re perfect for digital equity research
- How they connect to your theoretical framework
- Real Cypher query examples
- Integration with Bayesian networks and GraphRAG
Audience: Technical implementers, data analysts, policymakers curious about the “why”
The Core Idea: Relationships Are First-Class Citizens
Digital equity is fundamentally about RELATIONSHIPS:
Relationships that matter:
- Counties have infrastructure scores
- Infrastructure INFLUENCES aspiration
- Aspiration ENABLES skill development
- Hampton & Bauer VALIDATED the framework
- Sen's theory GROUNDS the Opportunity stage
- Digital navigators BUILD aspiration capacity
Traditional database (tables):
Counties table:
| ID | Name | Infrastructure | Aspiration | Skills |
Researchers table:
| ID | Name | Institution | Year |
Problem: Relationships are implicit, hard to query
"Show me all theories that ground the Aspiration stage
and the evidence validating them in Michigan"
→ Complex joins, slow, fragile
Knowledge graph (nodes + edges):
(County)-[:HAS_SCORE]->(Infrastructure:0.65)
(Infrastructure)-[:INFLUENCES]->(Aspiration)
(Aspiration)-[:GROUNDED_IN]->(Appadurai_Theory)
(Hampton_Bauer)-[:VALIDATES]->(Aspiration_Stage)
Query: Just follow the relationships!
MATCH (theory)-[:GROUNDS]->(stage)<-[:VALIDATES]-(evidence)
RETURN theory, stage, evidence
Relationships are EXPLICIT, queryable, traversable
What is a Knowledge Graph?
Simple definition:
A knowledge graph is a network of connected facts, where both the facts AND the connections have meaning.
Structure:
- Nodes = Entities (counties, researchers, theories, interventions)
- Edges = Relationships (influences, validates, grounds, measures)
- Properties = Attributes (scores, dates, values)
Example:
(County:Region {name: "Baraga County", population: 8,158})
-[:HAS_OPPORTUNITY_SCORE {value: 0.58, year: 2024}]->
(Score:Metric {component: "Connectivity"})
(Score)-[:MEASURED_BY]->(Dagg_Compass:Framework)
(Dagg_Compass)-[:OPERATIONALIZES]->(Sen_Theory:Capability_Approach)
You can ask:
- “What’s Baraga County’s opportunity score?” (property lookup)
- “What framework measured it?” (follow relationship)
- “What theory does that framework operationalize?” (multi-hop traversal)
All in one query!
Why Graphs > Tables for Digital Equity
1. Natural Representation of Your Framework
Your pathway:
Opportunity → Aspiration → Growth Mindset → Digital Equity
In a knowledge graph:
(Opportunity:Stage {name: "Opportunity"})
-[:ENABLES {mechanism: "Sen's conversion factors"}]->
(Aspiration:Stage {name: "Aspiration"})
-[:ENABLES {mechanism: "Dweck's learning"}]->
(Growth_Mindset:Stage {name: "Growth Mindset"})
-[:PRODUCES]->
(Digital_Equity:Outcome)
Query the pathway:
MATCH path = (start:Stage {name: "Opportunity"})
-[:ENABLES*]->(end:Outcome)
RETURN path
Result: Visual representation of complete pathway with mechanisms
2. Connect Theory to Evidence to Implementation
Knowledge graph captures:
(Sen:Researcher {name: "Amartya Sen", year: 1999})
-[:DEVELOPED]->(Capabilities:Theory)
-[:GROUNDS]->(Opportunity:Stage)
(Hampton_Bauer:Research {year: 2020, doi: "10.25335/BZGY-3V91"})
-[:VALIDATES]->(Opportunity:Stage)
-[:FOUND {gap: "Missing infrastructure"}]->
(Infrastructure_Gap:Finding)
(Dagg_Compass:Framework {year: 2023})
-[:MEASURES]->(Opportunity:Stage)
-[:COMPONENT {name: "Connectivity"}]->
(Connectivity_Metrics:Measurement)
(Your_System:Implementation)
-[:USES]->(Dagg_Compass)
-[:STORES_IN]->(Knowledge_Graph)
-[:PREDICTS_WITH]->(Bayesian_Network)
Query:
// Show complete lineage for Opportunity stage
MATCH (theory:Theory)-[:GROUNDS]->(stage:Stage {name: "Opportunity"}),
(evidence:Research)-[:VALIDATES]->(stage),
(measurement:Framework)-[:MEASURES]->(stage)
RETURN theory.name AS "Theoretical Foundation",
evidence.doi AS "Empirical Validation",
measurement.name AS "Measurement Tool"
Result:
Theoretical Foundation | Empirical Validation | Measurement Tool
Sen's Capabilities | 10.25335/BZGY-3V91 | Dagg Compass Connectivity
This is research lineage in one query!
3. Multi-Hop Reasoning
Question: “Which Michigan researchers influenced your Aspiration stage?”
In tables: Would require multiple joins across researchers, institutions, theories, stages
In knowledge graph:
MATCH (researcher:Person)-[:AFFILIATED_WITH]->(institution:University {state: "Michigan"}),
(researcher)-[:DEVELOPED|VALIDATED]->(work),
(work)-[:GROUNDS|MEASURES]->(stage:Stage {name: "Aspiration"})
RETURN researcher.name, institution.name, work.type
Result:
Researcher | Institution | Contribution
Arjun Appadurai | NYU (theory) | Grounding theory
Keith Hampton | MSU Quello | Validation research
Pierrette Dagg | Merit/UMich | Measurement framework
One query, multiple relationship types, complete answer
4. Flexible Schema Evolution
Traditional database:
ALTER TABLE counties ADD COLUMN new_metric FLOAT;
-- Must update ALL rows, even if data not available
-- Schema change is painful
Knowledge graph:
// Just add new relationships when data available
MATCH (c:County {name: "Baraga County"})
CREATE (c)-[:HAS_NEW_METRIC {value: 0.72, source: "2025 Survey"}]->(:Metric)
// Other counties unchanged
// No schema migration needed
Your framework is EVOLVING—graphs adapt easily
How Knowledge Graphs Connect to Your Framework:
Researchers as Nodes with Relationships
// Sen's Capabilities Approach
CREATE (sen:Researcher {
name: "Amartya Sen",
institution: "Harvard University",
nobel: 1998,
field: "Economics"
})
CREATE (capabilities:Theory {
name: "Capabilities Approach",
year: 1999,
key_insight: "Capability = Resources + Conversion factors"
})
CREATE (sen)-[:DEVELOPED {year: 1999}]->(capabilities)
CREATE (opportunity:Stage {name: "Opportunity"})
CREATE (capabilities)-[:GROUNDS]->(opportunity)
// Now queryable:
MATCH (r:Researcher)-[:DEVELOPED]->(t:Theory)-[:GROUNDS]->(s:Stage)
WHERE s.name = "Opportunity"
RETURN r.name, t.name
// Returns: "Amartya Sen", "Capabilities Approach"
Hampton & Bauer Research as Evidence Nodes
// Store research findings
CREATE (hb:Research {
title: "Broadband and Student Performance Gaps",
authors: ["Keith Hampton", "Johannes Bauer", "Luis Fernandez", "Craig Robertson"],
year: 2020,
institution: "MSU Quello Center",
doi: "10.25335/BZGY-3V91"
})
// Three gaps found
CREATE (gap1:Finding {
type: "Infrastructure gap",
description: "Missing infrastructure prevents access"
})
CREATE (gap2:Finding {
type: "Value gap",
description: "Unclear value prevents engagement"
})
CREATE (gap3:Finding {
type: "Skill gap",
description: "Insufficient skills prevent achievement"
})
CREATE (hb)-[:FOUND]->(gap1)
CREATE (hb)-[:FOUND]->(gap2)
CREATE (hb)-[:FOUND]->(gap3)
// Link to your framework stages
CREATE (gap1)-[:VALIDATES]->(opportunity:Stage {name: "Opportunity"})
CREATE (gap2)-[:VALIDATES]->(aspiration:Stage {name: "Aspiration"})
CREATE (gap3)-[:VALIDATES]->(mindset:Stage {name: "Growth Mindset"})
// Query: Which findings validate which stages?
MATCH (r:Research)-[:FOUND]->(f:Finding)-[:VALIDATES]->(s:Stage)
RETURN s.name, collect(f.type) AS validating_evidence
Dagg Compass as Measurement Framework
// Digital Opportunities Compass framework
CREATE (compass:Framework {
name: "Digital Opportunities Compass",
year: 2023,
institution: "MSU Quello Center",
purpose: "BEAD/DEA implementation guidance"
})
// Six components
CREATE (contexts:Component {name: "Contexts", measures: "Demographics, economics"})
CREATE (governance:Component {name: "Governance", measures: "Policy, funding"})
CREATE (connectivity:Component {name: "Connectivity", measures: "Infrastructure, adoption"})
CREATE (skills:Component {name: "Skills", measures: "Literacy, training"})
CREATE (application:Component {name: "Application", measures: "Use cases, relevance"})
CREATE (outcomes:Component {name: "Outcomes", measures: "Achievement, inclusion"})
CREATE (compass)-[:HAS_COMPONENT]->(contexts)
CREATE (compass)-[:HAS_COMPONENT]->(governance)
CREATE (compass)-[:HAS_COMPONENT]->(connectivity)
CREATE (compass)-[:HAS_COMPONENT]->(skills)
CREATE (compass)-[:HAS_COMPONENT]->(application)
CREATE (compass)-[:HAS_COMPONENT]->(outcomes)
// Link components to framework stages
CREATE (connectivity)-[:MEASURES]->(opportunity:Stage {name: "Opportunity"})
CREATE (application)-[:MEASURES]->(aspiration:Stage {name: "Aspiration"})
CREATE (skills)-[:MEASURES]->(mindset:Stage {name: "Growth Mindset"})
CREATE (outcomes)-[:MEASURES]->(equity:Outcome {name: "Digital Equity"})
// Query: How do we measure each stage?
MATCH (component:Component)-[:MEASURES]->(stage:Stage)
RETURN stage.name, component.name, component.measures
Counties with Compass Scores
// Store county assessment data
CREATE (baraga:County {
name: "Baraga County",
state: "Michigan",
region: "Upper Peninsula",
population: 8158,
rural: true
})
// Compass scores
CREATE (baraga)-[:HAS_SCORE {
component: "Connectivity",
value: 0.58,
year: 2024,
source: "FCC + ACS data"
}]->(connectivity:Component)
CREATE (baraga)-[:HAS_SCORE {
component: "Application",
value: 0.42,
year: 2024,
source: "Community survey"
}]->(application:Component)
CREATE (baraga)-[:HAS_SCORE {
component: "Skills",
value: 0.38,
year: 2024,
source: "Digital literacy assessment"
}]->(skills:Component)
// Link to predicted outcome
CREATE (baraga)-[:PREDICTED_EQUITY {
value: 0.31,
confidence: 0.85,
method: "Bayesian network",
date: "2024-11-07"
}]->(equity:Outcome)
// Query: Show complete county profile
MATCH (c:County {name: "Baraga County"})-[score:HAS_SCORE]->(comp:Component),
(c)-[pred:PREDICTED_EQUITY]->(outcome:Outcome)
RETURN c.name, comp.name, score.value, pred.value
Real Cypher Queries for Policy Questions:
Query 1: Which counties need navigators most?
// Find counties with good infrastructure but low aspiration
MATCH (c:County)-[conn:HAS_SCORE]->(connectivity:Component {name: "Connectivity"}),
(c)-[app:HAS_SCORE]->(application:Component {name: "Application"})
WHERE conn.value > 0.6 AND app.value < 0.5
RETURN c.name,
conn.value AS infrastructure,
app.value AS aspiration,
(conn.value - app.value) AS opportunity_gap
ORDER BY opportunity_gap DESC
LIMIT 10
Result: Counties where infrastructure exists but aspiration capacity is low
Policy implication: These counties are PRIME candidates for digital navigator programs
Theory: Appadurai’s navigation capacity gap + Toyama’s amplification principle
Query 2: Show research lineage for a specific finding
// Trace "unclear value" finding back to theory
MATCH path = (researcher:Researcher)-[:DEVELOPED]->(theory:Theory)
-[:GROUNDS]->(stage:Stage)<-[:VALIDATES]-(research:Research)
-[:FOUND]->(finding:Finding {type: "Value gap"})
<-[:MEASURED_BY]-(framework:Framework)
RETURN path
Result: Visual graph showing:
Appadurai → Capacity to Aspire → Aspiration Stage
↑ (validated by)
Hampton & Bauer → "Value gap" finding
↓ (measured by)
Dagg Compass → Application component
This is your theoretical foundation visualized!
Query 3: Compare similar counties to predict outcomes
// Find counties similar to target county
MATCH (target:County {name: "Ontonagon County"})-[t_scores:HAS_SCORE]->(comp:Component)
MATCH (similar:County)-[s_scores:HAS_SCORE]->(comp)
WHERE similar.name <> target.name
AND similar.rural = target.rural
AND abs(t_scores.value - s_scores.value) < 0.15 // Similar scores
WITH similar, count(*) AS matching_components
WHERE matching_components >= 4 // Similar on most dimensions
// Show their interventions and outcomes
MATCH (similar)-[:RECEIVED_INTERVENTION]->(intervention:Program)
MATCH (similar)-[:ACHIEVED_OUTCOME {year: intervention.year + 2}]->(outcome)
RETURN similar.name,
intervention.type,
intervention.budget,
outcome.gini_change AS equity_improvement
ORDER BY equity_improvement DESC
Result: Learn from similar counties’ experiences
Policy implication: “These counties had similar profiles and succeeded with this intervention mix”
Query 4: Intervention impact analysis
// Which interventions improved which stages most?
MATCH (county:County)-[:RECEIVED_INTERVENTION]->(intervention:Program),
(county)-[:HAS_SCORE {year: intervention.year}]->(comp_before),
(county)-[:HAS_SCORE {year: intervention.year + 1}]->(comp_after)
WHERE comp_before = comp_after // Same component, different years
WITH intervention.type AS intervention_type,
comp_before.name AS component,
avg(comp_after.value - comp_before.value) AS avg_improvement
RETURN intervention_type, component, avg_improvement
ORDER BY avg_improvement DESC
Result:
Intervention Type | Component | Avg Improvement
Digital Navigators | Application | +0.24
Digital Navigators | Skills | +0.18
Infrastructure | Connectivity | +0.31
Infrastructure | Application | +0.06
Training Programs | Skills | +0.22
Training Programs | Application | +0.08
Policy implication:
- Infrastructure → best for Connectivity (obvious)
- Navigators → best for Application (Appadurai’s navigation capacity!)
- Training → best for Skills (Dweck’s growth mindset development)
Query 5: Find gaps between theory and practice
// Theory predicts certain relationships—do they hold empirically?
MATCH (theory:Theory)-[:PREDICTS]->(relationship)
MATCH (empirical:Research)-[:FOUND]->(finding)
WHERE NOT (finding)-[:CONFIRMS]->(relationship)
RETURN theory.name,
relationship.description AS predicted,
finding.description AS actual_finding,
"DISCREPANCY" AS status
Result: Identifies where theory and evidence diverge
Use case: Refine your Bayesian network based on Michigan data
Integration with Bayesian Networks:
Knowledge graph FEEDS Bayesian networks
// Query county data for Bayesian network input
MATCH (c:County {name: "Baraga County"})-[scores:HAS_SCORE]->(comp:Component)
RETURN comp.name AS component, scores.value AS value
// Python code receives this data:
# Use knowledge graph query results as Bayesian evidence
from neo4j import GraphDatabase
from pgmpy.inference import VariableElimination
# Query graph
with driver.session() as session:
result = session.run("""
MATCH (c:County {name: $county})-[s:HAS_SCORE]->(comp:Component)
RETURN comp.name, s.value
""", county="Baraga County")
evidence = {row["comp.name"]: row["s.value"] for row in result}
# Feed to Bayesian network
inference = VariableElimination(bayesian_model)
prediction = inference.query(['Equity'], evidence=evidence)
# Store prediction back in graph
session.run("""
MATCH (c:County {name: $county})
CREATE (c)-[:PREDICTED_EQUITY {
value: $pred_value,
confidence: $confidence,
date: date()
}]->(outcome:Outcome)
""", county="Baraga County", pred_value=prediction.values[1], confidence=0.85)
Complete loop: Graph → Bayesian → Prediction → Back to Graph
Integration with GraphRAG:
Knowledge graph provides CONTEXT for AI responses
Without knowledge graph (regular LLM):
User: "Should we invest in navigators for Baraga County?"
LLM: "Digital navigators can help improve digital literacy..."
(Generic answer, no county-specific data, possible hallucination)
With knowledge graph (GraphRAG):
User: "Should we invest in navigators for Baraga County?"
GraphRAG:
1. Queries knowledge graph for Baraga County data
2. Retrieves: Connectivity=0.58, Application=0.42, Skills=0.38
3. Retrieves: Appadurai theory (navigation capacity)
4. Retrieves: Hampton & Bauer finding ("unclear value" gap)
5. Retrieves: Similar county outcomes with navigator programs
LLM response (grounded in retrieved data):
"Yes. Baraga County has moderate infrastructure (0.58) but LOW
aspiration capacity (0.42). This is exactly the 'unclear value'
gap Hampton & Bauer identified. Appadurai's theory says navigation
capacity building addresses this. Similar rural Michigan counties
(Ontonagon, Keweenaw) improved Application scores by +0.24 with
$2M navigator programs. Recommend 2-year navigator program focused
on showing relevant digital pathways."
Evidence-grounded, theory-informed, data-specific answer!
Read more: TrainingCompassGraphRAG.md
Graph Algorithms for Policy Insights:
Community Detection: Find Similar Counties
// Use graph algorithm to cluster similar counties
CALL gds.louvain.stream({
nodeProjection: 'County',
relationshipProjection: {
SIMILAR_SCORES: {
type: 'HAS_SCORE',
properties: 'value'
}
}
})
YIELD nodeId, communityId
RETURN gds.util.asNode(nodeId).name AS county, communityId
ORDER BY communityId
Result: Counties clustered by similar Compass profiles
Use case: Targeted interventions for county clusters, peer learning networks
Centrality: Which researchers are most influential?
// PageRank to find most-cited/connected researchers
CALL gds.pageRank.stream({
nodeProjection: ['Researcher', 'Theory', 'Research'],
relationshipProjection: ['DEVELOPED', 'CITES', 'BUILDS_ON']
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS entity, score
ORDER BY score DESC
LIMIT 10
Result: Sen, Appadurai, Dweck likely score highest
Use case: Prioritize which theories to emphasize in training materials
Shortest Path: Connect Theory to Practice
// Shortest path from Sen's theory to your county intervention
MATCH path = shortestPath(
(sen:Theory {name: "Capabilities Approach"})-[*]->
(intervention:Program {county: "Baraga County", type: "Navigator"})
)
RETURN path
Result:
Sen's Capabilities → Opportunity Stage → Connectivity Component
→ Baraga County Assessment → Navigator Intervention
Use case: Show stakeholders the complete lineage from theory to action
Why Neo4j (or Similar) for Your System:
Neo4j advantages:
- Native graph storage (not graph layer on SQL)
- Cypher query language (intuitive, readable)
- Graph algorithms library (clustering, centrality, pathfinding)
- Scalability (billions of nodes/relationships)
- ACID transactions (data integrity for policy decisions)
- Visualization tools (Neo4j Browser, Bloom)
Alternative graph databases:
- Amazon Neptune: AWS-managed, good for cloud deployments
- ArangoDB: Multi-model (graph + document)
- TigerGraph: High-performance analytics
For Michigan digital equity: Neo4j recommended (mature, well-documented, strong community)
Data Model for Your System:
Node Types:
// Entities
:County (name, population, region, rural/urban)
:Researcher (name, institution, field)
:Theory (name, year, key_insight)
:Research (title, year, doi, authors)
:Framework (name, year, purpose)
:Component (name, measures)
:Stage (name, description)
:Outcome (name, type)
:Intervention (type, budget, year, duration)
:Finding (type, description)
// Metrics
:Score (component, value, year, source)
:Prediction (value, confidence, method, date)
Relationship Types:
// Theory relationships
(:Researcher)-[:DEVELOPED]->(:Theory)
(:Theory)-[:GROUNDS]->(:Stage)
(:Theory)-[:BUILDS_ON]->(:Theory)
// Evidence relationships
(:Research)-[:VALIDATES]->(:Stage)
(:Research)-[:FOUND]->(:Finding)
(:Finding)-[:SUPPORTS]->(:Theory)
// Measurement relationships
(:Framework)-[:HAS_COMPONENT]->(:Component)
(:Component)-[:MEASURES]->(:Stage)
(:County)-[:HAS_SCORE]->(:Component)
// Intervention relationships
(:County)-[:RECEIVED_INTERVENTION]->(:Intervention)
(:Intervention)-[:TARGETS]->(:Stage)
(:County)-[:ACHIEVED_OUTCOME]->(:Outcome)
// Predictions
(:County)-[:PREDICTED_EQUITY]->(:Outcome)
(:Intervention)-[:EXPECTED_TO_IMPROVE]->(:Stage)
Building the Graph: Python Integration
from neo4j import GraphDatabase
class DigitalEquityGraph:
def __init__(self, uri, user, password):
self.driver = GraphDatabase.driver(uri, auth=(user, password))
def add_county_assessment(self, county_name, compass_scores):
"""Store county Compass scores in graph"""
with self.driver.session() as session:
session.run("""
MERGE (c:County {name: $county})
WITH c
UNWIND $scores AS score
MATCH (comp:Component {name: score.component})
MERGE (c)-[:HAS_SCORE {
value: score.value,
year: score.year,
source: score.source
}]->(comp)
""", county=county_name, scores=compass_scores)
def find_similar_counties(self, target_county, threshold=0.15):
"""Find counties with similar Compass profiles"""
with self.driver.session() as session:
result = session.run("""
MATCH (target:County {name: $target})-[t:HAS_SCORE]->(comp:Component)
MATCH (other:County)-[o:HAS_SCORE]->(comp)
WHERE other.name <> target.name
AND abs(t.value - o.value) < $threshold
WITH other, count(*) AS matching_dims
WHERE matching_dims >= 4
RETURN other.name AS county, matching_dims
ORDER BY matching_dims DESC
""", target=target_county, threshold=threshold)
return [record["county"] for record in result]
def get_intervention_evidence(self, intervention_type, component):
"""Find evidence for intervention effectiveness"""
with self.driver.session() as session:
result = session.run("""
MATCH (county:County)-[:RECEIVED_INTERVENTION]->(i:Intervention {type: $int_type})
MATCH (county)-[before:HAS_SCORE {year: i.year}]->(c:Component {name: $comp})
MATCH (county)-[after:HAS_SCORE {year: i.year + 1}]->(c)
RETURN county.name AS county,
i.budget AS budget,
(after.value - before.value) AS improvement
ORDER BY improvement DESC
""", int_type=intervention_type, comp=component)
return [dict(record) for record in result]
def store_prediction(self, county, equity_prediction, confidence):
"""Store Bayesian network prediction"""
with self.driver.session() as session:
session.run("""
MATCH (c:County {name: $county})
CREATE (c)-[:PREDICTED_EQUITY {
value: $prediction,
confidence: $confidence,
method: 'Bayesian network',
timestamp: datetime()
}]->(o:Outcome {type: 'Digital Equity'})
""", county=county, prediction=equity_prediction, confidence=confidence)
# Usage
graph = DigitalEquityGraph("bolt://localhost:7687", "neo4j", "password")
# Add assessment
graph.add_county_assessment("Baraga County", [
{"component": "Connectivity", "value": 0.58, "year": 2024, "source": "FCC"},
{"component": "Application", "value": 0.42, "year": 2024, "source": "Survey"},
{"component": "Skills", "value": 0.38, "year": 2024, "source": "Assessment"}
])
# Find similar counties
similar = graph.find_similar_counties("Baraga County")
print(f"Similar counties: {similar}")
# Get intervention evidence
evidence = graph.get_intervention_evidence("Digital Navigator", "Application")
print(f"Navigator effectiveness: {evidence}")
# Store prediction
graph.store_prediction("Baraga County", 0.31, 0.85)
Visualization: Making the Graph Understandable
Neo4j Browser visualization:
// Show complete framework
MATCH path = (r:Researcher)-[:DEVELOPED]->(t:Theory)-[:GROUNDS]->(s:Stage)
<-[:VALIDATES]-(research:Research)
RETURN path
Result: Visual network showing theory → evidence connections
For stakeholders: Export to D3.js, Gephi, or Tableau for presentations
Common Questions:
Q: “Why not just use SQL?”
A: SQL requires JOIN hell for relationship queries
SQL version:
-- "Which theories ground stages validated by Michigan research?"
SELECT t.name, s.name, r.title
FROM researchers res
JOIN theories t ON res.id = t.researcher_id
JOIN theory_stage_mapping tsm ON t.id = tsm.theory_id
JOIN stages s ON tsm.stage_id = s.id
JOIN research_validation rv ON s.id = rv.stage_id
JOIN research r ON rv.research_id = r.id
WHERE r.state = 'Michigan'
Graph version:
MATCH (r:Researcher {state: "Michigan"})-[:DEVELOPED]->(t:Theory)
-[:GROUNDS]->(s:Stage)<-[:VALIDATES]-(research:Research)
WHERE research.state = "Michigan"
RETURN t.name, s.name, research.title
Readable, intuitive, fast
Q: “How do we keep the graph updated?”
A: ETL pipelines from data sources
# Scheduled job (e.g., daily)
def update_county_scores():
# Pull latest FCC data
fcc_data = fetch_fcc_broadband_data()
# Pull latest ACS demographics
acs_data = fetch_census_acs_data()
# Calculate Compass scores
compass_scores = calculate_dagg_compass(fcc_data, acs_data)
# Update graph
for county, scores in compass_scores.items():
graph.add_county_assessment(county, scores)
# Trigger Bayesian predictions
update_predictions()
Q: “How big can this graph get?”
A: Very big. Neo4j handles:
- Billions of nodes
- Trillions of relationships
- Sub-second query times with proper indexing
Your scale:
- 83 Michigan counties
- ~10 researchers
- ~20 theories/frameworks
- ~100 research papers
- Thousands of interventions over time
Total: Tens of thousands of nodes, hundreds of thousands of relationships
Neo4j: No problem
Bottom Line:
Knowledge graphs are the data foundation of your digital equity intelligence system.
What they provide:
✅ Natural representation of relationships (theory, evidence, practice)
✅ Flexible schema (evolve as framework grows)
✅ Complex queries answered simply (Cypher is readable)
✅ Multi-hop reasoning (connect distant concepts)
✅ Integration hub (feeds Bayesian, GraphRAG, visualization)
✅ Audit trail (trace from theory to policy decision)
Why they’re perfect for digital equity:
- Research is about RELATIONSHIPS (theories ground stages, evidence validates theories)
- Framework EVOLVES (new research, new metrics, graph adapts)
- Policy questions are MULTI-HOP (“Which theory → evidence → measurement → action?”)
- Need EXPLAINABILITY (show reasoning path to stakeholders)
The complete system:
Data sources (FCC, ACS, surveys)
↓
Knowledge Graph (stores + relates)
↓
Bayesian Network (predicts with graph data)
↓
GraphRAG (answers questions with graph context)
↓
Policy decisions (evidence-grounded, explainable)
This is how you build institutional memory and organizational intelligence.
Next steps:
- TrainingCompassBayesian.md - How predictions work
- TrainingCompassGraphRAG.md - How AI uses the graph
- TrainingCompassMetrics.md - What data goes in the graph
Version: 1.0
Last Updated: November 2025
Part of: Project Compass (Merit Network) - Digital Opportunities Intelligence Network (DOIN) • Working draft