Why Knowledge Graphs for Digital Equity Intelligence

Knowledge graphs provide the structured data foundation that stores relationships, enables complex queries, and grounds AI reasoning in your Digital Equity Intelligence System.

What This Document Covers:

  • What knowledge graphs are (vs. traditional databases)
  • Why they’re perfect for digital equity research
  • How they connect to your theoretical framework
  • Real Cypher query examples
  • Integration with Bayesian networks and GraphRAG

Audience: Technical implementers, data analysts, policymakers curious about the “why”


The Core Idea: Relationships Are First-Class Citizens

Digital equity is fundamentally about RELATIONSHIPS:

Relationships that matter:
  - Counties have infrastructure scores
  - Infrastructure INFLUENCES aspiration
  - Aspiration ENABLES skill development
  - Hampton & Bauer VALIDATED the framework
  - Sen's theory GROUNDS the Opportunity stage
  - Digital navigators BUILD aspiration capacity

Traditional database (tables):

Counties table:
  | ID | Name | Infrastructure | Aspiration | Skills |

Researchers table:
  | ID | Name | Institution | Year |

Problem: Relationships are implicit, hard to query
         "Show me all theories that ground the Aspiration stage
          and the evidence validating them in Michigan"
         → Complex joins, slow, fragile

Knowledge graph (nodes + edges):

(County)-[:HAS_SCORE]->(Infrastructure:0.65)
(Infrastructure)-[:INFLUENCES]->(Aspiration)
(Aspiration)-[:GROUNDED_IN]->(Appadurai_Theory)
(Hampton_Bauer)-[:VALIDATES]->(Aspiration_Stage)

Query: Just follow the relationships!
       MATCH (theory)-[:GROUNDS]->(stage)<-[:VALIDATES]-(evidence)
       RETURN theory, stage, evidence

Relationships are EXPLICIT, queryable, traversable


What is a Knowledge Graph?

Simple definition:
A knowledge graph is a network of connected facts, where both the facts AND the connections have meaning.

Structure:

  • Nodes = Entities (counties, researchers, theories, interventions)
  • Edges = Relationships (influences, validates, grounds, measures)
  • Properties = Attributes (scores, dates, values)

Example:

(County:Region {name: "Baraga County", population: 8,158})
  -[:HAS_OPPORTUNITY_SCORE {value: 0.58, year: 2024}]->
  (Score:Metric {component: "Connectivity"})
  
(Score)-[:MEASURED_BY]->(Dagg_Compass:Framework)
(Dagg_Compass)-[:OPERATIONALIZES]->(Sen_Theory:Capability_Approach)

You can ask:

  • “What’s Baraga County’s opportunity score?” (property lookup)
  • “What framework measured it?” (follow relationship)
  • “What theory does that framework operationalize?” (multi-hop traversal)

All in one query!


Why Graphs > Tables for Digital Equity

1. Natural Representation of Your Framework

Your pathway:

Opportunity → Aspiration → Growth Mindset → Digital Equity

In a knowledge graph:

(Opportunity:Stage {name: "Opportunity"})
  -[:ENABLES {mechanism: "Sen's conversion factors"}]->
(Aspiration:Stage {name: "Aspiration"})
  -[:ENABLES {mechanism: "Dweck's learning"}]->
(Growth_Mindset:Stage {name: "Growth Mindset"})
  -[:PRODUCES]->
(Digital_Equity:Outcome)

Query the pathway:

MATCH path = (start:Stage {name: "Opportunity"})
             -[:ENABLES*]->(end:Outcome)
RETURN path

Result: Visual representation of complete pathway with mechanisms


2. Connect Theory to Evidence to Implementation

Knowledge graph captures:

(Sen:Researcher {name: "Amartya Sen", year: 1999})
  -[:DEVELOPED]->(Capabilities:Theory)
  -[:GROUNDS]->(Opportunity:Stage)
  
(Hampton_Bauer:Research {year: 2020, doi: "10.25335/BZGY-3V91"})
  -[:VALIDATES]->(Opportunity:Stage)
  -[:FOUND {gap: "Missing infrastructure"}]->
  (Infrastructure_Gap:Finding)
  
(Dagg_Compass:Framework {year: 2023})
  -[:MEASURES]->(Opportunity:Stage)
  -[:COMPONENT {name: "Connectivity"}]->
  (Connectivity_Metrics:Measurement)
  
(Your_System:Implementation)
  -[:USES]->(Dagg_Compass)
  -[:STORES_IN]->(Knowledge_Graph)
  -[:PREDICTS_WITH]->(Bayesian_Network)

Query:

// Show complete lineage for Opportunity stage
MATCH (theory:Theory)-[:GROUNDS]->(stage:Stage {name: "Opportunity"}),
      (evidence:Research)-[:VALIDATES]->(stage),
      (measurement:Framework)-[:MEASURES]->(stage)
RETURN theory.name AS "Theoretical Foundation",
       evidence.doi AS "Empirical Validation",
       measurement.name AS "Measurement Tool"

Result:

Theoretical Foundation | Empirical Validation | Measurement Tool
Sen's Capabilities     | 10.25335/BZGY-3V91  | Dagg Compass Connectivity

This is research lineage in one query!


3. Multi-Hop Reasoning

Question: “Which Michigan researchers influenced your Aspiration stage?”

In tables: Would require multiple joins across researchers, institutions, theories, stages

In knowledge graph:

MATCH (researcher:Person)-[:AFFILIATED_WITH]->(institution:University {state: "Michigan"}),
      (researcher)-[:DEVELOPED|VALIDATED]->(work),
      (work)-[:GROUNDS|MEASURES]->(stage:Stage {name: "Aspiration"})
RETURN researcher.name, institution.name, work.type

Result:

Researcher          | Institution      | Contribution
Arjun Appadurai    | NYU (theory)     | Grounding theory
Keith Hampton      | MSU Quello       | Validation research
Pierrette Dagg     | Merit/UMich      | Measurement framework

One query, multiple relationship types, complete answer


4. Flexible Schema Evolution

Traditional database:

ALTER TABLE counties ADD COLUMN new_metric FLOAT;
-- Must update ALL rows, even if data not available
-- Schema change is painful

Knowledge graph:

// Just add new relationships when data available
MATCH (c:County {name: "Baraga County"})
CREATE (c)-[:HAS_NEW_METRIC {value: 0.72, source: "2025 Survey"}]->(:Metric)

// Other counties unchanged
// No schema migration needed

Your framework is EVOLVING—graphs adapt easily


How Knowledge Graphs Connect to Your Framework:

Researchers as Nodes with Relationships

// Sen's Capabilities Approach
CREATE (sen:Researcher {
  name: "Amartya Sen",
  institution: "Harvard University",
  nobel: 1998,
  field: "Economics"
})

CREATE (capabilities:Theory {
  name: "Capabilities Approach",
  year: 1999,
  key_insight: "Capability = Resources + Conversion factors"
})

CREATE (sen)-[:DEVELOPED {year: 1999}]->(capabilities)

CREATE (opportunity:Stage {name: "Opportunity"})
CREATE (capabilities)-[:GROUNDS]->(opportunity)

// Now queryable:
MATCH (r:Researcher)-[:DEVELOPED]->(t:Theory)-[:GROUNDS]->(s:Stage)
WHERE s.name = "Opportunity"
RETURN r.name, t.name
// Returns: "Amartya Sen", "Capabilities Approach"

Hampton & Bauer Research as Evidence Nodes

// Store research findings
CREATE (hb:Research {
  title: "Broadband and Student Performance Gaps",
  authors: ["Keith Hampton", "Johannes Bauer", "Luis Fernandez", "Craig Robertson"],
  year: 2020,
  institution: "MSU Quello Center",
  doi: "10.25335/BZGY-3V91"
})

// Three gaps found
CREATE (gap1:Finding {
  type: "Infrastructure gap",
  description: "Missing infrastructure prevents access"
})
CREATE (gap2:Finding {
  type: "Value gap",
  description: "Unclear value prevents engagement"
})
CREATE (gap3:Finding {
  type: "Skill gap",
  description: "Insufficient skills prevent achievement"
})

CREATE (hb)-[:FOUND]->(gap1)
CREATE (hb)-[:FOUND]->(gap2)
CREATE (hb)-[:FOUND]->(gap3)

// Link to your framework stages
CREATE (gap1)-[:VALIDATES]->(opportunity:Stage {name: "Opportunity"})
CREATE (gap2)-[:VALIDATES]->(aspiration:Stage {name: "Aspiration"})
CREATE (gap3)-[:VALIDATES]->(mindset:Stage {name: "Growth Mindset"})

// Query: Which findings validate which stages?
MATCH (r:Research)-[:FOUND]->(f:Finding)-[:VALIDATES]->(s:Stage)
RETURN s.name, collect(f.type) AS validating_evidence

Dagg Compass as Measurement Framework

// Digital Opportunities Compass framework
CREATE (compass:Framework {
  name: "Digital Opportunities Compass",
  year: 2023,
  institution: "MSU Quello Center",
  purpose: "BEAD/DEA implementation guidance"
})

// Six components
CREATE (contexts:Component {name: "Contexts", measures: "Demographics, economics"})
CREATE (governance:Component {name: "Governance", measures: "Policy, funding"})
CREATE (connectivity:Component {name: "Connectivity", measures: "Infrastructure, adoption"})
CREATE (skills:Component {name: "Skills", measures: "Literacy, training"})
CREATE (application:Component {name: "Application", measures: "Use cases, relevance"})
CREATE (outcomes:Component {name: "Outcomes", measures: "Achievement, inclusion"})

CREATE (compass)-[:HAS_COMPONENT]->(contexts)
CREATE (compass)-[:HAS_COMPONENT]->(governance)
CREATE (compass)-[:HAS_COMPONENT]->(connectivity)
CREATE (compass)-[:HAS_COMPONENT]->(skills)
CREATE (compass)-[:HAS_COMPONENT]->(application)
CREATE (compass)-[:HAS_COMPONENT]->(outcomes)

// Link components to framework stages
CREATE (connectivity)-[:MEASURES]->(opportunity:Stage {name: "Opportunity"})
CREATE (application)-[:MEASURES]->(aspiration:Stage {name: "Aspiration"})
CREATE (skills)-[:MEASURES]->(mindset:Stage {name: "Growth Mindset"})
CREATE (outcomes)-[:MEASURES]->(equity:Outcome {name: "Digital Equity"})

// Query: How do we measure each stage?
MATCH (component:Component)-[:MEASURES]->(stage:Stage)
RETURN stage.name, component.name, component.measures

Counties with Compass Scores

// Store county assessment data
CREATE (baraga:County {
  name: "Baraga County",
  state: "Michigan",
  region: "Upper Peninsula",
  population: 8158,
  rural: true
})

// Compass scores
CREATE (baraga)-[:HAS_SCORE {
  component: "Connectivity",
  value: 0.58,
  year: 2024,
  source: "FCC + ACS data"
}]->(connectivity:Component)

CREATE (baraga)-[:HAS_SCORE {
  component: "Application",
  value: 0.42,
  year: 2024,
  source: "Community survey"
}]->(application:Component)

CREATE (baraga)-[:HAS_SCORE {
  component: "Skills",
  value: 0.38,
  year: 2024,
  source: "Digital literacy assessment"
}]->(skills:Component)

// Link to predicted outcome
CREATE (baraga)-[:PREDICTED_EQUITY {
  value: 0.31,
  confidence: 0.85,
  method: "Bayesian network",
  date: "2024-11-07"
}]->(equity:Outcome)

// Query: Show complete county profile
MATCH (c:County {name: "Baraga County"})-[score:HAS_SCORE]->(comp:Component),
      (c)-[pred:PREDICTED_EQUITY]->(outcome:Outcome)
RETURN c.name, comp.name, score.value, pred.value

Real Cypher Queries for Policy Questions:

Query 1: Which counties need navigators most?

// Find counties with good infrastructure but low aspiration
MATCH (c:County)-[conn:HAS_SCORE]->(connectivity:Component {name: "Connectivity"}),
      (c)-[app:HAS_SCORE]->(application:Component {name: "Application"})
WHERE conn.value > 0.6 AND app.value < 0.5
RETURN c.name, 
       conn.value AS infrastructure, 
       app.value AS aspiration,
       (conn.value - app.value) AS opportunity_gap
ORDER BY opportunity_gap DESC
LIMIT 10

Result: Counties where infrastructure exists but aspiration capacity is low
Policy implication: These counties are PRIME candidates for digital navigator programs
Theory: Appadurai’s navigation capacity gap + Toyama’s amplification principle


Query 2: Show research lineage for a specific finding

// Trace "unclear value" finding back to theory
MATCH path = (researcher:Researcher)-[:DEVELOPED]->(theory:Theory)
             -[:GROUNDS]->(stage:Stage)<-[:VALIDATES]-(research:Research)
             -[:FOUND]->(finding:Finding {type: "Value gap"})
             <-[:MEASURED_BY]-(framework:Framework)
RETURN path

Result: Visual graph showing:

Appadurai → Capacity to Aspire → Aspiration Stage 
  ↑ (validated by)
Hampton & Bauer → "Value gap" finding
  ↓ (measured by)
Dagg Compass → Application component

This is your theoretical foundation visualized!


Query 3: Compare similar counties to predict outcomes

// Find counties similar to target county
MATCH (target:County {name: "Ontonagon County"})-[t_scores:HAS_SCORE]->(comp:Component)
MATCH (similar:County)-[s_scores:HAS_SCORE]->(comp)
WHERE similar.name <> target.name
  AND similar.rural = target.rural
  AND abs(t_scores.value - s_scores.value) < 0.15  // Similar scores
WITH similar, count(*) AS matching_components
WHERE matching_components >= 4  // Similar on most dimensions

// Show their interventions and outcomes
MATCH (similar)-[:RECEIVED_INTERVENTION]->(intervention:Program)
MATCH (similar)-[:ACHIEVED_OUTCOME {year: intervention.year + 2}]->(outcome)
RETURN similar.name, 
       intervention.type, 
       intervention.budget,
       outcome.gini_change AS equity_improvement
ORDER BY equity_improvement DESC

Result: Learn from similar counties’ experiences
Policy implication: “These counties had similar profiles and succeeded with this intervention mix”


Query 4: Intervention impact analysis

// Which interventions improved which stages most?
MATCH (county:County)-[:RECEIVED_INTERVENTION]->(intervention:Program),
      (county)-[:HAS_SCORE {year: intervention.year}]->(comp_before),
      (county)-[:HAS_SCORE {year: intervention.year + 1}]->(comp_after)
WHERE comp_before = comp_after  // Same component, different years
WITH intervention.type AS intervention_type,
     comp_before.name AS component,
     avg(comp_after.value - comp_before.value) AS avg_improvement
RETURN intervention_type, component, avg_improvement
ORDER BY avg_improvement DESC

Result:

Intervention Type     | Component      | Avg Improvement
Digital Navigators    | Application    | +0.24
Digital Navigators    | Skills         | +0.18
Infrastructure        | Connectivity   | +0.31
Infrastructure        | Application    | +0.06
Training Programs     | Skills         | +0.22
Training Programs     | Application    | +0.08

Policy implication:

  • Infrastructure → best for Connectivity (obvious)
  • Navigators → best for Application (Appadurai’s navigation capacity!)
  • Training → best for Skills (Dweck’s growth mindset development)

Query 5: Find gaps between theory and practice

// Theory predicts certain relationships—do they hold empirically?
MATCH (theory:Theory)-[:PREDICTS]->(relationship)
MATCH (empirical:Research)-[:FOUND]->(finding)
WHERE NOT (finding)-[:CONFIRMS]->(relationship)
RETURN theory.name, 
       relationship.description AS predicted,
       finding.description AS actual_finding,
       "DISCREPANCY" AS status

Result: Identifies where theory and evidence diverge
Use case: Refine your Bayesian network based on Michigan data


Integration with Bayesian Networks:

Knowledge graph FEEDS Bayesian networks

// Query county data for Bayesian network input
MATCH (c:County {name: "Baraga County"})-[scores:HAS_SCORE]->(comp:Component)
RETURN comp.name AS component, scores.value AS value

// Python code receives this data:
# Use knowledge graph query results as Bayesian evidence
from neo4j import GraphDatabase
from pgmpy.inference import VariableElimination

# Query graph
with driver.session() as session:
    result = session.run("""
        MATCH (c:County {name: $county})-[s:HAS_SCORE]->(comp:Component)
        RETURN comp.name, s.value
    """, county="Baraga County")
    
    evidence = {row["comp.name"]: row["s.value"] for row in result}

# Feed to Bayesian network
inference = VariableElimination(bayesian_model)
prediction = inference.query(['Equity'], evidence=evidence)

# Store prediction back in graph
session.run("""
    MATCH (c:County {name: $county})
    CREATE (c)-[:PREDICTED_EQUITY {
        value: $pred_value,
        confidence: $confidence,
        date: date()
    }]->(outcome:Outcome)
""", county="Baraga County", pred_value=prediction.values[1], confidence=0.85)

Complete loop: Graph → Bayesian → Prediction → Back to Graph


Integration with GraphRAG:

Knowledge graph provides CONTEXT for AI responses

Without knowledge graph (regular LLM):

User: "Should we invest in navigators for Baraga County?"

LLM: "Digital navigators can help improve digital literacy..." 
     (Generic answer, no county-specific data, possible hallucination)

With knowledge graph (GraphRAG):

User: "Should we invest in navigators for Baraga County?"

GraphRAG:
  1. Queries knowledge graph for Baraga County data
  2. Retrieves: Connectivity=0.58, Application=0.42, Skills=0.38
  3. Retrieves: Appadurai theory (navigation capacity)
  4. Retrieves: Hampton & Bauer finding ("unclear value" gap)
  5. Retrieves: Similar county outcomes with navigator programs

LLM response (grounded in retrieved data):
  "Yes. Baraga County has moderate infrastructure (0.58) but LOW 
   aspiration capacity (0.42). This is exactly the 'unclear value' 
   gap Hampton & Bauer identified. Appadurai's theory says navigation 
   capacity building addresses this. Similar rural Michigan counties 
   (Ontonagon, Keweenaw) improved Application scores by +0.24 with 
   $2M navigator programs. Recommend 2-year navigator program focused 
   on showing relevant digital pathways."

Evidence-grounded, theory-informed, data-specific answer!

Read more: TrainingCompassGraphRAG.md


Graph Algorithms for Policy Insights:

Community Detection: Find Similar Counties

// Use graph algorithm to cluster similar counties
CALL gds.louvain.stream({
  nodeProjection: 'County',
  relationshipProjection: {
    SIMILAR_SCORES: {
      type: 'HAS_SCORE',
      properties: 'value'
    }
  }
})
YIELD nodeId, communityId
RETURN gds.util.asNode(nodeId).name AS county, communityId
ORDER BY communityId

Result: Counties clustered by similar Compass profiles
Use case: Targeted interventions for county clusters, peer learning networks


Centrality: Which researchers are most influential?

// PageRank to find most-cited/connected researchers
CALL gds.pageRank.stream({
  nodeProjection: ['Researcher', 'Theory', 'Research'],
  relationshipProjection: ['DEVELOPED', 'CITES', 'BUILDS_ON']
})
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS entity, score
ORDER BY score DESC
LIMIT 10

Result: Sen, Appadurai, Dweck likely score highest
Use case: Prioritize which theories to emphasize in training materials


Shortest Path: Connect Theory to Practice

// Shortest path from Sen's theory to your county intervention
MATCH path = shortestPath(
  (sen:Theory {name: "Capabilities Approach"})-[*]->
  (intervention:Program {county: "Baraga County", type: "Navigator"})
)
RETURN path

Result:

Sen's Capabilities → Opportunity Stage → Connectivity Component 
  → Baraga County Assessment → Navigator Intervention

Use case: Show stakeholders the complete lineage from theory to action


Why Neo4j (or Similar) for Your System:

Neo4j advantages:

  1. Native graph storage (not graph layer on SQL)
  2. Cypher query language (intuitive, readable)
  3. Graph algorithms library (clustering, centrality, pathfinding)
  4. Scalability (billions of nodes/relationships)
  5. ACID transactions (data integrity for policy decisions)
  6. Visualization tools (Neo4j Browser, Bloom)

Alternative graph databases:

  • Amazon Neptune: AWS-managed, good for cloud deployments
  • ArangoDB: Multi-model (graph + document)
  • TigerGraph: High-performance analytics

For Michigan digital equity: Neo4j recommended (mature, well-documented, strong community)


Data Model for Your System:

Node Types:

// Entities
:County (name, population, region, rural/urban)
:Researcher (name, institution, field)
:Theory (name, year, key_insight)
:Research (title, year, doi, authors)
:Framework (name, year, purpose)
:Component (name, measures)
:Stage (name, description)
:Outcome (name, type)
:Intervention (type, budget, year, duration)
:Finding (type, description)

// Metrics
:Score (component, value, year, source)
:Prediction (value, confidence, method, date)

Relationship Types:

// Theory relationships
(:Researcher)-[:DEVELOPED]->(:Theory)
(:Theory)-[:GROUNDS]->(:Stage)
(:Theory)-[:BUILDS_ON]->(:Theory)

// Evidence relationships
(:Research)-[:VALIDATES]->(:Stage)
(:Research)-[:FOUND]->(:Finding)
(:Finding)-[:SUPPORTS]->(:Theory)

// Measurement relationships
(:Framework)-[:HAS_COMPONENT]->(:Component)
(:Component)-[:MEASURES]->(:Stage)
(:County)-[:HAS_SCORE]->(:Component)

// Intervention relationships
(:County)-[:RECEIVED_INTERVENTION]->(:Intervention)
(:Intervention)-[:TARGETS]->(:Stage)
(:County)-[:ACHIEVED_OUTCOME]->(:Outcome)

// Predictions
(:County)-[:PREDICTED_EQUITY]->(:Outcome)
(:Intervention)-[:EXPECTED_TO_IMPROVE]->(:Stage)

Building the Graph: Python Integration

from neo4j import GraphDatabase

class DigitalEquityGraph:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))
    
    def add_county_assessment(self, county_name, compass_scores):
        """Store county Compass scores in graph"""
        with self.driver.session() as session:
            session.run("""
                MERGE (c:County {name: $county})
                WITH c
                UNWIND $scores AS score
                MATCH (comp:Component {name: score.component})
                MERGE (c)-[:HAS_SCORE {
                    value: score.value,
                    year: score.year,
                    source: score.source
                }]->(comp)
            """, county=county_name, scores=compass_scores)
    
    def find_similar_counties(self, target_county, threshold=0.15):
        """Find counties with similar Compass profiles"""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (target:County {name: $target})-[t:HAS_SCORE]->(comp:Component)
                MATCH (other:County)-[o:HAS_SCORE]->(comp)
                WHERE other.name <> target.name
                  AND abs(t.value - o.value) < $threshold
                WITH other, count(*) AS matching_dims
                WHERE matching_dims >= 4
                RETURN other.name AS county, matching_dims
                ORDER BY matching_dims DESC
            """, target=target_county, threshold=threshold)
            
            return [record["county"] for record in result]
    
    def get_intervention_evidence(self, intervention_type, component):
        """Find evidence for intervention effectiveness"""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (county:County)-[:RECEIVED_INTERVENTION]->(i:Intervention {type: $int_type})
                MATCH (county)-[before:HAS_SCORE {year: i.year}]->(c:Component {name: $comp})
                MATCH (county)-[after:HAS_SCORE {year: i.year + 1}]->(c)
                RETURN county.name AS county,
                       i.budget AS budget,
                       (after.value - before.value) AS improvement
                ORDER BY improvement DESC
            """, int_type=intervention_type, comp=component)
            
            return [dict(record) for record in result]
    
    def store_prediction(self, county, equity_prediction, confidence):
        """Store Bayesian network prediction"""
        with self.driver.session() as session:
            session.run("""
                MATCH (c:County {name: $county})
                CREATE (c)-[:PREDICTED_EQUITY {
                    value: $prediction,
                    confidence: $confidence,
                    method: 'Bayesian network',
                    timestamp: datetime()
                }]->(o:Outcome {type: 'Digital Equity'})
            """, county=county, prediction=equity_prediction, confidence=confidence)

# Usage
graph = DigitalEquityGraph("bolt://localhost:7687", "neo4j", "password")

# Add assessment
graph.add_county_assessment("Baraga County", [
    {"component": "Connectivity", "value": 0.58, "year": 2024, "source": "FCC"},
    {"component": "Application", "value": 0.42, "year": 2024, "source": "Survey"},
    {"component": "Skills", "value": 0.38, "year": 2024, "source": "Assessment"}
])

# Find similar counties
similar = graph.find_similar_counties("Baraga County")
print(f"Similar counties: {similar}")

# Get intervention evidence
evidence = graph.get_intervention_evidence("Digital Navigator", "Application")
print(f"Navigator effectiveness: {evidence}")

# Store prediction
graph.store_prediction("Baraga County", 0.31, 0.85)

Visualization: Making the Graph Understandable

Neo4j Browser visualization:

// Show complete framework
MATCH path = (r:Researcher)-[:DEVELOPED]->(t:Theory)-[:GROUNDS]->(s:Stage)
             <-[:VALIDATES]-(research:Research)
RETURN path

Result: Visual network showing theory → evidence connections

For stakeholders: Export to D3.js, Gephi, or Tableau for presentations


Common Questions:

Q: “Why not just use SQL?”

A: SQL requires JOIN hell for relationship queries

SQL version:

-- "Which theories ground stages validated by Michigan research?"
SELECT t.name, s.name, r.title
FROM researchers res
JOIN theories t ON res.id = t.researcher_id
JOIN theory_stage_mapping tsm ON t.id = tsm.theory_id
JOIN stages s ON tsm.stage_id = s.id
JOIN research_validation rv ON s.id = rv.stage_id
JOIN research r ON rv.research_id = r.id
WHERE r.state = 'Michigan'

Graph version:

MATCH (r:Researcher {state: "Michigan"})-[:DEVELOPED]->(t:Theory)
      -[:GROUNDS]->(s:Stage)<-[:VALIDATES]-(research:Research)
WHERE research.state = "Michigan"
RETURN t.name, s.name, research.title

Readable, intuitive, fast


Q: “How do we keep the graph updated?”

A: ETL pipelines from data sources

# Scheduled job (e.g., daily)
def update_county_scores():
    # Pull latest FCC data
    fcc_data = fetch_fcc_broadband_data()
    
    # Pull latest ACS demographics
    acs_data = fetch_census_acs_data()
    
    # Calculate Compass scores
    compass_scores = calculate_dagg_compass(fcc_data, acs_data)
    
    # Update graph
    for county, scores in compass_scores.items():
        graph.add_county_assessment(county, scores)
    
    # Trigger Bayesian predictions
    update_predictions()

Q: “How big can this graph get?”

A: Very big. Neo4j handles:

  • Billions of nodes
  • Trillions of relationships
  • Sub-second query times with proper indexing

Your scale:

  • 83 Michigan counties
  • ~10 researchers
  • ~20 theories/frameworks
  • ~100 research papers
  • Thousands of interventions over time

Total: Tens of thousands of nodes, hundreds of thousands of relationships
Neo4j: No problem


Bottom Line:

Knowledge graphs are the data foundation of your digital equity intelligence system.

What they provide: ✅ Natural representation of relationships (theory, evidence, practice)
✅ Flexible schema (evolve as framework grows)
✅ Complex queries answered simply (Cypher is readable)
✅ Multi-hop reasoning (connect distant concepts)
✅ Integration hub (feeds Bayesian, GraphRAG, visualization)
✅ Audit trail (trace from theory to policy decision)

Why they’re perfect for digital equity:

  • Research is about RELATIONSHIPS (theories ground stages, evidence validates theories)
  • Framework EVOLVES (new research, new metrics, graph adapts)
  • Policy questions are MULTI-HOP (“Which theory → evidence → measurement → action?”)
  • Need EXPLAINABILITY (show reasoning path to stakeholders)

The complete system:

Data sources (FCC, ACS, surveys)
  ↓
Knowledge Graph (stores + relates)
  ↓
Bayesian Network (predicts with graph data)
  ↓
GraphRAG (answers questions with graph context)
  ↓
Policy decisions (evidence-grounded, explainable)

This is how you build institutional memory and organizational intelligence.


Next steps:

  • TrainingCompassBayesian.md - How predictions work
  • TrainingCompassGraphRAG.md - How AI uses the graph
  • TrainingCompassMetrics.md - What data goes in the graph

Version: 1.0
Last Updated: November 2025
Part of: Project Compass (Merit Network) - Digital Opportunities Intelligence Network (DOIN) • Working draft