osint-methodologies

OSINT Methodologies Skill

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "osint-methodologies" with this command: npx skills add hack23/riksdagsmonitor/hack23-riksdagsmonitor-osint-methodologies

OSINT Methodologies Skill

Purpose

This skill provides comprehensive Open-Source Intelligence (OSINT) collection methodologies specifically tailored for the Riksdagsmonitor (Riksdagsmonitor) platform's mission of democratic transparency and political accountability in Sweden. It ensures ethical, legal, and effective intelligence gathering from public sources while maintaining GDPR compliance and respecting privacy principles.

When to Use This Skill

Apply this skill when:

  • ✅ Integrating new public data sources (government APIs, open datasets)

  • ✅ Evaluating data source reliability and completeness

  • ✅ Implementing data collection pipelines for political intelligence

  • ✅ Verifying information from multiple independent sources

  • ✅ Conducting source triangulation for accuracy validation

  • ✅ Designing OSINT collection strategies for new intelligence requirements

  • ✅ Assessing data quality and establishing quality metrics

  • ✅ Creating automated data harvesting and validation workflows

Do NOT use for:

  • ❌ Collecting non-public or classified information

  • ❌ Social media scraping without clear public interest justification

  • ❌ Personal data collection beyond public official capacity

  • ❌ Intelligence gathering that violates GDPR or Swedish privacy laws

OSINT Collection Framework

Four Primary Sources Architecture

The Riksdagsmonitor platform integrates four high-integrity OSINT sources to create comprehensive political intelligence:

graph TB subgraph "OSINT Sources" A1[📡 Riksdagen API<br/>98.5% completeness<br/>3.5M+ votes, 89K documents] A2[📊 Election Authority<br/>99.2% completeness<br/>1970-present electoral data] A3[💰 World Bank<br/>94.1% completeness<br/>598K indicators, 211 countries] A4[🏛️ Financial Authority ESV<br/>97.8% completeness<br/>Monthly updates] end

subgraph "Collection Layer"
    A1 --> B[Data Ingestion Pipeline]
    A2 --> B
    A3 --> B
    A4 --> B
end

subgraph "Quality Assurance"
    B --> C{Source Verification}
    C --> D[Completeness Check]
    C --> E[Consistency Validation]
    C --> F[Temporal Continuity]
    C --> G[Cross-Reference Check]
end

subgraph "Data Integration"
    D &#x26; E &#x26; F &#x26; G --> H[(Integrated Dataset)]
    H --> I[Entity Resolution]
    I --> J[Data Enrichment]
end

subgraph "Quality Metrics"
    J --> K[📊 Quality Dashboard]
    K --> L[Completeness: 97.4%]
    K --> M[Accuracy: 99.1%]
    K --> N[Timeliness: 24h lag]
end

style A1 fill:#e1f5ff
style A2 fill:#e1f5ff
style A3 fill:#e1f5ff
style A4 fill:#e1f5ff
style C fill:#ffeb99
style H fill:#ccffcc
style K fill:#ffe6cc

Source Evaluation Matrix

  1. Riksdagen API (Swedish Parliament)

Source Profile:

  • URL: https://data.riksdagen.se/

  • Authority: Swedish Parliament (Legislative Branch)

  • Classification: Primary Source, Government Official

  • Completeness: 98.5%

  • Update Frequency: Daily (real-time for votes)

  • Historical Coverage: 1971-present (54 years)

  • Data Volume: 3.5M+ votes, 89K documents, 2.5K politicians

Quality Metrics:

@Component public class RiksdagenSourceQualityMetrics {

public SourceQualityReport evaluateRiksdagenAPI() {
    return SourceQualityReport.builder()
        .sourceName("Riksdagen API")
        .completeness(98.5)  // Measured against expected records
        .accuracy(99.8)       // Verified via manual audit sampling
        .timeliness(1.0)      // Hours lag for new data
        .reliability(5)       // 1-5 scale, 5 = highest
        .trustLevel("PRIMARY_SOURCE")
        .legalBasis("Public Records Act (Offentlighetsprincipen)")
        .gdprCompliance(true)
        .updateFrequency("DAILY")
        .build();
}

public List&#x3C;DataGap> identifyDataGaps() {
    // Example: Check for missing voting records
    String sql = """
        SELECT 
            COUNT(*) as expected_votes,
            COUNT(DISTINCT ballot_id) as actual_votes,
            COUNT(*) - COUNT(DISTINCT ballot_id) as missing_votes
        FROM view_riksdagen_vote_daily_summary
        WHERE created_date >= CURRENT_DATE - INTERVAL '30 days'
    """;
    
    // Calculate completeness percentage
    return jdbcTemplate.query(sql, (rs, rowNum) -> 
        DataGap.builder()
            .expectedRecords(rs.getInt("expected_votes"))
            .actualRecords(rs.getInt("actual_votes"))
            .missingRecords(rs.getInt("missing_votes"))
            .completeness((double) rs.getInt("actual_votes") / 
                          rs.getInt("expected_votes") * 100)
            .build()
    );
}

}

Data Categories:

Voting Records (vote_data table)

  • Individual votes on ballots

  • Party discipline metrics

  • Coalition voting patterns

  • Temporal voting trends

Documents (document_element , document_status_container )

  • Legislative proposals (motioner)

  • Government bills (propositioner)

  • Committee reports (utskottsbetänkanden)

  • Parliamentary questions (interpellationer)

Politicians (person_data , assignment_data )

  • Parliamentary roles and assignments

  • Committee memberships

  • Party affiliations

  • Career trajectories

Parliamentary Activities

  • Debate participation

  • Committee work

  • Government questions

  • Speaker time

OSINT Techniques:

-- Source Triangulation: Verify politician voting record completeness WITH expected_votes AS ( SELECT person_id, COUNT(DISTINCT ballot_id) as riksdagen_vote_count FROM vote_data WHERE vote_date >= '2022-09-11' -- Current parliament GROUP BY person_id ), document_activity AS ( SELECT person_id, COUNT(DISTINCT document_id) as document_count FROM document_person_reference_data WHERE document_type IN ('mot', 'ip') -- Motions & Questions GROUP BY person_id ) SELECT p.first_name || ' ' || p.last_name as politician_name, p.party as party_affiliation, ev.riksdagen_vote_count, da.document_count, CASE WHEN ev.riksdagen_vote_count > 500 THEN 'ACTIVE' WHEN ev.riksdagen_vote_count BETWEEN 100 AND 500 THEN 'MODERATE' WHEN ev.riksdagen_vote_count < 100 THEN 'INACTIVE' END as engagement_level FROM person_data p LEFT JOIN expected_votes ev ON p.person_id = ev.person_id LEFT JOIN document_activity da ON p.person_id = da.person_id WHERE p.status = 'Tjänstgörande riksdagsledamot' -- Active MPs ORDER BY ev.riksdagen_vote_count DESC;

  1. Election Authority (Valmyndigheten)

Source Profile:

  • URL: https://www.val.se/

  • Authority: Swedish Election Authority (Independent)

  • Classification: Primary Source, Government Official

  • Completeness: 99.2%

  • Update Frequency: Post-election (every 4 years + special elections)

  • Historical Coverage: 1970-present (14 elections)

  • Data Volume: 40 parties, national & regional results

Quality Metrics:

@Component public class ElectionAuthorityQualityMetrics {

public SourceQualityReport evaluateElectionAuthority() {
    return SourceQualityReport.builder()
        .sourceName("Valmyndigheten")
        .completeness(99.2)
        .accuracy(99.9)  // Official certified results
        .timeliness(168.0)  // Hours post-election for final results
        .reliability(5)
        .trustLevel("PRIMARY_SOURCE")
        .legalBasis("Election Act (Vallagen)")
        .gdprCompliance(true)
        .updateFrequency("ELECTION_CYCLE")
        .build();
}

public List&#x3C;ElectionDataValidation> validateElectionData() {
    // Cross-reference with Riksdagen seat assignments
    String sql = """
        SELECT 
            e.party_name,
            e.parliament_seats as election_authority_seats,
            COUNT(DISTINCT a.person_id) as riksdagen_active_mps,
            e.parliament_seats - COUNT(DISTINCT a.person_id) as seat_discrepancy
        FROM sweden_political_party e
        LEFT JOIN assignment_data a 
            ON e.party_id = a.party 
            AND a.role_code = 'Riksdagsledamot'
            AND a.from_date &#x3C;= CURRENT_DATE
            AND (a.to_date IS NULL OR a.to_date >= CURRENT_DATE)
        WHERE e.parliament_seats > 0
        GROUP BY e.party_name, e.parliament_seats
        HAVING e.parliament_seats != COUNT(DISTINCT a.person_id)
    """;
    
    return jdbcTemplate.query(sql, (rs, rowNum) -> 
        ElectionDataValidation.builder()
            .partyName(rs.getString("party_name"))
            .expectedSeats(rs.getInt("election_authority_seats"))
            .actualSeats(rs.getInt("riksdagen_active_mps"))
            .discrepancy(rs.getInt("seat_discrepancy"))
            .build()
    );
}

}

OSINT Techniques:

-- Historical Election Trend Analysis SELECT party_name, election_year, percentage, parliament_seats, LAG(percentage, 1) OVER (PARTITION BY party_name ORDER BY election_year) as previous_percentage, percentage - LAG(percentage, 1) OVER (PARTITION BY party_name ORDER BY election_year) as vote_swing, LAG(parliament_seats, 1) OVER (PARTITION BY party_name ORDER BY election_year) as previous_seats, parliament_seats - LAG(parliament_seats, 1) OVER (PARTITION BY party_name ORDER BY election_year) as seat_change FROM sweden_political_party WHERE parliament_seats IS NOT NULL ORDER BY party_name, election_year DESC;

  1. World Bank Open Data

Source Profile:

  • URL: https://data.worldbank.org/

  • Authority: World Bank (International Organization)

  • Classification: Secondary Source, International Authority

  • Completeness: 94.1%

  • Update Frequency: Quarterly

  • Historical Coverage: 1960-present (64 years)

  • Data Volume: 598K indicators, 211 countries

Quality Metrics:

@Component public class WorldBankQualityMetrics {

public SourceQualityReport evaluateWorldBankAPI() {
    return SourceQualityReport.builder()
        .sourceName("World Bank Open Data")
        .completeness(94.1)
        .accuracy(97.5)
        .timeliness(2160.0)  // 90 days lag (quarterly updates)
        .reliability(4)  // High but not primary source
        .trustLevel("SECONDARY_SOURCE")
        .legalBasis("Open Data License")
        .gdprCompliance(true)  // Aggregate data only
        .updateFrequency("QUARTERLY")
        .build();
}

}

Use Cases:

  • Economic context for political decisions

  • International comparison benchmarks

  • Policy effectiveness measurement

  • Correlation analysis (GDP vs. government spending)

  1. Financial Authority (ESV)

Source Profile:

  • URL: https://www.esv.se/

  • Authority: Swedish Financial Management Authority

  • Classification: Primary Source, Government Official

  • Completeness: 97.8%

  • Update Frequency: Monthly

  • Historical Coverage: 1990-present

  • Data Volume: Agency financial data

OSINT Collection Checklist

Pre-Collection Assessment

Legal & Ethical Review:

✅ Data is publicly available under Swedish Public Access Principle (Offentlighetsprincipen) ✅ Collection method complies with GDPR Article 6(1)(e) - public interest ✅ Data processing aligns with purpose limitation (Article 5(1)(b)) ✅ Data retention periods defined and documented ✅ Privacy impact assessment completed for personal data ✅ Source attribution and license compliance verified

Source Reliability Assessment:

public enum SourceReliability { A("COMPLETELY_RELIABLE", "Primary government source, official records"), B("USUALLY_RELIABLE", "Established media, verified aggregators"), C("FAIRLY_RELIABLE", "Secondary sources with known limitations"), D("NOT_USUALLY_RELIABLE", "Unverified sources, requires corroboration"), E("UNRELIABLE", "Known bias, propaganda, unverified claims"), F("RELIABILITY_CANNOT_BE_JUDGED", "New source, insufficient track record");

private final String description;
private final String criteria;

}

public enum InformationCredibility { ONE("CONFIRMED", "Verified by multiple independent sources"), TWO("PROBABLY_TRUE", "Corroborated by additional source"), THREE("POSSIBLY_TRUE", "Not confirmed but consistent with known facts"), FOUR("DOUBTFUL", "Contradicts other information"), FIVE("IMPROBABLE", "Contradicts established facts"), SIX("TRUTH_CANNOT_BE_JUDGED", "Insufficient information to evaluate");

private final String description;
private final String criteria;

}

Data Collection Workflow

  1. Source Identification

class OSINTSource: def init(self, name, url, api_endpoint=None): self.name = name self.url = url self.api_endpoint = api_endpoint self.reliability = None # A-F rating self.credibility = None # 1-6 rating self.completeness = None self.last_updated = None

def evaluate_reliability(self):
    """NATO Admiralty Code reliability assessment"""
    if self.is_government_official_source():
        self.reliability = SourceReliability.A
    elif self.is_verified_aggregator():
        self.reliability = SourceReliability.B
    # ... additional logic
    
def verify_completeness(self, expected_records, actual_records):
    """Calculate data completeness percentage"""
    self.completeness = (actual_records / expected_records) * 100
    return self.completeness

def assess_timeliness(self):
    """Calculate data lag from source update to platform availability"""
    time_lag = datetime.now() - self.last_updated
    return time_lag.total_seconds() / 3600  # Hours

2. Data Harvesting

@Service public class OSINTDataCollector {

@Scheduled(cron = "0 0 2 * * *")  // Daily at 02:00
public void collectRiksdagenData() {
    try {
        // 1. Fetch new data
        List&#x3C;VoteData> newVotes = riksdagenClient.getVotesSince(lastCollectionTimestamp);
        
        // 2. Validate data quality
        DataQualityReport report = validateDataQuality(newVotes);
        if (report.getQualityScore() &#x3C; 95.0) {
            alertService.sendAlert("Low quality data detected", report);
        }
        
        // 3. Deduplicate
        List&#x3C;VoteData> uniqueVotes = deduplicateRecords(newVotes);
        
        // 4. Store with audit trail
        voteRepository.saveAll(uniqueVotes);
        
        // 5. Update collection metadata
        updateCollectionMetrics(uniqueVotes.size(), report);
        
        log.info("Collected {} new votes with {}% quality score", 
                 uniqueVotes.size(), report.getQualityScore());
        
    } catch (Exception e) {
        log.error("Data collection failed", e);
        alertService.sendAlert("OSINT collection failure", e);
    }
}

private DataQualityReport validateDataQuality(List&#x3C;VoteData> data) {
    DataQualityReport report = new DataQualityReport();
    
    // Completeness check
    long missingFields = data.stream()
        .filter(v -> v.getPersonId() == null || v.getBallotId() == null)
        .count();
    report.setCompleteness(100.0 - (missingFields * 100.0 / data.size()));
    
    // Consistency check
    long invalidVotes = data.stream()
        .filter(v -> !Arrays.asList("Ja", "Nej", "Frånvarande", "Avstår").contains(v.getVote()))
        .count();
    report.setConsistency(100.0 - (invalidVotes * 100.0 / data.size()));
    
    // Temporal validity
    long futureVotes = data.stream()
        .filter(v -> v.getVoteDate().isAfter(LocalDate.now()))
        .count();
    report.setTemporalValidity(100.0 - (futureVotes * 100.0 / data.size()));
    
    report.calculateOverallScore();
    return report;
}

}

  1. Source Triangulation

-- Cross-Reference Verification: Validate politician data across sources WITH riksdagen_politicians AS ( SELECT person_id, first_name, last_name, party, status FROM person_data WHERE status = 'Tjänstgörande riksdagsledamot' ), riksdagen_seats AS ( SELECT party, COUNT(DISTINCT person_id) as active_mps FROM riksdagen_politicians GROUP BY party ), election_seats AS ( SELECT party_name, parliament_seats FROM sweden_political_party WHERE parliament_seats > 0 ) SELECT COALESCE(r.party, e.party_name) as party, r.active_mps as riksdagen_count, e.parliament_seats as election_authority_count, ABS(COALESCE(r.active_mps, 0) - COALESCE(e.parliament_seats, 0)) as discrepancy, CASE WHEN ABS(COALESCE(r.active_mps, 0) - COALESCE(e.parliament_seats, 0)) = 0 THEN '✅ VERIFIED' WHEN ABS(COALESCE(r.active_mps, 0) - COALESCE(e.parliament_seats, 0)) <= 2 THEN '⚠️ MINOR_DISCREPANCY' ELSE '🚨 MAJOR_DISCREPANCY' END as verification_status FROM riksdagen_seats r FULL OUTER JOIN election_seats e ON r.party = e.party_name ORDER BY discrepancy DESC;

Data Quality Metrics

Quality Dimensions

  1. Completeness

-- Calculate field completeness across all tables SELECT 'person_data' as table_name, COUNT() as total_records, SUM(CASE WHEN first_name IS NULL THEN 1 ELSE 0 END) as missing_first_name, SUM(CASE WHEN last_name IS NULL THEN 1 ELSE 0 END) as missing_last_name, SUM(CASE WHEN party IS NULL THEN 1 ELSE 0 END) as missing_party, ROUND(100.0 * (1.0 - ( SUM(CASE WHEN first_name IS NULL THEN 1 ELSE 0 END)::decimal + SUM(CASE WHEN last_name IS NULL THEN 1 ELSE 0 END) + SUM(CASE WHEN party IS NULL THEN 1 ELSE 0 END) ) / (COUNT() * 3)), 2) as completeness_percentage FROM person_data;

  1. Accuracy

@Component public class DataAccuracyVerifier {

public AccuracyReport verifyPoliticianData() {
    // Manual audit sampling: Verify 100 random politicians against Riksdagen website
    List&#x3C;PersonData> sample = personRepository.findRandomSample(100);
    
    int accurate = 0;
    int inaccurate = 0;
    
    for (PersonData person : sample) {
        // Fetch from official Riksdagen website for comparison
        OfficialPersonData official = riksdagenWebScraper.fetchPersonData(person.getPersonId());
        
        if (verifyMatch(person, official)) {
            accurate++;
        } else {
            inaccurate++;
            log.warn("Data mismatch for person_id: {}", person.getPersonId());
        }
    }
    
    return AccuracyReport.builder()
        .sampleSize(sample.size())
        .accurateRecords(accurate)
        .inaccurateRecords(inaccurate)
        .accuracyPercentage((double) accurate / sample.size() * 100)
        .build();
}

}

  1. Timeliness

-- Calculate data freshness lag SELECT 'vote_data' as data_type, MAX(created_date) as latest_record, CURRENT_TIMESTAMP as current_time, EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - MAX(created_date))) / 3600 as hours_lag, CASE WHEN EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - MAX(created_date))) / 3600 < 24 THEN '✅ CURRENT' WHEN EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - MAX(created_date))) / 3600 < 168 THEN '⚠️ RECENT' ELSE '🚨 STALE' END as freshness_status FROM vote_data;

  1. Consistency

-- Detect referential integrity violations SELECT 'Vote Data Integrity' as check_name, COUNT(DISTINCT v.ballot_id) as total_votes, COUNT(DISTINCT CASE WHEN p.person_id IS NULL THEN v.person_id END) as orphaned_person_refs, COUNT(DISTINCT CASE WHEN b.ballot_id IS NULL THEN v.ballot_id END) as orphaned_ballot_refs, ROUND(100.0 * (1.0 - ( COUNT(DISTINCT CASE WHEN p.person_id IS NULL THEN v.person_id END)::decimal + COUNT(DISTINCT CASE WHEN b.ballot_id IS NULL THEN v.ballot_id END) ) / (COUNT(DISTINCT v.ballot_id) * 2)), 2) as consistency_percentage FROM vote_data v LEFT JOIN person_data p ON v.person_id = p.person_id LEFT JOIN ballot_data b ON v.ballot_id = b.ballot_id;

ISMS Compliance Mapping

ISO 27001:2022 Controls

A.5.10 - Acceptable Use of Information

  • OSINT collection limited to public interest purposes

  • Data usage aligned with Swedish Public Access Principle

  • Personal data processing restricted to official capacity

A.5.33 - Protection of Records

  • Source attribution maintained for all collected data

  • Audit trails for data collection activities

  • Retention periods aligned with legal requirements

A.8.8 - Management of Technical Vulnerabilities

  • API endpoints monitored for availability and security

  • Rate limiting to prevent service disruption

  • Secure credential management for API access

NIST CSF 2.0 Functions

IDENTIFY (ID)

  • ID.AM-5: Resources prioritized based on classification and business value

  • Four OSINT sources classified by reliability and completeness

  • Data quality metrics tracked continuously

PROTECT (PR)

  • PR.DS-5: Protections against data leaks implemented

  • API keys stored in secure vaults (AWS Secrets Manager)

  • No sensitive credentials in code repositories

DETECT (DE)

  • DE.CM-1: Network monitored to detect potential cybersecurity events

  • API health checks every 5 minutes

  • Anomaly detection for unusual data patterns

CIS Controls v8.1

CIS Control 3: Data Protection

  • 3.3: Configure data access control lists

  • OSINT data classified by sensitivity

  • Access controls based on role-based permissions

CIS Control 12: Network Infrastructure Management

  • 12.4: Deny communication over unauthorized network protocols

  • HTTPS-only for all API communications

  • Certificate pinning for Riksdagen API

Hack23 ISMS Policy References

Data Classification Policy

Privacy Policy

Secure Development Policy

Third Party Management

References

Official Documentation:

CIA Platform Documentation:

  • Intelligence Data Flow: INTELLIGENCE_DATA_FLOW.md

  • Data Analysis Frameworks: DATA_ANALYSIS_INTOP_OSINT.md

  • Database View Catalog: DATABASE_VIEW_INTELLIGENCE_CATALOG.md

  • Risk Rules: RISK_RULES_INTOP_OSINT.md

Academic Sources:

  • NATO OSINT Reader: Intelligence Collection Taxonomy

  • Admiralty Code for Source Reliability Assessment

  • GDPR Article 6(1)(e): Processing necessary for public interest tasks

Standards:

  • ISO 27001:2022: Information Security Management

  • NIST Special Publication 800-53: Security and Privacy Controls

  • CIS Controls v8.1: Critical Security Controls

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

secrets-management

No summary provided by upstream source.

Repository SourceNeeds Review
General

ai governance

No summary provided by upstream source.

Repository SourceNeeds Review
General

business-model-canvas

No summary provided by upstream source.

Repository SourceNeeds Review