5. Use detect-secrets for secret scanning#
Date: 21/08/2025
Status#
Accepted
Context#
The SmartEM Decisions project requires robust secret scanning to protect sensitive research data, database credentials, API keys, and Kubernetes cluster secrets. As part of the Diamond Light Source facility infrastructure, high security standards are essential whilst supporting scientific computing workflows.
The development team evaluated secret scanning tools for integration into the existing sophisticated pre-commit and CI/CD pipeline (Python 3.12+, ruff, pyright). The organisational cybersecurity team recommended Gitleaks for standardisation across projects.
Key requirements included:
Integration with Python 3.12+ ecosystem and existing toolchain
Handling scientific computing patterns (chemical formulas, gene sequences, scientific notation) without excessive false positives
Support for high-throughput processing (1000+ images/hour) without workflow disruption
Enterprise-grade baseline management for research environments
Three tools were evaluated:
Gitleaks: High-performance Go implementation, organisational preference, but higher false positives in scientific contexts
TruffleHog: Advanced entropy analysis, but resource-intensive with SaaS dependencies
detect-secrets: Python-native, superior false positive handling, sophisticated baseline management
Decision#
We will use detect-secrets as the primary secret scanning tool, integrated into both pre-commit hooks and CI/CD pipelines, despite the organisational preference for Gitleaks standardisation.
Consequences#
Positive:
Native Python integration with existing development workflow
Superior false positive management for scientific computing patterns
Enterprise-grade baseline system for managing known safe patterns
Faster CI/CD execution through incremental scanning approach
Flexible plugin architecture for research-specific customisation
Negative:
Divergence from organisational tooling standardisation
Potential knowledge silos between teams using different tools
Responsibility for maintaining tool-specific expertise within the team
Mitigation:
Comprehensive documentation of configuration and workflows