Introducing Microsoft Sentinel Data Lake
The Microsoft Sentinel Data Lake, launched in public preview in July 2025, is a cloud-native, fully managed security data lake integrated into Microsoft Sentinel, Microsoft’s Security Information and Event Management (SIEM) platform.
Designed to tackle the challenges of managing vast amounts of security data, it offers cost-effective, long-term storage, advanced analytics, and AI-driven threat detection.
By centralizing security data in an open-format repository, it eliminates data silos and supports hyperscale ingestion, making it a transformative solution for modern Security Operations Centers (SOCs).
The product provides a unified approach to data ingestion, supporting over 350 native connectors for Microsoft services like Microsoft 365, Azure, Defender XDR, Entra, and Intune, as well as third-party sources such as AWS, GCP, Palo Alto, and Cisco. Organizations can also use custom connectors to ingest raw or transformed data, including firewall logs, DNS data, and asset inventories. Integration with Microsoft Defender Threat Intelligence (MDTI) enriches this data with insights from 84 trillion daily signals at no additional cost, enabling comprehensive threat analysis across cloud and on-premises environments.
A key feature of the Sentinel Data Lake is its tiered storage model, which balances performance and cost. The analytics tier stores high-fidelity, frequently accessed data for real-time querying, visualization, and alerting, while the data lake tier offers low-cost, long-term storage for high-volume, low-fidelity data, such as firewall and DNS logs, with retention periods extending up to 12 years. Data in the analytics tier is automatically mirrored to the data lake at no extra cost, eliminating duplication. With storage costs at less than 15% of traditional analytics logs—priced at $0.026 per GB per month for storage, $0.05 per GB for ingestion, and $0.005 per GB for querying during the preview phase—it significantly reduces expenses. Additionally, the preview includes 30 days of free storage and free data processing until August 4, 2025.
The Sentinel Data Lake leverages advanced analytics and AI to empower security teams. It supports Kusto Query Language (KQL) for querying across both tiers, enabling threat hunting, forensic analysis, and compliance reporting.
Integration with Jupyter notebooks, Python, and Apache Spark facilitates advanced analytics, machine learning, and visualizations, such as anomaly detection and behavioral baselining.
By providing a unified data foundation, it powers Microsoft Security Copilot and custom AI models, helping detect subtle attack patterns and generate high-fidelity alerts. The decoupled storage and compute architecture allows organizations to query data only when needed, further optimizing costs.
Management is streamlined within the Microsoft Defender portal, eliminating the need for complex infrastructure like Azure Data Explorer or storage blobs. Automated onboarding, audit logging, and table management simplify operations, while KQL jobs enable one-time or scheduled data promotion from the data lake to the analytics tier with cost-optimizing filters.
The platform supports long-term threat detection, allowing retroactive threat hunting and incident reconstruction over years, which is ideal for identifying “low and slow” attacks or meeting regulatory requirements like GDPR, FCA, or NIS2. Its multi-tenant flexibility also makes it suitable for Managed Security Service Providers (MSSPs) and Managed Detection and Response (MDR) providers, offering tenant-specific workflows and data isolation.
Despite its strengths, the preview phase has limitations. Interactive KQL queries are capped at 30,000 rows or 64 MB and timeout after 10 minutes, requiring KQL jobs for larger queries. Job concurrency is limited to three, with a maximum of 100 enabled jobs per tenant and a one-hour execution timeout. Up to 20 workspaces can be onboarded, and the primary workspace must be in the tenant’s home region. After enabling the data lake, auxiliary logs from Defender XDR are accessible only via Data Lake Exploration KQL queries, not Advanced Hunting. Analysts may also need to hone KQL optimization skills to manage query costs effectively.
The Microsoft Sentinel Data Lake supports a range of use cases, from threat hunting and incident reconstruction to compliance and AI-driven automation. By offering scalable, cost-efficient storage and analytics, it redefines SIEM, enabling faster detection, smarter responses, and compliance-ready operations.
For enterprises, it maximizes existing Microsoft security investments, while MSSPs benefit from simplified multi-tenant management. To get started, organizations can enable the data lake through the Defender portal and refer to Microsoft’s onboarding guide and KQL documentation. For pricing details, visit the Microsoft Sentinel Pricing Page. As the product is in preview, features and pricing may evolve before general availability.