Skip to content

Commit 2918370

Browse files
fmassotestherk15
andauthored
Fix CloudPrem wording and add processing page. (#29140)
* Fix CloudPrem wording and add processing page. * Apply suggestions from code review Sounds good, thanks @estherk15 ! Co-authored-by: Esther Kim <esther.kim@datadoghq.com> --------- Co-authored-by: Esther Kim <esther.kim@datadoghq.com>
1 parent a5fbcce commit 2918370

File tree

6 files changed

+248
-146
lines changed

6 files changed

+248
-146
lines changed

content/en/cloudprem/_index.md

+16-45
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: CloudPrem
3-
description: Learn how to deploy and manage Datadog CloudPrem, a self-hosted log solution for cost-effective log ingestion, indexing, and search capabilities
3+
description: Learn how to deploy and manage Datadog CloudPrem, a self-hosted log management solution for cost-effective log ingestion, processing, indexing, and search capabilities
44
private: true # This removes this page from search and limits the availability of this doc to only those that have the link
55
further_reading:
66
- link: "/cloudprem/installation/"
@@ -22,40 +22,20 @@ further_reading:
2222

2323
## Overview
2424

25-
Datadog CloudPrem is a self-hosted log solution which provides cost-effective log ingestion, indexing, and search capabilities in your own infrastructure. Designed to address data residency, security or high-volume requirements, CloudPrem seamlessly integrates with the Datadog platform to offer powerful log analysis, visualization, and alerting while ensuring that your sensitive log data remains within your own infrastructure.
25+
Datadog CloudPrem is a self-hosted log management solution that enables cost-effective log ingestion, processing, indexing, and search capabilities within your own infrastructure. Built to meet data residency, stringent security, and high-volume requirements, CloudPrem integrates with the Datadog platform to provide log analysis, visualization, and alerting - all while keeping your log data at rest within your infrastructure boundaries.
2626

2727
{{< img src="/cloudprem/cloudprem_overview_diagram.png" alt="CloudPrem product overview diagram" style="width:100%;" >}}
2828

29-
### Core Capabilities
30-
<!-- This sections was populated with Cursor, we can delete if it's not relevant -->
31-
32-
CloudPrem enhances your log management strategy through several fundamental capabilities:
33-
- **Data Sovereignty**<br>
34-
Process and store logs within your own infrastructure while maintaining full integration with Datadog's analysis tools. This gives you complete oversight of your data's location and handling.
35-
36-
- **Infrastructure Efficiency**<br>
37-
Scale log processing and storage according to your needs by leveraging your existing infrastructure. This provides flexibility in resource allocation and management as your requirements evolve.
38-
39-
- **Deployment Flexibility**<br>
40-
Adapt the deployment to match your infrastructure requirements and security controls while preserving seamless integration with Datadog's platform features and functionality.
41-
4229
## Architecture
4330

44-
CloudPrem uses a modular architecture that separates processing tasks from data storage:
45-
46-
- Processing tasks like indexing and searching run independently
47-
- Log data is stored separately in object storage (like S3)
48-
- Each component can be scaled separately to match your needs
49-
- This separation allows you to optimize resources based on your specific workload
50-
51-
<!-- {{< img src="path/to/your/image-name-here.png" alt="TBD CloudPrem architecture and component diagram" style="width:100%;" >}} -->
31+
CloudPrem uses a decoupled architecture which separates the compute (indexing and searching), and data on an object storage. This allows for independent scaling and optimization of different cluster components based on workload demands.
5232

5333
### Components
5434

5535
The CloudPrem cluster, typically deployed on Kubernetes (EKS), consists of several components:
5636

5737
**Indexers**
58-
: Responsible for receiving logs from Datadog Agents. Indexers process, index, and store logs in index files called splits to the object storage (such as Amazon S3).
38+
: Responsible for receiving logs from Datadog Agents. Indexers process, index, and store logs in index files called splits to the object storage (for example, Amazon S3).
5939

6040
**Searchers**
6141
: Handle search queries from the Datadog UI, reading metadata from Metastore and index data from the object storage.
@@ -67,41 +47,32 @@ The CloudPrem cluster, typically deployed on Kubernetes (EKS), consists of sever
6747
: Responsible for tasks like indexing tasks scheduling and delete tasks.
6848

6949

70-
## Prerequisites for getting started
50+
## Get started
51+
### Prerequisites
7152

7253
Before getting started with CloudPrem, ensure you have:
7354

7455
- AWS account with necessary permissions
7556
- Kubernetes cluster (EKS recommended)
7657
- S3 bucket for log storage
7758
- PostgreSQL database (RDS recommended)
78-
- Datadog agent installed
79-
- Required tools: `kubectl`, `helm`
80-
81-
For detailed instructions, see the [Installation][2] documentation.
59+
- Datadog agent
60+
- `kubectl`
61+
- `helm`
8262

83-
## Additional considerations
63+
### Installation
8464

85-
### Log processing capabilities
65+
1. [Install CloudPrem][2]
66+
2. [Send logs to CloudPrem](2)
67+
3. [Configure logs processing](3)(optional)
68+
4. [Configure your Datadog account to connect the Log Explorer to CloudPrem](2)
8669

87-
CloudPrem includes basic log processing capabilities out-of-the-box. For more advanced use cases such as dual shipping, sensitive data redaction, or log volume control, Datadog recommends using [Observability Pipelines][3] in conjunction with CloudPrem.
88-
89-
### Billing and usage
90-
91-
Logs sent to CloudPrem components are counted toward your Datadog usage, you will be billed for CloudPrem's internal telemetry.
92-
93-
### Network and cost
94-
95-
CloudPrem sends query results outside your network for display in the Datadog UI. These query results are compressed, resulting in negligible egress costs for most deployments.
96-
97-
### Deployment options
98-
99-
You cannot deploy multiple CloudPrem clusters.
70+
For detailed instructions, see the [Installation][2] documentation.
10071

10172
## Further reading
10273

10374
{{< partial name="whats-next/whats-next.html" >}}
10475

10576
[1]: https://kubernetes-sigs.github.io/aws-load-balancer-controller/latest/deploy/installation/
10677
[2]: /cloudprem/installation/
107-
[3]: /observability_pipelines/
78+
[3]: /cloudprem/processing/

content/en/cloudprem/advanced.md

+12-12
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: Advanced Configuration
3-
description: Learn about advanced deployment scenarios and customization options for CloudPrem
2+
title: AWS Configuration
3+
description: Learn how to configure AWS for CloudPrem
44
further_reading:
55
- link: "/cloudprem/"
66
tag: "Documentation"
@@ -21,13 +21,15 @@ further_reading:
2121

2222
## Overview
2323

24-
This guide covers advanced configuration options and deployment scenarios for CloudPrem, including multiple cluster deployments, advanced processing features, and integration with external tools. For ingress configuration, refer to the [Ingress Configuration guide](/cloudprem/ingress/).
24+
This guide covers how to configure your AWS account for CloudPrem. For ingress configuration, refer to the [Ingress Configuration guide](/cloudprem/ingress/).
2525

26-
## AWS setup
26+
Setting up a CloudPrem cluster on AWS requires the configuration of three elements:
27+
- AWS credentials
28+
- AWS region
29+
- IAM permissions for S3
2730

28-
Setting up a CloudPrem cluster on AWS requires the configuration of the following elements:
31+
## AWS credentials
2932

30-
{{% collapse-content title="AWS credentials" level="h4" expanded=false %}}
3133
When starting a node, CloudPrem attempts to find AWS credentials using the credential provider chain implemented by [rusoto\_core::ChainProvider](https://docs.rs/rusoto_credential/latest/rusoto_credential/struct.ChainProvider.html) and looks for credentials in this order:
3234

3335
1. Environment variables `AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, or `AWS_SESSION_TOKEN` (optional).
@@ -36,18 +38,18 @@ When starting a node, CloudPrem attempts to find AWS credentials using the crede
3638
4. Instance profile credentials, used on Amazon EC2 instances, and delivered through the Amazon EC2 metadata service.
3739

3840
An error is returned if no credentials are found in the chain.
39-
{{% /collapse-content %}}
4041

41-
{{% collapse-content title="AWS region" level="h4" expanded=false %}}
42+
## AWS Region
43+
4244
CloudPrem attempts to find an AWS region in multiple locations and with the following order of precedence:
4345

4446
1. Environment variables (`AWS_REGION` then `AWS_DEFAULT_REGION`)
4547
2. Config file, typically located at `~/.aws/config` or otherwise specified by the `AWS_CONFIG_FILE` environment variable if set and not empty.
4648
3. Amazon EC2 instance metadata service indicating the region of the currently running Amazon EC2 instance.
4749
4. Default value: `us-east-1`
48-
{{% /collapse-content %}}
4950

50-
{{% collapse-content title="IAM permissions for Amazon S3" level="h4" expanded=false %}}
51+
## IAM permissions for S3
52+
5153
Required authorized actions:
5254

5355
* `ListBucket` (on the bucket directly)
@@ -89,8 +91,6 @@ Here is an example of a bucket policy:
8991
]
9092
}
9193
```
92-
{{% /collapse-content %}}
93-
9494

9595
## Further reading
9696

content/en/cloudprem/cluster.md

+2-4
Original file line numberDiff line numberDiff line change
@@ -21,9 +21,7 @@ further_reading:
2121

2222
## Overview
2323

24-
This document offers comprehensive guidance on dimensioning and managing CloudPrem cluster components, covering indexers, searchers, and auxiliary services.
25-
26-
Find specific resource requirements (CPU, RAM, storage) for each component, along with practical examples for capacity planning. Use these guidelines to properly size your initial deployment and scale components as your needs grow. The recommendations provided help you maintain optimal performance while efficiently utilizing your infrastructure resources.
24+
This document gives recommendations on dimensioning your CloudPrem cluster components, particularly indexers and searchers.
2725

2826
<div class="alert alert-info">
2927
These are starting recommendations. Monitor your cluster's performance and resource utilization closely and adjust sizing as needed.
@@ -36,7 +34,7 @@ These are starting recommendations. Monitor your cluster's performance and resou
3634
- 2 vCPUs and 4GB of RAM
3735
- 4 vCPUs and 8GB of RAM
3836
- 8 vCPUs and 16GB of RAM
39-
- **Storage:** An indexer stores temporary data and requires persistent storage (e.g., AWS EBS).
37+
- **Storage:** Indexers require persistent storage (preferably SSDs, but local HDDs or remote EBS volumes can also be used) to store temporary data while constructing the index files.
4038
- Minimum: 100GB per pod
4139
- Recommendation (for pods > 4 vCPUs): 200GB per pod
4240
- **Example Calculation:** To index 1 TB per day (~11.6 MB/s):

content/en/cloudprem/ingress.md

+13-52
Original file line numberDiff line numberDiff line change
@@ -12,19 +12,13 @@ further_reading:
1212

1313
## Overview
1414

15-
Ingress configuration is a critical component of your CloudPrem deployment that manages how external traffic reaches your services. A properly configured ingress controller ensures secure, efficient, and reliable access to your CloudPrem environment. It provides:
16-
17-
- **Traffic management**: Routes external requests to the appropriate CloudPrem services
18-
- **Load balancing**: Distributes incoming traffic across multiple instances for better performance
19-
- **TLS termination**: Handles HTTPS encryption and certificate management
20-
- **Access control**: Enables you to define rules for who can access your CloudPrem deployment
15+
Ingress is a critical component of your CloudPrem deployment, CloudPrem has one public ingress and one private one.
2116

2217
## Public ingress
2318

2419
The public ingress is essential for enabling Datadog's control plane and query service to manage and query CloudPrem clusters over the public internet. It provides secure access to the CloudPrem gRPC API through the following mechanisms:
25-
2620
- Creates an internet-facing AWS Application Load Balancer (ALB) that accepts traffic from Datadog services
27-
- Implements TLS encryption with SSL termination at the load balancer level
21+
- Implements TLS encryption with termination at the load balancer level
2822
- Uses HTTP/2 (gRPC) for communication between the ALB and CloudPrem cluster
2923
- Requires mutual TLS (mTLS) authentication where Datadog services must present valid client certificates
3024
- Configures the ALB in TLS passthrough mode to forward client certificates to CloudPrem pods via the `X-Amzn-Mtls-Clientcert` header
@@ -34,6 +28,17 @@ This setup ensures that only authenticated Datadog services can access the Cloud
3428

3529
<!-- {{< img src="path/to/your/image-name-here.png" alt="TBD Public ingress diagram" style="width:100%;" >}} -->
3630

31+
<div class="alert alert-warning">Only the CloudPrem gRPC API endpoints (paths starting with `/cloudprem`) perform mutual TLS authentication. Exposing any other endpoints through the public ingress introduces a security risk, as those endpoints would be accessible over the internet without authentication. Always restrict non-gRPC endpoints to the internal ingress. </div>
32+
33+
### IP Ranges
34+
The Datadog control plane and query services connect to CloudPrem clusters using a set of fixed IP ranges, which can be retrieved for each Datadog site from the Datadog IP Ranges API, specifically under the "webhooks" section. For example, to fetch the IP ranges for the datadoghq.eu site, you can run:
35+
```
36+
curl -X GET "https://ip-ranges.datadoghq.eu/" \
37+
-H "Accept: application/json" |
38+
jq '.webhooks'
39+
```
40+
41+
3742
## Internal ingress
3843

3944
The internal ingress enables log ingestion from Datadog Agents and other log collectors within your environment through HTTP.
@@ -96,50 +101,6 @@ rules:
96101

97102
<!-- {{< img src="path/to/your/image-name-here.png" alt="TBD Internal ingress using NGINX ingress controller" style="width:100%;" >}} -->
98103

99-
## Supported ingress controllers
100-
101-
<!-- This is generated by Cursor, if this is incorrect, I'll delete this entire section -->
102-
103-
CloudPrem supports various ingress controllers to accommodate different infrastructure requirements and preferences:
104-
105-
{{% collapse-content title="NGINX Configuration" level="h4" expanded=false %}}
106-
```yaml
107-
ingress:
108-
internal:
109-
create: false
110-
nginx:
111-
enabled: true
112-
annotations:
113-
kubernetes.io/ingress.class: nginx
114-
nginx.ingress.kubernetes.io/ssl-redirect: "true"
115-
```
116-
{{% /collapse-content %}}
117-
118-
{{% collapse-content title="HAProxy Configuration" level="h4" expanded=false %}}
119-
```yaml
120-
ingress:
121-
internal:
122-
create: false
123-
haproxy:
124-
enabled: true
125-
annotations:
126-
kubernetes.io/ingress.class: haproxy
127-
```
128-
{{% /collapse-content %}}
129-
130-
{{% collapse-content title="Traefik Configuration" level="h4" expanded=false %}}
131-
```yaml
132-
ingress:
133-
internal:
134-
create: false
135-
traefik:
136-
enabled: true
137-
annotations:
138-
kubernetes.io/ingress.class: traefik
139-
```
140-
{{% /collapse-content %}}
141-
142-
143104
## Further reading
144105

145106
{{< partial name="whats-next/whats-next.html" >}}

0 commit comments

Comments
 (0)