🎯 Mục tiêu Task 13: Setup AWS X-Ray cho distributed tracing - TRACE REQUEST END-TO-END
Task 13 enable distributed tracing:
Tracing Flow: User Request → API Gateway → ECS → DynamoDB → X-Ray Analysis
Enable X-Ray cho API Gateway:
✅ Enable X-Ray Tracing
Sampling rate: 10%
curl -X GET https://your-api-gateway-url/api/users
Update ECS Task Definition với X-Ray daemon:
{
"containerDefinitions": [
{
"name": "user-service",
"image": "ACCOUNT.dkr.ecr.ap-southeast-1.amazonaws.com/vinashoes-user-service:latest",
"environment": [
{
"name": "AWS_XRAY_DAEMON_ADDRESS",
"value": "localhost:2000"
}
],
"dependsOn": [
{
"containerName": "xray-daemon",
"condition": "START"
}
]
},
{
"name": "xray-daemon",
"image": "amazon/aws-xray-daemon:latest",
"cpu": 32,
"memoryReservation": 256,
"portMappings": [
{
"containerPort": 2000,
"protocol": "udp"
}
]
}
]
}
IAM permissions cần thiết:
{
"Effect": "Allow",
"Action": [
"xray:PutTraceSegments",
"xray:PutTelemetryRecords"
],
"Resource": "*"
}
Install X-Ray SDK:
npm install aws-xray-sdk-core aws-xray-sdk-express
Configure trong main.ts:
import { NestFactory } from '@nestjs/core';
import * as AWSXRay from 'aws-xray-sdk-express';
async function bootstrap() {
const app = await NestFactory.create(AppModule);
// Enable X-Ray tracing
app.use(AWSXRay.openSegment('user-service'));
app.enableCors();
app.use(AWSXRay.closeSegment());
await app.listen(3000);
}
Trace DynamoDB calls:
import * as AWSXRay from 'aws-xray-sdk-core';
const AWS = AWSXRay.captureAWS(require('aws-sdk'));
@Injectable()
export class UsersService {
private dynamoDB = new AWS.DynamoDB.DocumentClient();
async findOne(id: string) {
const params = { TableName: 'User', Key: { id } };
return await this.dynamoDB.get(params).promise();
}
}
Deploy updated services:
# Build và push new image với X-Ray
docker build -t user-service-xray .
docker push ACCOUNT.dkr.ecr.ap-southeast-1.amazonaws.com/vinashoes-user-service:latest
# Update ECS service
aws ecs update-service \
--cluster vinashoes-cluster \
--service vinashoes-user-service \
--task-definition vinashoes-user-service:LATEST
Test tracing:
# Generate test requests
curl -X GET https://your-api-gateway-url/api/users/123
curl -X GET https://your-api-gateway-url/api/users/nonexistent
# Check X-Ray Console → Service map
Service Map Analysis:
Performance Insights:
Example Trace Analysis:
Total Duration: 245ms
- API Gateway: 5ms
- User Service: 180ms
- Business logic: 20ms
- DynamoDB call: 160ms (BOTTLENECK!)
- Response: 60ms
CloudWatch ServiceLens Integration:
| Component | Status |
|---|---|
| ✅ API Gateway Tracing | ACTIVE |
| ✅ ECS X-Ray Integration | DEPLOYED |
| ✅ NestJS SDK | INTEGRATED |
| ✅ Service Map | VISIBLE |
| ✅ Performance Analysis | READY |
✅ Complete End-to-End Tracing:
Sampling Strategy:
Production: 5-10%
Development: 100%
Critical paths: Always sample
Cost Management:
- Use intelligent sampling
- Configure retention policies
- Focus on critical services
Troubleshooting:
- Check X-Ray daemon logs
- Verify IAM permissions
- Monitor sampling rates
Next: Task 14 - Security monitoring với AWS Config 🚀
Disable X-Ray cho API Gateway stage:
# Disable X-Ray tracing cho stage prod
aws apigateway update-stage \
--rest-api-id YOUR_API_ID \
--stage-name prod \
--patch-op op=replace,path=/tracingEnabled,value=false
Update ECS task definition để loại bỏ X-Ray daemon:
{
"containerDefinitions": [
{
"name": "user-service",
"image": "ACCOUNT.dkr.ecr.ap-southeast-1.amazonaws.com/vinashoes-user-service:latest",
"environment": []
}
]
}
Deploy updated task definition:
# Update ECS service với task definition mới
aws ecs update-service \
--cluster vinashoes-cluster \
--service vinashoes-user-service \
--task-definition vinashoes-user-service:NEW_VERSION \
--force-new-deployment
Remove X-Ray permissions từ ECS task role:
# Detach X-Ray policy từ task execution role
aws iam detach-role-policy \
--role-name ecsTaskExecutionRole \
--policy-arn arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess
# Hoặc remove specific permissions từ custom policy
Xóa X-Ray traces và service maps (optional):
# Xóa traces cũ hơn 30 ngày
aws xray delete-trace \
--trace-id trace-id-here
# Note: X-Ray tự động xóa data sau 30 ngày theo mặc định
⚠️ Thứ Tự Dọn Dẹp X-Ray:
Cấu trúc giá AWS X-Ray:
| Thành Phần Dịch Vụ | Miễn Phí | Trả Phí | Ước Tính Chi Phí |
|---|---|---|---|
| Traces | 100,000 traces/tháng | $0.000005/trace | $0.50/tháng |
| Retrieved Traces | - | $0.000005/retrieved trace | $0.10/tháng |
| Analytics Queries | - | $0.000005/query | $0.05/tháng |
| Service Map | Miễn phí | - | $0/tháng |
| CloudWatch Integration | Miễn phí | - | $0/tháng |
Ước tính chi phí cho e-commerce platform:
Chi Phí Cơ Bản X-Ray:
Traces: $0.50/tháng (100K traces)
Retrieved Traces: $0.10/tháng (20K retrieved)
Analytics: $0.05/tháng (10K queries)
Sampling & Storage:
Sampling Rate: 10% (giảm 90% traces)
Storage: Miễn phí (30 ngày retention)
Tổng Chi Phí Hàng Tháng: $0.65/tháng
Giảm chi phí X-Ray:
Chiến Thuật Tối Ưu:
1. Sampling Thông Minh:
- Production: 5-10% sampling
- Development: 100% sampling
- Critical paths: Luôn trace
2. Retention Policy:
- 30 ngày đủ cho hầu hết debugging
- Archive traces quan trọng
3. Query Optimization:
- Sử dụng filters để giảm retrieved traces
- Schedule analytics queries
4. Service Selection:
- Enable chỉ cho critical services
- Disable cho background jobs
Lợi Ích Observability vs Chi Phí:
| Loại Lợi Ích | Giá Trị | Tác Động Chi Phí |
|---|---|---|
| Debugging Thời Gian | Giảm MTTR 50% | $10K+ mỗi outage |
| Performance Optimization | Cải thiện response time | $5K+ mỗi giây chậm |
| Service Reliability | Giảm downtime | $50K+ mỗi giờ downtime |
| Development Efficiency | Troubleshooting nhanh hơn | 5 giờ/ngày tiết kiệm |
| Customer Experience | Tăng satisfaction | Tăng revenue 2-5% |
Tính Toán ROI:
Theo dõi chi tiêu X-Ray:
# Kiểm tra chi phí X-Ray
aws ce get-cost-and-usage \
--time-period Start=2024-01-01,End=2024-01-31 \
--granularity MONTHLY \
--metrics BlendedCost \
--group-by Type=DIMENSION,Key=SERVICE \
--filter '{
"Dimensions": {
"Key": "SERVICE",
"Values": ["AWS X-Ray"]
}
}'
# Giám sát số lượng traces
aws xray get-trace-summaries \
--start-time 2024-01-01T00:00:00Z \
--end-time 2024-01-31T23:59:59Z \
--query 'TraceSummaries[*].{Id:Id,Duration:Duration,ResponseTime:ResponseTime}'
💡 Thực Tiễn Quản Lý Chi Phí Tốt Nhất
Sampling Strategy:
Cost Monitoring:
Optimization:
Scaling Considerations:
🚀 Production-Ready AWS Microservices Platform with Complete Observability! 🚀