aws-monitoring

This skill helps you monitor and debug AWS resources for the SG Cars Trends platform.

Safety Notice

This listing is imported from skills.sh public index metadata. Review upstream SKILL.md and repository scripts before running.

Copy this and send it to your AI assistant to learn

Install skill "aws-monitoring" with this command: npx skills add sgcarstrends/sgcarstrends/sgcarstrends-sgcarstrends-aws-monitoring

AWS Monitoring Skill

This skill helps you monitor and debug AWS resources for the SG Cars Trends platform.

When to Use This Skill

  • Investigating production errors

  • Checking Lambda function logs

  • Monitoring API performance

  • Debugging deployment failures

  • Analyzing CloudWatch metrics

  • Setting up alarms

  • Troubleshooting resource issues

Monitoring Tools

SST Console

SST provides a built-in console for monitoring:

Open SST console for specific stage

npx sst console --stage production npx sst console --stage staging npx sst console --stage dev

Features:

  • Real-time Lambda logs

  • Function invocations

  • Error tracking

  • Resource overview

  • Environment variables

CloudWatch Logs

Access Lambda logs via CloudWatch:

View logs using SST

npx sst logs --stage production

View specific function logs

npx sst logs --stage production --function api

Tail logs in real-time

npx sst logs --stage production --function api --tail

Filter logs

npx sst logs --stage production --function api --filter "ERROR"

Show logs from specific time

npx sst logs --stage production --function api --since 1h npx sst logs --stage production --function api --since "2024-01-15 10:00"

AWS CLI

Use AWS CLI for advanced log queries:

List log groups

aws logs describe-log-groups
--log-group-name-prefix "/aws/lambda/sgcarstrends"

Get recent log streams

aws logs describe-log-streams
--log-group-name "/aws/lambda/sgcarstrends-api-production"
--order-by LastEventTime
--descending
--max-items 5

Tail logs

aws logs tail "/aws/lambda/sgcarstrends-api-production" --follow

Filter logs

aws logs filter-log-events
--log-group-name "/aws/lambda/sgcarstrends-api-production"
--filter-pattern "ERROR"
--start-time $(date -u -d '1 hour ago' +%s)000

Get logs for specific request

aws logs filter-log-events
--log-group-name "/aws/lambda/sgcarstrends-api-production"
--filter-pattern "request-id-here"

CloudWatch Metrics

Lambda Metrics

Get Lambda invocations

aws cloudwatch get-metric-statistics
--namespace AWS/Lambda
--metric-name Invocations
--dimensions Name=FunctionName,Value=sgcarstrends-api-production
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Sum

Get errors

aws cloudwatch get-metric-statistics
--namespace AWS/Lambda
--metric-name Errors
--dimensions Name=FunctionName,Value=sgcarstrends-api-production
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Sum

Get duration

aws cloudwatch get-metric-statistics
--namespace AWS/Lambda
--metric-name Duration
--dimensions Name=FunctionName,Value=sgcarstrends-api-production
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Average,Maximum

API Gateway Metrics

Get API requests

aws cloudwatch get-metric-statistics
--namespace AWS/ApiGateway
--metric-name Count
--dimensions Name=ApiName,Value=sgcarstrends-api
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Sum

Get 4XX errors

aws cloudwatch get-metric-statistics
--namespace AWS/ApiGateway
--metric-name 4XXError
--dimensions Name=ApiName,Value=sgcarstrends-api
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Sum

Get latency

aws cloudwatch get-metric-statistics
--namespace AWS/ApiGateway
--metric-name Latency
--dimensions Name=ApiName,Value=sgcarstrends-api
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Average,Maximum,p99

CloudWatch Alarms

Creating Alarms

// infra/alarms.ts import { StackContext, use } from "sst/constructs"; import * as cloudwatch from "aws-cdk-lib/aws-cloudwatch"; import * as sns from "aws-cdk-lib/aws-sns"; import * as subscriptions from "aws-cdk-lib/aws-sns-subscriptions"; import { API } from "./api";

export function Alarms({ stack, app }: StackContext) { const { api } = use(API);

// Only create alarms for production if (app.stage !== "production") { return; }

// SNS topic for alarms const alarmTopic = new sns.Topic(stack, "AlarmTopic");

// Add email subscription alarmTopic.addSubscription( new subscriptions.EmailSubscription("alerts@sgcarstrends.com") );

// High error rate alarm new cloudwatch.Alarm(stack, "ApiHighErrorRate", { metric: api.metricErrors(), threshold: 10, evaluationPeriods: 2, datapointsToAlarm: 2, alarmDescription: "API has high error rate", treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING, }).addAlarmAction(new cloudwatch.SnsAction(alarmTopic));

// High duration alarm new cloudwatch.Alarm(stack, "ApiHighDuration", { metric: api.metricDuration(), threshold: 5000, // 5 seconds evaluationPeriods: 2, datapointsToAlarm: 2, alarmDescription: "API response time is high", treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING, }).addAlarmAction(new cloudwatch.SnsAction(alarmTopic));

// Throttle alarm new cloudwatch.Alarm(stack, "ApiThrottled", { metric: api.metricThrottles(), threshold: 1, evaluationPeriods: 1, alarmDescription: "API is being throttled", treatMissingData: cloudwatch.TreatMissingData.NOT_BREACHING, }).addAlarmAction(new cloudwatch.SnsAction(alarmTopic)); }

Add to SST config:

// infra/sst.config.ts import { Alarms } from "./alarms";

export default { stacks(app) { app .stack(DNS) .stack(API) .stack(Web) .stack(Alarms); // Add alarms stack }, } satisfies SSTConfig;

Managing Alarms via CLI

List alarms

aws cloudwatch describe-alarms

Get alarm state

aws cloudwatch describe-alarms
--alarm-names "sgcarstrends-ApiHighErrorRate"

Disable alarm

aws cloudwatch disable-alarm-actions
--alarm-names "sgcarstrends-ApiHighErrorRate"

Enable alarm

aws cloudwatch enable-alarm-actions
--alarm-names "sgcarstrends-ApiHighErrorRate"

Delete alarm

aws cloudwatch delete-alarms
--alarm-names "sgcarstrends-ApiHighErrorRate"

CloudWatch Insights

Querying Logs

Start query

aws logs start-query
--log-group-name "/aws/lambda/sgcarstrends-api-production"
--start-time $(date -u -d '1 hour ago' +%s)
--end-time $(date -u +%s)
--query-string 'fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 20'

Get query results

aws logs get-query-results --query-id <query-id>

Common Queries

Find errors:

fields @timestamp, @message | filter @message like /ERROR/ | sort @timestamp desc | limit 20

API performance:

fields @timestamp, @duration | stats avg(@duration), max(@duration), min(@duration)

Count errors by type:

fields @message | filter @message like /ERROR/ | parse @message /(?<errorType>\w+Error)/ | stats count() by errorType

Slow requests:

fields @timestamp, @duration, @requestId | filter @duration > 1000 | sort @duration desc | limit 20

Request rate:

fields @timestamp | stats count() by bin(5m)

X-Ray Tracing

Enable X-Ray

// infra/api.ts import { StackContext, Function } from "sst/constructs"; import * as lambda from "aws-cdk-lib/aws-lambda";

export function API({ stack }: StackContext) { const api = new Function(stack, "api", { handler: "apps/api/src/index.handler", tracing: lambda.Tracing.ACTIVE, // Enable X-Ray });

return { api }; }

Instrument Code

// apps/api/src/index.ts import { captureAWSv3Client } from "aws-xray-sdk-core"; import { DynamoDBClient } from "@aws-sdk/client-dynamodb";

// Wrap AWS SDK clients const client = captureAWSv3Client(new DynamoDBClient({}));

View Traces

Get service graph

aws xray get-service-graph
--start-time $(date -u -d '1 hour ago' +%s)
--end-time $(date -u +%s)

Get trace summaries

aws xray get-trace-summaries
--start-time $(date -u -d '1 hour ago' +%s)
--end-time $(date -u +%s)

Get trace details

aws xray batch-get-traces --trace-ids <trace-id>

Resource Monitoring

Lambda Functions

List functions

aws lambda list-functions --query 'Functions[?starts_with(FunctionName, sgcarstrends)].FunctionName'

Get function config

aws lambda get-function-configuration
--function-name sgcarstrends-api-production

Get function code location

aws lambda get-function
--function-name sgcarstrends-api-production

Invoke function

aws lambda invoke
--function-name sgcarstrends-api-production
--payload '{"path": "/health"}'
response.json

cat response.json

CloudFront Distributions

List distributions

aws cloudfront list-distributions
--query 'DistributionList.Items[*].[Id,DomainName,Status]'
--output table

Get distribution config

aws cloudfront get-distribution-config --id <distribution-id>

Create invalidation (cache clear)

aws cloudfront create-invalidation
--distribution-id <distribution-id>
--paths "/*"

List invalidations

aws cloudfront list-invalidations --distribution-id <distribution-id>

S3 Buckets

List buckets

aws s3 ls

Get bucket size

aws s3 ls s3://bucket-name --recursive --summarize | grep "Total Size"

Monitor bucket metrics

aws cloudwatch get-metric-statistics
--namespace AWS/S3
--metric-name BucketSizeBytes
--dimensions Name=BucketName,Value=bucket-name Name=StorageType,Value=StandardStorage
--start-time $(date -u -d '1 day ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 86400
--statistics Average

Cost Monitoring

Cost Explorer

Get cost and usage

aws ce get-cost-and-usage
--time-period Start=$(date -u -d '1 month ago' +%Y-%m-%d),End=$(date -u +%Y-%m-%d)
--granularity MONTHLY
--metrics BlendedCost
--group-by Type=SERVICE

Get cost by tag

aws ce get-cost-and-usage
--time-period Start=$(date -u -d '1 month ago' +%Y-%m-%d),End=$(date -u +%Y-%m-%d)
--granularity MONTHLY
--metrics BlendedCost
--group-by Type=TAG,Key=Environment

Budget Alerts

Create budget in AWS Console or via CLI:

Create budget

aws budgets create-budget
--account-id $(aws sts get-caller-identity --query Account --output text)
--budget file://budget.json
--notifications-with-subscribers file://notifications.json

Debugging Production Issues

  1. Check Recent Deployments

Get stack events

aws cloudformation describe-stack-events
--stack-name sgcarstrends-api-production
--max-items 50

Get deployment status

npx sst stacks info API --stage production

  1. Check Logs for Errors

Get recent errors

npx sst logs --stage production --function api --filter "ERROR" --since 1h

Or use AWS CLI

aws logs tail "/aws/lambda/sgcarstrends-api-production"
--follow
--filter-pattern "ERROR"

  1. Check Metrics

Check invocations and errors

aws cloudwatch get-metric-statistics
--namespace AWS/Lambda
--metric-name Invocations
--dimensions Name=FunctionName,Value=sgcarstrends-api-production
--start-time $(date -u -d '1 hour ago' +%Y-%m-%dT%H:%M:%S)
--end-time $(date -u +%Y-%m-%dT%H:%M:%S)
--period 300
--statistics Sum

  1. Test Endpoint

Test API directly

curl -I https://api.sgcarstrends.com/health

Test with verbose output

curl -v https://api.sgcarstrends.com/health

  1. Check Resource Limits

Check Lambda quotas

aws service-quotas get-service-quota
--service-code lambda
--quota-code L-B99A9384 # Concurrent executions

Check API Gateway quotas

aws service-quotas list-service-quotas
--service-code apigateway

Common Issues

High Latency

Investigation:

  • Check Lambda duration metrics

  • Review CloudWatch Insights for slow queries

  • Check database connection pool

  • Review API response times

Solutions:

  • Increase Lambda memory

  • Optimize database queries

  • Add caching

  • Use connection pooling

High Error Rate

Investigation:

  • Check error logs

  • Review error types

  • Check external service status

  • Verify environment variables

Solutions:

  • Fix application bugs

  • Add error handling

  • Retry failed requests

  • Check API rate limits

Cold Starts

Investigation:

  • Check init duration

  • Review package size

  • Check provisioned concurrency

Solutions:

  • Enable provisioned concurrency

  • Reduce bundle size

  • Use ARM architecture

  • Optimize imports

Monitoring Scripts

Health Check Script

#!/bin/bash

scripts/health-check.sh

STAGE=${1:-production} API_URL="https://api${STAGE:+.$STAGE}.sgcarstrends.com"

echo "Checking health of $STAGE environment..."

Check API

API_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $API_URL/health)

if [ $API_STATUS -eq 200 ]; then echo "✓ API is healthy" else echo "✗ API is down (status: $API_STATUS)" exit 1 fi

Check Web

WEB_URL="https://${STAGE:+$STAGE.}sgcarstrends.com" WEB_STATUS=$(curl -s -o /dev/null -w "%{http_code}" $WEB_URL)

if [ $WEB_STATUS -eq 200 ]; then echo "✓ Web is healthy" else echo "✗ Web is down (status: $WEB_STATUS)" exit 1 fi

echo "All services are healthy!"

Run:

chmod +x scripts/health-check.sh ./scripts/health-check.sh production

Log Analysis Script

#!/bin/bash

scripts/analyze-logs.sh

STAGE=${1:-production} LOG_GROUP="/aws/lambda/sgcarstrends-api-$STAGE"

echo "Analyzing logs for $STAGE..."

Count errors in last hour

ERROR_COUNT=$(aws logs filter-log-events
--log-group-name $LOG_GROUP
--filter-pattern "ERROR"
--start-time $(date -u -d '1 hour ago' +%s)000
--query 'events[*].message'
--output text | wc -l)

echo "Errors in last hour: $ERROR_COUNT"

Get top errors

echo -e "\nTop error types:" aws logs filter-log-events
--log-group-name $LOG_GROUP
--filter-pattern "ERROR"
--start-time $(date -u -d '1 hour ago' +%s)000
--query 'events[*].message'
--output text |
grep -oE '\w+Error' |
sort | uniq -c | sort -rn | head -5

References

Best Practices

  • Log Levels: Use appropriate log levels (DEBUG, INFO, WARN, ERROR)

  • Structured Logging: Use JSON format for easier parsing

  • Correlation IDs: Track requests across services

  • Alarms: Set up alarms for critical metrics

  • Dashboards: Create CloudWatch dashboards for key metrics

  • Cost Monitoring: Track AWS costs regularly

  • Regular Reviews: Review logs and metrics weekly

  • Retention: Set appropriate log retention (7-30 days)

Source Transparency

This detail page is rendered from real SKILL.md content. Trust labels are metadata-based hints, not a safety guarantee.

Related Skills

Related by shared tags or category signals.

General

framer-motion-animations

No summary provided by upstream source.

Repository SourceNeeds Review
General

shadcn-components

No summary provided by upstream source.

Repository SourceNeeds Review
General

api-testing

No summary provided by upstream source.

Repository SourceNeeds Review
General

design-language-system

No summary provided by upstream source.

Repository SourceNeeds Review