Kubectl Skill
This skill enables comprehensive Kubernetes cluster management using kubectl and related tools.
Environment
Cluster Aliases
Three cluster/namespace combinations are pre-configured:
Alias Cluster Namespace Purpose
k1
AWS EKS Production production
生产环境
k2
AWS EKS Production staging
预发布环境
k
K3s (192.168.10.117) simplex
本地开发环境
Usage:
k1 get pods # 查看生产环境 pods k2 get pods # 查看预发布环境 pods k get pods # 查看本地环境 pods
Additional Tools
-
kubectx
-
Switch between clusters
-
kubens
-
Switch between namespaces
-
argocd
-
GitOps deployments (see separate skill)
-
kargo
-
Progressive delivery (see separate skill)
Safety Protocol
Dangerous Operations Requiring Confirmation
Before executing any of the following operations, explicitly confirm with the user:
-
Delete operations: delete pod , delete deployment , delete service , delete pvc
-
Scale to zero: scale --replicas=0
-
Production modifications: Any k1 command that modifies resources
-
Drain/cordon nodes: drain , cordon , uncordon
-
Apply/patch: Changes to production resources
Confirmation Format
⚠️ 危险操作确认
环境: [Production/Staging/Local] 操作: [具体操作描述] 资源: [受影响的资源] 影响: [潜在影响说明]
是否继续执行?
Common Operations Reference
Resource Viewing
Pods
List pods with status
k1 get pods k1 get pods -o wide # Include node and IP info k1 get pods --show-labels # Show labels k1 get pods -l app=simplex-api # Filter by label
Pod details
k1 describe pod <pod-name>
Watch pods in real-time
k1 get pods -w
Deployments
List deployments
k1 get deployments k1 get deploy -o wide
Deployment details
k1 describe deployment <name>
Rollout status
k1 rollout status deployment/<name>
Rollout history
k1 rollout history deployment/<name>
Services & Endpoints
List services
k1 get services k1 get svc
Service details with endpoints
k1 describe svc <name> k1 get endpoints <name>
All Resources
Get all common resources
k1 get all
Get specific resource types
k1 get pods,svc,deploy
Get all resources with labels
k1 get all -l app=simplex-api
Logs & Debugging
Viewing Logs
Basic logs
k1 logs <pod-name>
Follow logs (streaming)
k1 logs -f <pod-name>
Last N lines
k1 logs --tail=100 <pod-name>
Logs since time
k1 logs --since=1h <pod-name> k1 logs --since=10m <pod-name>
Previous container logs (after restart)
k1 logs --previous <pod-name>
Multi-container pod
k1 logs <pod-name> -c <container-name>
All containers in pod
k1 logs <pod-name> --all-containers=true
Executing Commands
Execute command in container
k1 exec <pod-name> -- <command>
Interactive shell
k1 exec -it <pod-name> -- /bin/sh k1 exec -it <pod-name> -- /bin/bash
Specific container in multi-container pod
k1 exec -it <pod-name> -c <container> -- /bin/sh
Debugging
Pod events and status
k1 describe pod <pod-name>
Get pod YAML
k1 get pod <pod-name> -o yaml
Debug with ephemeral container
k1 debug <pod-name> -it --image=busybox
Check resource usage
k1 top pods k1 top nodes
Deployment Management
Scaling
Scale deployment
k1 scale deployment/<name> --replicas=3
Autoscale
k1 autoscale deployment/<name> --min=2 --max=5 --cpu-percent=80
Rolling Updates
Update image
k1 set image deployment/<name> <container>=<image>:<tag>
Rollout status
k1 rollout status deployment/<name>
Pause/resume rollout
k1 rollout pause deployment/<name> k1 rollout resume deployment/<name>
Rollback
k1 rollout undo deployment/<name> k1 rollout undo deployment/<name> --to-revision=2
Restart
Restart deployment (rolling restart)
k1 rollout restart deployment/<name>
Configuration Resources
ConfigMaps
List ConfigMaps
k1 get configmaps k1 get cm
View ConfigMap content
k1 describe cm <name> k1 get cm <name> -o yaml
Create from file
k1 create configmap <name> --from-file=<path>
Create from literal
k1 create configmap <name> --from-literal=key=value
Secrets
List Secrets
k1 get secrets
View Secret (base64 encoded)
k1 get secret <name> -o yaml
Decode Secret value
k1 get secret <name> -o jsonpath='{.data.password}' | base64 -d
Create Secret
k1 create secret generic <name> --from-literal=password=xxx
PersistentVolumeClaims
List PVCs
k1 get pvc
PVC details
k1 describe pvc <name>
Network Operations
Port Forwarding
Forward local port to pod
k1 port-forward pod/<name> 8080:80
Forward to service
k1 port-forward svc/<name> 8080:80
Background port-forward
k1 port-forward pod/<name> 8080:80 &
Service Exposure
Expose deployment as service
k1 expose deployment/<name> --port=80 --target-port=8080
Get service external IP
k1 get svc <name> -o jsonpath='{.status.loadBalancer.ingress[0].hostname}'
Cluster Management
Nodes
List nodes
k1 get nodes k1 get nodes -o wide
Node details
k1 describe node <name>
Node resource usage
k1 top nodes
Namespaces
List namespaces
k1 get namespaces
Switch namespace (using kubens)
kubens <namespace>
Create namespace
k1 create namespace <name>
Context Management
List contexts
kubectx
Switch context
kubectx <context-name>
Show current context
kubectl config current-context
Resource Monitoring
Pod resource usage
k1 top pods k1 top pods --sort-by=cpu k1 top pods --sort-by=memory
Node resource usage
k1 top nodes
HPA status
k1 get hpa k1 describe hpa <name>
Output Formatting
For Status Checks
Provide concise summaries:
✅ Pod 状态 (production) ┌──────────────────────────┬─────────┬──────────┬─────────┐ │ Pod │ Status │ Restarts │ Age │ ├──────────────────────────┼─────────┼──────────┼─────────┤ │ simplex-api-xxx-abc │ Running │ 0 │ 2d │ │ simplex-api-xxx-def │ Running │ 0 │ 2d │ └──────────────────────────┴─────────┴──────────┴─────────┘
For Troubleshooting
When investigating issues, gather:
-
Pod status: k1 get pod <name>
-
Pod events: k1 describe pod <name>
-
Recent logs: k1 logs --tail=50 <name>
-
Resource usage: k1 top pod <name>
Custom Output Formats
JSON output
k1 get pods -o json
YAML output
k1 get pod <name> -o yaml
Custom columns
k1 get pods -o custom-columns=NAME:.metadata.name,STATUS:.status.phase
JSONPath
k1 get pods -o jsonpath='{.items[*].metadata.name}'
Troubleshooting Workflows
Pod Not Starting
-
Check pod status: k1 get pod <name>
-
Check events: k1 describe pod <name> (look at Events section)
-
Check logs: k1 logs <name> or k1 logs --previous <name>
-
Common issues:
-
ImagePullBackOff : Check image name and registry credentials
-
CrashLoopBackOff : Check application logs
-
Pending : Check resource requests and node capacity
High Resource Usage
-
Check pod usage: k1 top pods --sort-by=memory
-
Check node usage: k1 top nodes
-
Check HPA status: k1 get hpa
-
Consider scaling: k1 scale deployment/<name> --replicas=N
Service Not Accessible
-
Check service: k1 get svc <name>
-
Check endpoints: k1 get endpoints <name>
-
Check pod labels match service selector
-
Test from within cluster: k1 exec -it <pod> -- curl <service>:<port>
Integration Notes
For GitOps operations (deployments via git), use the ArgoCD and Kargo skills:
-
ArgoCD: Application sync, rollback, status
-
Kargo: Progressive delivery, freight promotion
For AWS infrastructure operations, use the AWS CLI skill.