Table of contents
Monday, 17th June 2024
Kubernetes doesn’t expose metrics, logs, and trace data in the same way traditional apps and VMs do. Instead, Kubernetes captures data snapshots at specific points in the lifecycle.
Additionally, Kubernetes doesn’t centralise logs out of the box, so each app and cluster records data in its respective environment.
To integrate logs from different environments (e.g., system logs, k8s logs), auditing must be enabled, and a log processor is required to aggregate, process, and route logs to an external system.
A lot of effort right?
How GCloud makes this simple
GCP streamlines logging and monitoring by integrating these services directly into the GKE platform. Unlike traditional Kubernetes setups that rely heavily on CLI configurations, GCP offers both an intuitive user interface and CLI options for seamless interaction with your clusters.
You can also integrate your development workflows into a CI/CD pipeline or use a GitOps architecture, the latter being recommended for Kubernetes environments since it applies the very same principles. You can do this for both on-prem and cloud-based workloads. See our article on replacing your ci/cd pipeline with GitOps
Building an observability strategy
Google Cloud and its Google Kubernetes Managed service offer robust logging and monitoring right out of the box. To maximise their benefits, it’s essential to implement these features thoughtfully from the design stage.
Starting simple and gradually scaling up ensures a smooth integration process without unnecessary complexity.
The Managed Prometheus service on GCP enables you to leverage powerful open-source tools, avoiding typical vendor lock-in. While configuring customised solutions might require a more hands-on approach, the flexibility and control you gain can outweigh the costs associated with a more complex setup. Every solution has its costs; understanding and managing these costs effectively is key to a successful observability strategy.
The first step
The first step in planning a GCP observability strategy is determining which specific cloud services your workloads use in GCP. Then, you can read the Google Cloud metrics documentation to determine which types of metrics are available for those services. If you run workloads hosted in VMs using Compute Engine, you have an entirely different set of metrics to collect than you do for workloads running in GKE.
The costs
In GKE, Logging & Monitoring incur costs based on usage and configuration. The more complex your setup and the more services you integrate, the higher the resource usage and resulting bill. This is great for getting started as it means your costs should be relatively manageable.
With GKE, you can leverage "metrics exporters" using various GCP native services, each designed to handle specific metrics efficiently (e.g. system metrics, kube-state metrics, cAdvisor/kubelet metrics, Google Cloud Managed Service for Prometheus, etc.). This allows for comprehensive and granular monitoring tailored to your needs.
For customised solutions, GKE provides the flexibility to disable managed services and deploy your own, ensuring your observability setup aligns perfectly with your business and technology requirements. For instance, if you need to customise Control Plane metrics, you can seamlessly integrate your metric exporter into the GCP platform. Cloud Logging and Cloud Monitoring are integrated with GKE, offering a solid foundation that you can build upon with specialised solutions to suit your unique needs.
Additional considerations
If you want to monitor the actions that human and machine users perform in your GCP environment, you can use Cloud Audit Logs, a GCP service that tracks administrative activities. Cloud Audit Logs only works for GCP services that generate audit logs. Though this includes most services, not all types of actions are recorded for every service.