Outcold Solutions provide solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. We offer Splunk applications, which give you insights across all containers environments. We are helping businesses to reduce complexity related to logging and monitoring by providing easy-to-use and deploy solutions for Linux and Windows containers. We deliver applications to help developers monitor their applications and operators to keep their clusters healthy. With the power of Splunk Enterprise and Splunk Cloud, we offer a unique solution to help you keep all the metrics and logs in one place, allowing you to quickly address complex questions on container performance and cluster health.
We provide solutions for monitoring Kubernetes, OpenShift and Docker clusters in Splunk Enterprise and Splunk Cloud. With 10 minutes setup, you will get a monitoring solution, that includes log aggregation, performance and system metrics, metrics from the control plane and application metrics, a dashboard for reviewing network activity, and alerts to notify you about cluster or application performance issues.
All our solutions are powered by the Collectord, a container-native software built by Outcold Solutions that provides capabilities for discovering, transforming and forwarding logs, collecting system metrics, collecting metrics from the control plane of the orchestration frameworks and forwarding network activity. Collectord provides flexible and powerful tools for transforming logs. With our software you can hide sensitive information from the loglines before forwarding them. With Collectord you can reduce the licensing costs associated with logging aggregation by choosing which data you want to forward from the log streams. Collectord forwards container logs, host logs and can discover logs written by the containerized applications.
See detailed metrics from containers and processes, including performance metrics, utilization metrics and security insights. Forward application-specific metrics, exported in Prometheus format. Use prebuilt Splunk dashboards for a comprehensive overview.
Aggregate logs from containers, applications, and servers. Use flexible mappings to filter logs enriched with container metadata, correlate logs with metrics, and leverage Splunk capabilities for analyzing logs. Use Collectord to transform logs before they reach Splunk, remove sensitive information, remove PII data to help keep your logs GDPR compliant. With Collectord you can reduce licensing and storage costs by choosing which loglines you want to forward.
Diagnose cluster issues by looking at historical events, monitoring allocations, and regulating cluster capacity. Leverage pre-built alerts for monitoring the health of the clusters out of the box.
Define access to the data by clusters, namespaces and even pods or containers. Review network activities, happening inside your cluster, and outside connections. Verify containers running with elevated security permissions. Use audit logs for monitoring changes in deployments.
Use one tool to collect and forward logs and metrics required by developers for reviewing performance and health of their applications. With the annotations developers can define how they want to see the data in log aggregation tool, specify multiline log patterns, removing terminal escape codes, override types, sources and indexes.
- Bug fix: events dashboard does not filter by the namespace name
- New dashboard: Collectord metrics
- Compatibility for Kubernetes 1.20
- Bug fix: broken link in Allocatable Resources dashboard
Collectord updates:
- Annotations for collecting prometheus metrics: authorization keys and CAName for SSL certificates
- Improvement for DNS resolutions of Splunk output FQDN
- Export internal collectord metrics in Prometheus format
- Forwarding internal collectord metrics to Splunk
- For the watch objects inputs being able to hide management fields
- In the diag include all open file descriptors
- Upgrade go runtime to 1.14.13
- Remove `\0` symbol from the labels values in the prometheus metrics
- Allow to filter host logs with blacklist and whitelist
- Bug fix: less verbose warnings about not being able to load resources from API server
- Bug fix: performance improvements for Ack DB
- Bug fix: custom prometheus metrics forwarded by Collectord do not include cluster field or custom user fields
- Bug fix: addon pod terminates faster
....
5.15.300 - 2020-06-01
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.15.300 or above (see https://www.outcoldsolutions.com for latest configuration)
- Events dashboard: filters depend on selection of cluster and node labels
- Improvements for supporting Kubernetes 1.14 and higher (OpenShift 4.2+)
- Improvement for alert "Cluster Warning: high number of errors to Kubernetes API" (only alert on 5xx errors)
- Bug fix: node events aren't visible in Events tab
Collectord updates:
- Support for annotations to add custom user fields to data
- Support for blacklisting and whitelisting Prometheus metrics (significally reducing the indexing cost of data)
- Verify command improvements - verify proper configurations for cgroup (memory/memory.use_hierarchy is 1)
- Bug fix: fix bug in prometheus metrics parser, empty fields can be filled with previous fields
- Bug fix: occasionally addon can report warnings about trying to delete expired keys
..
- Logs dashboard: filters depend on selection
- Overview dashboard: namespace counter for list of projects
Collectord updates:
- Support templates in the index, source and sourcetype
- Allow to exclude indexed fields when forwarding to Splunk
- Support annotation for stats interval for containers
- Support containerd runtime
- Bug fix: verify command can show incorrect error about verifying journald input
- Bug fix: index on namespace should set index for application logs
- Bug fix: warning about not being able to retrieve node information
- Improvement for the alerts verifying that control components or nodes are down.
5.12.271 - 2019-11-07
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.12.271 or above (see https://www.outcoldsolutions.com for latest configuration)
- Improvements for the macros for backward compatibility
Collectord updates:
- Bug fix: when event pattern is used for joining multi-line events, the error can not be showed if raised by the input in pipeline.
- Bug fix: reduce warnings failed to get the new event in pipeline - submitted
- Stability improvements
- Compact metrics (pre-calculated on Collectord side)
- Switched stats for host and cgroup in different macros
- Use base macro for alerts
- Improved command extraction for exec in Audit Logs
- Add cluster name in the alert results
Collectord updates:
- Watch namespaces and workloads for changes
- Global configurations with Custom Resources and selectors
- Describe command to see applied annotations for pods
- Bug fix: panic when pipe join configuration is removed
- Bug fix: panic when proc stats is enabled and cgroup stats is disabled
- Bug fix: support ProxyBasicAuthorization for license server checks
- Bug fix: Fix for collecting first sample (can show high CPU usage for first sample)
- Bug fix: if list of URLs is used for Splunk output, the empty URL is still required
- Beta: dynamic index, source and sourcetype names based on the metafields
- Beta: cluster diagnostics with one rule: node entropy
5.11.260 - 2019-09-09
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.11.260 or above (see https://www.outcoldsolutions.com for latest configuration)
- GPU Monitoring (NVIDIA)
Collectord updates:
- Support for PVC volumes for application logs
- Bug fix: small memory leak in addon
- Bug fix: duplicate events then pipeline is getting throttled
- Bug fix: don't use throttling for devnull output
- Bug fix: better recovery for ack db corruption
- Bug fix: crash on journald input initialization when ack db is corrupted
- Bug fix: annotations joinmultiline requires joinpartial
- Bug fix: configurations for stdout only with annotations can crash collectord
- Set events = 50 by default for Splunk output batches
- Security dashboard: Access: access to host via ssh, sudo, exec commands, failed access
- Security dashboard: Audit (users and namespaces)
- Security dashboard: Network (traffic)
- Security dashboard: Network (connections)
- Security dashboard: Objects (pods) - review pods with host network, age of pods, image pull policy, attached host paths, security context and restart policies
- Review dashboard: Clusters (allocations and usage)
- Cluster field filters
- Base macro for overriding macros for other macros
Collectord updates:
- Support for volatile and persistent journald storage with default configuration
- Updated YAML configuration to include most common resources
- Better support for overriding sourcetype, that does not require to update the Splunk macros
- Bug fix: rarely when collectord fails to post to HEC it can panic
- Bug fix: better support for Kubernetes 1.14 and CRI-O storage
- Bug fix: space characters in index annotations can break the pipeline
5.9.240 - 2019-05-14
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.9.240 or above (see https://www.outcoldsolutions.com for latest configuration)
- Visual improvements on the graphs for the number of logs and events
- New alerts for the CPU and Memory reservation
Collectord updates:
- Support for multiple Splunk destinations (outputs)
- Support subdomains for annotations (to deploy multiple collectord instances)
- Support for streaming objects from Kubernetes API to Splunk
- Bug fix: journald input keeps fd open to the rotated files
- Bug fix: fix in the annotation parser for the interval annotations
- Bug fix: fix splunk url selection configuration for multiple splunk URLs
5.8.231 - 2019-04-25
--------------------------------------------------------------------------------
- Bug fix: Collectord usage report shows trial licenses for all instances
5.8.230 - 2019-04-22
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.8.230 or above (see https://www.outcoldsolutions.com for latest configuration)
- Use multiselect filters for most dashboards and filters with possibility to input custom filters.
- Reduce dedup usage to improve performance on dashboards.
- Add critical pod annotations for Kubernetes ...1.13, and priority class for Kubernetes 1.14...
- Fix: statefulset dashboard does not show data with filters.
- Add graph of number of pods per namespace on Overview dashboard.
Collectord updates:
- Bug fix: clogging collectord output with errors when incorrect index is used.
- Bug fix: short lived containers can results in duplicating logs.
- Bug fix: clogging collectord output with warnings when kernel reports incorrect VmRss size.
- Bug fix: annotations cannot override timestamp location for fields extraction.
- Bug fix: verify command reports Journald input in incorrect place.
Requires collectorforkubernetes version 5.7.220 or above (see https://www.outcoldsolutions.com for latest configuration)
- Review savedsearches/alerts to support indexing delay (start searches from 2 minutes behind) and run them in more random time.
- Workload dashboard - change CPU (of host) in table to real CPU
- Fixed single value memory panel on host dashboard (missed span)
- Use SEGMENTATION=none for stats events to use less disk space (needs to me moved to indexers)
Collectord updates:
- Support hostname formatting with environment variables in configuration
- New rotated file logic uses less file descriptors and frees rotated files quicker
- Allow to specify a default sampling value for container logs
- Reimplemented shutdown sequence to stop collectord faster
- Allow to override sampling percent with annotations
- New Input: journald
5.6.212 - 2019-02-19
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.6.212 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Alert: high CPU usage on the host.
- Fixed: Splunk usage dashboard - charts do not show the data, when the used indexed aren't searchable by default.
- New: Support Dark theme.
- New: Free text search in Logs dashboard.
- New: Add auto-refresh options to the dashboard.
- Fixed: Revisited CPU limits and requests for Pods and Containers.
- New: add CPU Max, Memory Max and Project/Namespace labels to the Review-Namespaces dashboard.
- Fixed: Show deleted events
Read more https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
5.5.202 - 2019-01-24
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.5.202 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Dashboard Review -> namespaces. Review allocations and requests for namespaces and pods.
- Fixed: kubernetes_stats_cpu_request_percent - is divided by the number of CPU.
Collectord updates:
- Fixed: Interval 0 in prometheus input can crash the collectord.
- Fixed: When both glob and match are set for the application logs, the glob pattern can block the match pattern from
finding the files in the volume.
5.4.201 - 2018-12-19
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.4.201 or above (see https://www.outcoldsolutions.com for latest configuration)
- Fixed: Alerts for licenses issued with AWS Subscriptions
Collectord updates:
- Fixed: Better handling rotated files (less open fd)
- Fixed: Events input can hang in the err loop.
5.4 - 2018-12-17
--------------------------------------------------------------------------------
Requires collectorforopenshift version 5.4 or above (see https://www.outcoldsolutions.com for latest configuration)
- Improved: etcd metrics representation for bucket values.
- Fixed: API latency alert - exclude imagestreamimports.
- Compatibility update for collectord 5.4.
Collectord updates:
- New: Attach EC2 metadata fields
- New: Basic Auth for Proxy (License Server and Splunk)
- Fixed: Collectord verifies reports CRI-O as unsupported runtime.
- Fixed: Rare crash on Prometheus metrics definition.
- Fixed: Better handling of acknowledgment database corruption.
- Fixed: When handling incorrect indexes, collectord can send index with an empty string, that Splunk recognize as an incorrect index
5.3 - 2018-11-19
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.3 or above (see https://www.outcoldsolutions.com for latest configuration)
- Fixed: Improved Workload dashboard. Allows to filter by namespace, see all Pods in a specific namespace, filter by workload label.
- New: Alert for showing when Collectord reports errors in Processing pipelines (as an example if it failed to extract fields).
- New: Alert for showing when Collectord reports warnings.
- Fixed: Add node labels filter to Storage Dashboard and Control Plane Dashboards.
- New: Alert if lag in the indexing of the data.
- New: Splunk Usage (License usage, number of events) report under Setup.
- Fixed: adjusted high amount of errors to Kubernetes API dashboard to make it less verbose.
https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
5.2.180 - 2018-10-29
- Fixed: misprint in the search for showing alerts
5.2.180 - 2018-10-28
- Fixed: lookup with alerts causing very often replication activities on SHC
5.2.179 - 2018-10-17
- Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations
5.2 - 2018-10-15
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.2 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Review/Storage dashboard based on storage metrics and PVC metrics.
- New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
- Fixed: Performance improvements
...
For details https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
5.2.180 - 2018-10-28
- Fixed: lookup with alerts causing very often replication activities on SHC
5.2.179 - 2018-10-17
- Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations
5.2 - 2018-10-15
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.2 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Review/Storage dashboard based on storage metrics and PVC metrics.
- New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
- Fixed: Performance improvements
...
For details https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
5.2.179 - 2018-10-17
- Fixed: changed search time for few alerts that cause false positives with indexing lag on large installations
5.2 - 2018-10-15
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.2 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Review/Storage dashboard based on storage metrics and PVC metrics.
- New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
- Fixed: Performance improvements
...
For details https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
5.2 - 2018-10-15
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 5.2 or above (see https://www.outcoldsolutions.com for latest configuration)
- New: Review/Storage dashboard based on storage metrics and PVC metrics.
- New: predefined alerts to help you monitor the health of the clusters and performance of the applications.
- Fixed: Performance improvements
...
For details https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
- New: Network metrics (MB, Packets, Drops, and Errors) for host and containers.
- New: Network socket tables (list of the port that containers and hosts are listening on, connections to external resources).
- New: Network review dashboard to see the list of connection to public services and in private network.
- Improvement: Replace python-based lookup with a macro written with eval.
- Improvement: Visual improvement for showing when the object was Last Seen (highlighting and showing minutes ago).
... and more
https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
Highlights:
- Application logs
- Annotations for fields extraction, hiding sensitive information, time extraction, redirecting to /dev/null, stripping terminal colors and more
For more details:
https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/
- New dashboard: Cluster/Audit
- New dashboard: Cluster/Kubernetes API Server
- New dashboard: Cluster/Kubelet
- New dashboard: Cluster/etcd
- New dashboard: Cluster/Scheduler
- New dashboard: Cluster/Controller Manager.
- Include image name, when list containers.
- Added syslog component to the list of host logs.
- Fixed: Include Daemon Set on Overview dashboard, list of namespaces.
3.0.23 - bug fixes release
3.0.22
New overview, security and capacity dashboards. Workload aggregation dashboard.
A lot of of bug fixes and performance improvements.
Relese Notes: https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/#30-2018-02-07
Upgrade instructions: https://www.outcoldsolutions.com/docs/monitoring-kubernetes/upgrade-2-to-3/
Requires collectorforkubernetes version 3.0 or above (see https://www.outcoldsolutions.com for latest configuration)
New overview, security and capacity dashboards. Workload aggregation dashboard.
A lot of of bug fixes and performance improvements.
Relese Notes: https://www.outcoldsolutions.com/docs/monitoring-kubernetes/release-history/#30-2018-02-07
Upgrade instructions: https://www.outcoldsolutions.com/docs/monitoring-kubernetes/upgrade-2-to-3/
Requires collectorforkubernetes version 3.0 or above (see https://www.outcoldsolutions.com for latest configuration)
2.1.21 - 2018-01-02
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 2.1.59.171209 or above
- Updated author and description
2.1.20 - 2017-12-09
--------------------------------------------------------------------------------
- Fixed link to setup / installation instructions.
2.1.18 - 2017-12-09
--------------------------------------------------------------------------------
- Implemented collectors dashboard to track number of collectors, their versions
and used licenses.
- Fallback to the process IO statistics when blkio is not available.
- Fix IO statistic graphs, showed average, when sum should be used.
- Fields extraction support for nginx ingress 0.9 and above.
- [collector] Improved resistance for storage failures.
- [collector] License checks reporting.
- [collector] Better support for openshift environment (default configuration).
2.1.20 - 2017-12-09
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 2.1.59.171209 or above
- Fixed link to setup / installation instructions.
2.1.18 - 2017-12-09
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 2.1.59.171209 or above
- Implemented collectors dashboard to track number of collectors, their versions
and used licenses.
- Fallback to the process IO statistics when blkio is not available.
- Fix IO statistic graphs, showed average, when sum should be used.
- Fields extraction support for nginx ingress 0.9 and above.
- [collector] Improved resistance for storage failures.
- [collector] License checks reporting.
- [collector] Better support for openshift environment (default configuration).
2.0 - 2017-10-22
--------------------------------------------------------------------------------
Requires collectorforkubernetes version 2.0.37.171023 or above
- Better labels support in Dashboards.
Collector has a breaking feature, replacing format for labels from
`kubernetes_node_labels_LABEL1=VALUE1` to `kubernetes_node_labels=[LABEL1=VALUE1,LABEL2=VALUE2]`.
- Process level metrics.
- Uptime for hosts and processes.
- Fields extraction for kubernetes controller manager and scheduler.
- Fields extraction and support in dashboards for main kubernetes components (setup
host logs collection with collector).
- New top-like dashboards allow to monitor Hosts/Pods/Containers/Processes in realtime.
- Rewritten Kubernetes Objects Dashboards with support of Events and Labels.
- Improved dashboards navigation.
- Support for host logs.
- Other bugs and improvements based on user feedback.
Updated links to official documentation for installation instructions.
Fix labels on Kubernetes Dashboards (Most of the filters has incorrect label Daemon Sets)
Monitoring Kubernetes
Splunk AppInspect evaluates Splunk apps against a set of Splunk-defined criteria to assess the validity and security of an app package and components.
As a Splunkbase app developer, you will have access to all Splunk development resources and receive a 10GB license to build an app that will help solve use cases for customers all over the world. Splunkbase has 1000+ apps and add-ons from Splunk, our partners and our community. Find an app or add-on for most any data source and user need, or simply create your own with help from our developer portal.