Wiki Article

Prometheus (software)

Nguồn dữ liệu từ Wikipedia, hiển thị bởi DefZone.Net

Prometheus
Initial releaseNovember 24, 2012; 13 years ago (2012-11-24)
Stable release
v3.2.1[1] / February 26, 2025; 12 months ago (2025-02-26)
Written inGo
Operating systemCross-platform
TypeTime series database
LicenseApache License 2.0
Websiteprometheus.io
Repositorygithub.com/prometheus/prometheus

Prometheus is a free software application for event monitoring and alerting.[2] It records metrics in a time series database built using an HTTP pull model, supporting high dimensionality through key-value label pairs, flexible queries, and real-time alerting.[3] The project is written in Go and licensed under the Apache 2.0 License, with source code available on GitHub.[4]

Prometheus originated at SoundCloud in 2012 and was accepted by the Cloud Native Computing Foundation (CNCF) in 2016, graduating from incubation in 2018. It is commonly paired with Grafana for dashboard visualization and supports a wide range of exporters and integrations.

History

[edit]

Prometheus was developed at SoundCloud starting in 2012,[5] after the company found that its existing metrics tools, based on StatsD and Graphite, could not meet the demands of its containerized infrastructure. The design goals included a multi-dimensional data model, operational simplicity, scalable data collection, and a powerful query language in a single tool.[6] The project was open source from the start and was adopted by Boxever and Docker users before any official announcement.[6][7]

The design was influenced by Borgmon, Google's internal time-series monitoring system, which treated time-series data as a source for alert generation.[8][9]

By 2013, Prometheus was in production use at SoundCloud. The project was publicly announced in January 2015.[6]

In May 2016, the Cloud Native Computing Foundation accepted Prometheus as its second incubated project, after Kubernetes.[10] In August 2018, the CNCF announced that Prometheus had graduated from incubation.[11]

Versions

[edit]

Prometheus 1.0 was released in July 2016.[12] Subsequent releases through 2016 and 2017 led to Prometheus 2.0 in November 2017, which introduced a new storage engine with significantly improved performance and reduced disk usage.[13]

Architecture

[edit]

A typical Prometheus monitoring deployment consists of several components working together.[5] Exporters run on monitored hosts to collect and expose local metrics. The Prometheus server scrapes those exporters at a configured interval, aggregates the data, and stores it locally. Alertmanager[14] receives alerts from Prometheus and handles routing, grouping, and silencing before forwarding notifications. Grafana is commonly used to build dashboards from Prometheus data. Queries against all of these are written in PromQL, Prometheus's native query language.

Data model

[edit]

Prometheus data is organized as named metrics, each optionally qualified by an arbitrary number of key-value label pairs. Labels can identify the data source (server name, datacenter) or carry application-specific context such as HTTP status code, request method, or endpoint. Querying in real time against any combination of labels is what makes the data model multi-dimensional.[15][6][7]

Prometheus stores data locally on disk for fast writes and queries.[6] Metrics can also be forwarded to remote storage backends, including Grafana Mimir and other Prometheus-compatible systems.[16]

Data collection

[edit]

Prometheus collects data through a pull model: the server periodically queries a configured list of targets (exporters) and aggregates the returned time-series values.[6] Prometheus includes several service discovery mechanisms to automatically locate targets in dynamic environments.[17]

PromQL

[edit]

Prometheus provides its own query language, PromQL (Prometheus Query Language), which allows users to select and aggregate time-series data. The language includes time-oriented constructs such as the rate() function, instant vectors, and range vectors that return multiple samples per series over a specified time window.[18]

Prometheus defines four metric types that PromQL operates on:[19] Counter (a monotonically increasing value), Gauge (an arbitrary value that can go up or down), Histogram (samples observations and counts them in configurable buckets), and Summary (similar to Histogram but calculates quantiles on the client side).

Example

[edit]
# A metric with label filtering
go_gc_duration_seconds{instance="localhost:9090", job="alertmanager"}

# Aggregation operators
sum by (app, proc) (
  instance_memory_limit_bytes - instance_memory_usage_bytes
) / 1024 / 1024

[20]

Alerting

[edit]

Alert rules in Prometheus specify a condition and a duration; if the condition holds for that duration, Prometheus fires an alert to Alertmanager. Alertmanager handles silencing, inhibition, and routing to notification destinations including email, Slack, and PagerDuty.[21] Additional targets such as Microsoft Teams[22] can be reached through the Alertmanager webhook receiver interface.[23]

Time series database

[edit]

Prometheus includes its own time series database. Recent data (by default, one to three hours) is held in a combination of memory[24] and mmap-backed files.[25] Older data is written to persistent blocks indexed with an inverted index, which suits Prometheus's label-based query patterns.[26][27] A background compaction process merges smaller blocks into larger ones to reduce read overhead.[28] Durability against crashes is provided by a write-ahead log (WAL).[29]

Dashboards

[edit]

Prometheus includes a basic expression browser but is not a full dashboard system. Grafana is the standard pairing, querying Prometheus via PromQL to produce dashboards; the need to deploy and maintain Grafana separately is sometimes cited as an operational drawback.[30]

Interoperability

[edit]

Prometheus favors white-box monitoring, where applications publish internal metrics for collection. Exporters and agents are available for many applications and systems.[31] For transition from existing monitoring stacks, Prometheus supports several protocols: Graphite, StatsD, SNMP, JMX, and CollectD.[32]

Metrics are typically retained for a few weeks. For longer retention, Prometheus can stream data to remote storage backends.[16]

OpenMetrics

[edit]

An effort to standardize the Prometheus exposition format as OpenMetrics has gained adoption from several vendors, including InfluxData's TICK suite,[33] InfluxDB, Google Cloud Platform,[34] Datadog,[35] and New Relic.[36][37] The OpenMetrics specification is maintained separately from the Prometheus project.[38]

Library support

[edit]

Prometheus client libraries are available for most major programming languages. The POCO C++ Libraries expose Prometheus metrics through the Poco::Prometheus namespace.[39]

See also

[edit]

References

[edit]
  1. ^ "Latest release". GitHub. Prometheus. Retrieved March 10, 2026.
  2. ^ "Overview". prometheus.io. Retrieved March 10, 2026.
  3. ^ James Turnbull (June 12, 2018). Monitoring with Prometheus. Turnbull Press. ISBN 978-0-9888202-8-9.
  4. ^ "Prometheus". GitHub. Retrieved December 26, 2018.
  5. ^ a b Brian Brazil (July 9, 2018). Prometheus: Up & Running: Infrastructure and Application Performance Monitoring. O'Reilly Media. p. 3. ISBN 978-1-4920-3409-4.
  6. ^ a b c d e f Volz, Julius; Rabenstein, Björn (January 26, 2015). "Prometheus: Monitoring at SoundCloud". SoundCloud. Retrieved March 10, 2026.
  7. ^ a b "Monitor Docker Containers with Prometheus". 5π Consulting. January 26, 2015. Archived from the original on January 3, 2019. Retrieved December 26, 2018.
  8. ^ Murphy, Niall; Beyer, Betsy; Jones, Chris; Petoff, Jennifer (2016). Site Reliability Engineering: How Google Runs Production Systems. O'Reilly Media. ISBN 978-1491929124. Even though Borgmon remains internal to Google, the idea of treating time-series data as a data source for generating alerts is now accessible to everyone through those open source tools like Prometheus ...
  9. ^ Volz, Julius (September 4, 2017). "PromCon 2017: Conference Recap". Retrieved March 10, 2026 – via YouTube. I joined SoundCloud back in 2012 coming from Google...we didn't yet have any monitoring tools that works with this kind of dynamic environment. We were kind of missing the way Google did its monitoring for its own internal cluster scheduler and we were very inspired by that and finally decided to build our own open-source solution.
  10. ^ "Cloud Native Computing Foundation Accepts Prometheus as Second Hosted Project". Cloud Native Computing Foundation. May 9, 2016. Retrieved December 26, 2018.
  11. ^ Evans, Kristen (August 9, 2018). "Cloud Native Computing Foundation Announces Prometheus Graduation". Retrieved December 26, 2018.
  12. ^ "Prometheus 1.0 Is Here". Cloud Native Computing Foundation. July 18, 2016. Retrieved December 26, 2018.
  13. ^ "New Features in Prometheus 2.0.0". Robust Perception. November 8, 2017. Retrieved December 26, 2018.
  14. ^ "Alertmanager". GitHub. May 17, 2022. Retrieved March 10, 2026.
  15. ^ "Data model". Prometheus. Retrieved December 26, 2018.
  16. ^ a b "Integrations - Prometheus". prometheus.io. Retrieved March 10, 2026.
  17. ^ "Prometheus: Collects metrics, provides alerting and graphs web UI". March 18, 2017. Retrieved December 26, 2018.
  18. ^ "Querying Prometheus". Retrieved November 4, 2019.
  19. ^ "Metric types". prometheus.io. Retrieved June 29, 2024.
  20. ^ pygments/tests/examplefiles/promql/example.promql at master · pygments/pygments on GitHub
  21. ^ Dubey, Abhishek (March 25, 2018). "AlertManager Integration with Prometheus". Retrieved December 26, 2018.
  22. ^ Danuka, Praneeth (March 8, 2020). "Alerting for Cloud-native Applications with Prometheus". Retrieved October 18, 2020.
  23. ^ "Integrations | Prometheus". Retrieved March 10, 2026.
  24. ^ "Prometheus TSDB (Part 1): The Head Block". ganeshvernekar.com. September 19, 2020. Retrieved January 17, 2025.
  25. ^ "Prometheus TSDB (Part 3): Memory Mapping of Head Chunks from Disk". ganeshvernekar.com. October 2, 2020. Retrieved January 17, 2025.
  26. ^ "Prometheus TSDB (Part 4): Persistent Block and its Index". ganeshvernekar.com. October 18, 2020. Retrieved January 17, 2025.
  27. ^ "Prometheus TSDB (Part 5): Queries". ganeshvernekar.com. January 4, 2021. Retrieved January 17, 2025.
  28. ^ "Prometheus TSDB (Part 6): Compaction and Retention". ganeshvernekar.com. July 27, 2021. Retrieved January 17, 2025.
  29. ^ "Prometheus TSDB (Part 2): WAL and Checkpoint". ganeshvernekar.com. September 26, 2020. Retrieved January 17, 2025.
  30. ^ Ryckbosch, Frederick (July 28, 2017). "Prometheus monitoring: Pros and cons". Retrieved December 26, 2018.
  31. ^ "Exporters". prometheus.io. Retrieved March 10, 2026.
  32. ^ "Instrumentation - Prometheus". prometheus.io. Retrieved March 10, 2026.
  33. ^ "Telegraf from InfluxData". GitHub. December 25, 2018. Retrieved March 10, 2026.
  34. ^ "Announcing Stackdriver Kubernetes Monitoring". Retrieved March 10, 2026.
  35. ^ "DataDog Prometheus". Retrieved March 10, 2026.
  36. ^ "Send Prometheus metric data to New Relic". docs.newrelic.com. Retrieved April 16, 2025.
  37. ^ "Configure Prometheus OpenMetrics integrations". docs.newrelic.com. Retrieved April 16, 2025.
  38. ^ "OpenMetrics". GitHub. November 13, 2018. Retrieved March 10, 2026.
  39. ^ "Namespace Poco::Prometheus". docs.pocoproject.org. POCO Project. Retrieved October 18, 2025.

Further reading

[edit]
[edit]