Each of these fills a real blind spot: throttling by principal lets you finger the specific client or service account causing backpressure rather than just seeing aggregate rate limits; rebalance duration gives a direct latency signal for consumer group instability that previously required log scraping. The connection spike and compacted partition metrics round out the picture for teams running retention-heavy or multi-tenant Kafka clusters where those two failure modes tend to sneak up quietly.