DevOps
DevOps

DevOps

Users scramble as critical open source project left to die

DevOps

Kubernetes Community Retires Popular Ingress NGINX Controller

DevOps

How to deploy a daemonset on a particular set of nodes in k8s cluster

Kubernetes to discontinue popular Ingress controller NGINX

Ingress NGINX will be discontinued in March 2026 due to unsustainable maintenance burden and security risks, with only best-effort support until then.

Google introduces Agent Sandbox for Kubernetes

Google launches Agent Sandbox, an open-source Kubernetes primitive providing kernel-level isolation for scalable AI agents, with a Python SDK and GKE optimizations including Pod Snapshots.

22 hours ago

DevOps

Google Cloud Demonstrates Massive Kubernetes Scale with 130,000-Node GKE Cluster

DevOps

Users scramble as critical open source project left to die

DevOps

Kubernetes Community Retires Popular Ingress NGINX Controller

DevOps

How to deploy a daemonset on a particular set of nodes in k8s cluster

DevOps

Kubernetes to discontinue popular Ingress controller NGINX

DevOps

Google introduces Agent Sandbox for Kubernetes

DevOps

Safeguard Critical Operations With Unified, Intelligent Context

DevOps

groundcover Takes Aim at Datadog with Observability Migration Tool

DevOps

Honeycomb Canvas brings AI to observability data

DevOps

From Dashboard Soup to Observability Lasagna: Building Better Layers

DevOps

The Modern Retailer's Playbook for In-Store IT

DevOps

New Relic Now Winter 2025 Round-Up

1 day ago

DevOps

Safeguard Critical Operations With Unified, Intelligent Context

DevOps

groundcover Takes Aim at Datadog with Observability Migration Tool

DevOps

Honeycomb Canvas brings AI to observability data

DevOps

From Dashboard Soup to Observability Lasagna: Building Better Layers

DevOps

The Modern Retailer's Playbook for In-Store IT

DevOps

New Relic Now Winter 2025 Round-Up

Unified Observability for Serverless + AWS Lambda | New Relic

Unified Serverless Monitoring sends all serverless telemetry into New Relic APM views, enabling per-invocation traces, cold start tracking, error grouping, and AI-driven cost optimization.

fromLondon Business News | Londonlovesbusiness.com

1 day ago

Researchers spot 700 percent increase in hypervisor attacks

Huntress case data revealed a stunning surge in hypervisor ransomware: its role in malicious encryption rocketed from just three percent in the first half of the year to 25 percent so far in the second half,

DevOps

2 days ago

Why digital services slow down during big events - And how modern tech is solving it

Sudden, concentrated spikes in user demand overload servers and databases, causing widespread slowdowns during major events; coordination and storage under extreme load are primary causes.

From Many Maven Repos to One SBT Monorepo: The Dev Story

Three pipelines spun up, three sets of plugins re-resolved half the internet, and one test failed because Repo C still referenced Repo B's previous artifact. I fixed it, pushed again, and watched the other two pipelines restart for moral support. By 9:30am I had three tabs of "Create Merge Request" open, three pom.xmls fighting me, and one cold coffee. We were living in a tiny-repo cul-de-sac - each house had its own rules, its own toolchain, and its own definition of " latest Jackson.&quot.

DevOps

3 days ago

HL is a Fast, Rust-based JSON Log Viewer Offering Up to 2GiB/s Parsing Speed

According to benchmarks published by hl's creator, the viewer achieves throughput of up to ~2 GiB/s with automatic indexing on initial scan and up to ~10 GiB/s when reindexing growing files. This performance appears to be a significant improvement over alternatives such as hlogf, humanlog, fblog, and fblog-d, making hl a compelling tool for DevOps engineers who work with very large log files from the command line.

DevOps

#cloud-native

5 days ago

DevOps

Hybrid Cloud-Native Networking in Enterprise - Some Assembly Required

DevOps

Where AI meets cloud-native computing

DevOps

The state of cloud-native computing in 2025

5 days ago

DevOps

Hybrid Cloud-Native Networking in Enterprise - Some Assembly Required

DevOps

Where AI meets cloud-native computing

DevOps

The state of cloud-native computing in 2025

more#cloud-native

6 days ago

Accelerate autonomous incident resolutions using the Datadog MCP server and AWS DevOps agent (in preview) | Amazon Web Services

On-call engineers spend hours manually investigating incidents across multiple observability tools, logs, and monitoring systems. This process delays incident resolution and impacts business operations, especially when teams need to correlate data across different monitoring platforms. AWS DevOps Agent (in preview) is a frontier agent that resolves and proactively prevents incidents, continuously improving reliability and performance of applications in AWS, multicloud, and hybrid environments.

DevOps

6 days ago

How we engineered 60% cloud cost reduction at New Relic | New Relic

Embedding real-time financial telemetry into engineering workflows shifted cost responsibility upstream, enabling cost forecasting and iterative optimizations to prevent waste.

6 days ago

Scaling Cloud and Distributed Applications: Lessons and Strategies

Design cloud systems to handle unpredictable, massive traffic spikes using reserved capacity, circuit breakers, automation, performance optimization, and multi-region isolation for resilience and cost-efficiency.

HPE expands GreenLake with virtualization, AI, and security

HPE expands GreenLake with virtualization, AI, and security features to offer flexible, cost-reducing alternatives to VMware and support hybrid cloud modernization.

The state of cloud in 2026

C loud computing has now entered its mature adolescence i.e. it's still surprisingly developmental, changeable and occasionally irrational in some areas, but overall it's certainly old enough to know better and should really start behaving properly. With the debate between public and private cloud now long over and the hybrid norm now (mostly) a de facto standard for typical deployments, multi-cloud itself is still an oft misunderstood state of being, with FinOps constantly berating us for waste and inefficiency.

DevOps

fromBusiness Matters

Best DevOps Certification Courses to Take for High-Paying Jobs in 2026

DevOps skills and certifications enable high-paying, rapidly growing IT careers in 2026 by addressing cloud migration, automation, containers, IaC, CI/CD, and observability demands.

#cloud-outages

fromFast Company

DevOps

The hidden cost of cloud centralization

DevOps

Inside a cloud outage

fromFast Company

DevOps

The hidden cost of cloud centralization

DevOps

Inside a cloud outage

Helm Improves Kubernetes Package Management with Biggest Release in 6 Years

Helm 4.0.0 modernizes Helm with server-side apply, embeddable SDK, WASM plugins, improved chart distribution, signing, and performance targeting scalability, security, and developer workflows.

Docker Releases Desktop 4.50, Adds Free Debugging Tools and AI-Native Enhancements

Docker recently announced the release of Docker Desktop 4.50, marking another update for developers seeking faster, more secure workflows and expanded AI-integration capabilities. The release introduces a free version of Docker Debug for all users, deeper IDE integration (including VSCode and Cursor), improved multi-service to Kubernetes conversion support, new enterprise-grade governance controls, and early support for Model Context Protocol (MCP) tooling.

DevOps

#aws

DevOps

Introducing the AWS Infrastructure as Code MCP Server: AI-Powered CDK and CloudFormation Assistance | Amazon Web Services

DevOps

AWS' new DNS 'business continuity' feature targets 60 minute recovery time after October cloud outage

DevOps

One bad click sent AWS bill into the stratosphere

DevOps

Streamlining Multi-Account Infrastructure with AWS CloudFormation StackSets and AWS CDK | Amazon Web Services

DevOps

CloudWhisper: Revolutionizing AWS Infrastructure Management with AI and MCP

DevOps

Introducing the AWS Infrastructure as Code MCP Server: AI-Powered CDK and CloudFormation Assistance | Amazon Web Services

DevOps

AWS' new DNS 'business continuity' feature targets 60 minute recovery time after October cloud outage

DevOps

One bad click sent AWS bill into the stratosphere

DevOps

Streamlining Multi-Account Infrastructure with AWS CloudFormation StackSets and AWS CDK | Amazon Web Services

DevOps

CloudWhisper: Revolutionizing AWS Infrastructure Management with AI and MCP

Why you should set up CI/CD from day one for your apps - LogRocket Blog

Implementing CI/CD and deployment automation prevents wasted time, enables safe rollbacks, and preserves reliability compared with ad-hoc manual releases.

Full-Stack Observability for DX Success

New Relic DEM provides AI-driven, full-stack observability and proactive UX optimization to correlate RUM, Synthetics, APM, and infrastructure, enabling rapid resolution and self-healing.

Agentic cloud ops with the new Azure Copilot

Azure Copilot provides six specialized AI agents that orchestrate automated, collaborative cloud management across Azure services, enabling action execution and integrated lifecycle support.

Atlassian's DR simulation showed it lived in dependency hell

Australian collaborationware company Atlassian has revealed it's spent four years trying to reduce dangerous internal dependencies, and while it has rebuilt its PaaS, it still has issues - but thinks they're now manageable. As explained in a Tuesday post by Senior Engineering Manager Andrew Ross, "Atlassian runs a large service-based platform with thousands of different services, most deployed by our custom orchestration system, 'Micros'."

DevOps

fromDbmaestro

4 years ago

What is Database Delivery Automation and Why Do You Need It?

Automated database delivery reduces release time, minimizes errors, improves developer-DBA synchronization, and accelerates time-to-market while boosting developer productivity.

fromDbmaestro

4 years ago

If You Don't Have Database Delivery Automation, Brace Yourself for These 10 Problems |

Manual database processes cause configuration drift, break DevOps pipelines, and prevent most organizations from deploying database changes daily.

fromDbmaestro

4 years ago

Database DevOps - Where Do I Start? |

Databases must be integrated into DevOps pipelines to prevent bugs, production issues, delays, and compliance gaps while enabling automated, auditable deployments.

7 open-source apps I'd honestly pay for because they're that good

Some open-source applications provide such powerful, cross-platform functionality that users would willingly pay for them.

Introducing AWS CloudFormation Stack Refactoring Console Experience: Reorganize Your Infrastructure Without Disruption | Amazon Web Services

AWS CloudFormation models and provisions cloud infrastructure as code, letting you manage entire lifecycle operations through declarative templates. Stack Refactoring console experience, announced today, extends the AWS CLI experience launched earlier. Now, you move resources between stacks, rename logical IDs, and decompose monolithic templates into focused components without touching the underlying infrastructure using the CloudFormation console. Your resources maintain stability and operational state throughout the reorganization.

DevOps

#aws-cloudformation

DevOps

Take fine-grained control of your AWS CloudFormation StackSets Deployment with StackSet Dependencies | Amazon Web Services

DevOps

How to Simplify Multi-Account Deployments Monitoring: Centralized Logs for AWS CloudFormation StackSets | Amazon Web Services

DevOps

Take fine-grained control of your AWS CloudFormation StackSets Deployment with StackSet Dependencies | Amazon Web Services

DevOps

How to Simplify Multi-Account Deployments Monitoring: Centralized Logs for AWS CloudFormation StackSets | Amazon Web Services

more#aws-cloudformation

Unleashing the Power of Monitoring: Master Your WordPress with New Relic

WordPress powers countless websites across various domains, offering incredible versatility. This Content Management System (CMS) is the undisputed leader in the CMS market, powering an impressive 43.6% of all websites globally, according to these statistics. With over 810 million websites built on the platform and hundreds more launching daily (500+), its adoption continues to surge. This widespread use gives WordPress a massive 62% CMS market share, significantly outpacing its rivals.

DevOps

Serverless strategies for streaming LLM responses | Amazon Web Services

Modern generative AI applications often need to stream large language model (LLM) outputs to users in real-time. Instead of waiting for a complete response, streaming delivers partial results as they become available, which significantly improves the user experience for chat interfaces and long-running AI tasks. This post compares three serverless approaches to handle Amazon Bedrock LLM streaming on Amazon Web Services (AWS), which helps you choose the best fit for your application.

DevOps

OCP learning how to get quantum computers into existing DCs

Datacenter operators must adapt infrastructure for co-locating quantum and classical HPC, addressing cryogenic cooling, environmental controls, magnetic shielding, weight, humidity, and scheduling.

The Architecture of Developer Experience: Where Product, Platform, and Operations Meet

Platform architecture should minimize cognitive load, enable automation, and foster collaboration across product, platform, and operations to accelerate delivery in distributed systems.

Systemd 259 release candidate flexes musl support

Note that systemd compiled with musl has various limitations: since NSS or equivalent functionality is not available, nss-systemd, nss-resolve, DynamicUser=, systemd-homed, systemd-userdbd, the foreign UID ID, unprivileged systemd-nspawn, systemd-nsresourced, and so on will not work. [...] Caveat emptor. What this means is that it's now possible to compile and run systemd on Linux distributions that are not based on the GNU version of the C standard library, glibc.

DevOps

#aws-outage

DevOps

AWS Disruption Exposes Fragility in Critical Cloud Infrastructure

DevOps

AWS outage: Myths vs reality

DevOps

AWS storm over, but the climate's only getting worse

DevOps

AWS Disruption Exposes Fragility in Critical Cloud Infrastructure

DevOps

AWS outage: Myths vs reality

DevOps

AWS storm over, but the climate's only getting worse

Massive Cloudflare outage was triggered by file that suddenly doubled in size

A bad feature file exceeding a 200-feature runtime limit caused Cloudflare's bot management to fail, generating widespread 5xx errors and network instability.

fromAzure DevOps Blog

Azure DevOps and GitHub Repositories - Next Steps in the Path to Agentic AI - Azure DevOps Blog

With all these rapidly advancing new capabilities, now is the ideal moment to move your repositories to GitHub so your teams can fully harness Copilot's agentic power while still benefiting from your existing investments in Azure Boards and Pipelines. The two platforms continue to work better together, and choosing GitHub as the home for your code unlocks the richest end-to-end agentic experience.

DevOps

fromSmashing Magazine

From Chaos To Clarity: Simplifying Server Management With AI And Automation - Smashing Magazine

AI-ready infrastructure and automation transform reactive server management into proactive operations that preserve performance, reduce firefighting, and improve user retention.

Aspire 13 bolsters Python, JavaScript support

aspire do breaks build, publish, and deploy workflows into discrete, parallelizable steps with dependency tracking and an MCP server that enables AI assistants to query running apps.

AWS Introduces Remote Build Cache in ECR to Accelerate Docker Image Builds

Amazon Web Services has announced enhancements to its CodeBuild service, allowing teams to use Amazon ECR as a remote Docker layer cache, significantly reducing image build times in CI/CD pipelines. By leveraging ECR repositories to persist and reuse build layers across runs, organisations can skip rebuilding unchanged parts of containers and accelerate delivery. The blog outlines how Docker BuildKit support enables commands such as --cache-from and --cache-to pointing to ECR-based cache images,

DevOps

Cloud Security Challenges in the AI Era - How Running Containers and Inference Weaken Your System

Containers package code and its dependencies but do not provide strong isolation, requiring additional runtime and platform security measures for Kubernetes environments.

How to kill a process that is listening on a port in linux

Port conflicts happen when another process has bound the IP:port, preventing a service from listening; identify the process and terminate it.

Crossplane Reaches Production Maturity by Graduating CNCF

Crossplane graduated from the Cloud Native Computing Foundation, becoming a production-hardened Kubernetes-native control plane for multi-cloud infrastructure and internal developer platforms.

Google Cloud Introduces Chaos Engineering Framework and Recipes for Distributed Systems

Intentional chaos engineering in production, using steady-state hypotheses, realistic fault injection, automation, and blast-radius controls, is necessary to build resilient cloud applications.

When Reverse Proxies Surprise You: Hard Lessons from Operating at Scale

Reverse proxy layers are fragile; small operational details and context-dependent optimizations cause large outages, so profile, monitor, and prioritize human-operational simplicity.

Building Resilient Platforms: Insights from Over Twenty Years in Mission-Critical Infrastructure

Great platforms hide complexity, prioritize stability/security/scalability, are opinionated, leverage open source, and rely on empowered diverse teams to build resilient infrastructure.

#minikube

DevOps

1/3 Hands on Kubernetes with Minikube

DevOps

1/3 Hands on Kubernetes with Minikube

DevOps

1/3 Hands on Kubernetes with Minikube

DevOps

1/3 Hands on Kubernetes with Minikube

Distro Hopping, Server Edition

Ubuntu LTS provides longer support and greater stability than Fedora's frequent releases, reducing upgrade churn and compatibility issues.

Rideshare giant dumps 200 cloudy Macs, saves $2.4 million

Grab replaced over 200 cloud Mac Minis with on-premises physical Macs, expecting $2.4 million in savings across three years by avoiding costly cloud macOS billing.

Grafana and GitLab Introduce Serverless CI/CD Observability Integration

Serverless integration forwards GitLab CI/CD webhooks into Grafana Cloud Logs, enabling real-time correlation of deploy events with metrics.

Developers don't care about Kubernetes clusters

Provide developers ready environments instead of forcing them to learn Helm, Kustomize, or Kubernetes manifest internals, which wastes developer time and reduces productivity.

fromAzure DevOps Blog

Azure Developer CLI: Azure Container Apps Dev-to-Prod Deployment with Layered Infrastructure - Azure DevOps Blog

Use azd publish and layered infrastructure in Azure Developer CLI v1.20.0 to build containers once and deploy them across multiple environments with separated concerns.

Logs Intelligence: Shift the Burden, Stop the Crisis

Logs Intelligence shifts incident analysis from SREs to the platform, providing instant root-cause hypotheses and unified Logs+APM correlation to reduce MTTR.

fromSitePoint Forums | Web Development & Design Community

The best place to store static web content on AWS

Prefer S3 with CloudFront for static, immutable web assets to minimize cost and handle traffic spikes while enabling usage metrics.

Google's new query builder to tackle SQL complexity in cloud workload monitoring

The query builder solves a large SQL bottleneck by transforming log analysis from a time-consuming task into a real-time, self-service capability that's fit for any DevOps or site reliability professional. This is an immense time-saver that can collapse typical investigation windows from hours to minutes,

DevOps

You Are Asking the Wrong Questions (About Reliability and SRE)

I'd like to beg you, dear Sir, as well as I can, to have patience with everything unresolved in your heart and to try to love the questions themselves as if they were locked rooms or books written in a very foreign language. Don't search for the answers, which could not be given to you now, because you would not be able to live them. The point is to live everything. Live the questions now. Perhaps then, someday far in the future, you will gradually, without even noticing it, live your way into the answer

DevOps

Perforce: Software development copilots are a shift-up

Copilot AI boosts individual developer productivity 50–70% but can widen quality, visibility, trust, and integration gaps across enterprises without improved processes and DevOps alignment.

#git-sync

DevOps

How to make a bidirectional GitHub Repository Sync

fromfaun.pub

DevOps

How to make a bidirectional GitHub Repository Sync

DevOps

How to make a bidirectional GitHub Repository Sync

fromfaun.pub

DevOps

How to make a bidirectional GitHub Repository Sync

more#git-sync

Zero Trust with Cilium : Enforcing mTLS in Kubernetes

Kubernetes networking is highly flexible but this flexibility can introduce security risks because all pods can communicate with each other by default. Cilium addresses these challenges by providing a modern, high-performance solution for Kubernetes networking that combines security, observability and performance using eBPF. Cilium is an open-source networking and security solution designed for cloud-native environments. It provides high-performance pod-to-pod networking utilizing eBPF and allows identity-aware network policies at the API level, enforcing fine grained controls.

DevOps

New Infrastructure-as-Code Tool "formae" Takes Aim at Terraform

formae is an open-source IaC platform that treats real cloud environments as versioned state, auto-discovers resources, and supports reconcile and patch modes.

Replaying massive data in a non-production environment using Pekko Streams and Kubernetes Pekko...

Replaying real production traffic to non-production environments is essential for realistic testing and requires rate shaping, data sanitization, and robust mirroring infrastructure.

CNCF Highlights How vCluster Eases Kubernetes Multi-Tenancy Challenges

The Cloud Native Computing Foundation (CNCF) published a blog post discussing how vCluster, an open-source project by Loft Labs, addresses key multi-tenancy obstacles in Kubernetes clusters by enabling "virtual clusters" within a single host cluster. This approach enables multiple tenants to have isolated control planes while sharing underlying compute resources, thereby reducing overhead without compromising isolation. Traditional namespace-based isolation in Kubernetes often falls short when tenants need to deploy cluster-scoped resources like custom resource definitions (CRDs)

DevOps

Snowflake shows resilience during AWS outage

Cloud outages expose critical-service vulnerabilities; multi-cloud replication and automated failover maintain business continuity with minimal interruption.

How to survive a horror movie (if that movie is your prod environment)

Missing or orphaned spans fragment distributed traces and obscure root causes, requiring fixes to collection limits, parent IDs, and instrumentation to restore trace continuity.

#linux

DevOps

The easiest way to protect your Linux PC from disaster - no backup needed

DevOps

Try this new Linux security threat scanner to keep your system safe - you'll thank me

DevOps

The easiest way to protect your Linux PC from disaster - no backup needed

DevOps

Try this new Linux security threat scanner to keep your system safe - you'll thank me

New boss changed code so it sent two billion unwanted emails

Removal of a rate-limited Log4j error-email plugin caused two billion SQL-error emails, overwhelming the bank's email system and hiding real error information.

Airbnb's Mussel V2: Next-Gen Key Value Storage to Unify Streaming and Bulk Ingestion

Airbnb's engineering team has rolled out Mussel v2, a complete rearchitecture of its internal key value engine designed to unify streaming and bulk ingestion while simplifying operations and scaling to larger workloads. The new system reportedly sustains over 100,000 streaming writes per second, supports tables exceeding 100 terabytes with p99 read latencies under 25 milliseconds, and ingests tens of terabytes in bulk workloads, allowing caller teams to focus on product innovation rather than managing data pipelines.

DevOps

Clunky tech is costing developers 20 working days a year - these are the leading 'productivity drains' impacting teams

US developers lose nearly 20 work days annually to bugs, outages, tool failures, and unofficial IT support, costing about $8,000 in productivity per developer.

A single DNS race condition brought AWS to its knees

The race condition occurred when one DNS Enactor experienced "unusually high delays" while the DNS Planner continued generating new plans. A second DNS Enactor began applying the newer plans and executed a clean-up process just as the first Enactor completed its delayed run. This clean-up deleted the older plan as stale, immediately removing all IP addresses for the regional endpoint and leaving the system in an inconsistent state that prevented further automated updates applied by any DNS Enactors.

DevOps

Context Engineering for AI Code Reviews: Fix Critical Bugs with Outside-Diff Impact Slicing

Outside-Diff Impact Slicing detects bugs by analyzing caller/callee code one hop beyond a patch to reveal contract violations hidden by diffs.

Microsoft threatens to bring Copilot to on-prem Exchange

"We are exploring the possibility of introducing Copilot for Exchange Server (on-premises)," Microsoft says, linking to a ten-question form that asks: "Would your organization be comfortable enabling Copilot for Exchange Server if it requires sending some Exchange Server data to the cloud?" Er, probably not. After all, many administrators run an on-premises version of Exchange precisely because they don't want any Exchange Server data being sent to Microsoft's cloud.

DevOps

Spotify Portal aims to rock it with developers

The company has now extended its developer tools to be coalesced inside the new Spotify Portal. This is an Internal Developer Platform built by Spotify's Backstage team. In the age of platform engineering, when IDPs hold the promise of self-service computing for developers who want to elevate themselves above the distraction of working with operations to get their infrastructure provisioning handled in an abstracted and automated way, does this technology rock the house?

DevOps

fromTechCrunch

Shuttle raises $6 million to fix vibe-coding's deployment problem | TechCrunch

Shuttle automates deployment of vibe-coded applications by packaging infrastructure, handling payment, and deploying to cloud providers to simplify maintenance and scaling.