DevOps

[ follow ]
#kubernetes-135
#clickhouse
#observability
fromInfoQ
2 days ago
DevOps

Uber Gets Ready for AI in Network Observability with Cloud Native Overhaul

fromInfoQ
5 days ago
DevOps

Railway Highlights the Importance of Logs, Metrics, Traces, and Alerts for Diagnosing System Failure

fromInfoQ
2 days ago
DevOps

Uber Gets Ready for AI in Network Observability with Cloud Native Overhaul

fromInfoQ
5 days ago
DevOps

Railway Highlights the Importance of Logs, Metrics, Traces, and Alerts for Diagnosing System Failure

fromInfoQ
2 days ago

OpenEverest: Open Source Platform for Database Automation

Percona recently announced OpenEverest, an open-source platform for automated database provisioning and management that supports multiple database technologies. Launched initially as Percona Everest, OpenEverest can be hosted on any Kubernetes infrastructure, in the cloud, or on-premises. The main goal of the project is to avoid vendor lock-in while still providing an automated private DBaaS. Built on top of Kubernetes operators, it aims to avoid complex deployments that depend on a single cloud provider's technology.
DevOps
fromNew Relic
4 days ago

Preventing network outages: How we use New Relic to monitor our multi-cloud infrastructure

Running a global observability platform means one thing above all: your infrastructure must never go down. When you're responsible for monitoring thousands of customers' applications 24/7, network failures aren't just inconvenient, they're existential threats. At New Relic, hundreds of clusters run on multiple clouds, and regions. These clusters depend on a complex web of network connections: regional transit gateways, inter-regional hubs, and cross-cloud links.
DevOps
DevOps
fromZDNET
4 days ago

7 open-source apps I'd happily pay for - because they're that good

Many high-quality open-source applications exist across Linux, MacOS, and Windows; some are indispensable enough that users would willingly pay for them.
fromZDNET
5 days ago

Need to manage virtual machines on Linux? I found an easier way

I recently wrote about my migration away from VirtualBox to KVM/Virt-Machine for my virtual machine needs. I've found those tools to be far superior (albeit with a bit more of a learning curve) than VirtualBox. Since then, however, I've found another method of working with KVM (the Linux kernel virtual machine technology), one that not only allows me to create and manage virtual machines on my local computer, but also from any machine on my LAN. That tool is Cockpit, which makes managing your Linux machines considerably easier.
DevOps
fromZDNET
6 days ago

The only Linux command you need for monitoring network traffic - and how to use it

Linux has a tool for everything. Sometimes those tools come in the form of an easy-to-use GUI, and other times a command is necessary. For monitoring network traffic, your best bet is the command line. Once you dive down the rabbit hole of possible commands for this task, you could become overwhelmed with choices -- and with the complexity of some of those commands.
DevOps
DevOps
fromInfoWorld
1 week ago

12 principles for improving devsecops

Apply SaaS-derived devsecops principles—shift-left practices, expanded test automation, and SLO-driven observability—to deliver reliable, performant, and secure enterprise applications.
fromInfoWorld
1 week ago

Stop treating force multiplication as a side gig. Make it intentional

Lead without authority. You may not have direct reports, yet you shape architecture, quality and the roadmap. Your leverage comes from artifacts, reviews and clear standards, not from title.I started by publishing a lightweight architecture template and a rollout checklist that the team could copy. That reduced ambiguity during design and cut review cycles by nearly 30 percent
DevOps
#kubernetes
fromInfoQ
2 weeks ago
DevOps

Pinterest's Moka: How Kubernetes Is Rewriting the Rules of Big Data Processing

fromInfoQ
1 month ago
DevOps

Kubernetes 1.35 Released with In-Place Pod Resize and AI-Optimized Scheduling

fromInfoQ
1 month ago
DevOps

Google Cloud Demonstrates Massive Kubernetes Scale with 130,000-Node GKE Cluster

fromInfoQ
2 weeks ago
DevOps

Pinterest's Moka: How Kubernetes Is Rewriting the Rules of Big Data Processing

fromInfoQ
1 month ago
DevOps

Kubernetes 1.35 Released with In-Place Pod Resize and AI-Optimized Scheduling

fromInfoQ
1 month ago
DevOps

Google Cloud Demonstrates Massive Kubernetes Scale with 130,000-Node GKE Cluster

DevOps
fromInfoQ
1 week ago

OpenCost Looks Back on 2025 Milestones and Charts a Roadmap for 2026

OpenCost expanded cost visibility and automation in 2025 with 11 releases, an AI-ready MCP server, improved multi-cloud tracking, enhanced usability, and stronger community contributions.
fromMedium
1 month ago

Securing Microservice Communication with Istio and Envoy Sidecars

As organizations increasingly adopt cloud-native architectures, managing communication between microservices becomes a critical challenge. Modern applications are often distributed across multiple Kubernetes pods and ensuring secure, reliable and observable interactions between these services is essential. This is where Istio and Envoy sidecars come into play. Together they form a service mesh solution that abstracts networking complexities, enforces security policies and provides deep observability - all without requiring changes to application code.
DevOps
DevOps
fromTechzine Global
1 week ago

Chainguard expands EmeritOSS with ten new projects

Chainguard's EmeritOSS assumes maintenance for ten mature open-source projects, providing dependency updates, builds, and releases to ensure continued reliability.
DevOps
fromInfoQ
1 week ago

Salesforce Migrates 1,000+ EKS Clusters to Karpenter to Improve Scaling Speed and Efficiency

Migrating 1,000+ EKS clusters to Karpenter reduced scaling latency, simplified operations, lowered costs, and enabled more flexible self-service infrastructure for developers.
DevOps
fromTechzine Global
1 week ago

Culture, not code, is the biggest challenge for Kubernetes

Cloud native technologies are widely adopted, but further growth depends on overcoming cultural resistance within organizations rather than technical limitations.
fromInfoQ
2 weeks ago

HumanCentred AI for SRE: MultiAgent Incident Response without Losing Control

Hakboian describes a pattern in which specialised agents: one for logs, one for metrics, one for runbooks and so on, are coordinated by a supervisor layer that decides who works on what and in what order. The aim, the author explains, is to reduce the cognitive load on the engineer by proposing hypotheses, drafting queries, and curating relevant context, rather than replacing the human entirely.
DevOps
DevOps
fromInfoQ
2 weeks ago

Pulumi Adds Native Support for Terraform and HCL

Pulumi now natively supports HashiCorp Terraform and OpenTofu, executing HCL and hosting Terraform state to enable mixed-tool infrastructure and migration.
DevOps
fromInfoWorld
2 weeks ago

How Ansible does the real work in hyperautomation

Hyperautomation combines RPA, IaC, AI/ML, NLP, intelligent workflows and process mining, with Ansible executing infrastructure and configuration changes across environments.
DevOps
fromAmazon Web Services
2 weeks ago

From AI agent prototype to product: Lessons from building AWS DevOps Agent | Amazon Web Services

AWS DevOps Agent employs a lead-and-sub-agent architecture to provide accurate, performant incident response and root-cause analysis for native AWS applications.
DevOps
fromInfoQ
2 weeks ago

Platform-as-a-Product: Declarative Infrastructure for Developer Velocity

A unified configuration layer centralizes application and infrastructure intent, simplifying developer workflows while enabling FinOps validation, consistent deployments, and platform-aligned visibility and compliance.
DevOps
fromTechzine Global
2 weeks ago

What Microsoft Azure Local can and cannot do

Azure Local delivers Azure cloud functionality on-premises, using Hyper-V/Stack HCI, validated server hardware, and Azure Portal management for gradual hybrid migration.
#linux
fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

DevOps
fromInfoWorld
2 weeks ago

From distributed monolith to composable architecture on AWS: A modern approach to scalable software

Migrating distributed monoliths to a composable AWS architecture yields loosely coupled, autonomous services that improve scalability, resilience, deployment velocity, and team autonomy.
DevOps
fromMedium
2 months ago

Unified Observability Through Open Standards and Distributed Tracing

Unified observability requires open standards and distributed tracing (e.g., OpenTelemetry) to correlate logs, metrics, and traces across distributed cloud-native systems.
#docker
fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

#docker-compose
DevOps
fromInfoQ
3 weeks ago

Cloudflare Scales Infrastructure as Code with Shift-Left Security Practices

Infrastructure-as-Code with mandatory peer review and automated policy enforcement prevents configuration incidents, increases velocity, and catches security violations before deployment across hundreds of production accounts.
DevOps
fromTheregister
3 weeks ago

Microsoft euthanizes ancient deployment toolkit

Microsoft has immediately retired Microsoft Deployment Toolkit (MDT), ending updates, patches, and support and urging migration to Autopilot or Configuration Manager OSD.
DevOps
fromMedium
3 weeks ago

Who's Spotting You When You Automate

Temporal awareness in ITSM approval automation builds trust by providing past, present, and future visibility so automation and humans can share judgement safely.
fromInfoQ
3 weeks ago

Fast Eventual Consistency: Inside Corrosion, the Distributed System Powering Fly.io

What do we do at Fly? We are a developer-focused cloud platform. That means we make it easy for developers to get their apps deployed, up and running. Something I think that really differentiates us is that we make it easy to deploy your apps in different regions over the world. We are available in 40 different regions. It's basically like a CDN, but for your apps.
DevOps
DevOps
fromStephane's Blog
3 weeks ago

Automating TLS Certificate Monitoring with GitHub Actions, certificate_watcher, and Slack

Combine certificate_watcher with a weekly GitHub Actions workflow and Slack notifications to monitor SSL/TLS certificate expirations serverlessly using a Git-hosted hosts list.
#devops
fromComputerworld
3 weeks ago

5 areas of ITSM being transformed by automation in 2026

Automation is transforming IT service management (ITSM), moving service desks from reactive, manual workflows toward systems that can intelligently route, prioritize, and resolve issues with minimal human intervention. Recent research from Freshworks found that IT professionals lose nearly seven hours every week-almost a full workday-to fragmented tools and overly complicated work processes. Implementing ITSM automation reduces manual effort, accelerates resolution, improves consistency and accuracy, enables proactive issue prevention, and delivers faster, more reliable service that measurably improves employee and end-user satisfaction.
DevOps
DevOps
fromInfoQ
3 weeks ago

Docker Kanvas Challenges Helm and Kustomize for Kubernetes Dominance

Docker Kanvas enables developers to convert local Docker Compose setups into production-ready Kubernetes deployments with automated cloud provisioning and Infrastructure-as-Code generation.
DevOps
fromInfoQ
3 weeks ago

Slack Enhances Chef Infrastructure to Improve Safety and Reduce Blast Radius in Deployments

Slack reduced deployment risk by splitting the Chef production environment into availability‑zone tied buckets and using Chef Summoner for staggered, artifact‑triggered runs.
fromPythonbytes
3 weeks ago

Malicious Package? No Build For You!

Charlie Marsh announced the Beta release of ty on Dec 16 "designed as an alternative to tools like mypy, Pyright, and Pylance." Extremely fast even from first run Successive runs are incremental, only rerunning necessary computations as a user edits a file or function. This allows live updates.
DevOps
#swap
fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

fromMedium
2 months ago
DevOps

What is swap memory in linux? What It Really Is, Why It Exists, and How to Actually Use It

DevOps
fromInfoQ
1 month ago

From Confusion to Clarity: Advanced Observability Strategies for Media Workflows at Netflix

One hour-long Netflix episode encoding generates millions of trace spans, thousands of microservice calls, hundreds of media encodes, and over 100,000 CPU hours.
#opentelemetry
fromInfoQ
1 month ago

AWS Announces New Amazon EKS Capabilities to Simplify Workload Orchestration

Amazon Web Services has launched Amazon EKS Capabilities, a set of fully managed, Kubernetes-native features designed to streamline workload orchestration, AWS cloud resource management, and Kubernetes resource composition and automation. The capabilities, now generally available across most AWS commercial regions, bundle popular open-source tools into a managed platform layer, reducing the operational burden on engineering teams and enabling faster application deployment and scaling on Amazon Elastic Kubernetes Service (EKS).
DevOps
DevOps
fromFast Company
1 month ago

Software resilience testing is more critical than ever

Many companies lack resilience testing, leaving systems vulnerable to cascading outages that cause massive financial, operational, and reputational damage; resilience testing limits such risks.
fromInfoWorld
1 month ago

2026: The year we stop trusting any single cloud

For more than a decade, many considered cloud outages a theoretical risk, something to address on a whiteboard and then quietly deprioritize during cost cuts. In 2025, this risk became real. A major Google Cloud outage in June caused hours-long disruptions to popular consumer and enterprise services, with ripple effects into providers that depend on Google's infrastructure. Microsoft 365 and Outlook also faced code failures and notable outages, as did collaboration platforms like Slack and Zoom. Even security platforms and enterprise backbones suffered extended downtime.
DevOps
fromMedium
2 years ago

Navigating Through the Storm

When a system is overwhelmed with more requests than it can effectively process, a cascade of problems can ensue, significantly undermining its performance and reliability. One of the most immediate and noticeable consequences is the degradation of performance. In such scenarios, users may face frustratingly slow response times or complete timeouts in more severe cases. This not only hampers the user experience but can also erode trust in the system's reliability.
DevOps
DevOps
fromZDNET
1 month ago

How I ditched Google Photos for my own private self-hosted alternative - for free

Immich provides a free, self-hosted Google Photos–like service that requires Docker and can be installed on local Linux, macOS, or Windows machines.
DevOps
fromInfoQ
1 month ago

Docker Makes Hardened Images Free in Container Security Shift

Docker released over 1,000 hardened container images under Apache 2.0, providing secure, non-root, minimal base images with SBOMs and SLSA provenance for all developers.
fromInfoQ
1 month ago

How Authress Designed for Resilience and Survived a Major AWS Outage

Identity and authentication services company Authress shared its strategy to stay operational during major cloud infrastructure outages like the massive October 2025 AWS outage that disrupted many major services. The company's resilience architecture relies on strategies like multi-region deployment and minimizing reliance on AWS control plane services, Authress CTO Warren Parad explains. Parad says the AWS October 20 incident was the worst seen in a decade. Even so, Authress maintained its SLA reliability commitments thanks to a reliability-first design centered on a failover routing strategy.
DevOps
#docker-daemon
fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromMedium
2 months ago
DevOps

Mastering Docker Daemon Configuration on Linux: systemd, Sockets, TLS & daemon.json Explained

fromAzure DevOps Blog
1 month ago

The New Test Run Hub is Going Generally Available! - Azure DevOps Blog

Real-Time Visibility: Instantly monitor test progress and quality trends to catch regressions before they impact your release. Comprehensive Analytics: Dive into historical data with built-in dashboards that break down results by outcome, priority, configuration, and failure type. Effortless Management: Use powerful filters such as timeline, run type, pipeline, and more, to find exactly what you need. Customize your view with persistent search and column visibility settings.
DevOps
DevOps
fromLondon Business News | Londonlovesbusiness.com
1 month ago

Beyond the migration: Why cloud strategy consulting is the architect of modern business growth - London Business News | Londonlovesbusiness.com

Nearly 30% of cloud spend is wasted due to poor architecture and planning; cloud strategy consulting is essential to align IT with business outcomes.
DevOps
fromInfoQ
1 month ago

AWS Launches ECS Express Mode to Simplify Containerised Application Deployment

Amazon ECS Express Mode enables single-step deployment of production-ready containerised web applications and APIs, automating infrastructure provisioning while keeping resources in the user's AWS account.
fromInfoQ
1 month ago

AWS Introduces Regional Availability for NAT Gateway

AWS has recently introduced regional availability for the managed NAT Gateway service. The new capability allows developers to create a single NAT Gateway that automatically spans multiple availability zones (AZs) in a VPC, providing high availability, eliminating the need to define separate gateways and public subnets in each zone. A NAT Gateway lets instances in a private subnet access the internet or other services outside a VPC using the NAT Gateway's IP address.
DevOps
DevOps
fromInfoQ
1 month ago

Pinterest Engineering Reduces Android CI Build Times by 36% with Runtime-Aware Sharding

Android CI build times dropped over 36% by using a runtime-aware test-sharding algorithm and an in-house emulator testing platform for balanced, duration-based shard distribution.
DevOps
fromInfoQ
1 month ago

Lessons Learned from Migrating a Legacy Test Suite to Gauge with Kotlin

Unified Kotlin + Gauge framework with Fabric8, Terraform, and Ansible replaced brittle bash/kubectl tests, reducing feedback loops and increasing shared ownership.
DevOps
fromInfoQ
1 month ago

QCon AI New York 2025: Moving Mountains: Migrating Legacy Code in Weeks Instead of Years

ServiceTitan migrated legacy reporting datasets to a DBT Labs MetricFlow metrics platform using validation and an Assembly Line Pattern to accelerate migration timelines to weeks.
#cicd
fromMedium
2 months ago
DevOps

CI/CD and Gitops with Microservices: Open Ecosystem vs AWS Native

CI/CD and GitOps automate testing, deployment, and management of microservices, making cloud deployments faster, safer, and more scalable.
fromMedium
2 months ago
DevOps

CI/CD and Gitops with Microservices: Open Ecosystem vs AWS Native

CI/CD and GitOps automate testing, building, and deployment to make microservices deployments faster, safer, and more scalable.
fromInfoQ
1 month ago

Scaling Cloud and Distributed Applications: Lessons and Strategies From chase.com, #1 Banking Portal in the US

Typically, what happens is that we plan for maybe 2x, 3x load, but when you put things into the internet, you don't have any control. Who is coming in, when they're going to come, how is this going to be used, because that's how the internet is. Any event can potentially trigger it. It could be good for your business. It could be bad actors coming and trying to steal stuff.
DevOps
fromAzure DevOps Blog
1 month ago

Azure Boards integration with GitHub Copilot - Azure DevOps Blog

The goal was simple: allow teams to take a work item from Azure Boards and send it directly to GitHub Copilot so the coding agent could begin working on it, track progress, and generate a pull request. We are happy to announce that this integration is now being rolled out as generally available 🎉. Customers who participated in the preview helped us validate the experience, find issues, and shape improvements.
DevOps
DevOps
fromIT Pro
1 month ago

Autonomous IT: Driving Efficiency and Security with Tanium

Autonomous patch management automates vulnerable-system patching, saving time and money, improving security, and freeing IT teams for higher-value tasks.
DevOps
fromTechzine Global
1 month ago

JFrog brings order back to a software supply chain under AI pressure

AI-driven development drastically increases release velocity, overwhelming traditional CI/CD and versioning; semantic, metadata-rich release systems like Fly aim to restore manageable delivery at scale.
DevOps
fromNieman Lab
1 month ago

Automation arrives in newsrooms

End-to-end AI automation with human review multiplies software development speed three to four times while maintaining or improving quality, but increases reviewer cognitive load.
DevOps
fromComputerWeekly.com
1 month ago

Unlocking intelligent automotive: why openness wins | Computer Weekly

Operators must transform 5G networks into programmable, developer-focused platforms providing predictable latency, dynamic QoS, edge computing, and network intelligence to unlock automotive value.
DevOps
fromNew Relic
1 month ago

Unified Observability for Serverless + AWS Lambda | New Relic

Unified Serverless Monitoring sends all serverless telemetry into New Relic APM views, enabling per-invocation traces, cold start tracking, error grouping, and AI-driven cost optimization.
fromTheregister
1 month ago

Researchers spot 700 percent increase in hypervisor attacks

Huntress case data revealed a stunning surge in hypervisor ransomware: its role in malicious encryption rocketed from just three percent in the first half of the year to 25 percent so far in the second half,
DevOps
DevOps
fromLondon Business News | Londonlovesbusiness.com
1 month ago

Why digital services slow down during big events - And how modern tech is solving it

Sudden, concentrated spikes in user demand overload servers and databases, causing widespread slowdowns during major events; coordination and storage under extreme load are primary causes.
fromMedium
2 months ago

From Many Maven Repos to One SBT Monorepo: The Dev Story

Three pipelines spun up, three sets of plugins re-resolved half the internet, and one test failed because Repo C still referenced Repo B's previous artifact. I fixed it, pushed again, and watched the other two pipelines restart for moral support. By 9:30am I had three tabs of "Create Merge Request" open, three pom.xmls fighting me, and one cold coffee. We were living in a tiny-repo cul-de-sac - each house had its own rules, its own toolchain, and its own definition of " latest Jackson.&quot.
DevOps
[ Load more ]