Data science

[ follow ]
Data science
fromTreehouse Blog
1 day ago

Beginning SQL: 10 Essential Query Patterns

Recognizing common SQL query patterns enables beginners to retrieve, filter, summarize, and reason about data effectively across industries.
frommoz.com
1 day ago

Vibe Coding Your Own SEO Tools Whiteboard Friday

You can always make it better. You can improve things. But it does give you a good taste of what can be done in vibe coding. Those are things that I made maybe in 15 minutes, half an hour. It is quite simple to get those first steps and say, "Oh, this works." Maybe you want to do some improvements, and you refine the code and what you're expecting.
Data science
Data science
fromInfoQ
4 days ago

How Agoda Unified Multiple Data Pipelines Into a Single Source of Truth

A centralized Apache Spark-based financial pipeline (FINUDP) creates a single source of truth and a multi-layered quality framework to ensure accurate, consistent financial metrics.
fromGael Varoquaux
3 days ago

Stepping up as probabl's CSO to supercharge scikit-learn and its ecosystem

I'm thrilled to announce that I'm stepping up as Probabl 's CSO (Chief Science Officer) to supercharge scikit-learn and its ecosystem, pursuing my dreams of tools that help go from data to impact. Scikit-learn, a central tool Scikit-learn is central to data-scientists' work: it is the most used machine-learning package. It has grown over more than a decade, supported by volunteers' time, donations, and grant funding, with a central role of Inria.
Data science
#spark
fromMedium
3 days ago
Data science

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromMedium
3 days ago
Data science

How I Fixed a Critical Spark Production Performance Issue (and Cut Runtime by 70%)

fromNew Relic
1 week ago

The Power and Cost of Data Cardinality

The more attributes you add to your metrics, the more complex and valuable questions you can answer. Every additional attribute provides a new dimension for analysis and troubleshooting. For instance, adding an infrastructure attribute, such as region can help you determine if a performance issue is isolated to a specific geographic area or is widespread. Similarly, adding business context, like a store location attribute for an e-commerce platform, allows you to understand if an issue is specific to a particular set of stores
Data science
Data science
fromMedium
1 week ago

The Complete Guide to Optimizing Apache Spark Jobs: From Basics to Production-Ready Performance

Optimize Spark jobs by using lazy evaluation awareness, early filter and column pruning, partition pruning, and appropriate join strategies to minimize shuffles and I/O.
Data science
fromwww.bbc.com
1 week ago

Excel: The software that's hard to quit

Excel's ubiquity enables quick analysis but spreadsheet-based workflows and macros create maintenance, security, centralization, and AI integration problems.
Data science
fromComputerworld
1 week ago

Accenture to acquire UK AI startup Faculty

Faculty, renamed from ASI Data Science, built NHS Covid predictive systems and aligns with Accenture's AI-focused Reinvention Services.
#aws
fromBusiness Insider
1 week ago

CEO of AI training startup says humans will still be involved in data creation for decades

"When I first started this job, the main push back I always got was that synthetic data will take over and you just will not need human feedback two to three years from now," said Fitzpatrick, who joined the startup last year. "From first principles, that actually doesn't make very much sense." Synthetic data refers to data that is artificially created.
Data science
Data science
fromMedium
2 weeks ago

Migrating from Historical Batch Processing to Incremental CDC Using Apache Iceberg (Glue 4...

Use Apache Iceberg Copy-on-Write tables in AWS Glue 4 to migrate from full historical batch reprocessing to incremental CDC, reducing redundant computation, I/O, and costs.
Data science
fromwww.housingwire.com
2 weeks ago

The spreadsheet trap: Why investor reporting still operates like it's 2005

Investor reporting offices in loan servicing rely on legacy, spreadsheet-based processes due to historical adoption, cultural inertia, and perceived transparency despite significant operational risk.
#charts
#data-quality
#data-visualization
fromFlowingData
2 weeks ago
Data science

Best Data Visualization Projects of 2025

Human-centered data visualizations translated complex topics into empathetic, timely, and beautifully designed interactive experiences across sizing, astronomy, wildfires, museums, and environmental change.
fromPerspective-dev
2 months ago
Data science

Perspective

Perspective is an interactive analytics and data visualization component for large and streaming datasets, deployable in-browser or with Python and JupyterLab.
#ai-data-centers
Data science
fromInfoQ
3 weeks ago

Beyond Win Rates: How Spotify Quantifies Learning in Product Experiments

Experiments should be judged by decision-ready learning—valid and actionable outcomes that tell teams to ship, abort, or iterate—rather than by win rates alone.
Data science
fromTheregister
3 weeks ago

AI has pumped hyperscale - but how long can it last?

Hyperscale datacenter operators nearly tripled infrastructure spending and increased quarterly operational capacity by roughly 170% driven by surging demand for AI workloads since late 2022.
Data science
fromInfoQ
4 weeks ago

Decathlon Switches to Polars to Optimize Data Pipelines and Infrastructure Costs

Migrating small-to-medium data workloads from Apache Spark to Polars yields major performance and cost improvements by enabling single-node execution and faster in-memory processing.
Data science
fromMedium
1 month ago

Data Quality on Spark, Part 4: Deequ

Deequ is a Spark-based open-source library for expressing, evaluating, and profiling data quality checks at scale, with analyzers, automatic suggestions, and Scala/Python support.
Data science
fromMedium
1 month ago

Ten Open-Source Business Intelligence Tools for Improved ROI and Productivity

Open-source BI tools deliver flexible, transparent, cost-effective analytics that enable nontechnical users to build dashboards and achieve higher ROI.
Data science
fromThe ODI
in 4 days

Data Ethics Professional #10: Advertising & Ethics - Audience Selection & Proxy Data

Digital advertising requires ethical, inclusive audience selection practices to prevent harmful exclusion and prioritize human safety alongside brand safety.
Data science
fromYahoo Creators
1 month ago

Boring remote jobs that pay at least $100,000 a year and employers can't fill fast enough

Numerous data-heavy, low-drama remote careers pay six figures and offer steady, repetitive tasks with strong demand and clear career paths.
fromInfoQ
1 month ago

Breaking Silos: Netflix Introduces Upper Metamodel to Bring Consistency Across Content Engineering

Upper is based on W3C standards such as RDF for conceptual graph representation and SHACL for validation, and it enables the principle of "model once, represent everywhere" across the data ecosystem.Upper organizes concepts through keyed entities, their attributes, and their relationships across domain boundaries. The modeling grammar and validation structure are designed to maintain consistency as definitions evolve. Keyed concepts can be extended monotonically, allowing new attributes or relationships without modifying existing definitions allowing domains to expand over time without breaking existing models.
Data science
Data science
fromZDNET
1 month ago

This company's AI success was built on 5 essential steps - see how they work for you

AI initiatives succeed when grounded in strong data foundations, clear user-focused goals, measurable value, governance, and an iterative approach that builds confidence and delivers outcomes.
Data science
fromTreehouse Blog
1 month ago

Beginning Data Analysis: From Questions to Insights

Learning data analysis enables beginners to turn raw information into meaningful insights, spot trends, and support evidence-based decision-making across many fields.
Data science
fromMedium
2 months ago

From Zero to Scala Expertise: My Step-by-Step Homework Path

Learning Scala requires overcoming unfamiliar functional syntax and errors, but mastery enables high-performance, cleaner code and access to big data frameworks like Apache Spark.
Data science
fromComputerWeekly.com
1 month ago

Interview: Paul Neville, director of digital, data and technology, The Pensions Regulator | Computer Weekly

TPR is shifting from compliance-based to risk-based regulation by building strong IT foundations, improving data, automation, and cross-organisational information flows.
#chart-templates
fromBarchart.com
1 month ago
Data science

Meta Platforms Has Lost $73 Billion on Reality labs. Are Its Spending Cuts Enough for META Stock?

fromBarchart.com
1 month ago
Data science

Meta Platforms Has Lost $73 Billion on Reality labs. Are Its Spending Cuts Enough for META Stock?

Data science
fromInfoWorld
1 month ago

OpenAI to acquire AI training tracker Neptune

Neptune's hosted experiment-tracking SaaS will shut down March 4, 2026; users have months to export data while stability and security fixes continue.
fromRealpython
1 month ago

Introduction to pandas - Real Python

The pandas DataFrame is a structure that contains two-dimensional data and its corresponding labels. DataFrames are widely used in data science, machine learning, scientific computing, and many other data-intensive fields. DataFrames are similar to SQL tables or the spreadsheets that you work with in Excel or Calc.
Data science
fromTheregister
1 month ago

MongoDB talks up its AI chops by talking down PostgreSQL

Speaking to investment analysts, he said that while MongoDB had all the elements needed to be the right foundational platform for AI workloads, it was too early to say what might be the platform of choice. However, he said MongoDB had been winning work from AI-native companies, citing a customer that recently "switched from PostgreSQL to MongoDB because PostgreSQL could not just scale."
Data science
Data science
fromTheregister
1 month ago

HPE pumps AI cloud lineup with extra Nvidia capabilities

HPE upgrades Private Cloud AI with Nvidia Blackwell GPUs, GPU fractionalization, STIG-hardened NIMs, Juniper networking integration, and Alletra storage for inline data preparation.
#data-engineering
fromInfoQ
1 month ago
Data science

Reliable Data Flows and Scalable Platforms: Tackling Key Data Challenges

fromInfoQ
1 month ago
Data science

Reliable Data Flows and Scalable Platforms: Tackling Key Data Challenges

fromTechzine Global
1 month ago

Snowflake acquires Select Star for broader data context

Snowflake has signed an agreement to acquire Select Star. This company's technology will expand Snowflake Horizon Catalog by integrating with databases, BI tools, and data pipelines. This will increase the context for AI agents such as Snowflake Intelligence. The full context of data assets is often scattered across upstream and downstream systems. This fragmentation makes it difficult to find the right data and understand the full context. In the AI era, this limited context poses a problem for both humans and agents.
Data science
Data science
fromIT Pro
1 month ago

Chief data officers believe they'll be a 'pivotal' force in in the C-suite within five years

CDOs will become equal or highly influential C-suite leaders as data, AI, budgets, and teams expand.
Data science
fromFortune
1 month ago

A World Bank expert thinks countries should leverage 'small AI'-and avoid competing with the biggest tech giants | Fortune

Smaller Southeast Asian countries can pursue targeted 'small AI' but require expanded data centers, reliable power infrastructure, and regulatory collaboration to scale.
Data science
fromeLearning Industry
1 month ago

Data Like Hips Don't Lie

Treat data as a language: ask the right questions, apply critical thinking, and measure learning impact beyond superficial metrics like completions and time.
Data science
fromBattery Power
1 month ago

2025 Atlanta Braves Player Review: Vidal Brujan

Vidal Bruján, a versatile but previously below-average offensive player, provided needed depth for the Braves and delivered a stronger-than-expected 2025 performance.
Data science
fromArs Technica
1 month ago

Data-driven sport: How Red Bull and AT&T move terabytes of F1 info

Race teams use hundreds of sensors and high-speed, low-latency, secure data links to optimize setups, strategy, and efficiency while reducing costs.
Data science
fromwww.ocregister.com
1 month ago

California shoppers intensely searching for bargains

California searches for budget-related terms rose 12% year-over-year, reflecting increased consumer thriftiness amid post-pandemic inflationary pressure.
Data science
fromInfoWorld
1 month ago

Improving annotation quality with machine learning

Treat annotation as data understanding to systematically reduce error rates, development time, and cost while improving dataset quality for machine learning.
fromBusiness Insider
1 month ago

The gruesome new data on tech jobs

Data and analytics jobs really stand out, though. This sector had a Jobs Posting Index of 60, the lowest of all sectors Indeed tracked as of the end of October. That means there are 40% fewer data and analytics job openings than before the pandemic. Even worse: There is still a rising number of applications per job in this sector, according to Indeed.
Data science
Data science
fromPycoders
1 month ago

PyCoder's Weekly | Issue #709

copy.deepcopy() fully clones nested object graphs which can be costly; upcoming JIT, REPL, and build improvements target faster execution and developer workflows.
fromMarTech
1 month ago

Data readiness is the missing foundation of AI-powered marketing | MarTech

Our industry is rushing headlong toward an AI-powered future. The promise is captivating: intelligent systems that can predict market shifts, personalize customer experiences and drive unprecedented growth. Yet in that race, many organizations are short-changing or even skipping a critical first step. They are building sophisticated engines but trying to run them on unrefined fuel. The result is a quiet crisis of confidence, where powerful technology underwhelms because the marketers don't trust the data it relies on.
Data science
Data science
fromMiami Herald
2 months ago

The Best AI Jobs Are No Longer Concentrated in Silicon Valley - Report

AI job growth and high salaries are shifting nationwide, with top-paying roles and remote options emerging outside Silicon Valley.
fromPsychology Today
2 months ago

The Data Within

It is clean and complete. It captures almost everything I have watched over the last decade, with the exception of a couple of hours of viewing on flights or in hotel rooms. Normally, the algorithm serves up a menu of options that includes something that will satisfy me. And that's the thing about algorithms: They are tuned to normality. They make predictions based on statistical likelihoods, past behavior, and expectations about the continuation of trends.
Data science
Data science
fromBusiness Matters
2 months ago

Managing Big Data Storage: The Role of Object Storage

Object storage provides scalable, flat-namespace management and archiving of massive unstructured big data that overwhelms traditional hierarchical storage systems.
fromBusiness Matters
2 months ago

Lessons in Leadership and Logic - A Conversation with Aadeesh Shastry

Aadeesh Shastry is a New York-based professional known for his analytical thinking, structured approach, and calm leadership style. He builds his work around the principles of focus, discipline, and long-term strategy - habits he began developing long before his career started. Raised in Fremont, California, Aadeesh combined academics with athletics. He was a hurdler on his school's track team, played competitive basketball, and studied chess theory in his free time.
Data science
fromBerlin Startup Jobs
2 months ago

Job Vacancy: Senior Data Scientist // DATATRONiQ | IT / Software Development Jobs | Berlin Startup Jobs

🚀 DATATRONiQ is a deep-tech 💪 startup in Germany, driving the future of Industrial IoT and Edge AI. Our platform provides a one-stop solution for companies to monitor, analyze, and optimize their assets and processes. Through advanced machine learning and real-time analytics, we transform high-quality industrial data into meaningful insights that drive smarter decisions. We believe in an open and collaborative environment to foster the best ideas from the most creative people - if you are excited about applying AI and data science to real industrial challenges, we'd love to talk with you!
Data science
Data science
fromInfoWorld
2 months ago

Was data mesh just a fad?

Data mesh transfers dataset ownership to source teams to reduce duplication and improve accuracy but demands ongoing schema maintenance and organizational commitment.
Data science
fromMedium
3 months ago

Enhancing Data Efficiency with Snowflake Storage Lifecycle Management

Snowflake Storage Lifecycle Policy automatically archives aged table rows to lower-cost storage and eventually purges them, reducing storage costs and enforcing retention policies.
Data science
fromLondon Business News | Londonlovesbusiness.com
2 months ago

How crypto prediction software works: Algorithms, APIs, and data sources - London Business News | Londonlovesbusiness.com

Crypto prediction software uses machine learning, deep learning models, and API integrations to analyze market data and deliver real-time probabilistic forecasts for trading decisions.
fromInfoQ
2 months ago

Cloudflare Introduces Data Platform with Zero Egress Fees

Micah Wylde, principal engineer at Cloudflare, Alex Graham, senior systems engineer at Cloudflare, and Jérôme Schneider, staff software engineer at Cloudflare, explain: Analytical data is critical for modern companies. It allows you to understand your users' behavior, your company's performance, and alerts you to issues. But traditional data infrastructure is expensive and hard to operate, requiring fixed cloud infrastructure and in-house expertise. We built the Cloudflare Data Platform to be easy enough for anyone to use with affordable, usage-based pricing.
Data science
Data science
fromDigiday
2 months ago

Walmart develops AI tools to help suppliers better understand customer data

Walmart will add AI tools to Scintilla to help suppliers interpret first-party data, summarize surveys, and improve marketing, advertising and operational decisions.
fromeLearning Industry
2 months ago

Checklist: Your AI Training Rollout [eBook Launch]

You know your team needs to build or strengthen their AI skills, but how do you provide them with the necessary know-how? This AI training rollout checklist covers all the essentials you need, from finding internal AI champions to establishing quarterly review processes. AI Training Rollout Checklist: How To Get Started While some employees might already use AI every day in their workflow, others might be relatively unfamiliar with this emerging tech.
Data science
Data science
fromSocial Media Today
2 months ago

X Reports Higher Usage in EU in Latest DSA Report

X's reported AMARS combine logged-in and logged-out users, showing regional usage patterns, a 29% logged-in increase, and a short DAU spike tied to topical trends.
Data science
fromAol
2 months ago

9 Remote Jobs That Pay $50 an Hour or More (Yes, They're Legit)

Nine remote-friendly professional roles pay around $50+ per hour and leverage experienced workers' expertise for flexible, high-paying careers.
Data science
fromFood & Beverage Magazine
2 months ago

Real-Time Analytics: Boosting U.S. Hotels in 2026 - Food & Beverage Magazine

Hotels that combine smart revenue strategy with real-time analytics maximize pricing, ancillary sales, and rapid market-response to protect margin and grow RevPAR in 2026.
Data science
fromAxios
2 months ago

The real trouble with the Fed's jobs data turmoil

Private firms’ data often lack long-term consistency, universal availability, and maximal reliability due to inherent limitations and conflicting incentives.
fromNon Profit News | Nonprofit Quarterly
2 months ago

Digital Transformation, A Nonprofit Primer - Non Profit News | Nonprofit Quarterly

On a weekday morning in suburban Maryland, a behavioral health therapist logs into her dashboard before meeting her first client. The screen displays real-time caseloads, treatment plans, and risk alerts. One name flashes yellow-a client whose recent history suggests heightened hospitalization risk. Rather than waiting for crisis, the therapist addresses this proactively. This moment illustrates how thoughtfully designed digital systems don't replace human care; they sharpen it.
Data science
fromLondon Business News | Londonlovesbusiness.com
2 months ago

How FIRST.com is redefining smarter betting for UK punters - London Business News | Londonlovesbusiness.com

The appeal of FIRST.com lies in how it blends editorial integrity with practical insight. Visitors find detailed sportsbook reviews that examine not only odds competitiveness but also mobile performance, withdrawal policies, customer service and regulatory licensing. Each review is written to help users understand both strengths and weaknesses, with the aim of providing clarity in an industry that often thrives on confusion.
Data science
[ Load more ]