#web-scraping

[ follow ]
#data-collection
fromHackernoon
3 years ago
Artificial intelligence

Behind the Scenes of Using Web Scraping and AI in Investigative Journalism | HackerNoon

Web scraping is essential for journalists to extract public information and hold authorities accountable.
fromHackernoon
1 month ago
Privacy professionals

Web Scraping in 2025: Staying on Track with New Rules | HackerNoon

AI advancements present new challenges for web scraping, requiring innovative techniques to navigate increased security measures.
fromHackernoon
1 month ago
Artificial intelligence

AI and Proxies: Are They Connected? | HackerNoon

Proxies are crucial for overcoming data collection barriers in machine learning.
fromBusiness Matters
2 weeks ago
Privacy professionals

Scraping Proxies: Why They're a Game-Changer for Modern Web Scraping

Scraping proxies are essential for effective web scraping to avoid rate limits and geo-restrictions.
Artificial intelligence
fromHackernoon
3 years ago

Behind the Scenes of Using Web Scraping and AI in Investigative Journalism | HackerNoon

Web scraping is essential for journalists to extract public information and hold authorities accountable.
fromHackernoon
1 month ago
Privacy professionals

Web Scraping in 2025: Staying on Track with New Rules | HackerNoon

AI advancements present new challenges for web scraping, requiring innovative techniques to navigate increased security measures.
fromHackernoon
1 month ago
Artificial intelligence

AI and Proxies: Are They Connected? | HackerNoon

Proxies are crucial for overcoming data collection barriers in machine learning.
fromBusiness Matters
2 weeks ago
Privacy professionals

Scraping Proxies: Why They're a Game-Changer for Modern Web Scraping

Scraping proxies are essential for effective web scraping to avoid rate limits and geo-restrictions.
more#data-collection
#data-analysis
E-Commerce
fromEntrepreneur
4 days ago

How Web Data Helps You Stay Ahead of the Competition | Entrepreneur

Ecommerce businesses need to leverage public web data for better decision-making across industries.
fromHackernoon
2 years ago
JavaScript

Let's Build a Free Web Scraping Tool That Combines Proxies and AI for Data Analysis | HackerNoon

The article focuses on building an AI-powered web scraper that can bypass advanced website security measures and automate data analysis.
fromHackernoon
4 months ago
Data science

In the Future, Your Data Is More Valuable Than Gold | HackerNoon

Data is the new currency driving business decisions and competitive advantage.
Web scraping is a vital method for data extraction, experiencing significant market growth.
E-Commerce
fromEntrepreneur
4 days ago

How Web Data Helps You Stay Ahead of the Competition | Entrepreneur

Ecommerce businesses need to leverage public web data for better decision-making across industries.
fromHackernoon
2 years ago
JavaScript

Let's Build a Free Web Scraping Tool That Combines Proxies and AI for Data Analysis | HackerNoon

The article focuses on building an AI-powered web scraper that can bypass advanced website security measures and automate data analysis.
fromHackernoon
4 months ago
Data science

In the Future, Your Data Is More Valuable Than Gold | HackerNoon

Data is the new currency driving business decisions and competitive advantage.
Web scraping is a vital method for data extraction, experiencing significant market growth.
more#data-analysis
#ai-technology
Artificial intelligence
fromArs Technica
1 month ago

Cloudflare turns AI against itself with endless maze of irrelevant facts

Cloudflare introduces 'AI Labyrinth' to combat unauthorized AI web scraping by serving fake content that wastes scraper resources.
Privacy technologies
fromArs Technica
1 month ago

AI bots strain Wikimedia as bandwidth surges 50%

AI crawlers are circumventing established rules, creating challenges for content platforms.
Wikimedia is focusing on a systemic initiative to address scraping issues and protect its infrastructure.
fromHackernoon
2 years ago
Artificial intelligence

What Does Your AI Agent Need to Conquer the Web? | HackerNoon

AI agents must evolve to outperform humans in speed and accuracy.
Real-time data extraction is crucial for AI agents to succeed online.
Artificial intelligence
fromArs Technica
1 month ago

Cloudflare turns AI against itself with endless maze of irrelevant facts

Cloudflare introduces 'AI Labyrinth' to combat unauthorized AI web scraping by serving fake content that wastes scraper resources.
Privacy technologies
fromArs Technica
1 month ago

AI bots strain Wikimedia as bandwidth surges 50%

AI crawlers are circumventing established rules, creating challenges for content platforms.
Wikimedia is focusing on a systemic initiative to address scraping issues and protect its infrastructure.
fromHackernoon
2 years ago
Artificial intelligence

What Does Your AI Agent Need to Conquer the Web? | HackerNoon

AI agents must evolve to outperform humans in speed and accuracy.
Real-time data extraction is crucial for AI agents to succeed online.
more#ai-technology
EU data protection
fromHackernoon
1 month ago

A Guide on How to Legally Web Scrape EU Data | HackerNoon

The Markup emphasizes the importance of web scraping for data journalism while navigating legal risks, especially in the EU.
#cybersecurity
fromHackernoon
1 month ago
Cryptocurrency

The TechBeat: Swift init(), Once and for All (4/5/2025) | HackerNoon

Cryptocurrency exchange security is crucial for user trust as adoption grows.
Web scraping faces new challenges and requires strategic adaptation for compliance.
Effective onboarding processes are key to new hire integration and productivity.
The $1.5 billion Bybit hack emphasizes the importance of security in cryptocurrency.
fromTechzine Global
1 month ago
Marketing tech

Bots now generate majority web traffic

Automated bot traffic now constitutes over half of all web page visits, impacting various sectors significantly.
fromHackernoon
2 years ago
Web design

Avoid Getting Caught in a Honeypot Trap When Scraping the Web | HackerNoon

Honeypots are traps used by websites to detect and thwart web scraping, often leading to consequences like IP blocking.
fromHackernoon
1 month ago
Cryptocurrency

The TechBeat: Swift init(), Once and for All (4/5/2025) | HackerNoon

Cryptocurrency exchange security is crucial for user trust as adoption grows.
Web scraping faces new challenges and requires strategic adaptation for compliance.
Effective onboarding processes are key to new hire integration and productivity.
The $1.5 billion Bybit hack emphasizes the importance of security in cryptocurrency.
fromTechzine Global
1 month ago
Marketing tech

Bots now generate majority web traffic

Automated bot traffic now constitutes over half of all web page visits, impacting various sectors significantly.
fromHackernoon
2 years ago
Web design

Avoid Getting Caught in a Honeypot Trap When Scraping the Web | HackerNoon

Honeypots are traps used by websites to detect and thwart web scraping, often leading to consequences like IP blocking.
more#cybersecurity
Marketing tech
fromForbes
2 months ago

New Data Shows Just How Badly OpenAI And Perplexity Are Screwing Over Publishers

AI-powered search engines are sending significantly less referral traffic to news sites compared to traditional search engines.
#data-extraction
fromMedium
2 months ago
Scala

Scala Web Scraping: Step-by-Step Tutorial 2025

Scala's unique strengths make it a viable alternative for web scraping, offering simplicity, interoperability with Java, and flexible data handling.
fromDATAVERSITY
11 months ago
Data science

Advanced Tips for Effective Data Extraction - DATAVERSITY

Understanding advanced data extraction techniques is crucial for organizations to maximize efficiency and accuracy in data analytics.
fromHackernoon
2 years ago
Web design

Navigating Advanced Web Scraping: Insights and Expectations | HackerNoon

Web scraping automates the process of extracting data from websites, making it efficient and scalable.
fromHackernoon
2 years ago
JavaScript

Web Scraping Optimization: Tips for Faster, Smarter Scrapers | HackerNoon

Advanced web scraping requires a shift from basic practices to more sophisticated strategies for scalability and long-term effectiveness.
fromMedium
2 months ago
Scala

Scala Web Scraping: Step-by-Step Tutorial 2025

Scala's unique strengths make it a viable alternative for web scraping, offering simplicity, interoperability with Java, and flexible data handling.
fromDATAVERSITY
11 months ago
Data science

Advanced Tips for Effective Data Extraction - DATAVERSITY

Understanding advanced data extraction techniques is crucial for organizations to maximize efficiency and accuracy in data analytics.
fromHackernoon
2 years ago
Web design

Navigating Advanced Web Scraping: Insights and Expectations | HackerNoon

Web scraping automates the process of extracting data from websites, making it efficient and scalable.
fromHackernoon
2 years ago
JavaScript

Web Scraping Optimization: Tips for Faster, Smarter Scrapers | HackerNoon

Advanced web scraping requires a shift from basic practices to more sophisticated strategies for scalability and long-term effectiveness.
more#data-extraction
#seo
fromHackernoon
1 year ago
Miscellaneous

The HackerNoon Newsletter: Surviving the Google SERP Data Crisis (1/23/2025) | HackerNoon

The rise of crypto regulations and the surge in phishing attacks highlight key challenges in the tech landscape.
Smart hotels are redefining customer experiences through advanced technologies.
fromGeekSided
9 months ago
Python

How to Create a Python Keyword Analyzer for SEO Optimization

Keyword analysis is crucial for website traffic. Python tools aid in building custom scripts. Libraries like beautifulsoup4, requests, & nltk are essential.
fromHackernoon
1 year ago
Miscellaneous

The HackerNoon Newsletter: Surviving the Google SERP Data Crisis (1/23/2025) | HackerNoon

The rise of crypto regulations and the surge in phishing attacks highlight key challenges in the tech landscape.
Smart hotels are redefining customer experiences through advanced technologies.
fromGeekSided
9 months ago
Python

How to Create a Python Keyword Analyzer for SEO Optimization

Keyword analysis is crucial for website traffic. Python tools aid in building custom scripts. Libraries like beautifulsoup4, requests, & nltk are essential.
more#seo
fromHackernoon
2 years ago
Miscellaneous

The HackerNoon Newsletter: Managing Stress May Be A Lot Simpler Than You Think (12/17/2024) | HackerNoon

Effective stress management is crucial in tech.
Bluesky API enhances content curation and management.
BadGPT-4o showcases a shift in AI experimentation.
Web scraping and AI facilitate efficient data extraction.
#python
fromPycoders
6 months ago
Python

PyCoder's Weekly | Issue #652

Structural pattern matching in Python allows developers to express complex data handling more clearly and concisely.
fromRealpython
6 months ago
Python

Beautiful Soup: Build a Web Scraper With Python Quiz - Real Python

Interactive quiz aimed at testing web scraping skills using Python and relevant libraries.
fromPycoders
5 months ago
Python

PyCoder's Weekly | Issue #658

Django performance tuning is crucial for web project efficiency.
Python's pathlib facilitates easy file path management.
Poetry streamlines dependency management for Python projects.
ZenRows simplifies web scraping with comprehensive tools.
fromPycoders
6 months ago
Python

PyCoder's Weekly | Issue #652

Structural pattern matching in Python allows developers to express complex data handling more clearly and concisely.
fromRealpython
6 months ago
Python

Beautiful Soup: Build a Web Scraper With Python Quiz - Real Python

Interactive quiz aimed at testing web scraping skills using Python and relevant libraries.
fromPycoders
5 months ago
Python

PyCoder's Weekly | Issue #658

Django performance tuning is crucial for web project efficiency.
Python's pathlib facilitates easy file path management.
Poetry streamlines dependency management for Python projects.
ZenRows simplifies web scraping with comprehensive tools.
more#python
fromHackernoon
2 years ago
Data science

Mastering Scraped Data Management (AI Tips Inside) | HackerNoon

Data processing and export are crucial next steps after scraping data from websites.
#automation
fromLogRocket Blog
5 months ago
JavaScript

Using curl-impersonate in Node.js to avoid blocks - LogRocket Blog

curl-impersonate helps automate web interactions by mimicking legitimate browser requests, bypassing common anti-bot protections.
fromLogRocket Blog
6 months ago
JavaScript

Playwright Extra: extending Playwright with plugins - LogRocket Blog

Playwright Extra enhances Playwright's capabilities by adding extensibility with plugin support for automation and scraping tasks.
fromHackernoon
2 years ago
JavaScript

The Role of the TLS Fingerprint in Web Scraping | HackerNoon

TLS fingerprinting can silently identify automated requests, leading to blocking even with proper HTTP headers in place.
fromHackernoon
2 years ago
Miscellaneous

How To Implement IP Rotation With Proxies | HackerNoon

IP rotation enhances online privacy and prevents IP bans in web scraping tasks.
It allows dynamic IP address changes for secure web automation and data gathering.
fromLogRocket Blog
5 months ago
JavaScript

Using curl-impersonate in Node.js to avoid blocks - LogRocket Blog

curl-impersonate helps automate web interactions by mimicking legitimate browser requests, bypassing common anti-bot protections.
fromLogRocket Blog
6 months ago
JavaScript

Playwright Extra: extending Playwright with plugins - LogRocket Blog

Playwright Extra enhances Playwright's capabilities by adding extensibility with plugin support for automation and scraping tasks.
fromHackernoon
2 years ago
JavaScript

The Role of the TLS Fingerprint in Web Scraping | HackerNoon

TLS fingerprinting can silently identify automated requests, leading to blocking even with proper HTTP headers in place.
fromHackernoon
2 years ago
Miscellaneous

How To Implement IP Rotation With Proxies | HackerNoon

IP rotation enhances online privacy and prevents IP bans in web scraping tasks.
It allows dynamic IP address changes for secure web automation and data gathering.
more#automation
fromHackernoon
2 years ago
JavaScript

How To Scrape Modern SPAs, PWAs, and AI-Driven Dynamic Sites | HackerNoon

Understand advanced web scraping techniques to adapt to modern web changes.
Recognize the differences between SPAs, PWAs, and AI-powered sites for effective scraping.
fromHackernoon
1 year ago
Miscellaneous

The HackerNoon Newsletter: Netflix and Amazon: A Tale of Two Ad Tiers (11/14/2024) | HackerNoon

The emergence of AGI poses critical questions for humanity's survival alongside superintelligence.
fromTechRadar
5 months ago
Miscellaneous

Best mobile proxies for 2024

Mobile proxies are essential for effective online tasks requiring anonymity and geolocation access.
Oxylabs offers unparalleled mobile proxy services with extensive coverage and customizable features.
#cloudflare
fromHackernoon
2 years ago
JavaScript

Bypassing JavaScript Challenges for Effective Web Scraping | HackerNoon

JavaScript challenges block web scraping by requiring execution of scripts that verify human presence.
fromHackernoon
2 years ago
JavaScript

Bypassing JavaScript Challenges for Effective Web Scraping | HackerNoon

JavaScript challenges block web scraping by requiring execution of scripts that verify human presence.
more#cloudflare
fromTechCrunch
6 months ago
Miscellaneous

Perplexity is reportedly looking to fundraise at an $8B valuation | TechCrunch

Perplexity aims to raise $500 million to enhance its valuation, despite facing scrutiny from news publishers.
The company emphasizes its growth in query volume and revenue while seeking cooperative relationships with content publishers.
fromMedium
6 months ago
Python

Concurrency vs Parallelism

Concurrency efficiently manages multiple tasks without blocking, improving resource use, especially during I/O waits.
Parallelism executes multiple tasks simultaneously, enhancing performance in computation-intensive processes.
#data-restrictions
Artificial intelligence
fromFuturism
9 months ago

Crisis Looms as AI Companies Rapidly Losing Access to Training Data

The restrictions imposed by content hosts on publicly available data can severely impact the effectiveness of AI models.
AI companies relying on web scraped data may face bias, lack of diversity, and freshness due to increasing restrictions from content hosts.
Artificial intelligence
fromFuturism
9 months ago

Crisis Looms as AI Companies Rapidly Losing Access to Training Data

The restrictions imposed by content hosts on publicly available data can severely impact the effectiveness of AI models.
AI companies relying on web scraped data may face bias, lack of diversity, and freshness due to increasing restrictions from content hosts.
more#data-restrictions
fromRealpython
8 months ago
JavaScript

Web Scraping With Scrapy and MongoDB - Real Python

Web scraping with Scrapy involves the ETL process: extracting, transforming, and loading data into storage like MongoDB.
fromHackernoon
3 years ago
Data science

Harnessing Public Web Data for AI | HackerNoon

Effective data acquisition is crucial for AI performance, with web scraping being a key method.
Bright Data provides solutions for successful web data scraping such as proxy networks and pre-configured datasets.
fromZato
11 months ago
JavaScript

Web scraping as an API service

Web scraping is a last resort in backend integrations due to its brittleness and deviation from traditional API interactions.
[ Load more ]