Web scraping specialist (senior) id30242

Casa Blanca, Pue

AgileEngine

Web

Publicada el Publicado hace 17 hr horas

Descripción

*what you will do*
- *web scraping & data extraction*: design, develop, and optimize web scraping strategies for large-scale data extraction from dynamic websites; identify and assess relevant data sources, ensuring alignment with business objectives; implement automated web scraping solutions using python and libraries like scrapy, beautifulsoup, and selenium; build resilient and adaptable scrapers that can handle website structure changes, rate limits, and anti-scraping measures;
- *data processing & integration*: cleanse, validate, and transform extracted data to ensure accuracy, consistency, and usability; store and manage large volumes of scraped data using best-in-class storage solutions; develop etl pipelines to integrate scraped data into data warehouses and analytics platforms; collaborate with cross-functional teams, including data scientists and engineers, to make scraped data actionable.
- *web scraping & optimization*: optimize scraping procedures to improve efficiency, reliability, and scalability across multiple data sources; implement solutions for bypassing captchas, rotating user agents, and managing proxy services; continuously monitor, troubleshoot, and maintain scraping scripts to minimize disruptions due to site changes.
- *compliance & documentatio*:stay up to date with legal, ethical, and compliance considerations related to web scraping and data collection; ensure data collection processes align with best practices and regulatory requirements; maintain clear and detailed documentation of scraping methodologies, data pipelines, and best practices.

*must haves*
- *5+ years* of hands-on experience in web scraping, data extraction, and integration;
- strong proficiency in *python* and web scraping frameworks (*scrapy*, *beautifulsoup*, *selenium*);
- expertise in handling dynamic content, browser fingerprinting, and bypassing anti-bot mechanisms (e.g., captchas, rate limits, proxy rotation);
- deep understanding of *html*, *css*, *xpath*, and *javascript-rendered content*;
- experience working with *large-scale data storage* solutions and optimizing retrieval performance;
- strong grasp of *etl* *processes*, *data pipelines*, and *data warehousing*;
- familiarity with *apis* for data extraction and integration from public and restricted sources;
- strong problem-solving skills with an ability to debug and adapt to changing web structures;
- solid understanding of *web scraping ethics*, *legal implications*, and *compliance guidelines*;
- upper-intermediate english level.

*nice to haves*
- *bachelor’s degree* in computer science, data science, information technology, or a related field;
- experience with *cloud-based distributed scraping systems (aws, gcp, azure)*;
- knowledge of *big data frameworks* and experience handling high-volume datasets within *snowflake*;
- familiarity with *machine learning techniques* for data extraction and natural language processing (nlp);
- experience working with *json*, *xml*, *csv*, and other *structured data formats*;
- proficiency with *version control systems* (*git*).

*the benefits of joining us*
- *professional growth*:accelerate your professional journey with mentorship, techtalks, and personalized growth roadmaps.
- *competitive compensation*:we match your ever-growing skills, talent, and contributions with competitive usd-based compensation and budgets for education, fitness, and team activities.
- *a selection of exciting projects*:join projects with modern solutions development and top-tier clients that include fortune 500 enterprises and leading product brands.
- *flextime*:tailor your schedule for an optimal work-life balance, by having the options of working from home and going to the office, whatever makes you the happiest and most productive.

*next steps after you apply*

work location: remote

Aplicar

Crear una alerta

Guardar

Oferta similar

Web designer

Casa Blanca, Pue

Zodient llc

Web

Oferta similar

Becario web

Heróica Puebla de Zaragoza, Pue

Becario

Sistemas Pr

Web

Oferta similar

Web developer

Casa Blanca, Pue

Advisor Internet Marketing

Web