Jérôme Crevoisier

Senior backend engineer

Hi there !

I’m a passionate software engineer with a strong background in automation, API development, data engineering, and AI integration. I’ve built a wide range of systems, from intelligent data pipelines to machine learning workflows, always with a focus on performance, scalability, and real-world impact.

While I have deep experience in web scraping and browser automation, my work extends far beyond that. I love building robust backend services, designing clean, maintainable architectures, and working across the full stack when needed.

I thrive in startup environments where I can take ownership, move fast, and help shape products from the ground up. Whether it’s prototyping a new idea or scaling a core system, I bring a hands-on, product-driven mindset to every project.

At my core, I’m driven by curiosity, creativity, and a love for building useful things, with clean code and smart automation as my tools of choice.

My tech stack

Languages: Python · TypeScript · SQL

Backend Development: FastAPI, Flask, Django, Node.js, Express, REST API design, async programming, Celery, RabbitMQ, CI/CD pipelines, pytest

Parsing & LLMs: LangChain · LlamaIndex · Haystack · OpenAI · Hugging Face · Unstructured.io · Regex

Scraping & Automation: Playwright · Puppeteer · Selenium · Scrapy · BeautifulSoup · pdfplumber · requests · aiohttp · Tor · Proxy Pools · Stealth Browsers

Databases: SQL (PostgreSQL, MySQL), NoSQL (MongoDB, Redis)

Deployment & DevOps: Docker · GitHub Actions · Kubernetes · GCP · AWS

My Project portfolio

AI-Powered Chat API with Multi-Provider Integration

A production-ready FastAPI application featuring comprehensive AI chat capabilities with dual provider support (OpenAI & HuggingFace), JWT authentication, and asynchronous task processing. This project demonstrates modern API architecture with containerized microservices, background job processing, and scalable AI integrations, perfect for building intelligent conversational applications and AI-powered services.

Technologies:
Python · FastAPI · OpenAI API · HuggingFace · JWT Authentication · Celery · Redis · SQLAlchemy · Docker · Pydantic · AsyncIO · RESTful API
https://github.com/jcrevoisier/ai-chat

Web Scraping Pipeline Infrastructure

A production-ready scraping infrastructure designed to manage large-scale data collection, processing, and monitoring, all fully containerized with Docker.
This project integrates modern tools to orchestrate scrapers, schedule jobs, store and serve data, and monitor performance, ideal for real-world data pipelines and cloud deployment.
Technologies:
Python · Scrapy · FastAPI · Celery · Redis · PostgreSQL · Docker · Prometheus · Grafana · Google Cloud Platform
https://github.com/jcrevoisier/scraping-pipeline-infra

Captcha Solvers

A toolkit demonstrating multiple approaches to solving CAPTCHAs, including reCAPTCHA, hCaptcha, and custom image-based challenges, using 2Captcha, Anti-Captcha APIs, and machine learning techniques.
Technologies:
Python · 2Captcha API · Anti-Captcha API · Machine Learning · OCR · .env Configuration
https://github.com/jcrevoisier/captcha-solver-examples

Tor Proxy Router

A Python library and Dockerized tool for anonymous web scraping with the Tor network. Enables automatic IP rotation, rate limiting, retry handling, and user-agent spoofing to avoid blocks and scraping restrictions.
Technologies:
Python · Tor · SOCKS Proxy · Docker · Requests · BeautifulSoup · User-Agent Rotation
https://github.com/jcrevoisier/tor-proxy-router

LLM Data Parser

A tool that combines web scraping with LLM-based post-processing to transform unstructured product and article data into clean, structured formats. Supports OpenAI, Hugging Face, and LangChain for parsing and summarization tasks.
Technologies:
Python · OpenAI API · Hugging Face Transformers · LangChain · BeautifulSoup · Requests · Pandas
https://github.com/jcrevoisier/llm-data-parser

Human Behavior Simulation Kit

A TypeScript library that mimics realistic human interaction for browser automation, simulating mouse movement, typing, scrolling, and randomness to help bypass basic bot detection systems.
Technologies:
TypeScript · Playwright · Browser Automation · User Simulation · Anti-Bot Evasion
https://github.com/jcrevoisier/human-behavior-simulation-kit

API Reverse Engineering Playbook

A hands-on toolkit for identifying and interacting with undocumented APIs by inspecting browser traffic. Includes real-world examples from Twitter, Indeed, and Yelp, with clean Python implementations and best practices for handling headers, tokens, and pagination.
Technologies:
Python · requests · httpx · DevTools Protocol · HAR Analysis
https://github.com/jcrevoisier/api-reverse-engineering-playbook

Advanced Browser Scraper

A stealth-focused web scraper built with TypeScript and Playwright, featuring proxy rotation, CAPTCHA solving via 2Captcha, and realistic human interaction simulation to bypass anti-bot mechanisms.
Technologies:
TypeScript · Playwright · 2Captcha API · Proxy Rotation · Human Behavior Emulation · dotenv
https://github.com/jcrevoisier/advanced-browser-scraper

2025

Freelance Work (Upwork)

API Documentation Text Extraction & Formatting

Extracted and reformatted technical content from an API documentation page into a clean, structured reference document. Focused on clarity, logical structure, and adherence to documentation best practices. Delivered a polished result suitable for internal use or developer onboarding.
Technologies:
Python · BeautifulSoup · re (Regex) · pdfplumber · Markdown · Text Cleaning · Formatting Automation

2025

Freelance project (Upwork)

Nationwide Facility Scraper & Bed Count Estimator

Developed a custom web scraper to extract assisted and independent living facility data from Caring.com across all 50 U.S. states. Automated deep navigation, pagination handling, and JavaScript rendering. Enriched data by estimating bed counts using external AI-driven web searches and structured parsing of third-party sources.
Technologies:
Python · Playwright · BeautifulSoup · pandas · Regex · DuckDuckGo API · CSV/Excel Export · Stealth Browser Automation

2020 - 2025

Redflag AI

Piracy Content Scraper

Designed, developed, maintained, and continuously enhanced a custom scraper to detect and extract illegal movie streaming links across piracy websites. The solution operated at scale and included detection of mirror sites, automated content categorization, and frequent bypassing of obfuscation and anti-bot techniques, including CAPTCHA solving and IP rotation.
Technologies:
Python · TypeScript · Selenium · AWS Lambda · Kubernetes · Headless Browsers · Cloudflare Bypassing · CAPTCHA Solving · Proxy Rotation · Anti-Bot Evasion Techniques · AWS · Docker · Kubernetes · MongoDB · Redis

2020-2025

Redflag AI

Social Media Scrapers Suite

Built and maintained a complete suite of web scrapers to collect structured data from major social media platforms, including Twitter, Telegram, Facebook, Instagram, TikTok, LinkedIn, Reddit, Quora, YouTube, and Dailymotion. These scrapers handled a wide range of content types (posts, profiles, comments, reactions, media), overcoming anti-bot protections and rate limits. Solutions were containerized and deployed at scale using cloud infrastructure.
Technologies:
Python · Playwright · Selenium · Scrapy · Headless Browsers · Proxy Rotation · CAPTCHA Solving · Rate Limiting Bypass · AWS · Docker · Kubernetes · MongoDB · Redis

2020-2025

Redflag AI

Social Media Computer Vision Engine

Developed computer vision models to analyze images and videos from social media, detecting and classifying logos, people, activities, and objects. This engine supported brand monitoring, threat detection, and visual trend analysis. The models were trained on custom-labeled datasets and optimized for performance in production environments.
Technologies:
Python · PyTorch · OpenCV · TensorFlow · FastAPI · Redis · Docker · AWS S3 · MediaPipe · YOLOv5 · Label Studio

2020 - 2025

Redflag AI

Twitter Sentiment Analysis with LLMs

Designed and trained language models to analyze the sentiment of large-scale Twitter posts. The system identified relevant mentions, classified sentiment (positive, negative, neutral), and generated real-time insights for clients in entertainment, politics, and consumer sectors.
Technologies:
Python · Hugging Face Transformers · FastAPI · Redis · Twitter API · AWS Lambda · PyTorch · TextBlob · Pandas

2023 - 2025

Redflag AI

Piracy Detection via LLMs

Developed and maintained a system using Large Language Models (LLMs) to automatically analyze and classify web pages as piracy-related or not. The tool processed large volumes of URLs daily, performing semantic analysis on the content, detecting piracy keywords, and flagging high-risk pages for review.
Technologies:
Python · OpenAI GPT · LangChain · Playwright · BeautifulSoup · Redis · AWS Lambda · FastAPI · Hugging Face · Pandas

2020 - 2025

Redflag AI

Client Data Access APIs

Designed, developed, and maintained robust RESTful APIs to give enterprise clients secure access to intelligence data scraped from multiple sources. Initially built with Flask and progressively migrated to FastAPI for performance and async support. Some legacy services also integrated with a Ruby on Rails backend. Implemented efficient query filtering, scalable pagination, and flexible data access endpoints. The APIs were deployed via AWS ECS with tasks running on EC2, and used S3 for data exports and backups.
Technologies:
Python · Flask · FastAPI · Ruby on Rails · PostgreSQL · Redis · OpenAPI · Docker · CI/CD
AWS ECS · EC2 · S3 · CloudWatch · Route 53

2020

Webhelp

Customer Call Audio Analysis

Developed machine learning algorithms in Python and R to analyze thousands of customer support calls. Focused on extracting actionable insights from audio data, including emotion detection, keyword spotting, and topic modeling. Deployed models in Databricks to enable real-time dashboarding and reporting for operational teams. Worked closely with business stakeholders to ensure visual outputs were intuitive and aligned with KPIs.

Technologies:
Python · R · Databricks · Scikit-learn · NLTK · Librosa · PyDub · Spark · Pandas · Matplotlib · SQL

My contacts

📧 Email: crevoisierj@hotmail.com
📱 Phone: +34 647 840 103
🔗 LinkedIn: linkedin.com/in/crevoisierjerome/
💻 GitHub: github.com/jcrevoisier
✍️ Medium: medium.com/@jromecrevoisier