Descrição do trabalho
At Wellhub (formerly Gympass) (Permanent), in PortugalSalary : €55.000 - €68.000Expires at : Remote policy : Full remote
Your wellbeing matters. Join a company that cares.
- GET TO KNOW USWellhub (formerly Gympass
- ) is a corporate wellness platform that connects employees to the best partners for fitness, mindfulness, therapy, nutrition, and sleep, all included in one subscription designed to cost less than each individual partner. Founded in 2012 and headquartered in NYC, we have a growing global team in 11 countries. At Wellhub, you have the opportunity to build a career in a high-growth tech company that places wellbeing at the foundation of its culture, and contribute to making every company a wellness company.
- Big news : Gympass is now Wellhub! We are thrilled to announce our rebranding as Wellhub, marking a significant milestone in our journey. This transformation reflects our evolution from a “pass for gyms” to a comprehensive employee wellbeing solution. With our refreshed identity, we are poised to embark on an exciting new chapter of growth and expansion. We are elevating our offerings, including a completely new app experience and an expanded network of wellbeing partners.THE OPPORTUNITYWe are hiring a Senior Data Scientist in our Generative AI area in Portugal! This is a Remote – Portugal position, meaning you can work from anywhere within the country. Please note that this role is only open to candidates in Portugal.
- YOUR IMPACT
- Multimodal Extraction : Apply state-of-the-art tools (OCR, vision-language models, document understanding frameworks) to interpret diverse input types;
- Prompt Engineering : Develop and refine strategies for using LLMs to extract, summarize, and transform unstructured content into structured formats;
- Data Quality & Structuring : Clean, validate, and transform messy, unstructured data into well-defined schemas ready for use in training or analytics pipelines;
- Content Filtering : Define standards and build systems for cleaning, validating, and filtering data to ensure accuracy, reduce bias, and align with ethical / safety guidelines;
- Human-in-the-Loop Feedback : Design feedback loops where experts validate or enrich data, improving LLM-based extraction reliability;
- Scalability & Optimization : Architect cost-efficient, high-throughput data pipelines that are robust to noisy or incomplete sources;
- Research & Prototyping : Experiment with emerging tools and methods in the LLM + multimodal space, exploring new ways to enhance information coverage and extraction reliability;
- Collaboration : Partner with data engineers and other data scientists to integrate collected data into larger AI and analytics systems;
Live the mission : inspire and empower others by genuinely caring for your own wellbeing and your colleagues. Bring wellbeing to the forefront of work, and create a supportive environment where everyone feels comfortable taking care of themselves, taking time off, and finding work-life balance.
Main requirements
- WHO YOU ARE
- Master’s degree (or Ph D) in Computer Science, Data Science, Machine Learning, Statistics, or a related field;
- Proficiency in Python and experience with libraries for web scraping, OCR (e.g., Tesseract, Easy OCR), and NLP (e.g., Hugging Face Transformers);
- Deep understanding of LLM capabilities in multimodal and extraction contexts, including prompt engineering and few-shot learning;
- Strong background in unstructured data processing : APIs, web scraping, HTML parsing, OCR, image / document analysis;
- Strong analytical problem-solving skills, with a track record of turning noisy data into high-quality datasets for ML;
Excellent communication and documentation skills, with the ability to influence across technical and product teams.
We recognize that individuals approach job applications differently. We strongly encourage all aspiring applicants to go for it, even if they don't match the job description 100%. We welcome your application and will be delighted to explore if you could be a great fit for our team. For this specific role, please note that prior experience in data science and Python are mandatory requirements.
- Nice to have
- Familiarity with visual-language models (e.g., BLIP, Donut, or Layout LM) and multimodal pipelines;
- Hands-on experience in prompt engineering and few-shot learning for data extraction tasks;
Experience deploying or supporting data acquisition systems in production environments.
Benefits & Perks
WHAT WE OFFER YOU
We're a wellness company that is committed to the health and wellbeing of our employees. Our benefits include :
WELLHUB : We believe in our mission and encourage our employees and their families to take care