Welo Data works with technology companies to provide datasets that are high-quality, ethically sourced, relevant, diverse, and scalable to supercharge their AI models. As a Welocalize brand, WeloData leverages over 25 years of experience in partnering with the world’s most innovative companies and brings together a curated global community of over 500,000 AI training and domain experts to offer services that span:
ANNOTATION & LABELLING: Transcription, summarization, image and video classification and labeling.
ENHANCING LLMs: Prompt engineering, SFT, RLHF, red teaming and adversarial model training, model output ranking.
DATA COLLECTION & GENERATION: From institutional languages to remote field audio collection.
RELEVANCE & INTENT: Culturally nuanced and aware, ranking, relevance, and evaluation to train models for search, ads, and LLM output.
Want to join our Welo Data team? We bring practical, applied AI expertise to projects. We have both strong academic experience and a deep working knowledge of state-of-the-art AI tools, frameworks, and best practices. Help us elevate our clients' Data at Welo Data.
About the Role
We are looking for Data Engineers to support the development and refinement of high-quality datasets used to train Large Language Models (LLMs).
In this role, you will design and evaluate complex prompts modeled after real customer support ticket journeys, ensuring model outputs align with customer expectations.
You will work closely with engineering teams to generate, annotate, validate, and QA task data used in AI training workflows.
Project Details & Commitment
-Location: US / North America
-Language Requirement: English (Native or C1/C2)
-Contract Type: Freelance, Project-based
-Contract Duration: December 22nd – January 31st (possibility of extension)
-Work Schedule (choose one):
- Minimum 4 hours per day, Monday to Friday
- OR 2 hours Monday–Friday plus 10 hours over the weekend
-Commitment: Reliable and consistent availability is mandatory
- Hourly rate: 100 USD
- Start date: Monday, December 22nd. Only candidates who are able to start on this date will be considered.
Please note: This opportunity is only available for candidates located in the United States.
Key Responsibilities
Query & Prompt Generation: Design complex LLM prompts that accurately represent real customer journeys and service interactions.
Data Shaping & Collaboration: Partner with Field Engineers to transform raw data into structured, high-quality tasks for model training.
Annotation & Evaluation: Annotate and review tasks to ensure strict quality standards and alignment with expected customer outcomes.
Quality Assurance: Validate and assess model responses to ensure accuracy, relevance, and confidence in outputs.
Required Skills & Qualifications
Language: Native or professional fluency (C1/C2) in English
LLM & Prompting Knowledge: Understanding of LLM behavior and prompt engineering principles
Analytical Skills: Strong attention to detail, critical thinking, and comfort working with ambiguous scenarios
Technical Skills:
SQL for data extraction
Python (Pandas, NumPy) for data manipulation
Experience with annotation tools (e.g., Labelbox, Prodigy, or similar platforms)
Advanced proficiency in Google Sheets/Drive
Familiarity with version control tools (GitHub)
AI/ML Tools: Experience working with playground environments and prompt debugging
Communication: Excellent technical writing skills and ability to clearly explain data requirements
Nice to Have:
Prior experience in data labeling, technical support analysis, or AI model evaluation
Background or exposure to AI-related projects
Ready to Join?
If you’re excited about working hands-on with cutting-edge AI models and shaping how LLMs understand real customer journeys, we’d love to hear from you.
This is a great opportunity to collaborate with experienced engineering teams, apply your technical and analytical skills to real-world AI challenges, and make a direct impact on model quality and performance.
Apply now and be part of building the next generation of AI-powered solutions. 🚀