Policy and Toxicity Evaluator

AI overview

Support a pilot project by analyzing and grading AI-generated outputs, ensuring compliance and safety across workflows like toxicity detection and policy verification.
Overview We are seeking Policy and Toxicity Evaluators to support a pilot project focused on analyzing and grading AI-generated model outputs for a leading AI platform. In this role, you will assess model responses to ensure compliance, safety, and adherence to project-specific guidelines. Evaluators will work across three workflows: refusal analysis, toxicity detection, and policy verification, helping to validate the model’s behavior and improve its safety and utility.  What you will do:  • Analyze AI model outputs to assess quality and safety.  • Identify instances where the model refuses to answer a prompt and determine if the refusal was necessary or an error (over-refusal). • Identify toxic content such as hate speech, harassment, explicit material, or self-harm encouragement. • Review model responses for compliance with project-specific policy guidelines. • Shift between refusal, toxicity, and policy workflows as needed based on project volume. Requirements:  • Excellent command of written English to process complex prompts and long-form model responses quickly.  • Strong critical thinking and the ability to make objective decisions.  • Exceptional attention to detail, including identifying borderline violations.  • Comfort navigating ambiguous or complex content requiring deep judgment.  • Ability to internalize and strictly follow complex policy guidelines.  • Prior experience in Trust & Safety, Content Moderation, or RLHF (Reinforcement Learning from Human Feedback) annotation is highly beneficial.    Project Details:  Contract Type: Freelance  Location: Philippines (remote) Duration: 1 to 2 weeks Schedule: 10 hours weekly; flexible based on client's needs   Note: Please do not use VPNs or IP-masking tools during the recruitment process — our security system requires accurate regional verification. 
Why Join Welo Data?  
✨ Limitless Flexibility  
Project-based opportunities that fit your availability. Choose when and how much you want to contribute—fully remote, with complete autonomy.   
🌱 Limitless Growth  
Optional access to AI and Large Language Model workshops designed specifically for professionals like you. No coding required—just your expertise.   
🌍 Limitless Support  
Be part of a global contributor community with responsive guidance and support.   
💡 Real Impact  
Apply your expertise in the Legal field to influence the AI systems shaping the future of your industry—while collaborating with data professionals and expanding your skills.  

How to Apply? 
Apply now by answering a few quick questions to join our database and become part of our growing community. 

About Welo Data 
Welo Data, part of Welocalize, is a global AI data company with 500,000+ contributors delivering high-quality, ethical data to train the world’s most advanced AI systems. We’re building smarter, more human AI with a diverse community in 100+ countries.  
At Welo Data, Limitless AI. Limitless You. isn’t just a slogan—it’s our promise. We build smarter AI through the power of human contribution, offering limitless opportunities for our global community to grow, contribute, and work on their terms. 

Welocalize delivers content solutions for translation, localization, adaptation, and machine automation to enable global brands and companies to reach, grow, and engage with international audiences.

View all jobs
Get hired quicker

Be the first to apply. Receive an email whenever similar jobs are posted.

Report this job
Apply for this job