OpenAI Seeks Real-World Work Samples to Train AI Models
OpenAI is reportedly requesting third-party contractors to submit authentic work examples from their current or past jobs. This data will be used to benchmark the performance of its next-generation AI models, including those aiming for Artificial General Intelligence (AGI). The initiative is part of a broader effort to measure AI capabilities against human professionals across various industries.
Establishing a Human Baseline for AI Performance
OpenAI launched a new evaluation process in September to directly compare its AI models to human performance. Achieving AGI – an AI system capable of exceeding human capabilities in most economically valuable tasks – is a key goal for the company.To reach this milestone, OpenAI needs a robust understanding of how humans perform real-world tasks.
According to a confidential OpenAI document,contractors are being asked to transform past work,taking “existing pieces of long-term or complex work (hours or days+)” adn turning them into individual tasks. The emphasis is on providing concrete deliverables – documents, presentations, spreadsheets, images, or code repositories – rather than summaries.
what Kind of Work is OpenAI Requesting?
The project focuses on capturing both the task request (the initial instructions) and the task deliverable (the completed work). OpenAI specifically asks for “real, on-the-job work” that contractors have “actually done.”
An example provided in OpenAI’s presentation details a task for a “Senior Lifestyle Manager” involving the creation of a draft itinerary for a luxury yacht trip. The deliverable would be a real itinerary previously created for a client.
Data Security and Confidentiality Concerns
OpenAI instructs contractors to remove any personally identifiable details, proprietary data, or confidential material from submitted files. They even provide access to a ChatGPT tool, “Superstar Scrubbing,” to assist with this process.
However, legal experts warn that this approach carries risks. Evan Brown,an intellectual property lawyer,notes that AI labs receiving confidential information at this scale could face trade secret misappropriation claims. Contractors sharing work examples,even after attempting to anonymize them,could perhaps violate non-disclosure agreements or expose sensitive company information.
Brown emphasizes the significant trust OpenAI places in contractors to accurately identify confidential information and questions whether the company is adequately assessing the risk of inadvertently handling trade secrets.
Key Takeaways
- OpenAI is actively collecting real-world work samples from contractors to improve its AI models.
- The goal is to establish a human performance baseline for comparison with AI capabilities.
- Contractors are asked to submit actual deliverables from past jobs, not just descriptions.
- Data security and confidentiality are concerns, despite OpenAI’s instructions and provided tools.
- Legal risks exist for both contractors and OpenAI regarding potential trade secret violations.
This initiative highlights the increasing demand for high-quality training data in the progress of advanced AI. As OpenAI and other companies push the boundaries of AI capabilities, the ethical and legal implications of data collection will continue to be a critical area of focus.