OpenAI Is Asking Contractors to Upload Work From Past Jobs to Evaluate the Performance of AI Agents

OpenAI is asking According to records from OpenAI and training data company Handshake AI obtained by WIRED, third-party contractors must upload actual assignments and tasks from their current or past workplaces so it can use the data to evaluate the performance of its next-generation AI models.

The project appears to be part of OpenAI’s efforts to establish a human baseline for various tasks against which AI models can be compared. In September, the company launched a new evaluation process to measure the performance of its AI models against human professionals in various industries. OpenAI says this is a key indicator of its progress toward achieving AGI, or an AI system that outperforms humans in most economically valuable tasks.

“We hired people in a variety of professions to help us collect real-world tasks you perform in your full-time jobs, so we can measure how well AI models perform on those tasks,” a confidential document from OpenAI reads. “Take existing pieces of long-term or complex work (hours or days+) done in your business and turn each into a task.”

According to an OpenAI presentation about the project seen by WIRED, OpenAI is asking contractors to describe the work they’ve done in their current job or in the past and to upload real examples of the work they’ve done. Each example “must be a concrete output (not a summary of the file, but the actual file), e.g., Word doc, PDF, PowerPoint, Excel, image, repo,” the presentation notes. OpenAI says people can also share fabricated work examples created to demonstrate how they would realistically react in specific scenarios.

OpenAI and Handshake AI declined to comment.

According to the OpenAI presentation, real-world tasks have two components. This includes work requests (what a person’s manager or coworker asked them to do) and deliverable work (the actual work they did in response to that request). The company emphasizes several times in the instructions that examples shared by contractors should reflect “actual, on-the-job work” that the individual has performed.In fact Done.”

An example in the OpenAI presentation outlines the job of “Senior Lifestyle Manager at a luxury concierge company for ultra-high-net-worth individuals.” The goal is to “create a short, 2-page PDF draft of a 7-day boat trip overview to the Bahamas for a family who will be traveling there for the first time.” This includes additional details about the family’s interests and what the itinerary should look like. “Experienced Human Delivery” then shows what the contractor would upload, in this case: an actual Bahamas itinerary created for a client.

OpenAI directs contractors to remove corporate intellectual property and personally identifiable information from the work files they upload. Under the section labeled “Important Reminders”, OpenAI asks workers to “delete or anonymize any of the personal information, proprietary or confidential data, material non-public information (for example, internal strategies, unpublished product descriptions).”

One of the files seen by WIRED documents mentions a ChatGPT tool called “Superstar Scrubbing” that provides advice on how to remove confidential information.

Evan Brown, an intellectual property attorney at Neal & McDevitt, told WIRED that AI labs that receive confidential information from contractors on this scale could be subject to claims of trade secret misappropriation. Contractors who present documents from their previous workplaces to an AI company, even not scrubbed, may risk violating their previous employers’ nondisclosure agreements or exposing trade secrets.

“AI labs are relying heavily on their contractors to decide what is confidential and what is not,” says Brown. “If they let something slip, are AI labs really taking the time to determine what is a trade secret and what is not? I feel like the AI ​​lab is putting itself at great risk.”



<a href

Leave a Comment