MIT tested AI on thousands of workplace tasks. Most of the time, it just barely got by
Key Points:
- A recent MIT study found that current AI models perform workplace tasks at a minimally sufficient level about 65% of the time, comparable to a disenchanted intern, but often require human refinement to ensure quality.
- AI struggles with complex, multi-step, creative, or precision-demanding tasks, rarely achieving superior performance above a 50% success rate, indicating limitations in fully automating skilled roles.
- The study analyzed over 11,000 text-based tasks across various professions using 41 large language models, revealing better AI performance in routine tasks typical of construction and maintenance compared to more skilled legal and IT jobs.
- Real-world applications of AI in workplaces have encountered issues such as fabricated reports and inaccurate outputs, underscoring the need for human oversight despite ongoing improvements in AI capabilities.
- Researchers estimate AI's ability to meet minimal task benchmarks could reach 80-95% by 2029, but achieving excellent or error-free performance remains uncertain, suggesting widespread automation in sensitive fields is still distant.