OpenAI Nears Launch of AI Agent Tool To Automate Tasks For Users

An anonymous reader quotes a report from Bloomberg: OpenAI is preparing to launch a new artificial intelligence agent codenamed "Operator" that can use a computer to take actions on a person's behalf (Warning: source may be paywalled; alternative source), such as writing code or booking travel [...]. In a staff meeting on Wednesday, OpenAI's leadership announced plans to release the tool in January as a research preview and through the company's application programming interface for developers [...]. The one nearest completion will be a general-purpose tool that executes tasks in a web browser, one of the people said. OpenAI Chief Executive Officer Sam Altman hinted at the shift to agents in response to a question last month during an Ask Me Anything session on Reddit. "We will have better and better models," Altman wrote. "But I think the thing that will feel like the next giant breakthrough will be agents." The move to release an agentic AI tool also comes as OpenAI and its competitors have seen diminishing returns from their costly efforts to develop more advanced AI models. Read more of this story at Slashdot.

Microsoft Gaming Handheld Device ‘Few Years’ Away, Says Xbox Chief

Microsoft's gaming division is developing prototypes for a handheld gaming device that won't launch for "a few years," gaming chief Phil Spencer said Wednesday. In an interview with Bloomberg, Spencer said that while Microsoft is actively working on prototypes, the company will first focus on improving its Xbox app performance on existing portable devices and establishing hardware partnerships. The gaming unit wants to be "informed by learning and what's happening now" before introducing its own device, Spencer said. "Longer term, I love us building devices," Spencer said, adding that Microsoft's team "could do some real innovative work." Read more of this story at Slashdot.

How Italy Became an Unexpected Spyware Hub

Italy has emerged as a major global spyware hub alongside Israel and India, with at least six major vendors operating in the country with limited oversight, The Record reported this week, citing researchers and Italian experts. Companies like RCS Labs, which has operated since 1992, sell surveillance tools to both domestic law enforcement and foreign governments including Kazakhstan, Syria, and several Asian nations. Italian authorities can rent spyware for $160 per day without large acquisition costs, leading to thousands of domestic surveillance operations in recent years. While new regulations taking effect in February 2024 will require judges to evaluate specific reasons for spyware use, critics cited in the story say the reform package won't address core issues like the lack of centralized oversight. The country's competitive marketplace and relatively lax export controls have also enabled Italian vendors to expand their overseas sales. Read more of this story at Slashdot.

AI Systems Solve Just 2% of Advanced Maths Problems in New Benchmark Test

Leading AI systems are solving less than 2% of problems in a new advanced mathematics benchmark, revealing significant limitations in their reasoning capabilities, research group Epoch AI reported this week. The benchmark, called FrontierMath, consists of hundreds of original research-level mathematics problems developed in collaboration with over 60 mathematicians, including Fields Medalists Terence Tao and Timothy Gowers. While top AI models like GPT-4 and Gemini 1.5 Pro achieve over 90% accuracy on traditional math tests, they struggle with FrontierMath's problems, which span computational number theory to algebraic geometry and require complex reasoning. "These are extremely challenging. [...] The only way to solve them is by a combination of a semi-expert like a graduate student in a related field, maybe paired with some combination of a modern AI and lots of other algebra packages," Tao said. The problems are designed to be "guessproof," with large numerical answers or complex mathematical objects as solutions, making it nearly impossible to solve without proper mathematical reasoning. Further reading: New secret math benchmark stumps AI models and PhDs alike. Read more of this story at Slashdot.