Experimental browser for the Atmosphere
We also observed a few (by now, standard) examples of agents “cheating” by violating the rules of the task to score higher. For a task where the agent is supposed to reduce the runtime of a training script, o1-preview instead writes code that just copies over the final output.
Nov 25, 2024, 7:42 PM
{ "uri": "at://did:plc:dll3hepzq76nymel5c3yt6nk/app.bsky.feed.post/3lbsbru7svc2b", "cid": "bafyreidgufhxnhyllzu26jitqg2huftc2uol7lstx7giw6nrgquum64seu", "value": { "text": "We also observed a few (by now, standard) examples of agents “cheating” by violating the rules of the task to score higher. For a task where the agent is supposed to reduce the runtime of a training script, o1-preview instead writes code that just copies over the final output.", "$type": "app.bsky.feed.post", "embed": { "$type": "app.bsky.embed.images", "images": [ { "alt": "", "image": { "$type": "blob", "ref": { "$link": "bafkreiaqei5c63dcy5dnassucld5awvqyn2kz6i22x26h5mga4n44xlw3m" }, "mimeType": "image/jpeg", "size": 711966 }, "aspectRatio": { "width": 1070, "height": 1442 } } ] }, "langs": [ "en" ], "reply": { "root": { "cid": "bafyreiewghwpltsxrvzxb4pehqb2a4prnn5wee34pxbfkj3xmdrccvdyau", "uri": "at://did:plc:dll3hepzq76nymel5c3yt6nk/app.bsky.feed.post/3lbsbrpmg3s2b" }, "parent": { "cid": "bafyreibtloigwsiixiyr4qeno4ygj3mdsk4bdy2liwvmebfjdkfr3e6jsi", "uri": "at://did:plc:dll3hepzq76nymel5c3yt6nk/app.bsky.feed.post/3lbsbrtsyqc2b" } }, "createdAt": "2024-11-25T19:42:38.043Z" } }