Experimental browser for the Atmosphere
However, it is not yet clear whether standard RL techniques like PPO/GRPO are capable of discovering truly new behaviors. 3/
Mar 27, 2025, 5:28 PM
{ "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5rpqlk2c", "cid": "bafyreic75xgy7hzxqmgdsmuqe4fdbiw43jqgr72l3vkbpupww6ecnpgvha", "value": { "text": "However, it is not yet clear whether standard RL techniques like PPO/GRPO are capable of discovering truly new behaviors. \n\n3/", "$type": "app.bsky.feed.post", "langs": [ "en" ], "reply": { "root": { "cid": "bafyreidzv7o2gllewwfqqi4ixremjopxmuyntc5gstydqektrxkcjni4pi", "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5p66ac2c" }, "parent": { "cid": "bafyreiajq5psz6xmuw2xbnjhjheii5nepctkj6b6c6a2bvh67keqswejiu", "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5qyih22c" } }, "createdAt": "2025-03-27T17:28:13.772Z" } }