Experimental browser for the Atmosphere
Yet, language model alignment has unique structure not present in general reinforcement learning—most prominently, access to the pre-trained base model—which gives some hope that we might be able to do better. 8/
Mar 27, 2025, 5:28 PM
{ "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5st3h32c", "cid": "bafyreibfqiqoriipbdwr2dww6xjwbxhgqz7lhbmgtbfqqhbn6tgrrmbrba", "value": { "text": "Yet, language model alignment has unique structure not present in general reinforcement learning—most prominently, access to the pre-trained base model—which gives some hope that we might be able to do better.\n\n8/", "$type": "app.bsky.feed.post", "langs": [ "en" ], "reply": { "root": { "cid": "bafyreidzv7o2gllewwfqqi4ixremjopxmuyntc5gstydqektrxkcjni4pi", "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5p66ac2c" }, "parent": { "cid": "bafyreih7b32qn2utojo6576qpk2fmarpyz6h7rgpe5krvngw5nxuj3pcfq", "uri": "at://did:plc:x2a3inabvfsn4wntrlbbndrv/app.bsky.feed.post/3llet5st3h22c" } }, "createdAt": "2025-03-27T17:28:13.777Z" } }