Experimental browser for the Atmosphere
“This replicates our finding for aligned models and contradicts the Waluigi thesis that representations learned by RLHF enable EM.” This sentence sounds like a parody of AI jargon, and is also an extremely interesting research result
May 6, 2025, 5:40 PM
{ "uri": "at://did:plc:p572wxnsuoogcrhlfrlizlrb/app.bsky.feed.post/3lojgij4pm22c", "cid": "bafyreidwl2slnk7ihhs5aiymjjjiiggwjiyyh6ldgsxpg6q4v4voyobive", "value": { "text": "“This replicates our finding for aligned models and contradicts the Waluigi thesis that representations learned by RLHF enable EM.”\n\nThis sentence sounds like a parody of AI jargon, and is also an extremely interesting research result", "$type": "app.bsky.feed.post", "embed": { "$type": "app.bsky.embed.record", "record": { "cid": "bafyreiamfcfsnsgy7fmr3o22qcir7nibiy2oukymlwy3zyrbztfgd6frvu", "uri": "at://did:plc:p572wxnsuoogcrhlfrlizlrb/app.bsky.feed.post/3lojgbdg6sk2c" } }, "langs": [ "en" ], "createdAt": "2025-05-06T17:40:31.465Z" } }