Experimental browser for the Atmosphere
AxBench makes the argument that most of the excitement around SAEs for steering lacked systematic evals which over hyped their effectiveness. This is echoed by Google moving away from them after negative results in more systematic evals: www.lesswrong.com/posts/4uXCAJ...
May 6, 2025, 3:50 PM
{
"text": "AxBench makes the argument that most of the excitement around SAEs for steering lacked systematic evals which over hyped their effectiveness.\n\nThis is echoed by Google moving away from them after negative results in more systematic evals: www.lesswrong.com/posts/4uXCAJ...",
"$type": "app.bsky.feed.post",
"langs": [
"en"
],
"reply": {
"root": {
"cid": "bafyreidn7a7oxe7zz3jvhnejadleof3dtxup2qxbtu2y2j6vislf2olm7e",
"uri": "at://did:plc:565ebob5f6hw33hjdkxty6qj/app.bsky.feed.post/3logv43nbuk2v"
},
"parent": {
"cid": "bafyreifci3ceip4nfnsukhg6poajzbissnxwh7qnpscnycgxrzntgeys7e",
"uri": "at://did:plc:565ebob5f6hw33hjdkxty6qj/app.bsky.feed.post/3loh2bp6cl22a"
}
},
"facets": [
{
"index": {
"byteEnd": 272,
"byteStart": 239
},
"features": [
{
"uri": "https://www.lesswrong.com/posts/4uXCAJNuPKtKBsi28/negative-results-for-saes-on-downstream-tasks",
"$type": "app.bsky.richtext.facet#link"
}
]
}
],
"createdAt": "2025-05-06T15:50:43.235Z"
}