ATProto Browser

Record data

{
  "uri": "at://did:plc:dsxewietk5tigqvn6daod2l6/app.bsky.feed.post/3ldlw26c5gk2r",
  "cid": "bafyreib7vjanflidlmfr4kpzzo7zxpgb4rqgo2gesniq374evc6fxsws34",
  "value": {
    "text": "Why does Claude selectively comply in training? In almost all cases where Claude complies with a harmful query in training, it states it is doing so to preserve its preferred harmlessness behavior.\n\nBelow, we show a randomly-selected example of this “alignment faking” reasoning.",
    "$type": "app.bsky.feed.post",
    "embed": {
      "$type": "app.bsky.embed.images",
      "images": [
        {
          "alt": "A randomly-selected transcript of Claude 3 Opus faking alignment. The first white bubble is the user’s prompt. This is followed by the red bubbles in which the model gives its reasoning in a “scratchpad” (as far as the model is concerned, neither users nor researchers can see this scratchpad). The final white bubble is the model’s output, with some of the graphic content removed for this illustration.\n",
          "image": {
            "$type": "blob",
            "ref": {
              "$link": "bafkreicoq2yzf6e4w6hucwnyyjjs72mweegp5yjyj4q23yqyj3bhoszvq4"
            },
            "mimeType": "image/jpeg",
            "size": 668937
          },
          "aspectRatio": {
            "width": 1728,
            "height": 1258
          }
        }
      ]
    },
    "langs": [
      "en"
    ],
    "reply": {
      "root": {
        "cid": "bafyreihzgyc76623mey63q7wusk3uckjsl5q4jnumjzjipq6a4p4mcnpga",
        "uri": "at://did:plc:dsxewietk5tigqvn6daod2l6/app.bsky.feed.post/3ldlw22eto22r"
      },
      "parent": {
        "cid": "bafyreidw66ad7puwgldm5hi5ej4y7yylyi37ekp2l55nsr6vv6b6gyxppy",
        "uri": "at://did:plc:dsxewietk5tigqvn6daod2l6/app.bsky.feed.post/3ldlw24jkv22r"
      }
    },
    "createdAt": "2024-12-18T17:46:57.671Z"
  }
}
Post

In reply to 3ldlw24jkv22r

Record data