ATProto Browser

ATProto Browser

Experimental browser for the Atmosphere

Post

Indeed the article is truly terrible, using "hallucinat* 20x vs "inaccurate" 1x, and making all kinds of quantitative comparisons of benchmark outcomes. (An LLM benchmark score is %age of correct answers on today's collection of tricky questions, providing no info on system accuracy in the wild.)

Apr 27, 2025, 9:31 AM

Record data

{
  "uri": "at://did:plc:iw4ngu7e6vevjog34kermab3/app.bsky.feed.post/3lnrwxwg52c2n",
  "cid": "bafyreih3fgxmbo3pb67nmpif4l56rc7f2ejoo34dfktxsui6s2udecpvg4",
  "value": {
    "text": "Indeed the article is truly terrible, using \"hallucinat* 20x vs \"inaccurate\" 1x, and making all kinds of quantitative comparisons of benchmark outcomes. (An LLM benchmark score is %age of correct answers on today's collection of tricky questions, providing no info on system accuracy in the wild.)",
    "$type": "app.bsky.feed.post",
    "langs": [
      "en"
    ],
    "reply": {
      "root": {
        "cid": "bafyreibf7y27l7cdjuzobo5pqzmil7b2zczycatr2x74di7cq7ls25al5a",
        "uri": "at://did:plc:iw4ngu7e6vevjog34kermab3/app.bsky.feed.post/3lnrwgrckz22n"
      },
      "parent": {
        "cid": "bafyreibf7y27l7cdjuzobo5pqzmil7b2zczycatr2x74di7cq7ls25al5a",
        "uri": "at://did:plc:iw4ngu7e6vevjog34kermab3/app.bsky.feed.post/3lnrwgrckz22n"
      }
    },
    "createdAt": "2025-04-27T09:31:34.868Z"
  }
}