ATProto Browser

ATProto Browser

Experimental browser for the Atmosphere

Post

Could you explain more why KTO or other unpaired methods wouldn't have similar issues with off-policy data? If the data is off-policy, my expectation would be that the users unpaired ratings would often change since the likelihood of possible alternatives has changed.

Dec 19, 2024, 6:54 PM

{
  "text": "Could you explain more why KTO or other unpaired methods wouldn't have similar issues with off-policy data?\n\nIf the data is off-policy, my expectation would be that the users unpaired ratings would often change since the likelihood of possible alternatives has changed.",
  "$type": "app.bsky.feed.post",
  "langs": [
    "en"
  ],
  "reply": {
    "root": {
      "cid": "bafyreic44c5fbtlfsjkjjhhxvg7sdvjkk556vruetrh4amdmebp2orffzq",
      "uri": "at://did:plc:brkj2yocng7vtggmyujy4khq/app.bsky.feed.post/3ldjl7torno2t"
    },
    "parent": {
      "cid": "bafyreigmb6tun6pmpaakcsdc7cau5kmgncxzl4kghso6ozs3ixlgbtl2mi",
      "uri": "at://did:plc:j7tmwpecoad43t6jhp5t5ovn/app.bsky.feed.post/3ldohfhgjfc2b"
    }
  },
  "createdAt": "2024-12-19T18:54:48.210Z"
}