Experimental browser for the Atmosphere
understand about this model is that the score you see in MATH500 and other benchmarks is not due to overfitting. I generated (with SOTA LLMs and Python scripts to verify the correctness) private datasets of math and coding questions, and the model scores so high in such problems as well.
Apr 22, 2025, 10:57 PM
{ "uri": "at://did:plc:ipt7y6qaf6fn7oeeduboqe44/app.bsky.feed.post/3lngrovjusk2h", "cid": "bafyreidimimio525s62vqx3mdxyu7mdnr2tbm4j4z7jxq6pwy6k525q4fq", "value": { "text": "understand about this model is that the score you see in MATH500 and other benchmarks is not due to overfitting. I generated (with SOTA LLMs and Python scripts to verify the correctness) private datasets of math and coding questions, and the model scores so high in such problems as well.", "$type": "app.bsky.feed.post", "langs": [ "en" ], "reply": { "root": { "cid": "bafyreiegdxclqjq3ln27h56hvvkuhcgxemfkzzmmqmathnmxec5yaybagu", "uri": "at://did:plc:ipt7y6qaf6fn7oeeduboqe44/app.bsky.feed.post/3lngrovjrus2h" }, "parent": { "cid": "bafyreiegdxclqjq3ln27h56hvvkuhcgxemfkzzmmqmathnmxec5yaybagu", "uri": "at://did:plc:ipt7y6qaf6fn7oeeduboqe44/app.bsky.feed.post/3lngrovjrus2h" } }, "createdAt": "2025-04-22T22:57:46.121Z" } }