Experimental browser for the Atmosphere
Balancing data across domains is key to training the best generalist LLMs! In my summer work on the Meta Llama team, we introduce UtiliMax and MEDU, new methods to estimate data utility and optimize data mixes efficiently. HF Blog: huggingface.co/blog/WillHel... ArXiv: arxiv.org/abs/2501.11747
Jan 22, 2025, 8:06 PM
{
"text": "Balancing data across domains is key to training the best generalist LLMs!\n\nIn my summer work on the Meta Llama team, we introduce UtiliMax and MEDU, new methods to estimate data utility and optimize data mixes efficiently.\n\nHF Blog: huggingface.co/blog/WillHel...\nArXiv: arxiv.org/abs/2501.11747",
"$type": "app.bsky.feed.post",
"langs": [
"en"
],
"facets": [
{
"index": {
"byteEnd": 264,
"byteStart": 234
},
"features": [
{
"uri": "https://huggingface.co/blog/WillHeld/utilimax-and-medu",
"$type": "app.bsky.richtext.facet#link"
}
]
},
{
"index": {
"byteEnd": 296,
"byteStart": 272
},
"features": [
{
"uri": "https://arxiv.org/abs/2501.11747",
"$type": "app.bsky.richtext.facet#link"
}
]
}
],
"createdAt": "2025-01-22T20:06:19.076Z"
}