csvbase is a simple website for sharing table data. Join the discord.

6 rows, last changed 3 months ago
Row ID dataset_id data_name_split domain gemma3_token_cnt epochs desired_token_cnt training_stage link
1 DS-100001 experiment_1 CRAWL-Gen 1500000000000.0 0.5 3000000000000.0 SMoE-Phase1 https://huggingface.co/datasets/experiment-1
2 DS-100002 experiment_2 SFT 500000000000.0 1.0 1000000000000.0 BMoE-Phase2 https://huggingface.co/datasets/experiment-2
3 DS-100003 experiment_3 CRAWL-Gen 2300000000000.0 0.25 5000000000000.0 SMoE-Phase1 <NA>
4 DS-100004 experiment_4 SFT 100000000000.0 2.0 200000000000.0 BMoE-Phase2 https://huggingface.co/datasets/experiment-4
5 DS-100005 experiment_5 CRAWL-Gen 850000000000.0 0.75 1200000000000.0 SMoE-Phase1 https://github.com/experiment-5
6 DS-0001 name-1 general 10000.0 10.0 1000.0 pre-train