csvbase is a simple website for sharing table data. Join the discord.
| Row ID | dataset_id | data_name_split | domain | gemma3_token_cnt | epochs | desired_token_cnt | training_stage | link |
|---|---|---|---|---|---|---|---|---|
| 1 | DS-100001 | experiment_1 | CRAWL-Gen | 1500000000000.0 | 0.5 | 3000000000000.0 | SMoE-Phase1 | https://huggingface.co/datasets/experiment-1 |
| 2 | DS-100002 | experiment_2 | SFT | 500000000000.0 | 1.0 | 1000000000000.0 | BMoE-Phase2 | https://huggingface.co/datasets/experiment-2 |
| 3 | DS-100003 | experiment_3 | CRAWL-Gen | 2300000000000.0 | 0.25 | 5000000000000.0 | SMoE-Phase1 | <NA> |
| 4 | DS-100004 | experiment_4 | SFT | 100000000000.0 | 2.0 | 200000000000.0 | BMoE-Phase2 | https://huggingface.co/datasets/experiment-4 |
| 5 | DS-100005 | experiment_5 | CRAWL-Gen | 850000000000.0 | 0.75 | 1200000000000.0 | SMoE-Phase1 | https://github.com/experiment-5 |
| 6 | DS-0001 | name-1 | general | 10000.0 | 10.0 | 1000.0 | pre-train |