arxiv:2402.11485
Ryokan Ri
ryo0634
AI & ML interests
Multilingual NLP, Pretrained Language Models, Information Retrieval
Organizations
models
21
ryo0634/TinySwallow-1.5B-Math-DPO
Text Generation
•
2B
•
Updated
•
6
ryo0634/TinySwallow-1.5B-Math-SFT
Text Generation
•
2B
•
Updated
•
4
ryo0634/Swallow-7b-hf-oasst1-21k-ja-alert-dpo-100-steps-beta-2e-1
Text Generation
•
7B
•
Updated
•
2
ryo0634/Swallow-7b-hf-oasst1-21k-ja-alert-dpo-100-steps-beta-1e-1
Text Generation
•
7B
•
Updated
•
5
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-200-steps
Text Generation
•
7B
•
Updated
•
3
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-safety-150-steps
Text Generation
•
7B
•
Updated
•
3
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja-100-steps
Text Generation
•
7B
•
Updated
•
4
ryo0634/Swallow-7b-hf-oasst1-21k-ja-aio-retriever-200-steps
Text Generation
•
7B
•
Updated
•
5
ryo0634/Swallow-7b-hf-oasst1-21k-ja-hh-rlhf-12k-ja
Text Generation
•
7B
•
Updated
•
5
ryo0634/Swallow-7b-plus-hf-oasst1-21k-ja
Text Generation
•
7B
•
Updated
•
5
datasets
22
ryo0634/gsm8k-ja-noisy-dpo-on-policy-4
Viewer
•
Updated
•
890
•
22
ryo0634/gsm8k-ja-noisy-dpo-on-policy-3
Viewer
•
Updated
•
900
•
39
ryo0634/gsm8k-ja-noisy-dpo-on-policy
Viewer
•
Updated
•
706
•
33
ryo0634/gsm8k-ja-noisy-dpo-on-policy-2
Viewer
•
Updated
•
1.07k
•
30
ryo0634/gsm8k-ja-noisy-dpo
Viewer
•
Updated
•
1k
•
26
ryo0634/gsm8k-ja-noisy-sft
Viewer
•
Updated
•
1k
•
40
ryo0634/gsm8k-ja-filtered-dev
Viewer
•
Updated
•
400
•
22
ryo0634/gsm8k-ja-filtered-sft
Viewer
•
Updated
•
3k
•
27
ryo0634/math-short-thought-filtered
Viewer
•
Updated
•
757
•
14
ryo0634/math-thought-filtered
Viewer
•
Updated
•
923
•
10