Ainiketan ainiketan.in
Papers Dictionary This Week Learning Paths ∑ Playground
Support Us
Papers Dictionary This Week Learning Paths ∑ Playground Support Us ☕
← Dictionary / Policy Model

Policy Model

Appears in 1 paper

The language model being trained and improved across rounds.

As used in Paper 24 — rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking →

The language model being trained and improved across rounds. Starts at 42% accuracy, improves to 90% through self-evolution.

Appears in papers

Paper 24 — rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking →
Browse Dictionary
← All terms A–Z
Share
WhatsApp
Ainiketan

Where India learns AI — deeply, freely, together.

जहाँ हर जिज्ञासु AI सीखे — खुलकर, गहराई से, साथ में।

Free forever No ads No login Open source

Learn

All 24 Papers Math Tutorials Dictionary Learning Paths This Week in AI

Community

Student Journal Soon Paper Club Soon Research Questions Soon Mentor Network Soon Teacher Packs Soon

Site

About Scholarship Fund Impact Corrections Support Us ☕ Terms & Copyright
☕
Buy us a chai

This site is free forever. If it helped you, support it for others.

GitHub Sponsors →
Weekly digest

5 things in AI every week. Plain English. Free.

© 2026 Ainiketan · Built for India, for free, forever · Suggest a correction

Content license: CC BY 4.0 · Hosted on Vercel · Privacy-friendly analytics (no cookies)

All summaries are original writing by Ainiketan — we link to sources and do not reproduce copyrighted text. Copyright concerns: askainiketan@gmail.com · Terms & Copyright