사업 운영EP7
📋세무·회계
Tax·accounting
외피 — 산업 도메인
세무·회계 (개인·기업 통합)
내용 — 측정하는 AI 능력
- · 한국 세법 정확도 (부가세·종소세·법인세·연말정산)
- · 다단계 계산 추론
- · 회계 처리·재무제표 구조화
채점자 editor · max_tokens 32768 · temp 0.7 · attempts 3 · reasoning_effort medium
| 모델 | | | | | | | |
|---|
| 5/5 | 84 | 92 | 92 | 100 | 100 | 94.8 |
| 5/5 | 84 | 92 | 87 | 100 | 100 | 93.8 |
3Gemini 3.1 Pro | 5/5 | 80 | 96 | 80 | 100 | 100 | 92.6 |
| 5/5 | 80 | 84 | 80 | 88 | 96 | 86.4 |
5Gemini 3.5 Flash | 5/5 | 80 | 80 | 80 | 88 | 92 | 85.2 |
6Gemma 4 26B A4B | 5/5 | 80 | 80 | 80 | 80 | 92 | 82.4 |
7Gemma 4 12B | 5/5 | 79 | 82 | 81 | 78 | 85 | 80.4 |
| 5/5 | 76 | 80 | 80 | 80 | 80 | 79.2 |
9Gemini 3.1 Flash Lite | 5/5 | 76 | 80 | 76 | 76 | 84 | 78.0 |
| 5/5 | 62 | 88 | 80 | 72 | 92 | 76.8 |
11Gemma 4 31B | 5/5 | 73 | 80 | 78 | 72 | 80 | 75.6 |
| 5/5 | 80 | 76 | 72 | 72 | 68 | 73.2 |
| 5/5 | 68 | 80 | 76 | 68 | 80 | 72.8 |
14DeepSeek V4 Pro | 5/5 | 72 | 80 | 76 | 60 | 88 | 72.4 |
| 5/5 | 64 | 84 | 68 | 60 | 100 | 72.4 |
| 5/5 | 66 | 82 | 78 | 63 | 87 | 72.4 |
| 5/5 | 68 | 80 | 76 | 68 | 76 | 72.0 |
18DeepSeek V4 Flash | 5/5 | 72 | 80 | 68 | 60 | 84 | 70.4 |
19Nemotron 3 Ultra 550B | 5/5 | 52 | 76 | 64 | 60 | 96 | 67.8 |
20HyperCLOVAX SEED Think 32B | 5/5 | 52 | 76 | 52 | 68 | 80 | 65.6 |
| 5/5 | 56 | 72 | 56 | 56 | 80 | 62.4 |
| 5/5 | 40 | 72 | 44 | 56 | 80 | 57.4 |
| 5/5 | 47 | 75 | 53 | 45 | 77 | 56.0 |
| 5/5 | 48 | 80 | 60 | 36 | 80 | 55.2 |
| 5/5 | 44 | 64 | 44 | 48 | 80 | 54.6 |
| 5/5 | 40 | 64 | 40 | 40 | 76 | 49.6 |
| 5/5 | 36 | 52 | 48 | 36 | 80 | 48.2 |
28Mistral Small 4 | 5/5 | 32 | 52 | 36 | 28 | 72 | 41.2 |
29Kanana 2 30B-A3B Thinking | 5/5 | 40 | 52 | 44 | 28 | 56 | 40.8 |
30Gemma 4 E2B | 5/5 | 37 | 48 | 35 | 31 | 53 | 39.0 |
31HyperCLOVAX SEED 1.5B | 5/5 | 18 | 39 | 18 | 13 | 36 | 21.8 |
| 5/5 | 13 | 32 | 15 | 9 | 32 | 17.6 |
GPT-5.5OpenAI
80808010010091
Claude Opus 4.8Anthropic
80808010010091
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
80808010010091
Gemini 3.5 FlashGoogle
808080808080
Gemma 4 26B A4BGoogle
808080808080
Gemma 4 12BGoogle
808082808481
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
808080808080
MiniMax M3Minimax
708080608071
Gemma 4 31BGoogle
808080808080
Qwen 3.7 PlusAlibaba
808080808080
DeepSeek V4 ProDeepSeek
808080608073
Mimo V2.5 ProXiaomi
8080808010084
Step 3.7 FlashStepFun
728078688575
Qwen 3.7 MaxAlibaba
808080808080
DeepSeek V4 FlashDeepSeek
808080608073
Nemotron 3 Ultra 550BNVIDIA
6080808010080
HyperCLOVAX SEED Think 32BNaver
608060808073
EXAONE 4.5 33BLG AI
608060808073
Qwen 3.5 9BAlibaba
588065547563
Qwen 3.6 27BAlibaba
408060408055
Kimi K2.6Moonshot
8080801008087
Solar Pro 3Upstage
4080606010066
Qwen 3.6 35B A3BAlibaba
608080608069
Mistral Small 4Mistral
406060408053
Kanana 2 30B-A3B ThinkingKakao
604060604054
Gemma 4 E2BGoogle
415036355642
HyperCLOVAX SEED 1.5BNaver
184018113521
LFM2.5 8B-A1BLiquid AI
11331373016
GPT-5.5OpenAI
8010010010010096
Claude Opus 4.8Anthropic
801009010010094
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
100808010010095
Gemini 3.5 FlashGoogle
8080808010084
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
768280748678
GPT-5.4 MiniOpenAI
608080608069
Gemini 3.1 Flash LiteGoogle
8080808010084
MiniMax M3Minimax
6080806010073
Gemma 4 31BGoogle
608080608069
Qwen 3.7 PlusAlibaba
608080608069
DeepSeek V4 ProDeepSeek
6080806010073
Mimo V2.5 ProXiaomi
4080404010056
Step 3.7 FlashStepFun
588072558567
Qwen 3.7 MaxAlibaba
608080608069
DeepSeek V4 FlashDeepSeek
1001008010010097
Nemotron 3 Ultra 550BNVIDIA
4080606010066
HyperCLOVAX SEED Think 32BNaver
606060608064
EXAONE 4.5 33BLG AI
606060608064
Qwen 3.5 9BAlibaba
688570628571
Qwen 3.6 27BAlibaba
408060408055
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
406040206039
Qwen 3.6 35B A3BAlibaba
606040608061
Mistral Small 4Mistral
204020208034
Kanana 2 30B-A3B ThinkingKakao
406040206039
Gemma 4 E2BGoogle
324632265135
HyperCLOVAX SEED 1.5BNaver
163816113420
LFM2.5 8B-A1BLiquid AI
9291162914
GPT-5.5OpenAI
8010010010010096
Claude Opus 4.8Anthropic
801009010010094
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
60100808010082
Gemini 3.5 FlashGoogle
808080808080
Gemma 4 26B A4BGoogle
808080808080
Gemma 4 12BGoogle
808280788480
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
808080808080
MiniMax M3Minimax
60100808010082
Gemma 4 31BGoogle
808080808080
Qwen 3.7 PlusAlibaba
808080808080
DeepSeek V4 ProDeepSeek
6080604010063
Mimo V2.5 ProXiaomi
4080604010059
Step 3.7 FlashStepFun
628578608871
Qwen 3.7 MaxAlibaba
608080608069
DeepSeek V4 FlashDeepSeek
406040408050
Nemotron 3 Ultra 550BNVIDIA
4060404010054
HyperCLOVAX SEED Think 32BNaver
608060808073
EXAONE 4.5 33BLG AI
206020408043
Qwen 3.5 9BAlibaba
386840387849
Qwen 3.6 27BAlibaba
608060408059
Kimi K2.6Moonshot
206020208036
Solar Pro 3Upstage
406020206036
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
204020206030
Kanana 2 30B-A3B ThinkingKakao
406040206039
Gemma 4 E2BGoogle
344734275036
HyperCLOVAX SEED 1.5BNaver
284728244532
LFM2.5 8B-A1BLiquid AI
274328204630
GPT-5.5OpenAI
80808010010091
Claude Opus 4.8Anthropic
80808010010091
Gemini 3.1 ProGoogle
80808010010091
Claude Sonnet 4.6Anthropic
100808010010095
Gemini 3.5 FlashGoogle
80808010010091
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
828482828483
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
608060808073
MiniMax M3Minimax
408080808072
Gemma 4 31BGoogle
708072708073
Qwen 3.7 PlusAlibaba
608080608069
DeepSeek V4 ProDeepSeek
808080808080
Mimo V2.5 ProXiaomi
1001008010010097
Step 3.7 FlashStepFun
748580728878
Qwen 3.7 MaxAlibaba
808080808080
DeepSeek V4 FlashDeepSeek
608060408059
Nemotron 3 Ultra 550BNVIDIA
4080606010066
HyperCLOVAX SEED Think 32BNaver
608040608063
EXAONE 4.5 33BLG AI
408040408052
Qwen 3.5 9BAlibaba
326540306843
Qwen 3.6 27BAlibaba
608060408059
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
204020206030
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
204020206030
Kanana 2 30B-A3B ThinkingKakao
206040206035
Gemma 4 E2BGoogle
425239385644
HyperCLOVAX SEED 1.5BNaver
13341372916
LFM2.5 8B-A1BLiquid AI
10331273316
GPT-5.5OpenAI
100100100100100100
Claude Opus 4.8Anthropic
10098961009899
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
608080608069
Gemini 3.5 FlashGoogle
80808010010091
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
768480788680
GPT-5.4 MiniOpenAI
8080801008087
Gemini 3.1 Flash LiteGoogle
808080608073
MiniMax M3Minimax
80100808010086
Gemma 4 31BGoogle
768080728076
Qwen 3.7 PlusAlibaba
608060608066
DeepSeek V4 ProDeepSeek
808080608073
Mimo V2.5 ProXiaomi
6080804010066
Step 3.7 FlashStepFun
628280608871
Qwen 3.7 MaxAlibaba
608060606062
DeepSeek V4 FlashDeepSeek
808080608073
Nemotron 3 Ultra 550BNVIDIA
808080608073
HyperCLOVAX SEED Think 32BNaver
208040608055
EXAONE 4.5 33BLG AI
208040608055
Qwen 3.5 9BAlibaba
407550428054
Qwen 3.6 27BAlibaba
408060208048
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
6080608010077
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
608060408059
Kanana 2 30B-A3B ThinkingKakao
404040206037
Gemma 4 E2BGoogle
364733305238
HyperCLOVAX SEED 1.5BNaver
163716113520
LFM2.5 8B-A1BLiquid AI
7241062412