본문으로 건너뛰기
AXyNowAX IS NOW
사업 운영EP7

📋세무·회계

Tax·accounting

외피 — 산업 도메인
세무·회계 (개인·기업 통합)
내용 — 측정하는 AI 능력
  • · 한국 세법 정확도 (부가세·종소세·법인세·연말정산)
  • · 다단계 계산 추론
  • · 회계 처리·재무제표 구조화

모델별 종합 점수

✓ 챗봇 1턴

측정일 2026-06-05T02:40:47+00:00 · 5개 항목 × 100점 기준

채점자 editor · max_tokens 32768 · temp 0.7 · attempts 3 · reasoning_effort medium

모델
1OpenAIGPT-5.5
5/584929210010094.8
2ClaudeClaude Opus 4.8
5/584928710010093.8
3Google GeminiGemini 3.1 Pro
5/580968010010092.6
4ClaudeClaude Sonnet 4.6
5/5808480889686.4
5Google GeminiGemini 3.5 Flash
5/5808080889285.2
6Google GeminiGemma 4 26B A4B
5/5808080809282.4
7Google GeminiGemma 4 12B
5/5798281788580.4
8OpenAIGPT-5.4 Mini
5/5768080808079.2
9Google GeminiGemini 3.1 Flash Lite
5/5768076768478.0
10MiniMaxMiniMax M3
5/5628880729276.8
11Google GeminiGemma 4 31B
5/5738078728075.6
12xAIGrok 4.3
5/5807672726873.2
13QWenQwen 3.7 Plus
5/5688076688072.8
14DeepSeekDeepSeek V4 Pro
5/5728076608872.4
15XiaomiMimo V2.5 Pro
5/56484686010072.4
16StepFunStep 3.7 Flash
5/5668278638772.4
17QWenQwen 3.7 Max
5/5688076687672.0
18DeepSeekDeepSeek V4 Flash
5/5728068608470.4
19NVIDIANemotron 3 Ultra 550B
5/5527664609667.8
20NaverHyperCLOVAX SEED Think 32B
5/5527652688065.6
21GLM 5.1
5/5567256568062.4
22LG AIEXAONE 4.5 33B
5/5407244568057.4
23QWenQwen 3.5 9B
5/5477553457756.0
24QWenQwen 3.6 27B
5/5488060368055.2
25Moonshot AIKimi K2.6
5/5446444488054.6
26Solar Pro 3
5/5406440407649.6
27QWenQwen 3.6 35B A3B
5/5365248368048.2
28Mistral AIMistral Small 4
5/5325236287241.2
29KakaoKanana 2 30B-A3B Thinking
5/5405244285640.8
30Google GeminiGemma 4 E2B
5/5374835315339.0
31NaverHyperCLOVAX SEED 1.5B
5/5183918133621.8
32Liquid AILFM2.5 8B-A1B
5/513321593217.6

문항별 점수

5 문항

각 문항당 모델 세부 점수. 응답 원문·근거는 문항 카드 우측 링크.

세무·회계 · 문항 11분기 부가세 신고서 작성 (일반과세자)공개

1분기 부가세 신고서 작성 (일반과세자)

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
GPT-5.5OpenAI
80808010010091
Claude Opus 4.8Anthropic
80808010010091
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
80808010010091
Gemini 3.5 FlashGoogle
808080808080
Gemma 4 26B A4BGoogle
808080808080
Gemma 4 12BGoogle
808082808481
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
808080808080
MiniMax M3Minimax
708080608071
Gemma 4 31BGoogle
808080808080
Grok 4.3xAI
808080608073
Qwen 3.7 PlusAlibaba
808080808080
DeepSeek V4 ProDeepSeek
808080608073
Mimo V2.5 ProXiaomi
8080808010084
Step 3.7 FlashStepFun
728078688575
Qwen 3.7 MaxAlibaba
808080808080
DeepSeek V4 FlashDeepSeek
808080608073
Nemotron 3 Ultra 550BNVIDIA
6080808010080
HyperCLOVAX SEED Think 32BNaver
608060808073
GLM 5.1Z.ai
608060608066
EXAONE 4.5 33BLG AI
608060808073
Qwen 3.5 9BAlibaba
588065547563
Qwen 3.6 27BAlibaba
408060408055
Kimi K2.6Moonshot
8080801008087
Solar Pro 3Upstage
4080606010066
Qwen 3.6 35B A3BAlibaba
608080608069
Mistral Small 4Mistral
406060408053
Kanana 2 30B-A3B ThinkingKakao
604060604054
Gemma 4 E2BGoogle
415036355642
HyperCLOVAX SEED 1.5BNaver
184018113521
LFM2.5 8B-A1BLiquid AI
11331373016
세무·회계 · 문항 2연말정산 — 인적공제·세액공제 시뮬레이션 (직장인 + 1자녀)비공개

연말정산 시뮬레이션

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
GPT-5.5OpenAI
8010010010010096
Claude Opus 4.8Anthropic
801009010010094
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
100808010010095
Gemini 3.5 FlashGoogle
8080808010084
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
768280748678
GPT-5.4 MiniOpenAI
608080608069
Gemini 3.1 Flash LiteGoogle
8080808010084
MiniMax M3Minimax
6080806010073
Gemma 4 31BGoogle
608080608069
Grok 4.3xAI
808080808080
Qwen 3.7 PlusAlibaba
608080608069
DeepSeek V4 ProDeepSeek
6080806010073
Mimo V2.5 ProXiaomi
4080404010056
Step 3.7 FlashStepFun
588072558567
Qwen 3.7 MaxAlibaba
608080608069
DeepSeek V4 FlashDeepSeek
1001008010010097
Nemotron 3 Ultra 550BNVIDIA
4080606010066
HyperCLOVAX SEED Think 32BNaver
606060608064
GLM 5.1Z.ai
406040408050
EXAONE 4.5 33BLG AI
606060608064
Qwen 3.5 9BAlibaba
688570628571
Qwen 3.6 27BAlibaba
408060408055
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
406040206039
Qwen 3.6 35B A3BAlibaba
606040608061
Mistral Small 4Mistral
204020208034
Kanana 2 30B-A3B ThinkingKakao
406040206039
Gemma 4 E2BGoogle
324632265135
HyperCLOVAX SEED 1.5BNaver
163816113420
LFM2.5 8B-A1BLiquid AI
9291162914
세무·회계 · 문항 3종합소득세 — 신고방식 선택 판단 (프리랜서)비공개

종소세 신고방식 판단

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
GPT-5.5OpenAI
8010010010010096
Claude Opus 4.8Anthropic
801009010010094
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
60100808010082
Gemini 3.5 FlashGoogle
808080808080
Gemma 4 26B A4BGoogle
808080808080
Gemma 4 12BGoogle
808280788480
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
808080808080
MiniMax M3Minimax
60100808010082
Gemma 4 31BGoogle
808080808080
Grok 4.3xAI
808060808077
Qwen 3.7 PlusAlibaba
808080808080
DeepSeek V4 ProDeepSeek
6080604010063
Mimo V2.5 ProXiaomi
4080604010059
Step 3.7 FlashStepFun
628578608871
Qwen 3.7 MaxAlibaba
608080608069
DeepSeek V4 FlashDeepSeek
406040408050
Nemotron 3 Ultra 550BNVIDIA
4060404010054
HyperCLOVAX SEED Think 32BNaver
608060808073
GLM 5.1Z.ai
406040408050
EXAONE 4.5 33BLG AI
206020408043
Qwen 3.5 9BAlibaba
386840387849
Qwen 3.6 27BAlibaba
608060408059
Kimi K2.6Moonshot
206020208036
Solar Pro 3Upstage
406020206036
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
204020206030
Kanana 2 30B-A3B ThinkingKakao
406040206039
Gemma 4 E2BGoogle
344734275036
HyperCLOVAX SEED 1.5BNaver
284728244532
LFM2.5 8B-A1BLiquid AI
274328204630
세무·회계 · 문항 4법인세 — 영업이익에서 과세표준까지 (세무조정 사항 5건)비공개

법인세 세무조정

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
GPT-5.5OpenAI
80808010010091
Claude Opus 4.8Anthropic
80808010010091
Gemini 3.1 ProGoogle
80808010010091
Claude Sonnet 4.6Anthropic
100808010010095
Gemini 3.5 FlashGoogle
80808010010091
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
828482828483
GPT-5.4 MiniOpenAI
808080808080
Gemini 3.1 Flash LiteGoogle
608060808073
MiniMax M3Minimax
408080808072
Gemma 4 31BGoogle
708072708073
Grok 4.3xAI
808080806076
Qwen 3.7 PlusAlibaba
608080608069
DeepSeek V4 ProDeepSeek
808080808080
Mimo V2.5 ProXiaomi
1001008010010097
Step 3.7 FlashStepFun
748580728878
Qwen 3.7 MaxAlibaba
808080808080
DeepSeek V4 FlashDeepSeek
608060408059
Nemotron 3 Ultra 550BNVIDIA
4080606010066
HyperCLOVAX SEED Think 32BNaver
608040608063
GLM 5.1Z.ai
808080808080
EXAONE 4.5 33BLG AI
408040408052
Qwen 3.5 9BAlibaba
326540306843
Qwen 3.6 27BAlibaba
608060408059
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
204020206030
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
204020206030
Kanana 2 30B-A3B ThinkingKakao
206040206035
Gemma 4 E2BGoogle
425239385644
HyperCLOVAX SEED 1.5BNaver
13341372916
LFM2.5 8B-A1BLiquid AI
10331273316
세무·회계 · 문항 5가산세 안분 계산 — 차명계좌 은닉분 40% + 단순누락분 10% + 수정신고 감면 단계비공개

부정행위 안분 가산세

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
GPT-5.5OpenAI
100100100100100100
Claude Opus 4.8Anthropic
10098961009899
Gemini 3.1 ProGoogle
801008010010093
Claude Sonnet 4.6Anthropic
608080608069
Gemini 3.5 FlashGoogle
80808010010091
Gemma 4 26B A4BGoogle
8080808010084
Gemma 4 12BGoogle
768480788680
GPT-5.4 MiniOpenAI
8080801008087
Gemini 3.1 Flash LiteGoogle
808080608073
MiniMax M3Minimax
80100808010086
Gemma 4 31BGoogle
768080728076
Grok 4.3xAI
806060604060
Qwen 3.7 PlusAlibaba
608060608066
DeepSeek V4 ProDeepSeek
808080608073
Mimo V2.5 ProXiaomi
6080804010066
Step 3.7 FlashStepFun
628280608871
Qwen 3.7 MaxAlibaba
608060606062
DeepSeek V4 FlashDeepSeek
808080608073
Nemotron 3 Ultra 550BNVIDIA
808080608073
HyperCLOVAX SEED Think 32BNaver
208040608055
GLM 5.1Z.ai
608060608066
EXAONE 4.5 33BLG AI
208040608055
Qwen 3.5 9BAlibaba
407550428054
Qwen 3.6 27BAlibaba
408060208048
Kimi K2.6Moonshot
406040408050
Solar Pro 3Upstage
6080608010077
Qwen 3.6 35B A3BAlibaba
204040208037
Mistral Small 4Mistral
608060408059
Kanana 2 30B-A3B ThinkingKakao
404040206037
Gemma 4 E2BGoogle
364733305238
HyperCLOVAX SEED 1.5BNaver
163716113520
LFM2.5 8B-A1BLiquid AI
7241062412