본문으로 건너뛰기
AXyNowAX IS NOW
기술 출력

📑문서 출력

Document output

외피 — 산업 도메인
문서·슬라이드·차트·표
내용 — 측정하는 AI 능력
  • · 정보 구조화 (계층·우선순위)
  • · 시각 도식 코드 (PlantUML·Mermaid·SVG)
  • · 표·구조화 데이터 변환

모델별 종합 점수

✓ 챗봇 1턴

측정일 2026-06-05T02:40:49+00:00 · 5개 항목 × 100점 기준

채점자 editor · max_tokens 32768 · temp 0.7 · attempts 3 · reasoning_effort medium

모델
1ClaudeClaude Opus 4.8
5/59910090849996.8
2MiniMaxMiniMax M3
5/5949392799592.2
3Google GeminiGemini 3.1 Pro
5/5969284809290.8
4ClaudeClaude Sonnet 4.6
5/5848484809687.8
5OpenAIGPT-5.5
5/5928880768886.8
6QWenQwen 3.7 Plus
5/5848282788583.0
7XiaomiMimo V2.5 Pro
5/5848384768482.8
8GLM 5.1
5/5808484808081.4
9Google GeminiGemini 3.1 Flash Lite
5/5838283758081.0
10Google GeminiGemini 3.5 Flash
5/5807684808480.8
11NVIDIANemotron 3 Ultra 550B
5/5857772758580.6
12Google GeminiGemma 4 31B
5/5808084808080.4
13StepFunStep 3.7 Flash
5/5738881707979.6
14Google GeminiGemma 4 26B A4B
5/5828078757979.4
15QWenQwen 3.6 27B
5/5807980757979.0
16Google GeminiGemma 4 12B
5/5818075628379.0
17QWenQwen 3.7 Max
5/5808076807678.2
18QWenQwen 3.6 35B A3B
5/5768170728077.8
19Moonshot AIKimi K2.6
5/5807280767676.2
20OpenAIGPT-5.4 Mini
5/5807672807275.4
21LG AIEXAONE 4.5 33B
5/5708067508375.0
22DeepSeekDeepSeek V4 Pro
5/5747667747273.0
23xAIGrok 4.3
5/5777371757173.0
24Mistral AIMistral Small 4
5/5737567747372.8
25DeepSeekDeepSeek V4 Flash
5/5737266637471.4
26QWenQwen 3.5 9B
5/5686958717268.8
27Solar Pro 3
5/5637457507266.8
28NaverHyperCLOVAX SEED Think 32B
5/5646448606863.4
29KakaoKanana 2 30B-A3B Thinking
5/5546448426056.8
30Google GeminiGemma 4 E2B
5/5475244444948.4
31Liquid AILFM2.5 8B-A1B
5/5354434353738.0
32NaverHyperCLOVAX SEED 1.5B
5/5324231323535.4

문항별 점수

5 문항

각 문항당 모델 세부 점수. 응답 원문·근거는 문항 카드 우측 링크.

문서 출력 · 문항 1시드 IR Traction 슬라이드 1장 — 정밀 제약공개

시드 IR Traction 슬라이드 1장 — 정밀 제약

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Claude Opus 4.8Anthropic
100100808010096
MiniMax M3Minimax
959290809592
Gemini 3.1 ProGoogle
1008080808084
Claude Sonnet 4.6Anthropic
808080808080
GPT-5.5OpenAI
806060604057
Qwen 3.7 PlusAlibaba
827880788481
Mimo V2.5 ProXiaomi
808580758081
GLM 5.1Z.ai
8010080808085
Gemini 3.1 Flash LiteGoogle
808080758080
Gemini 3.5 FlashGoogle
806080808075
Nemotron 3 Ultra 550BNVIDIA
847078747275
Gemma 4 31BGoogle
808080808080
Step 3.7 FlashStepFun
858580708282
Gemma 4 26B A4BGoogle
808080758080
Qwen 3.6 27BAlibaba
757075757574
Gemma 4 12BGoogle
788080628479
Qwen 3.7 MaxAlibaba
808080808080
Qwen 3.6 35B A3BAlibaba
808075758079
Kimi K2.6Moonshot
804060606059
GPT-5.4 MiniOpenAI
808080808080
EXAONE 4.5 33BLG AI
708060508575
DeepSeek V4 ProDeepSeek
757075756068
Grok 4.3xAI
807575757576
Mistral Small 4Mistral
757075757072
DeepSeek V4 FlashDeepSeek
808080758080
Qwen 3.5 9BAlibaba
786570725565
Solar Pro 3Upstage
556550505556
HyperCLOVAX SEED Think 32BNaver
604040606053
Kanana 2 30B-A3B ThinkingKakao
505545306052
Gemma 4 E2BGoogle
535749495554
LFM2.5 8B-A1BLiquid AI
445142434646
HyperCLOVAX SEED 1.5BNaver
435040424445
문서 출력 · 문항 2트랙션 시각화 판단 — 허영지표·결측 지표비공개

IR 트랙션 허영지표 판단

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Claude Opus 4.8Anthropic
1001001008010098
MiniMax M3Minimax
929892759593
Gemini 3.1 ProGoogle
8080808010087
Claude Sonnet 4.6Anthropic
80100808010092
GPT-5.5OpenAI
80100808010092
Qwen 3.7 PlusAlibaba
848682788684
Mimo V2.5 ProXiaomi
808580758582
GLM 5.1Z.ai
808080808080
Gemini 3.1 Flash LiteGoogle
809080758082
Gemini 3.5 FlashGoogle
808080808080
Nemotron 3 Ultra 550BNVIDIA
909288729088
Gemma 4 31BGoogle
808080808080
Step 3.7 FlashStepFun
628882706873
Gemma 4 26B A4BGoogle
808580758081
Qwen 3.6 27BAlibaba
808580758081
Gemma 4 12BGoogle
808882628582
Qwen 3.7 MaxAlibaba
808080808080
Qwen 3.6 35B A3BAlibaba
808580758081
Kimi K2.6Moonshot
808080808080
GPT-5.4 MiniOpenAI
808080808080
EXAONE 4.5 33BLG AI
758570508578
DeepSeek V4 ProDeepSeek
809080758082
Grok 4.3xAI
757570757073
Mistral Small 4Mistral
808580758081
DeepSeek V4 FlashDeepSeek
806070207064
Qwen 3.5 9BAlibaba
809080728282
Solar Pro 3Upstage
758570508076
HyperCLOVAX SEED Think 32BNaver
608060606065
Kanana 2 30B-A3B ThinkingKakao
657560506566
Gemma 4 E2BGoogle
576153515958
LFM2.5 8B-A1BLiquid AI
424940424444
HyperCLOVAX SEED 1.5BNaver
495646465251
문서 출력 · 문항 3이중축 차트 — 영업이익 파생 계산 + 음수 0선비공개

이중축 파생계산 정합

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Claude Opus 4.8Anthropic
1001008010010098
MiniMax M3Minimax
969290809592
Gemini 3.1 ProGoogle
100100808010096
Claude Sonnet 4.6Anthropic
10080808010091
GPT-5.5OpenAI
10080808010091
Qwen 3.7 PlusAlibaba
868280788684
Mimo V2.5 ProXiaomi
858080758582
GLM 5.1Z.ai
808080808080
Gemini 3.1 Flash LiteGoogle
858080758080
Gemini 3.5 FlashGoogle
808080808080
Nemotron 3 Ultra 550BNVIDIA
808570788883
Gemma 4 31BGoogle
808080808080
Step 3.7 FlashStepFun
608878707274
Gemma 4 26B A4BGoogle
858080758080
Qwen 3.6 27BAlibaba
858080758080
Gemma 4 12BGoogle
848080628079
Qwen 3.7 MaxAlibaba
808080808080
Qwen 3.6 35B A3BAlibaba
608040608070
Kimi K2.6Moonshot
808080808080
GPT-5.4 MiniOpenAI
808080808080
EXAONE 4.5 33BLG AI
607050507566
DeepSeek V4 ProDeepSeek
858080758582
Grok 4.3xAI
807575757074
Mistral Small 4Mistral
858080758080
DeepSeek V4 FlashDeepSeek
858080758080
Qwen 3.5 9BAlibaba
557050727567
Solar Pro 3Upstage
456540506557
HyperCLOVAX SEED Think 32BNaver
608060608072
Kanana 2 30B-A3B ThinkingKakao
456040506054
Gemma 4 E2BGoogle
424940424444
LFM2.5 8B-A1BLiquid AI
314131323234
HyperCLOVAX SEED 1.5BNaver
172917192222
문서 출력 · 문항 4차트 거짓 전제 — y축 최솟값·막대 강제 교정비공개

거짓 전제 차트 교정

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Claude Opus 4.8Anthropic
1001001008010098
MiniMax M3Minimax
929298789291
Gemini 3.1 ProGoogle
100100100808091
Claude Sonnet 4.6Anthropic
80801008010089
GPT-5.5OpenAI
1001001008010098
Qwen 3.7 PlusAlibaba
848486788484
Mimo V2.5 ProXiaomi
8585100808586
GLM 5.1Z.ai
8080100808082
Gemini 3.1 Flash LiteGoogle
808095758081
Gemini 3.5 FlashGoogle
80801008010089
Nemotron 3 Ultra 550BNVIDIA
804838748268
Gemma 4 31BGoogle
8080100808082
Step 3.7 FlashStepFun
788885688583
Gemma 4 26B A4BGoogle
757570757574
Qwen 3.6 27BAlibaba
808085758080
Gemma 4 12BGoogle
786852607871
Qwen 3.7 MaxAlibaba
808060806071
Qwen 3.6 35B A3BAlibaba
808080758080
Kimi K2.6Moonshot
8080100808082
GPT-5.4 MiniOpenAI
806040804057
EXAONE 4.5 33BLG AI
708085508076
DeepSeek V4 ProDeepSeek
406020705551
Grok 4.3xAI
707060756066
Mistral Small 4Mistral
406020705551
DeepSeek V4 FlashDeepSeek
406020706053
Qwen 3.5 9BAlibaba
403512686046
Solar Pro 3Upstage
657555508070
HyperCLOVAX SEED Think 32BNaver
604020606051
Kanana 2 30B-A3B ThinkingKakao
556545505556
Gemma 4 E2BGoogle
334231333536
LFM2.5 8B-A1BLiquid AI
273825272930
HyperCLOVAX SEED 1.5BNaver
273926283031
문서 출력 · 문항 5환불 워크플로 — 하자>7일 우선순위 역전비공개

환불 우선순위 역전

본문·raw·근거 →
모델
정확성의도 파악신중함한국 맥락짜임새avg
Claude Opus 4.8Anthropic
9510090809594
MiniMax M3Minimax
959288829693
Gemini 3.1 ProGoogle
100100808010096
Claude Sonnet 4.6Anthropic
8080808010087
GPT-5.5OpenAI
100100808010096
Qwen 3.7 PlusAlibaba
848280788482
Mimo V2.5 ProXiaomi
908080758583
GLM 5.1Z.ai
808080808080
Gemini 3.1 Flash LiteGoogle
908080758082
Gemini 3.5 FlashGoogle
808080808080
Nemotron 3 Ultra 550BNVIDIA
929085759289
Gemma 4 31BGoogle
808080808080
Step 3.7 FlashStepFun
829082709086
Gemma 4 26B A4BGoogle
908080758082
Qwen 3.6 27BAlibaba
808080758080
Gemma 4 12BGoogle
868682628884
Qwen 3.7 MaxAlibaba
808080808080
Qwen 3.6 35B A3BAlibaba
808075758079
Kimi K2.6Moonshot
808080808080
GPT-5.4 MiniOpenAI
808080808080
EXAONE 4.5 33BLG AI
758570509080
DeepSeek V4 ProDeepSeek
908080758082
Grok 4.3xAI
807075758076
Mistral Small 4Mistral
858080758080
DeepSeek V4 FlashDeepSeek
808080758080
Qwen 3.5 9BAlibaba
858580728884
Solar Pro 3Upstage
758070508075
HyperCLOVAX SEED Think 32BNaver
808060608076
Kanana 2 30B-A3B ThinkingKakao
556550306056
Gemma 4 E2BGoogle
485347475050
LFM2.5 8B-A1BLiquid AI
334231333536
HyperCLOVAX SEED 1.5BNaver
243624252728