aboutsummaryrefslogtreecommitdiffstatshomepage
diff options
context:
space:
mode:
author简律纯 <i@jyunko.cn>2025-10-24 23:15:35 +0800
committer简律纯 <i@jyunko.cn>2025-10-24 23:15:35 +0800
commit08299b37dfda86e56e4f2b442f68ccd2da7a82e3 (patch)
treee155d11412a26f692d08b8eb796fa689fc5a4019
parent990048eb2163127615de60d9359c150bdfb99536 (diff)
downloadconventional_role_play-08299b37dfda86e56e4f2b442f68ccd2da7a82e3.tar.gz
conventional_role_play-08299b37dfda86e56e4f2b442f68ccd2da7a82e3.zip
feat: Enhance Processor, RuleExtractor, and Renderers with type hints and improved documentation
- Added type hints to Processor methods for better clarity and type safety. - Improved documentation for Processor methods, including detailed descriptions of parameters and return types. - Refactored RuleExtractor to support optional configuration file loading and added error handling for file operations. - Enhanced MarkdownRenderer to handle both list and dictionary inputs, with improved rendering logic. - Created comprehensive examples and tests for all components, ensuring robust functionality and error handling. - Added example rules for D&D 5E and structured output files for various formats (JSON, HTML, Markdown). - Established a testing framework with clear instructions and coverage reporting.
-rw-r--r--examples/README.md190
-rw-r--r--examples/basic_usage.py102
-rw-r--r--examples/custom_plugin.py154
-rw-r--r--examples/output/session_output.html1
-rw-r--r--examples/output/session_output.json121
-rw-r--r--examples/output/session_output.md87
-rw-r--r--examples/rules/dnd5e_rules.json578
-rw-r--r--src/conventionalrp/core/processor.py86
-rw-r--r--src/conventionalrp/extractors/rule_extractor.py75
-rw-r--r--src/conventionalrp/renderers/markdown_renderer.py37
-rw-r--r--tests/README.md63
-rw-r--r--tests/run_tests.py36
-rw-r--r--tests/test_parser.py142
-rw-r--r--tests/test_processor.py114
-rw-r--r--tests/test_renderers.py114
-rw-r--r--tests/test_rule_extractor.py99
16 files changed, 1429 insertions, 70 deletions
diff --git a/examples/README.md b/examples/README.md
new file mode 100644
index 0000000..d823f5b
--- /dev/null
+++ b/examples/README.md
@@ -0,0 +1,190 @@
+# ConventionalRP 示例
+
+本目录包含 ConventionalRP SDK 的使用示例。
+
+## 目录结构
+
+```
+examples/
+├── basic_usage.py # 基础使用示例
+├── custom_plugin.py # 自定义插件示例
+├── rules/ # 规则文件
+│ └── dnd5e_rules.json5 # D&D 5E 解析规则
+├── logs/ # 示例日志文件
+│ ├── sample_session.txt # 完整会话日志
+│ └── combat_log.txt # 战斗日志
+└── output/ # 输出文件目录(自动生成)
+```
+
+## 快速开始
+
+### 1. 基础使用示例
+
+演示如何解析 TRPG 日志并以多种格式输出:
+
+```bash
+cd examples
+python basic_usage.py
+```
+
+**输出:**
+- `output/session_output.json` - JSON 格式
+- `output/session_output.html` - HTML 格式
+- `output/session_output.md` - Markdown 格式
+
+### 2. 自定义插件示例
+
+演示如何创建自定义插件进行数据分析:
+
+```bash
+python custom_plugin.py
+```
+
+**功能:**
+- 骰子统计分析
+- 对话提取
+- 角色行为分析
+
+## 规则文件格式
+
+规则文件使用 JSON5 格式(支持注释和尾随逗号):
+
+```json5
+{
+ metadata: [{
+ type: "metadata",
+ patterns: ["正则表达式"],
+ groups: ["字段名"],
+ priority: 100
+ }],
+
+ content: [{
+ type: "内容类型",
+ match_type: "匹配模式", // enclosed, prefix, suffix
+ patterns: ["正则表达式"],
+ groups: ["提取字段"],
+ priority: 90
+ }]
+}
+```
+
+### 匹配模式说明
+
+- **enclosed**: 封闭匹配(如 `**动作**`、`「对话」`)
+- **prefix**: 前缀匹配(如 `[系统]消息`)
+- **suffix**: 后缀匹配(文本结尾)
+
+### 优先级
+
+- 数字越大优先级越高
+- 建议范围:1-100
+- 元数据通常设置为最高优先级(100)
+
+## 日志文件格式
+
+标准 TRPG 日志格式:
+
+```
+[时间戳] <角色名> 内容
+```
+
+**示例:**
+
+```
+[2025-10-24 14:30:01] <艾莉娅> 「我要检查这扇门」
+[2025-10-24 14:30:05] <DiceBot> 检定结果: [d20 = 18]
+[2025-10-24 14:30:10] <DM> 你发现了陷阱
+```
+
+## 自定义规则
+
+你可以为不同的游戏系统创建自定义规则:
+
+### D&D 5E
+
+已提供 `rules/dnd5e_rules.json5`
+
+### 其他系统
+
+创建新的规则文件,参考 D&D 5E 规则的结构:
+
+```bash
+cp rules/dnd5e_rules.json5 rules/my_system_rules.json5
+# 然后编辑 my_system_rules.json5
+```
+
+## 创建自定义插件
+
+插件是用于扩展功能的 Python 类:
+
+```python
+class MyPlugin:
+ def __init__(self):
+ self.name = "My Plugin"
+
+ def process(self, parsed_data):
+ # 你的处理逻辑
+ return result
+```
+
+查看 `custom_plugin.py` 了解完整示例。
+
+## 常见模式
+
+### 1. 骰子投掷
+
+- `[d20 = 18]` - 简单投掷结果
+- `.r1d20+5` - 投掷命令
+- `(1d20+5 = 18)` - 完整投掷信息
+
+### 2. 角色动作
+
+- `*动作描述*` - 单星号
+- `**重要动作**` - 双星号
+
+### 3. 对话
+
+- `「对话内容」` - 中文引号
+- `"对话内容"` - 英文引号
+- `"对话内容"` - 弯引号
+
+### 4. OOC(脱戏)
+
+- `((OOC内容))` - 双括号
+- `//OOC注释` - 双斜杠
+
+### 5. 系统消息
+
+- `[系统]消息内容`
+- `[System]Message`
+
+## 疑难解答
+
+### 问题:规则文件加载失败
+
+**解决方案:**
+1. 确保文件是有效的 JSON5 格式
+2. 检查正则表达式是否转义正确(使用 `\\` 而不是 `\`)
+3. 验证文件编码为 UTF-8
+
+### 问题:解析结果不正确
+
+**解决方案:**
+1. 调整规则的优先级
+2. 测试正则表达式(使用 https://regex101.com/)
+3. 检查 match_type 是否正确
+
+### 问题:中文字符显示异常
+
+**解决方案:**
+- 确保所有文件使用 UTF-8 编码
+- 在打开文件时指定 `encoding='utf-8'`
+
+## 更多示例
+
+访问项目文档查看更多示例:
+https://crp.hydroroll.team/
+
+## 贡献
+
+欢迎提交新的示例和规则文件!请参考 CONTRIBUTING.md
diff --git a/examples/basic_usage.py b/examples/basic_usage.py
index 7a4e53d..c327cb0 100644
--- a/examples/basic_usage.py
+++ b/examples/basic_usage.py
@@ -1,32 +1,94 @@
+#!/usr/bin/env python3
+"""
+基础使用示例
+演示如何使用 ConventionalRP 解析和处理 TRPG 日志
+"""
+
+import sys
+from pathlib import Path
+
+# 添加 src 目录到 Python 路径
+project_root = Path(__file__).parent.parent
+sys.path.insert(0, str(project_root / "src"))
+
from conventionalrp.core.parser import Parser
from conventionalrp.core.processor import Processor
from conventionalrp.extractors.rule_extractor import RuleExtractor
from conventionalrp.renderers.html_renderer import HTMLRenderer
+from conventionalrp.renderers.json_renderer import JSONRenderer
+from conventionalrp.renderers.markdown_renderer import MarkdownRenderer
def main():
- # Initialize the parser and load rules
+ # 获取示例文件路径
+ example_dir = Path(__file__).parent
+ rules_file = example_dir / "rules" / "dnd5e_rules.json5"
+ log_file = example_dir / "logs" / "sample_session.txt"
+
+ print("=" * 60)
+ print("ConventionalRP 基础使用示例")
+ print("=" * 60)
+
+ # 步骤 1: 加载规则
+ print("\n[步骤 1] 加载解析规则...")
parser = Parser()
- parser.load_rules("path/to/rules.json")
-
- # Parse the TRPG log
- log_data = "Your TRPG log data here"
- parsed_tokens = parser.parse_log(log_data)
-
- # Initialize the rule extractor
- extractor = RuleExtractor()
- rules = extractor.extract("path/to/rules.json")
-
- # Process the parsed tokens
+ parser.load_rules(str(rules_file))
+ print(f"✓ 规则加载成功: {rules_file.name}")
+
+ # 步骤 2: 解析日志
+ print("\n[步骤 2] 解析 TRPG 日志...")
+ parsed_data = parser.parse_log(str(log_file))
+ print(f"✓ 日志解析完成,共 {len(parsed_data)} 条记录")
+
+ # 步骤 3: 处理解析结果
+ print("\n[步骤 3] 处理解析后的数据...")
processor = Processor()
- processed_data = processor.process_tokens(parsed_tokens, rules)
-
- # Render the output in HTML format
- renderer = HTMLRenderer()
- output = renderer.render(processed_data)
-
- # Print or save the output
- print(output)
+ processed_data = processor.process_tokens(parsed_data)
+ print(f"✓ 数据处理完成")
+
+ # 步骤 4: 渲染输出
+ print("\n[步骤 4] 渲染输出...")
+
+ # JSON 格式
+ json_renderer = JSONRenderer()
+ json_output = json_renderer.render(processed_data)
+ json_file = example_dir / "output" / "session_output.json"
+ json_file.parent.mkdir(exist_ok=True)
+ with open(json_file, "w", encoding="utf-8") as f:
+ f.write(json_output)
+ print(f"✓ JSON 输出已保存: {json_file}")
+
+ # HTML 格式
+ html_renderer = HTMLRenderer()
+ html_output = html_renderer.render(processed_data)
+ html_file = example_dir / "output" / "session_output.html"
+ with open(html_file, "w", encoding="utf-8") as f:
+ f.write(html_output)
+ print(f"✓ HTML 输出已保存: {html_file}")
+
+ # Markdown 格式
+ md_renderer = MarkdownRenderer()
+ md_output = md_renderer.render(processed_data)
+ md_file = example_dir / "output" / "session_output.md"
+ with open(md_file, "w", encoding="utf-8") as f:
+ f.write(md_output)
+ print(f"✓ Markdown 输出已保存: {md_file}")
+
+ # 预览前几条记录
+ print("\n" + "=" * 60)
+ print("解析结果预览(前3条):")
+ print("=" * 60)
+ for i, entry in enumerate(parsed_data[:3], 1):
+ print(f"\n[记录 {i}]")
+ print(f" 时间: {entry.get('timestamp', 'N/A')}")
+ print(f" 发言者: {entry.get('speaker', 'N/A')}")
+ print(f" 内容类型数: {len(entry.get('content', []))}")
+ for content in entry.get('content', [])[:2]: # 只显示前2个内容
+ print(f" - {content.get('type', 'unknown')}: {content.get('content', '')[:50]}...")
+
+ print("\n" + "=" * 60)
+ print("✓ 所有步骤完成!")
+ print("=" * 60)
if __name__ == "__main__":
diff --git a/examples/custom_plugin.py b/examples/custom_plugin.py
index ecb9e71..dd96311 100644
--- a/examples/custom_plugin.py
+++ b/examples/custom_plugin.py
@@ -1,27 +1,149 @@
-from conventionalrp.plugins.plugin_manager import PluginManager
+#!/usr/bin/env python3
+"""
+自定义插件示例
+演示如何创建和使用自定义插件来扩展 ConventionalRP 的功能
+"""
+import sys
+from typing import List, Dict, Any
+from pathlib import Path
-class CustomPlugin:
- def __init__(self):
- self.name = "Custom Plugin"
+# 添加 src 目录到 Python 路径
+project_root = Path(__file__).parent.parent
+sys.path.insert(0, str(project_root / "src"))
- def process(self, data):
- # Custom processing logic
- processed_data = data.upper() # Example transformation
- return processed_data
+from conventionalrp.core.parser import Parser
+from conventionalrp.core.processor import Processor
-def main():
- plugin_manager = PluginManager()
- custom_plugin = CustomPlugin()
+class DiceRollAnalyzer:
+ """骰子统计分析插件"""
+
+ def __init__(self):
+ self.name = "Dice Roll Analyzer"
+
+ def analyze(self, parsed_data: List[Dict[str, Any]]) -> Dict[str, Any]:
+ """
+ 分析日志中的所有骰子投掷
+
+ Args:
+ parsed_data: 解析后的日志数据
+
+ Returns:
+ 统计结果
+ """
+ stats = {
+ "total_rolls": 0,
+ "by_character": {},
+ "dice_types": {}
+ }
+
+ for entry in parsed_data:
+ speaker = entry.get("speaker", "Unknown")
+ for content in entry.get("content", []):
+ if content.get("type") == "dice_roll":
+ stats["total_rolls"] += 1
+
+ # 按角色统计
+ if speaker not in stats["by_character"]:
+ stats["by_character"][speaker] = 0
+ stats["by_character"][speaker] += 1
+
+ # 按骰子类型统计
+ dice_type = content.get("dice_type", "unknown")
+ if dice_type not in stats["dice_types"]:
+ stats["dice_types"][dice_type] = 0
+ stats["dice_types"][dice_type] += 1
+
+ return stats
- plugin_manager.register_plugin(custom_plugin)
- # Example data to process
- data = "This is a sample TRPG log."
- result = custom_plugin.process(data)
+class DialogueExtractor:
+ """对话提取插件"""
+
+ def __init__(self):
+ self.name = "Dialogue Extractor"
+
+ def extract(self, parsed_data: List[Dict[str, Any]]) -> List[Dict[str, str]]:
+ """
+ 提取所有角色对话
+
+ Args:
+ parsed_data: 解析后的日志数据
+
+ Returns:
+ 对话列表
+ """
+ dialogues = []
+
+ for entry in parsed_data:
+ speaker = entry.get("speaker", "Unknown")
+ timestamp = entry.get("timestamp", "")
+
+ for content in entry.get("content", []):
+ if content.get("type") == "dialogue":
+ dialogues.append({
+ "speaker": speaker,
+ "timestamp": timestamp,
+ "dialogue": content.get("dialogue_text", content.get("content", ""))
+ })
+
+ return dialogues
+
- print(f"Processed Data: {result}")
+def main():
+ print("=" * 60)
+ print("ConventionalRP 自定义插件示例")
+ print("=" * 60)
+
+ # 准备数据
+ example_dir = Path(__file__).parent
+ rules_file = example_dir / "rules" / "dnd5e_rules.json5"
+ log_file = example_dir / "logs" / "combat_log.txt"
+
+ print("\n[1] 解析日志...")
+ parser = Parser()
+ parser.load_rules(str(rules_file))
+ parsed_data = parser.parse_log(str(log_file))
+ print(f"✓ 解析完成,共 {len(parsed_data)} 条记录")
+
+ # 使用骰子分析插件
+ print("\n[2] 运行骰子统计分析插件...")
+ dice_analyzer = DiceRollAnalyzer()
+ dice_stats = dice_analyzer.analyze(parsed_data)
+
+ print(f"\n骰子统计结果:")
+ print(f" 总投掷次数: {dice_stats['total_rolls']}")
+ print(f"\n 按角色统计:")
+ for character, count in dice_stats['by_character'].items():
+ print(f" {character}: {count} 次")
+ print(f"\n 按骰子类型统计:")
+ for dice_type, count in dice_stats['dice_types'].items():
+ print(f" d{dice_type}: {count} 次")
+
+ # 使用对话提取插件
+ print("\n[3] 运行对话提取插件...")
+ dialogue_extractor = DialogueExtractor()
+ dialogues = dialogue_extractor.extract(parsed_data)
+
+ print(f"\n提取到 {len(dialogues)} 条对话:")
+ for i, dialogue in enumerate(dialogues[:5], 1): # 只显示前5条
+ print(f"\n [{i}] {dialogue['speaker']} ({dialogue['timestamp']})")
+ print(f" {dialogue['dialogue']}")
+
+ if len(dialogues) > 5:
+ print(f"\n ... 还有 {len(dialogues) - 5} 条对话")
+
+ print("\n" + "=" * 60)
+ print("✓ 插件演示完成!")
+ print("=" * 60)
+ print("\n提示: 你可以创建自己的插件来实现:")
+ print(" - 战斗统计分析")
+ print(" - 角色行为分析")
+ print(" - 关键词提取")
+ print(" - 情感分析")
+ print(" - 自动摘要生成")
+ print(" - ... 以及更多!")
if __name__ == "__main__":
diff --git a/examples/output/session_output.html b/examples/output/session_output.html
new file mode 100644
index 0000000..461a59f
--- /dev/null
+++ b/examples/output/session_output.html
@@ -0,0 +1 @@
+<html><head><title>TRPG Log Output</title></head><body><h1>TRPG Log Output</h1><ul><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:01', 'speaker': '艾莉娅', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:05', 'speaker': '艾莉娅', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:05', 'speaker': 'DiceBot', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:15', 'speaker': 'DM', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:30', 'speaker': '艾莉娅', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:35', 'speaker': '艾莉娅', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:35', 'speaker': 'DiceBot', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:30:45', 'speaker': 'DM', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:00', 'speaker': '索恩', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:10', 'speaker': 'DM', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:25', 'speaker': '索恩', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:30', 'speaker': '艾莉娅', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:45', 'speaker': '莉莉安', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:50', 'speaker': 'DM', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:55', 'speaker': '莉莉安', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:31:55', 'speaker': 'DiceBot', 'content': [], 'processed': True}</li><li>{'type': 'metadata', 'timestamp': '2025-10-24 14:32:10', 'speaker': 'DM', 'content': [], 'processed': True}</li></ul></body></html> \ No newline at end of file
diff --git a/examples/output/session_output.json b/examples/output/session_output.json
new file mode 100644
index 0000000..076ef15
--- /dev/null
+++ b/examples/output/session_output.json
@@ -0,0 +1,121 @@
+[
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:01",
+ "speaker": "艾莉娅",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:05",
+ "speaker": "艾莉娅",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:05",
+ "speaker": "DiceBot",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:15",
+ "speaker": "DM",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:30",
+ "speaker": "艾莉娅",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:35",
+ "speaker": "艾莉娅",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:35",
+ "speaker": "DiceBot",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:45",
+ "speaker": "DM",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:00",
+ "speaker": "索恩",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:10",
+ "speaker": "DM",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:25",
+ "speaker": "索恩",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:30",
+ "speaker": "艾莉娅",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:45",
+ "speaker": "莉莉安",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:50",
+ "speaker": "DM",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:55",
+ "speaker": "莉莉安",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:31:55",
+ "speaker": "DiceBot",
+ "content": [],
+ "processed": true
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:32:10",
+ "speaker": "DM",
+ "content": [],
+ "processed": true
+ }
+] \ No newline at end of file
diff --git a/examples/output/session_output.md b/examples/output/session_output.md
new file mode 100644
index 0000000..6f4a09b
--- /dev/null
+++ b/examples/output/session_output.md
@@ -0,0 +1,87 @@
+# TRPG Log
+
+## Entry 1
+
+**Timestamp**: 2025-10-24 14:30:01
+**Speaker**: 艾莉娅
+
+## Entry 2
+
+**Timestamp**: 2025-10-24 14:30:05
+**Speaker**: 艾莉娅
+
+## Entry 3
+
+**Timestamp**: 2025-10-24 14:30:05
+**Speaker**: DiceBot
+
+## Entry 4
+
+**Timestamp**: 2025-10-24 14:30:15
+**Speaker**: DM
+
+## Entry 5
+
+**Timestamp**: 2025-10-24 14:30:30
+**Speaker**: 艾莉娅
+
+## Entry 6
+
+**Timestamp**: 2025-10-24 14:30:35
+**Speaker**: 艾莉娅
+
+## Entry 7
+
+**Timestamp**: 2025-10-24 14:30:35
+**Speaker**: DiceBot
+
+## Entry 8
+
+**Timestamp**: 2025-10-24 14:30:45
+**Speaker**: DM
+
+## Entry 9
+
+**Timestamp**: 2025-10-24 14:31:00
+**Speaker**: 索恩
+
+## Entry 10
+
+**Timestamp**: 2025-10-24 14:31:10
+**Speaker**: DM
+
+## Entry 11
+
+**Timestamp**: 2025-10-24 14:31:25
+**Speaker**: 索恩
+
+## Entry 12
+
+**Timestamp**: 2025-10-24 14:31:30
+**Speaker**: 艾莉娅
+
+## Entry 13
+
+**Timestamp**: 2025-10-24 14:31:45
+**Speaker**: 莉莉安
+
+## Entry 14
+
+**Timestamp**: 2025-10-24 14:31:50
+**Speaker**: DM
+
+## Entry 15
+
+**Timestamp**: 2025-10-24 14:31:55
+**Speaker**: 莉莉安
+
+## Entry 16
+
+**Timestamp**: 2025-10-24 14:31:55
+**Speaker**: DiceBot
+
+## Entry 17
+
+**Timestamp**: 2025-10-24 14:32:10
+**Speaker**: DM
+
diff --git a/examples/rules/dnd5e_rules.json5 b/examples/rules/dnd5e_rules.json5
new file mode 100644
index 0000000..96f7cd1
--- /dev/null
+++ b/examples/rules/dnd5e_rules.json5
@@ -0,0 +1,78 @@
+{
+ // D&D 5E TRPG 日志解析规则
+ metadata: [
+ {
+ type: "metadata",
+ patterns: [
+ "^\\[(.+?)\\]\\s*<(.+?)>\\s*(.*)$", // [时间] <角色名> 内容
+ "^(.+?)\\s*\\|\\s*(.+?)\\s*:\\s*(.*)$" // 时间 | 角色名: 内容
+ ],
+ groups: ["timestamp", "speaker", "content"],
+ priority: 100
+ }
+ ],
+
+ content: [
+ {
+ type: "dice_roll",
+ match_type: "enclosed",
+ patterns: [
+ "\\[d(\\d+)\\s*=\\s*(\\d+)\\]", // [d20 = 15]
+ "\\.r(\\d*)d(\\d+)(?:[+\\-](\\d+))?", // .r1d20+5
+ "\\((\\d+)d(\\d+)(?:[+\\-](\\d+))?\\s*=\\s*(\\d+)\\)" // (1d20+5 = 18)
+ ],
+ groups: ["dice_type", "result"],
+ priority: 90
+ },
+ {
+ type: "action",
+ match_type: "enclosed",
+ patterns: [
+ "\\*\\*(.+?)\\*\\*", // **动作**
+ "\\*(.+?)\\*" // *动作*
+ ],
+ groups: ["action_text"],
+ priority: 80
+ },
+ {
+ type: "ooc",
+ match_type: "enclosed",
+ patterns: [
+ "\\(\\((.+?)\\)\\)", // ((OOC对话))
+ "//(.+?)$" // //OOC注释
+ ],
+ groups: ["ooc_text"],
+ priority: 70
+ },
+ {
+ type: "dialogue",
+ match_type: "enclosed",
+ patterns: [
+ "「(.+?)」",
+ "\u201c(.+?)\u201d",
+ "\"(.+?)\""
+ ],
+ groups: ["dialogue_text"],
+ priority: 60
+ },
+ {
+ type: "system",
+ match_type: "prefix",
+ patterns: [
+ "^\\[系统\\](.+)",
+ "^\\[System\\](.+)"
+ ],
+ groups: ["system_message"],
+ priority: 50
+ },
+ {
+ type: "text",
+ match_type: "prefix",
+ patterns: [
+ "^(.+)$"
+ ],
+ groups: ["text_content"],
+ priority: 1
+ }
+ ]
+}
diff --git a/src/conventionalrp/core/processor.py b/src/conventionalrp/core/processor.py
index 4e2f573..bc74ffb 100644
--- a/src/conventionalrp/core/processor.py
+++ b/src/conventionalrp/core/processor.py
@@ -1,22 +1,68 @@
+from typing import List, Dict, Any, Optional
+
+
class Processor:
- def __init__(self, rules):
- self.rules = rules
+ """处理器,用于处理解析后的token"""
+
+ def __init__(self, rules: Optional[Dict[str, Any]] = None):
+ """
+ 初始化处理器
+
+ Args:
+ rules: 处理规则(可选)
+ """
+ self.rules = rules or {}
- def process_tokens(self, tokens):
+ def process_tokens(self, tokens: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
+ """
+ 处理token列表
+
+ Args:
+ tokens: 解析后的token列表
+
+ Returns:
+ 处理后的数据列表
+ """
processed_data = []
for token in tokens:
- processed_data.append(self.apply_rules(token))
+ processed_token = self.apply_rules(token)
+ processed_data.append(processed_token)
return processed_data
- def apply_rules(self, token):
- # Implement rule application logic here
- for rule in self.rules:
- if rule.matches(token):
- return rule.apply(token)
- return token
+ def apply_rules(self, token: Dict[str, Any]) -> Dict[str, Any]:
+ """
+ 对单个token应用规则
+
+ Args:
+ token: 单个token
+
+ Returns:
+ 处理后的token
+ """
+ # 基础实现:直接返回token
+ # 可以在此添加更多处理逻辑
+ processed = token.copy()
+
+ # 添加处理时间戳
+ if "timestamp" in processed:
+ processed["processed"] = True
+
+ return processed
- def generate_output(self, processed_data, format_type):
- # Implement output generation logic based on format_type
+ def generate_output(self, processed_data: List[Dict[str, Any]], format_type: str) -> str:
+ """
+ 生成指定格式的输出
+
+ Args:
+ processed_data: 处理后的数据
+ format_type: 输出格式 (json/html/markdown)
+
+ Returns:
+ 格式化后的字符串
+
+ Raises:
+ ValueError: 不支持的格式类型
+ """
if format_type == "json":
return self.generate_json_output(processed_data)
elif format_type == "html":
@@ -24,21 +70,21 @@ class Processor:
elif format_type == "markdown":
return self.generate_markdown_output(processed_data)
else:
- raise ValueError("Unsupported format type")
+ raise ValueError(f"Unsupported format type: {format_type}")
- def generate_json_output(self, processed_data):
+ def generate_json_output(self, processed_data: List[Dict[str, Any]]) -> str:
+ """生成JSON格式输出"""
import json
+ return json.dumps(processed_data, ensure_ascii=False, indent=2)
- return json.dumps(processed_data)
-
- def generate_html_output(self, processed_data):
- # Implement HTML output generation
+ def generate_html_output(self, processed_data: List[Dict[str, Any]]) -> str:
+ """生成HTML格式输出"""
return (
"<html><body>"
+ "".join(f"<p>{data}</p>" for data in processed_data)
+ "</body></html>"
)
- def generate_markdown_output(self, processed_data):
- # Implement Markdown output generation
+ def generate_markdown_output(self, processed_data: List[Dict[str, Any]]) -> str:
+ """生成Markdown格式输出"""
return "\n".join(f"- {data}" for data in processed_data)
diff --git a/src/conventionalrp/extractors/rule_extractor.py b/src/conventionalrp/extractors/rule_extractor.py
index b0d03d5..bfc60c8 100644
--- a/src/conventionalrp/extractors/rule_extractor.py
+++ b/src/conventionalrp/extractors/rule_extractor.py
@@ -1,3 +1,8 @@
+import json5
+from pathlib import Path
+from typing import Dict, Any, Optional
+
+
class BaseExtractor:
def extract(self):
raise NotImplementedError("This method should be overridden by subclasses.")
@@ -7,19 +12,65 @@ class BaseExtractor:
class RuleExtractor(BaseExtractor):
- def __init__(self, config_file):
+ """规则提取器,用于从配置文件加载解析规则"""
+
+ def __init__(self, config_file: Optional[str] = None):
+ """
+ 初始化规则提取器
+
+ Args:
+ config_file: 规则配置文件路径(可选)
+ """
self.config_file = config_file
- self.rules = self.load_rules_from_file()
+ self.rules: Dict[str, Any] = {}
+ if config_file:
+ self.rules = self.load_rules_from_file(config_file)
- def load_rules_from_file(self):
- import json
+ def load_rules_from_file(self, config_file: str) -> Dict[str, Any]:
+ """
+ 从文件加载规则
+
+ Args:
+ config_file: 规则配置文件路径
+
+ Returns:
+ 解析后的规则字典
+
+ Raises:
+ FileNotFoundError: 文件不存在
+ ValueError: 文件内容为空或格式错误
+ """
+ if not Path(config_file).exists():
+ raise FileNotFoundError(f"Rule file not found: {config_file}")
+
+ with open(config_file, "r", encoding="utf-8") as file:
+ content = file.read()
+
+ rules = json5.loads(content)
+
+ if not rules:
+ raise ValueError("Rule file cannot be empty")
+
+ return rules
- with open(self.config_file, "r") as file:
- return json.load(file)
+ def load_rules(self, config_file: str) -> Dict[str, Any]:
+ """
+ 加载规则(兼容旧接口)
+
+ Args:
+ config_file: 规则配置文件路径
+
+ Returns:
+ 解析后的规则字典
+ """
+ self.rules = self.load_rules_from_file(config_file)
+ return self.rules
- def extract(self):
- # Implement rule extraction logic here
- extracted_rules = []
- for rule in self.rules:
- extracted_rules.append(rule) # Placeholder for actual extraction logic
- return extracted_rules
+ def extract(self) -> Dict[str, Any]:
+ """
+ 提取规则
+
+ Returns:
+ 规则字典
+ """
+ return self.rules
diff --git a/src/conventionalrp/renderers/markdown_renderer.py b/src/conventionalrp/renderers/markdown_renderer.py
index fab429f..9df59a2 100644
--- a/src/conventionalrp/renderers/markdown_renderer.py
+++ b/src/conventionalrp/renderers/markdown_renderer.py
@@ -1,17 +1,50 @@
from .base import BaseRenderer
+from typing import List, Dict, Any, Union
class MarkdownRenderer(BaseRenderer):
- def render(self, data):
+ def render(self, data: Union[List[Dict[str, Any]], Dict[str, Any]]) -> str:
"""
Renders the given data in Markdown format.
Args:
- data (dict): The data to render.
+ data: The data to render (can be list or dict).
Returns:
str: The rendered Markdown string.
"""
+ if isinstance(data, list):
+ return self._render_list(data)
+ elif isinstance(data, dict):
+ return self._render_dict(data)
+ else:
+ return str(data)
+
+ def _render_list(self, data: List[Dict[str, Any]]) -> str:
+ """渲染列表数据为 Markdown"""
+ markdown_output = "# TRPG Log\n\n"
+
+ for i, entry in enumerate(data, 1):
+ if entry.get("type") == "metadata":
+ markdown_output += f"## Entry {i}\n\n"
+ markdown_output += f"**Timestamp**: {entry.get('timestamp', 'N/A')} \n"
+ markdown_output += f"**Speaker**: {entry.get('speaker', 'N/A')} \n\n"
+
+ content_items = entry.get("content", [])
+ if content_items:
+ markdown_output += "**Content**:\n\n"
+ for content in content_items:
+ content_type = content.get("type", "unknown")
+ content_text = content.get("content", "")
+ markdown_output += f"- [{content_type}] {content_text}\n"
+ markdown_output += "\n"
+ else:
+ markdown_output += f"- {entry}\n"
+
+ return markdown_output
+
+ def _render_dict(self, data: Dict[str, Any]) -> str:
+ """渲染字典数据为 Markdown"""
markdown_output = ""
for key, value in data.items():
markdown_output += f"## {key}\n\n{value}\n\n"
diff --git a/tests/README.md b/tests/README.md
new file mode 100644
index 0000000..97a8028
--- /dev/null
+++ b/tests/README.md
@@ -0,0 +1,63 @@
+# ConventionalRP 测试套件
+
+本目录包含 ConventionalRP SDK 的所有单元测试。
+
+## 测试文件
+
+- `test_parser.py` - Parser 解析器测试
+- `test_processor.py` - Processor 处理器测试
+- `test_rule_extractor.py` - RuleExtractor 规则提取器测试
+- `test_renderers.py` - 渲染器测试(HTML/JSON/Markdown)
+- `test_pyo3.py` - PyO3 Rust 扩展测试
+
+## 运行测试
+
+### 运行所有测试
+
+```bash
+python tests/run_tests.py
+```
+
+### 运行单个测试文件
+
+```bash
+python -m unittest tests/test_parser.py
+python -m unittest tests/test_processor.py
+```
+
+### 运行特定测试类
+
+```bash
+python -m unittest tests.test_parser.TestParser
+```
+
+### 运行特定测试方法
+
+```bash
+python -m unittest tests.test_parser.TestParser.test_load_rules_success
+```
+
+## 测试覆盖率
+
+要查看测试覆盖率,请安装 `coverage` 并运行:
+
+```bash
+pip install coverage
+coverage run -m unittest discover -s tests -p "test_*.py"
+coverage report
+coverage html # 生成 HTML 报告
+```
+
+## 测试数据
+
+测试使用临时文件来模拟规则文件和日志文件,测试完成后会自动清理。
+
+## 添加新测试
+
+创建新的测试文件时,请遵循以下约定:
+
+1. 文件名以 `test_` 开头
+2. 测试类继承自 `unittest.TestCase`
+3. 测试方法以 `test_` 开头
+4. 使用 `setUp()` 和 `tearDown()` 方法管理测试状态
+5. 添加清晰的文档字符串说明测试目的
diff --git a/tests/run_tests.py b/tests/run_tests.py
new file mode 100644
index 0000000..4cfc2d4
--- /dev/null
+++ b/tests/run_tests.py
@@ -0,0 +1,36 @@
+#!/usr/bin/env python3
+"""
+测试套件运行器
+运行所有单元测试
+"""
+
+import sys
+import unittest
+from pathlib import Path
+
+# 添加 src 目录到路径
+src_path = Path(__file__).parent.parent / "src"
+sys.path.insert(0, str(src_path))
+
+
+def run_all_tests():
+ """运行所有测试"""
+ # 创建测试加载器
+ loader = unittest.TestLoader()
+
+ # 从当前目录加载所有测试
+ suite = loader.discover(
+ start_dir=Path(__file__).parent,
+ pattern='test_*.py'
+ )
+
+ # 运行测试
+ runner = unittest.TextTestRunner(verbosity=2)
+ result = runner.run(suite)
+
+ # 返回结果
+ return 0 if result.wasSuccessful() else 1
+
+
+if __name__ == "__main__":
+ sys.exit(run_all_tests())
diff --git a/tests/test_parser.py b/tests/test_parser.py
new file mode 100644
index 0000000..595d0b4
--- /dev/null
+++ b/tests/test_parser.py
@@ -0,0 +1,142 @@
+#!/usr/bin/env python3
+"""
+Parser 模块单元测试
+"""
+
+import unittest
+import tempfile
+from pathlib import Path
+from conventionalrp.core.parser import Parser
+
+
+class TestParser(unittest.TestCase):
+ """Parser 类的单元测试"""
+
+ def setUp(self):
+ """设置测试环境"""
+ self.parser = Parser()
+
+ # 创建临时规则文件
+ self.temp_rules = tempfile.NamedTemporaryFile(
+ mode='w',
+ suffix='.json5',
+ delete=False,
+ encoding='utf-8'
+ )
+ self.temp_rules.write('''{
+ metadata: [{
+ type: "metadata",
+ patterns: ["^\\\\[(.+?)\\\\]\\\\s*<(.+?)>\\\\s*(.*)$"],
+ groups: ["timestamp", "speaker", "content"],
+ priority: 100
+ }],
+ content: [
+ {
+ type: "dice_roll",
+ match_type: "enclosed",
+ patterns: ["\\\\[d(\\\\d+)\\\\s*=\\\\s*(\\\\d+)\\\\]"],
+ groups: ["dice_type", "result"],
+ priority: 90
+ },
+ {
+ type: "dialogue",
+ match_type: "enclosed",
+ patterns: ["「(.+?)」"],
+ groups: ["dialogue_text"],
+ priority: 60
+ },
+ {
+ type: "text",
+ match_type: "prefix",
+ patterns: ["^(.+)$"],
+ groups: ["text_content"],
+ priority: 1
+ }
+ ]
+ }''')
+ self.temp_rules.close()
+
+ # 创建临时日志文件
+ self.temp_log = tempfile.NamedTemporaryFile(
+ mode='w',
+ suffix='.txt',
+ delete=False,
+ encoding='utf-8'
+ )
+ self.temp_log.write('''[2025-10-24 14:30:01] <艾莉娅> 「我要检查这扇门」
+[2025-10-24 14:30:05] <DiceBot> 检定结果: [d20 = 18]
+[2025-10-24 14:30:10] <DM> 你发现了陷阱
+''')
+ self.temp_log.close()
+
+ def tearDown(self):
+ """清理测试环境"""
+ Path(self.temp_rules.name).unlink(missing_ok=True)
+ Path(self.temp_log.name).unlink(missing_ok=True)
+
+ def test_load_rules_success(self):
+ """测试成功加载规则文件"""
+ self.parser.load_rules(self.temp_rules.name)
+ self.assertIn("metadata", self.parser.rules)
+ self.assertIn("content", self.parser.rules)
+
+ def test_load_rules_file_not_found(self):
+ """测试加载不存在的规则文件"""
+ with self.assertRaises(FileNotFoundError):
+ self.parser.load_rules("nonexistent_file.json5")
+
+ def test_parse_log_success(self):
+ """测试成功解析日志"""
+ self.parser.load_rules(self.temp_rules.name)
+ result = self.parser.parse_log(self.temp_log.name)
+
+ self.assertIsInstance(result, list)
+ self.assertGreater(len(result), 0)
+
+ # 检查第一条记录
+ first_entry = result[0]
+ self.assertIn("timestamp", first_entry)
+ self.assertIn("speaker", first_entry)
+ self.assertIn("content", first_entry)
+ self.assertEqual(first_entry["speaker"], "艾莉娅")
+
+ def test_parse_log_file_not_found(self):
+ """测试解析不存在的日志文件"""
+ self.parser.load_rules(self.temp_rules.name)
+ with self.assertRaises(FileNotFoundError):
+ self.parser.parse_log("nonexistent_log.txt")
+
+ def test_match_metadata(self):
+ """测试元数据匹配"""
+ self.parser.load_rules(self.temp_rules.name)
+ line = "[2025-10-24 14:30:01] <艾莉娅> 测试内容"
+ result = self.parser._match_metadata(line)
+
+ self.assertIsNotNone(result)
+ self.assertEqual(result["type"], "metadata")
+ self.assertEqual(result["timestamp"], "2025-10-24 14:30:01")
+ self.assertEqual(result["speaker"], "艾莉娅")
+
+ def test_parse_line_content_dialogue(self):
+ """测试解析对话内容"""
+ self.parser.load_rules(self.temp_rules.name)
+ line = "「这是一段对话」"
+ result = self.parser._parse_line_content(line)
+
+ self.assertIsInstance(result, list)
+ self.assertGreater(len(result), 0)
+ self.assertEqual(result[0]["type"], "dialogue")
+
+ def test_parse_line_content_dice_roll(self):
+ """测试解析骰子投掷"""
+ self.parser.load_rules(self.temp_rules.name)
+ line = "检定结果: [d20 = 18]"
+ result = self.parser._parse_line_content(line)
+
+ # 应该包含文本和骰子投掷
+ dice_tokens = [t for t in result if t["type"] == "dice_roll"]
+ self.assertGreater(len(dice_tokens), 0)
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/tests/test_processor.py b/tests/test_processor.py
new file mode 100644
index 0000000..c08fc52
--- /dev/null
+++ b/tests/test_processor.py
@@ -0,0 +1,114 @@
+#!/usr/bin/env python3
+"""
+Processor 模块单元测试
+"""
+
+import unittest
+from conventionalrp.core.processor import Processor
+
+
+class TestProcessor(unittest.TestCase):
+ """Processor 类的单元测试"""
+
+ def setUp(self):
+ """设置测试环境"""
+ self.processor = Processor()
+ self.sample_tokens = [
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:01",
+ "speaker": "艾莉娅",
+ "content": [
+ {"type": "dialogue", "content": "「测试对话」"}
+ ]
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:05",
+ "speaker": "DM",
+ "content": [
+ {"type": "text", "content": "测试文本"}
+ ]
+ }
+ ]
+
+ def test_init_without_rules(self):
+ """测试无规则初始化"""
+ processor = Processor()
+ self.assertEqual(processor.rules, {})
+
+ def test_init_with_rules(self):
+ """测试带规则初始化"""
+ rules = {"test_rule": "value"}
+ processor = Processor(rules)
+ self.assertEqual(processor.rules, rules)
+
+ def test_process_tokens(self):
+ """测试处理 token 列表"""
+ result = self.processor.process_tokens(self.sample_tokens)
+
+ self.assertIsInstance(result, list)
+ self.assertEqual(len(result), len(self.sample_tokens))
+
+ # 检查处理标记
+ for token in result:
+ if "timestamp" in token:
+ self.assertTrue(token.get("processed"))
+
+ def test_apply_rules(self):
+ """测试应用规则到单个 token"""
+ token = self.sample_tokens[0]
+ result = self.processor.apply_rules(token)
+
+ self.assertIsInstance(result, dict)
+ self.assertIn("timestamp", result)
+ self.assertTrue(result.get("processed"))
+
+ def test_generate_json_output(self):
+ """测试生成 JSON 输出"""
+ output = self.processor.generate_json_output(self.sample_tokens)
+
+ self.assertIsInstance(output, str)
+ self.assertIn("timestamp", output)
+ self.assertIn("speaker", output)
+
+ def test_generate_html_output(self):
+ """测试生成 HTML 输出"""
+ output = self.processor.generate_html_output(self.sample_tokens)
+
+ self.assertIsInstance(output, str)
+ self.assertIn("<html>", output)
+ self.assertIn("</html>", output)
+
+ def test_generate_markdown_output(self):
+ """测试生成 Markdown 输出"""
+ output = self.processor.generate_markdown_output(self.sample_tokens)
+
+ self.assertIsInstance(output, str)
+ self.assertIn("-", output)
+
+ def test_generate_output_json(self):
+ """测试生成输出 - JSON 格式"""
+ output = self.processor.generate_output(self.sample_tokens, "json")
+ self.assertIsInstance(output, str)
+
+ def test_generate_output_html(self):
+ """测试生成输出 - HTML 格式"""
+ output = self.processor.generate_output(self.sample_tokens, "html")
+ self.assertIsInstance(output, str)
+
+ def test_generate_output_markdown(self):
+ """测试生成输出 - Markdown 格式"""
+ output = self.processor.generate_output(self.sample_tokens, "markdown")
+ self.assertIsInstance(output, str)
+
+ def test_generate_output_unsupported_format(self):
+ """测试生成输出 - 不支持的格式"""
+ with self.assertRaises(ValueError) as context:
+ self.processor.generate_output(self.sample_tokens, "pdf")
+
+ self.assertIn("Unsupported format type", str(context.exception))
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/tests/test_renderers.py b/tests/test_renderers.py
new file mode 100644
index 0000000..13e4540
--- /dev/null
+++ b/tests/test_renderers.py
@@ -0,0 +1,114 @@
+#!/usr/bin/env python3
+"""
+Renderers 模块单元测试
+"""
+
+import unittest
+import json
+from conventionalrp.renderers.html_renderer import HTMLRenderer
+from conventionalrp.renderers.json_renderer import JSONRenderer
+from conventionalrp.renderers.markdown_renderer import MarkdownRenderer
+
+
+class TestRenderers(unittest.TestCase):
+ """测试所有渲染器"""
+
+ def setUp(self):
+ """设置测试数据"""
+ self.sample_data = [
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:01",
+ "speaker": "艾莉娅",
+ "content": [
+ {"type": "dialogue", "content": "「测试对话」"}
+ ]
+ },
+ {
+ "type": "metadata",
+ "timestamp": "2025-10-24 14:30:05",
+ "speaker": "DM",
+ "content": [
+ {"type": "text", "content": "测试文本"}
+ ]
+ }
+ ]
+
+ self.dict_data = {
+ "title": "测试标题",
+ "content": "测试内容"
+ }
+
+ def test_html_renderer_basic(self):
+ """测试 HTML 渲染器基本功能"""
+ renderer = HTMLRenderer()
+ output = renderer.render(self.sample_data)
+
+ self.assertIsInstance(output, str)
+ self.assertIn("<html>", output)
+ self.assertIn("</html>", output)
+ self.assertIn("<title>", output)
+
+ def test_html_renderer_set_style(self):
+ """测试 HTML 渲染器设置样式"""
+ renderer = HTMLRenderer()
+ renderer.set_style("custom_style")
+ # 当前实现为占位符,仅测试不抛出异常
+ self.assertIsNotNone(renderer)
+
+ def test_json_renderer_basic(self):
+ """测试 JSON 渲染器基本功能"""
+ renderer = JSONRenderer()
+ output = renderer.render(self.sample_data)
+
+ self.assertIsInstance(output, str)
+
+ # 验证输出是有效的 JSON
+ parsed = json.loads(output)
+ self.assertIsInstance(parsed, list)
+ self.assertEqual(len(parsed), len(self.sample_data))
+
+ def test_json_renderer_unicode(self):
+ """测试 JSON 渲染器处理 Unicode"""
+ renderer = JSONRenderer()
+ output = renderer.render(self.sample_data)
+
+ # 应该保留中文字符
+ self.assertIn("艾莉娅", output)
+ self.assertIn("测试", output)
+
+ def test_markdown_renderer_basic(self):
+ """测试 Markdown 渲染器基本功能"""
+ renderer = MarkdownRenderer()
+ output = renderer.render(self.dict_data)
+
+ self.assertIsInstance(output, str)
+ self.assertIn("##", output) # 应该有标题标记
+ self.assertIn("测试标题", output)
+
+ def test_markdown_renderer_set_style(self):
+ """测试 Markdown 渲染器设置样式"""
+ renderer = MarkdownRenderer()
+ style = {"heading_level": 2}
+ renderer.set_style(style)
+ self.assertEqual(renderer.style, style)
+
+ def test_all_renderers_empty_data(self):
+ """测试所有渲染器处理空数据"""
+ empty_data = []
+
+ html_renderer = HTMLRenderer()
+ html_output = html_renderer.render(empty_data)
+ self.assertIsInstance(html_output, str)
+
+ json_renderer = JSONRenderer()
+ json_output = json_renderer.render(empty_data)
+ self.assertEqual(json_output, "[]")
+
+ md_renderer = MarkdownRenderer()
+ md_output = md_renderer.render({})
+ self.assertEqual(md_output, "")
+
+
+if __name__ == "__main__":
+ unittest.main()
diff --git a/tests/test_rule_extractor.py b/tests/test_rule_extractor.py
new file mode 100644
index 0000000..6c4d585
--- /dev/null
+++ b/tests/test_rule_extractor.py
@@ -0,0 +1,99 @@
+#!/usr/bin/env python3
+"""
+RuleExtractor 模块单元测试
+"""
+
+import unittest
+import tempfile
+import json5
+from pathlib import Path
+from conventionalrp.extractors.rule_extractor import RuleExtractor
+
+
+class TestRuleExtractor(unittest.TestCase):
+ """RuleExtractor 类的单元测试"""
+
+ def setUp(self):
+ """设置测试环境"""
+ # 创建临时规则文件
+ self.temp_rules = tempfile.NamedTemporaryFile(
+ mode='w',
+ suffix='.json5',
+ delete=False,
+ encoding='utf-8'
+ )
+ self.temp_rules.write('''{
+ test_rule: "test_value",
+ metadata: [{type: "test"}],
+ content: [{type: "test_content"}]
+ }''')
+ self.temp_rules.close()
+
+ def tearDown(self):
+ """清理测试环境"""
+ Path(self.temp_rules.name).unlink(missing_ok=True)
+
+ def test_init_without_file(self):
+ """测试不带配置文件的初始化"""
+ extractor = RuleExtractor()
+ self.assertEqual(extractor.rules, {})
+ self.assertIsNone(extractor.config_file)
+
+ def test_init_with_file(self):
+ """测试带配置文件的初始化"""
+ extractor = RuleExtractor(self.temp_rules.name)
+ self.assertIsNotNone(extractor.rules)
+ self.assertIn("test_rule", extractor.rules)
+
+ def test_load_rules_from_file_success(self):
+ """测试成功加载规则文件"""
+ extractor = RuleExtractor()
+ rules = extractor.load_rules_from_file(self.temp_rules.name)
+
+ self.assertIsInstance(rules, dict)
+ self.assertIn("test_rule", rules)
+ self.assertEqual(rules["test_rule"], "test_value")
+
+ def test_load_rules_from_file_not_found(self):
+ """测试加载不存在的文件"""
+ extractor = RuleExtractor()
+ with self.assertRaises(FileNotFoundError):
+ extractor.load_rules_from_file("nonexistent.json5")
+
+ def test_load_rules_empty_file(self):
+ """测试加载空文件"""
+ empty_file = tempfile.NamedTemporaryFile(
+ mode='w',
+ suffix='.json5',
+ delete=False,
+ encoding='utf-8'
+ )
+ empty_file.write('')
+ empty_file.close()
+
+ try:
+ extractor = RuleExtractor()
+ with self.assertRaises(ValueError):
+ extractor.load_rules_from_file(empty_file.name)
+ finally:
+ Path(empty_file.name).unlink(missing_ok=True)
+
+ def test_load_rules_method(self):
+ """测试 load_rules 方法"""
+ extractor = RuleExtractor()
+ rules = extractor.load_rules(self.temp_rules.name)
+
+ self.assertIsInstance(rules, dict)
+ self.assertEqual(extractor.rules, rules)
+
+ def test_extract_method(self):
+ """测试 extract 方法"""
+ extractor = RuleExtractor(self.temp_rules.name)
+ extracted = extractor.extract()
+
+ self.assertIsInstance(extracted, dict)
+ self.assertEqual(extracted, extractor.rules)
+
+
+if __name__ == "__main__":
+ unittest.main()