From 69a6c865c584a87693513e01cce5c2ab44ae92aa Mon Sep 17 00:00:00 2001 From: 简律纯 Date: Wed, 29 Oct 2025 17:42:01 +0800 Subject: refactor: Refactor code structure for improved readability and maintainability --- README.md | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index cc5eca6..ca12630 100644 --- a/README.md +++ b/README.md @@ -11,6 +11,7 @@ Conventional Role Play (CVRP) is a Python SDK designed for structured processing * **Rule Extraction**: Easily extract rules from JSON configuration files using the `RuleExtractor` class. * **Multi-format Rendering**: Render outputs in various formats such as HTML, Markdown, and JSON using the respective renderer classes (e.g., `HTMLRenderer`). +* **THULAC Smart Parser**: 🆕 Intelligent parsing using Tsinghua THULAC (THU Lexical Analyzer for Chinese) for automatic content recognition with minimal configuration. See [THULAC Parser Documentation](docs/THULAC_PARSER.md). * **Extensibility**: Create custom plugins to extend the functionality of the SDK. See custom-plugins for details. * **Comprehensive API**: Full API documentation available for all modules and classes. See api-documentation. @@ -24,6 +25,8 @@ pip install conventionalrp ## Basic Usage +### Traditional Parser (Regex-based) + Here is a simple example of how to use the TRPG Log Processor: ```python @@ -53,6 +56,38 @@ with open('output.html', 'w') as f: f.write(html_output) ``` +### THULAC Smart Parser + +Simplified parsing with automatic content recognition: + +```python +from conventionalrp.core.thulac_parser import THULACParser + +# Step 1: Create parser +parser = THULACParser(seg_only=False) + +# Step 2: Load simplified rules (just delimiters!) +parser.load_rules('examples/rules/thulac_rules.json5') + +# Step 3: Parse a line +text = '[15:30] "Hello!"(waves)' +result = parser.parse_line(text) + +# Result: +# { +# "metadata": {"timestamp": "15:30", "speaker": "Alice"}, +# "content": [ +# {"type": "dialogue", "content": "Hello!", "confidence": 1.0}, +# {"type": "action", "content": "waves", "confidence": 1.0} +# ] +# } + +# Step 4: Parse entire log file +results = parser.parse_log('path/to/log.txt') +stats = parser.get_statistics() +print(f"Parsed {stats['total_parsed']} lines") +``` + ## Custom Plugins To create a custom plugin, you can follow the example provided in -- cgit v1.2.3-70-g09d2