<feed xmlns='http://www.w3.org/2005/Atom'>
<title>base-model/src/utils, branch v0.1.9</title>
<subtitle>BaseModel for HydroRoll </subtitle>
<id>https://git.hydroroll.team/base-model/atom?h=v0.1.9</id>
<link rel='self' href='https://git.hydroroll.team/base-model/atom?h=v0.1.9'/>
<link rel='alternate' type='text/html' href='https://git.hydroroll.team/base-model/'/>
<updated>2025-12-30T11:54:08Z</updated>
<entry>
<title>feat: Refactor and enhance TRPG NER model SDK</title>
<updated>2025-12-30T11:54:08Z</updated>
<author>
<name>HsiangNianian</name>
<email>i@jyunko.cn</email>
</author>
<published>2025-12-30T11:54:08Z</published>
<link rel='alternate' type='text/html' href='https://git.hydroroll.team/base-model/commit/?id=575114661ef9afb95df2a211e1d8498686340e6b'/>
<id>urn:sha1:575114661ef9afb95df2a211e1d8498686340e6b</id>
<content type='text'>
- Removed deprecated `word_conll_to_char_conll.py` utility and integrated its functionality into the new `utils` module.
- Introduced a comprehensive GitHub Actions workflow for automated publishing to PyPI and GitHub Releases.
- Added `__init__.py` files to establish package structure for `basemodel`, `inference`, `training`, and `utils` modules.
- Implemented model downloading functionality in `download_model.py` to fetch pre-trained ONNX models.
- Developed `TRPGParser` class for ONNX-based inference, including methods for parsing TRPG logs.
- Created training utilities in `training/__init__.py` for NER model training with Hugging Face Transformers.
- Enhanced utility functions for CoNLL file parsing and dataset creation.
- Added command-line interface for converting CoNLL files to datasets with validation options.
</content>
</entry>
<entry>
<title>feat: Implement TRPG NER training and inference script with robust model path detection and enhanced timestamp/speaker handling</title>
<updated>2025-12-30T11:14:39Z</updated>
<author>
<name>HsiangNianian</name>
<email>i@jyunko.cn</email>
</author>
<published>2025-12-30T11:14:39Z</published>
<link rel='alternate' type='text/html' href='https://git.hydroroll.team/base-model/commit/?id=7ac684f1f82023c6284cd7d7efde11b8dc98c149'/>
<id>urn:sha1:7ac684f1f82023c6284cd7d7efde11b8dc98c149</id>
<content type='text'>
- Added main training and inference logic in main.py, including CoNLL parsing, tokenization, and model training.
- Introduced TRPGParser class for inference with entity aggregation and special handling for timestamps and speakers.
- Developed utility functions for converting word-level CoNLL to char-level and saving datasets in various formats.
- Added ONNX export functionality for the trained model.
- Created a comprehensive requirements.txt and updated pyproject.toml with necessary dependencies.
- Implemented tests for ONNX inference to validate model outputs.
</content>
</entry>
</feed>
