Reader - Reader crashes after reading some special characters #55

Open
opened 2023-10-26 05:57:31 +00:00 by DSRLIN · 2 comments
Contributor

Currently, the reader in artemis is using the default parser provided by ElementTree.


This will cause the reader to crash after reading some special characters like &.


e.g. 曲:矢鴇つかさ & 脇眞富(Arte Refact)/歌:星咲 あかり(CV:赤尾 ひかる)、高瀬 梨緒(CV:久保 ユリカ)


If we use reader on this line, then it will raise an error.


The solution is to init a parser for each tree_root first.

    parser = etree.XMLParser(recover=True)
    troot = ET.fromstring(f.read(), parser=parser)

I've tested on "Victor Kong & Yukino"、"Jehezukiel (feat.Sagi & KURORAK)"、"矢鴇つかさ & 脇眞富"、"Katzeohr & Spiegel" and other strings and it works fine.
Currently, the reader in artemis is using the default parser provided by ElementTree. <br> This will cause the reader to crash after reading some special characters like &. <br> e.g. 曲:矢鴇つかさ & 脇眞富(Arte Refact)/歌:星咲 あかり(CV:赤尾 ひかる)、高瀬 梨緒(CV:久保 ユリカ) <br> If we use reader on this line, then it will raise an error. <br> The solution is to init a parser for each tree_root first. <br> ```python parser = etree.XMLParser(recover=True) troot = ET.fromstring(f.read(), parser=parser) ``` <br> I've tested on "Victor Kong & Yukino"、"Jehezukiel (feat.Sagi & KURORAK)"、"矢鴇つかさ & 脇眞富"、"Katzeohr & Spiegel" and other strings and it works fine.
Collaborator

I do remember seeing this issue indeed, it happened with Chunithm and Ongeki only i think

Feel free to push a PR and ill give it a test

I do remember seeing this issue indeed, it happened with Chunithm and Ongeki only i think Feel free to push a PR and ill give it a test

I might be seeing the same issue, importing Chunithm SUN SDHD_2.10.01_20220913130916_0. With clean DB:

python read.py --series SDHD --version 13 --binfolder d:\ChunithmSUN --optfolder d:\ChunithmSUN\Option

[2023-11-12 21:42:10] Reader | INFO | Inserted event 1376
Traceback (most recent call last):
  File "D:\services\artemis\read.py", line 139, in <module>
    handler.read()
  File "D:\services\artemis\titles\chuni\read.py", line 41, in read
    self.read_events(f"{dir}/event")
  File "D:\services\artemis\titles\chuni\read.py", line 128, in read_events
    xml_root = ET.fromstring(strdata)
  File "C:\Program Files\Python310\lib\xml\etree\ElementTree.py", line 1342, in XML
    parser.feed(text)
xml.etree.ElementTree.ParseError: mismatched tag: line 70, column 6
I might be seeing the same issue, importing Chunithm SUN SDHD_2.10.01_20220913130916_0. With clean DB: ``` python read.py --series SDHD --version 13 --binfolder d:\ChunithmSUN --optfolder d:\ChunithmSUN\Option [2023-11-12 21:42:10] Reader | INFO | Inserted event 1376 Traceback (most recent call last): File "D:\services\artemis\read.py", line 139, in <module> handler.read() File "D:\services\artemis\titles\chuni\read.py", line 41, in read self.read_events(f"{dir}/event") File "D:\services\artemis\titles\chuni\read.py", line 128, in read_events xml_root = ET.fromstring(strdata) File "C:\Program Files\Python310\lib\xml\etree\ElementTree.py", line 1342, in XML parser.feed(text) xml.etree.ElementTree.ParseError: mismatched tag: line 70, column 6 ```
Midorica added the
bug
label 2023-11-13 14:54:49 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: Hay1tsme/artemis#55
No description provided.