Create README.md
This commit is contained in:
parent
5aa954bf2d
commit
c23db8c50e
47
README.md
Normal file
47
README.md
Normal file
|
@ -0,0 +1,47 @@
|
|||
# jitenbot
|
||||
Jitenbot is a program for scraping Japanese dictionary websites and
|
||||
compiling the scraped data into compact dictionary file formats.
|
||||
|
||||
### Supported Dictionaries
|
||||
* Online
|
||||
* [四字熟語辞典オンライン](https://yoji.jitenon.jp/)
|
||||
* [故事・ことわざ・慣用句オンライン](https://kotowaza.jitenon.jp/)
|
||||
* Offline
|
||||
* [新明解国語辞典 第八版](https://www.monokakido.jp/ja/dictionaries/smk8/index.html)
|
||||
* [大辞林 第四版](https://www.monokakido.jp/ja/dictionaries/daijirin2/index.html)
|
||||
|
||||
|
||||
### Supported Output Formats
|
||||
|
||||
* [Yomichan](https://github.com/foosoft/yomichan)
|
||||
|
||||
# Usage
|
||||
```
|
||||
usage: jitenbot [-h] [-p PAGE_DIR] [-i IMAGE_DIR]
|
||||
{jitenon-yoji,jitenon-kotowaza,smk8,daijirin2}
|
||||
|
||||
Convert Japanese dictionary files to new formats.
|
||||
|
||||
positional arguments:
|
||||
{jitenon-yoji,jitenon-kotowaza,smk8,daijirin2}
|
||||
name of dictionary to convert
|
||||
|
||||
options:
|
||||
-h, --help show this help message and exit
|
||||
-p PAGE_DIR, --page-dir PAGE_DIR
|
||||
path to directory containing XML page files
|
||||
-i IMAGE_DIR, --image-dir IMAGE_DIR
|
||||
path to directory containing image folders (gaiji,
|
||||
graphics, etc.)
|
||||
|
||||
```
|
||||
### Online Targets
|
||||
Jitenbot will scrape the target website and save the pages to the [user's cache directory](https://pypi.org/project/platformdirs/).
|
||||
As a courtesy to the website owners, jitenbot is configured to pause for 10 seconds between each page request. Consequently,
|
||||
a complete crawl of a target website may take several hours.
|
||||
|
||||
### Offline Targets
|
||||
Page data and image data must be supplied by the user and passed to jitenbot via the appropriate command line flags.
|
||||
|
||||
# Attribution
|
||||
`Adobe-Japan1_sequences.txt` is provided by [The Adobe-Japan1-7 Character Collection](https://github.com/adobe-type-tools/Adobe-Japan1).
|
Loading…
Reference in a new issue