ひらがな)
3. Kanji compounds (熟語)
- **Individual Kanji characters** (字)
---
## So, let me tell you about [Kanji](https://en.wikipedia.org/wiki/Kanji)...
Ancient [logographic](https://en.wikipedia.org/wiki/Logogram) writing system originating from China.
One Kanji ≈ one word/concept. Multiple Kanji ≈ one word/concept.
停 電 停電 電停
[A thousand Kanji](https://en.wikipedia.org/wiki/Ky%C5%8Diku_kanji) in primary school and [another thousand](https://en.wikipedia.org/wiki/J%C5%8Dy%C5%8D_kanji) in high school.
--
Learning to **recognize**, read, pronounce and write Kanji is difficult.
---
# Kanji Primer
Some simple examples:
月 火 水 木 金 土 日
--
_Combine_ multiple Kanji together to make a single "complex" Kanji:
朋 棚 林 森 明 昌
--
Knowing the number of strokes in each Kanji is helpful:
ー 九 土 木 本 旭 体 侍 待 倉
Complex Kanji typically have a higher stroke count.
--
There are thousands of Kanji in use. How can we remember them?!
---
# Getting a Grip on the Kanji
金 + 失 = 鉄
If you can't _say_ it, you can't _use_ it.
How can we refer to these things?
--
We **cheat**. We assign _our own English keyword_ to each Kanji.
The keyword _may_ be related to the "meaning" of the Kanji.
金metal
+
失lose
=
鉄iron
--
We make up _our own story_ to connect the keyword to the Kanji, e.g.:
> Iron is an essential metal for the human body, so don't lose too much of it.
The [Heisig System](https://en.wikipedia.org/wiki/Remembering_the_Kanji_and_Remembering_the_Hanzi): 80% of the time, it works *every* time.
---
# Divide and Conquer
Complex Kanji often consist of simpler _parts_. Some complex examples:
露dew
雪snow
雷thunder
電electric
--
Examples of parts:
雨rain
足foot
各each
夂walking
口mouth
ヨbroom
田field
These **parts** get called different things: [部首](https://ja.wikipedia.org/wiki/%E9%83%A8%E9%A6%96), [radicals](https://kanjialive.com/214-traditional-kanji-radicals), roots, primitive components...
Some parts can be Kanji themselves. Others only occur as part of something else.
---
# Questions
1. How do we **organize** Kanji in a way that's easier to remember?
2. How do we **identify** Kanji that are similar to each other?
3. How do we **break down** complex Kanji into simpler parts?
4. How do we **look up** Kanji that we've never seen before?
5. How do we avoid doing the hard, manual work by ourselves?
---
layout: true
class: inverse, center, middle
---
# [Time for some Python!](https://github.com/mpenkov/heisig/blob/master/demo.ipynb)
.footnote[https://git.io/fA21v]
---
layout: true
name: topleft
---
# Heisig
Covers **2,042** commonly used Kanji, dividing them into **56** _Lessons_.
Example:
```json
{'#heisignumber': 846,
'indexordinal': 1626,
'kanji': '鉄',
'keyword': 'iron',
'lessonnumber': 25,
'strokecount': 13}
```
We can now refer to 2,042 Kanji using an unambiguous English _keyword_.
For other Kanji, we have to improvise, but they are relatively rare.
---
# [KRAD](http://www.edrdg.org/krad/kradinf.html)
From the [Electronic Dictionary Research & Development Group](http://www.edrdg.org/edrdg/index.html).
Decomposes **6,355** Kanji into **254** unique radicals.
Examples:
```
哀 : 衣 口 亠
愛 : 心 爪 冖 夂
旭 : 日 九
梓 : 十 辛 木 立
圧 : 土 厂
```
Example applications: searching for Kanji by radicals (parts), handwriting recognition, **comparing Kanji**.
Problems: arbitrary order of radicals, no positional information.
---
# Radically Confusing
Example of _different_ Kanji that have exactly the same radicals:
日sun木tree 椙 杲 杳 橸
--
More examples:
```
含 吟 合 哈
戎 戈 戉 戔
可 司 叮 哥
押 抽 抻
脅 肋 脇
桂 杜 埜
甲 申 由
細 累 縲
```
Luckily, most of these are pretty rare.
---
# [CHISE](http://www.chise.org/ids/index.html)
Covers **20,092** Kanji, decomposing each Kanji into smaller parts using [Polish notation](https://en.wikipedia.org/wiki/Polish_notation).
鉄iron
⿰
金metal
失lose
--
嵐storm
⿱
山mountain
風wind
--
亭pavilion
⿱
⿳
亠top-hat
口mouth
冖crown
丁street
--
Total of 12 layouts (translingual Unicode characters):
⿰ ⿱ ⿸ ⿺ ⿵ ⿳ ⿴ ⿹ ⿲ ⿷ ⿻ ⿶
---
# Encountering the Unknown
Kanji search challenge: look up the characters below in a dictionary.
顳 鸚 爨 驪 纜 鱸 鸛
--
When you stumble upon Kanji you don't recognize, your options include:
1. Copy-paste into a dictionary (trivial)
--
2. Enter it using the IME (easy)
--
3. Look it up in a Kanji dictionary using the _main radical_ (tricky)
--
4. Estimate the stroke count, narrow down the results (tedious)
--
5. Enter it via handwriting recognition (tedious)
--
6. Ask a friend (potentially annoying for your Kanji-literate friend)
--
7. Give up. Pretend that you never saw it and move on with life (unrewarding)
--
Armed with Python and the knowledge from the previous slides, you have more options:
8. If you recognize _any_ of the radicals, look them up via KRAD
9. If you recognize any of the parts, look them up via CHISE
耳ear
頁page
貝shellfish
女woman
鳥bird
木tree
大large
火fire
糸thread
---
# Let's talk about [Graphs](https://en.wikipedia.org/wiki/Graph_theory)
Labeled vs unlabeled graphs
Connected components
Subgraphs
Application of graphs to Kanji visualization
---
# Unlabeled Graph Example
The graph below has 26 *nodes* (vertices, points) and 17 *edges* (arcs, lines).
---
# Connected Components
There are 9 *connected components* (clusters, groups) in the graph below.
---
# Labeled Graph Example
Each node corresponds to a Kanji. Edges connect related Kanji.
---
# [Subgraph](https://en.wikipedia.org/wiki/Glossary_of_graph_theory_terms#subgraph) Example
A subgraph containing only 16 nodes. Helps simplify complex graphs.
---
# Demo Time
Visually identify related Kanji
Study related Kanji together
Reinforce "problematic" Kanji
---
# Conclusion
## Stuff We Talked About
- Brief introduction to Kanji
- Kanji resources: Heisig, KRAD, CHISE
- Basic graph theory: nodes, vertices, connected components, subgraphs
- Putting it all together: visualizing Kanji graphs
--
## Future Work
- Kanji pronunciation
- Identifying ambiguous keywords
超exceed
越overtake
追chase
逐pursue
収income
稼earnings
給salary
--
- Application to words (Kanji compounds)
- Introduce directed graphs
Want to collaborate? Talk to me!
---
name: last-page
template: inverse
## Thank you for listening!
ご静聴ありがとうございました。
Michael Penkov
@mpenkov
.footnote[
slides: https://git.io/fA2PP
notebook: https://git.io/fA21v
demo: https://kanji.now.sh
]