私がプログラミングを好きなのは、探求心と創造性を満たしてくれるからです。
今年の1月12日から何となくで始めた構文解析。
1カ月半ほどかけてようやく(一応)完成したので、バグ探しも兼ねて「デモンストレーション」してみます。
デモは1~4まであります。
まずはごく簡単な HTML です。
<html>
<head>
</head>
<body>
</body>
</html>
解析結果。
Object: [html]
[0] Object: [kakko<>]
[0] "<"
[1] "html"
[2] ">"
[1] "\n"
[2] Object: [kakko<>]
[0] "<"
[1] "head"
[2] ">"
[3] "\n"
[4] Object: [kakko<>]
[0] "<"
[1] "/head"
[2] ">"
[5] "\n"
[6] Object: [kakko<>]
[0] "<"
[1] "body"
[2] ">"
[7] "\n"
[8] Object: [kakko<>]
[0] "<"
[1] "/body"
[2] ">"
[9] "\n"
[10] Object: [kakko<>]
[0] "<"
[1] "/html"
[2] ">"
[11] "\n"
Object にはそれぞれ [html] とか [kakko<>] とか名前(種類名)が付けられています。
Object が持つ子要素は配列に格納されており、親子の階層構造を成しています。
"<" などダブルクオートで囲まれたものは文字列です。HTML 構造としての階層分けはしていません。
HTML のタグごとにオブジェクト化されています。
次にこの HTML に JavaScript を追加します。
<html>
<head>
<script>
function test( abc ) {
alert( abc );
}
</script>
</head>
<body>
</body>
</html>
解析結果。
Object: [html]
[0] Object: [kakko<>]
[0] "<"
[1] "html"
[2] ">"
[1] "\n\t"
[2] Object: [kakko<>]
[0] "<"
[1] "head"
[2] ">"
[3] "\n\t\t"
[4] Object: [kakko<>]
[0] "<"
[1] "script"
[2] ">"
[5] Object: [javascript]
[0] "\n\t\t\t"
[1] Object: [functionStatement]
[0] "function test"
[1] Object: [kakko()]
[0] "("
[1] " abc "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\t\t\talert"
[2] Object: [kakko()]
[0] "("
[1] " abc "
[2] ")"
[3] ";"
[4] "\n\t\t\t"
[5] "}"
[2] "\n\t\t"
[6] Object: [kakko<>]
[0] "<"
[1] "/script"
[2] ">"
[7] "\n\t"
[8] Object: [kakko<>]
[0] "<"
[1] "/head"
[2] ">"
[9] "\n\t"
[10] Object: [kakko<>]
[0] "<"
[1] "body"
[2] ">"
[11] "\n\t"
[12] Object: [kakko<>]
[0] "<"
[1] "/body"
[2] ">"
[13] "\n"
[14] Object: [kakko<>]
[0] "<"
[1] "/html"
[2] ">"
[15] "\n"
SCRIPT タグで囲まれた部分は、JavaScript として認識されており、「言語の中の別の言語」に対応しています。
『function と来たら引数エリアを示す () と、本体エリアを示す {} が続くだろう』というようなプログラムを書いています。
実際のプログラミング言語の実行においても、
ERROR: unexpected ~(~は期待されていない)
というエラーメッセージはよく見ると思います。
『コンパイラや構文解析などの「言語の処理」においては、どんな単語や記号が「期待される」、という動きは共通してあるんだな』と今回の構文解析を作っていて思いました。
前回の記事で問題にしていた、
if( A ) if( B ) test4(); else test5(); else test6();
このピンクの部分は if( A ) にとっての単文であり、「どう認識させるか?」ということでした。
解析結果。
Object: [javascript]
[0] Object: [ifStatement]
[0] "if"
[1] Object: [kakko()]
[0] "("
[1] " A "
[2] ")"
[2] " "
[3] Object: [ifStatement]
[0] "if"
[1] Object: [kakko()]
[0] "("
[1] " B "
[2] ")"
[2] " test4"
[3] Object: [kakko()]
[0] "("
[1] ")"
[4] ";"
[5] " "
[6] Object: [elseStatement]
[0] "else"
[1] " test5"
[2] Object: [kakko()]
[0] "("
[1] ")"
[3] ";"
[4] " "
[5] Object: [elseStatement]
[0] "else"
[1] " test6"
[2] Object: [kakko()]
[0] "("
[1] ")"
[3] ";"
if( A ) の階層の中に if( B ) があり、if( B ) に係る else 構文は if( B ) の階層に入っており、その後 if( A ) に係る else 構文が if( A ) の階層にて続いています。
このようにちゃんと正しく認識されています。
認識の方法としては、
などの処理を行っています。
「コンパイラがコンパイラを作る」というのはよく聞く話ですが、今回の構文解析でも「構文解析が自分自身を構文解析」することができます。
実力が試されるところです。
解析結果。
Object: [html]
[0] Object: [kakko<>]
[0] "<"
[1] "!DOCTYPE html"
[2] ">"
[1] "\n"
[2] Object: [kakko<>]
[0] "<"
[1] "html"
[2] ">"
[3] "\n"
[4] Object: [kakko<>]
[0] "<"
[1] "head"
[2] ">"
[5] "\n"
(中略)
[6] Object: [kakko<>]
[0] "<"
[1] "title"
[2] ">"
[7] Object: [kakko<>]
[0] "<"
[1] "/title"
[2] ">"
[8] "\n"
[9] Object: [kakko<>]
[0] "<"
[1] "meta content="
[2] Object: [doubleQuote]
[0] """
[1] "text/html; charset=UTF-8"
[2] """
[3] " http-equiv="
[4] Object: [doubleQuote]
[0] """
[1] "content-type"
[2] """
[5] ">"
[10] "\n"
[11] Object: [kakko<>]
[0] "<"
[1] "script"
[2] ">"
[12] Object: [javascript]
[0] "\nconsole.clear"
[1] Object: [kakko()]
[0] "("
[1] ")"
[2] ";"
[3] "\naddEventListener"
[4] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] "load"
[2] """
[3] ", "
[4] Object: [functionExpression]
[0] "function"
[1] Object: [kakko()]
[0] "("
[1] " e "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\tlet div = document.createElement"
[2] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] "div"
[2] """
[3] " "
[4] ")"
[3] ";"
[4] "\n\t\tObject.assign"
[5] Object: [kakko()]
[0] "("
[1] " div.style, "
[2] Object: [kakko{}]
[0] "{"
[1] "\n\t\t\tposition : "
[2] Object: [doubleQuote]
[0] """
[1] "fixed"
[2] """
[3] ",\n\t\t\ttop : 0,\n\t\t\tright : 0,\n\t\t\tfontSize : "
[4] Object: [doubleQuote]
[0] """
[1] "small"
[2] """
[5] ",\n\t\t\tpadding : "
[6] Object: [doubleQuote]
[0] """
[1] ".5em"
[2] """
[7] ",\n\t\t\tcolor : "
[8] Object: [doubleQuote]
[0] """
[1] "gray"
[2] """
[9] ",\n\t\t"
[10] "}"
[3] " "
[4] ")"
[6] ";"
[7] "\n\t\tdiv.innerHTML = decodeURI"
[8] Object: [kakko()]
[0] "("
[1] " location.href.match"
[2] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [regExp]
[0] "/"
[1] "[^\/]+\/[^\/]+$"
[2] "/"
[3] " "
[4] ")"
[3] " "
[4] ")"
[9] ";"
[10] "\n\tdocument.body.appendChild"
[11] Object: [kakko()]
[0] "("
[1] " div "
[2] ")"
[12] ";"
[13] "\n"
[14] "}"
[5] " "
[6] ")"
[5] ";"
[6] "\n\n\n"
[7] Object: [functionStatement]
[0] "function createStatement"
[1] Object: [kakko()]
[0] "("
[1] " name, parent "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\tlet statement = "
[2] Object: [kakko{}]
[0] "{"
[1] "\n\t\tname : name,\n\t\tparent : parent,\n\t\tchildren : new Array"
[2] Object: [kakko()]
[0] "("
[1] ")"
[3] ",\n\t"
[4] "}"
[3] "\n\treturn statement"
[4] ";"
[5] "\n"
[6] "}"
[8] "\n\n"
[9] Object: [functionStatement]
[0] "function allEscape"
[1] Object: [kakko()]
[0] "("
[1] " string "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\t"
[2] Object: [lineComment]
[0] "//"
[1] "check."
[2] "\n"
[3] "\t\t"
[4] Object: [ifStatement]
[0] "if"
[1] Object: [kakko()]
[0] "("
[1] " typeof string === "
[2] Object: [doubleQuote]
[0] """
[1] "undefined"
[2] """
[3] " "
[4] ")"
[2] " return"
[3] ";"
[5] "\n\t\t\n\tlet res = "
[6] Object: [doubleQuote]
[0] """
[1] """
[7] ";"
[8] "\n\t"
[9] Object: [forStatement]
[0] "for"
[1] Object: [kakko()]
[0] "("
[1] " let ch of string "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\t"
[2] Object: [ifStatement]
[0] "if"
[1] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] "[](){}/*.+|^$\\"
[2] """
[3] ".indexOf"
[4] Object: [kakko()]
[0] "("
[1] " ch "
[2] ")"
[5] " > -1 "
[6] ")"
[2] "\n\t\t\tres += "
[3] Object: [doubleQuote]
[0] """
[1] "\\"
[2] """
[4] " + ch"
[5] ";"
[6] "\n\t\t"
[7] Object: [elseStatement]
[0] "else"
[1] "\n\t\t\tres += ch"
[2] ";"
[3] "\n\t"
[4] "}"
[10] "\n\treturn res"
[11] ";"
[12] "\n"
[13] "}"
[10] "\n\n\n"
[11] Object: [functionStatement]
[0] "function AL"
[1] Object: [kakko()]
[0] "("
[1] " array "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\treturn array"
[2] Object: [kakko[]]
[0] "["
[1] " array.length - 1 "
[2] "]"
[3] ";"
[4] "\n"
[5] "}"
[12] "\n\n"
[13] Object: [functionStatement]
[0] "function LOMToText"
[1] Object: [kakko()]
[0] "("
[1] " LOM "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\tlet text = "
[2] Object: [doubleQuote]
[0] """
[1] """
[3] ";"
[4] "\n\t"
[5] Object: [forStatement]
[0] "for"
[1] Object: [kakko()]
[0] "("
[1] " let child of LOM.children "
[2] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\t"
[2] Object: [ifStatement]
[0] "if"
[1] Object: [kakko()]
[0] "("
[1] " typeof child === "
[2] Object: [doubleQuote]
[0] """
[1] "string"
[2] """
[3] " "
[4] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\t\ttext += child"
[2] ";"
[3] "\n\t\t"
[4] "}"
[4] " "
[5] Object: [elseStatement]
[0] "else"
[1] " "
[2] Object: [kakko{}]
[0] "{"
[1] "\n\t\t\ttext += LOMToText"
[2] Object: [kakko()]
[0] "("
[1] " child "
[2] ")"
[3] ";"
[4] "\n\t\t"
[5] "}"
[3] "\n\t"
[4] "}"
[6] "\n\treturn text"
[7] ";"
[8] "\n"
[9] "}"
[14] "\n\n\n"
[15] Object: [lineComment]
[0] "//"
[1] "===●●● main."
[2] "\n"
[16] Object: [functionStatement]
[0] "function run"
[1] Object: [kakko()]
[0] "("
[1] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\n\tlet ta0 = document.getElementById"
[2] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] "ta0"
[2] """
[3] " "
[4] ")"
[3] ";"
[4] "\n\t\tta0.style.backgroundColor = "
[5] Object: [doubleQuote]
[0] """
[1] "navy"
[2] """
[6] ";"
[7] "\n\t\tta0.style.color = "
[8] Object: [doubleQuote]
[0] """
[1] "white"
[2] """
[9] ";"
[10] "\n\n\tLOM = textToLOM"
[11] Object: [kakko()]
[0] "("
[1] " text0.value, ta0.value "
[2] ")"
[12] ";"
[13] "\n\n\tconsole.log"
[14] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] "//=== ここから debug1"
[2] """
[3] " "
[4] ")"
[15] ";"
[16] "\n\ttaLog.value = "
[17] Object: [doubleQuote]
[0] """
[1] """
[18] ";"
[19] "\n\tlog = "
[20] Object: [functionExpression]
[0] "function"
[1] Object: [kakko()]
[0] "("
[1] ")"
[2] " "
[3] Object: [kakko{}]
[0] "{"
[1] "\n\t\tlet array = Array.from"
[2] Object: [kakko()]
[0] "("
[1] " arguments "
[2] ")"
[3] ";"
[4] "\n\t\ttaLog.value += array.join"
[5] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [doubleQuote]
[0] """
[1] " "
[2] """
[3] " "
[4] ")"
[6] ".replace"
[7] Object: [kakko()]
[0] "("
[1] " "
[2] Object: [regExp]
[0] "/"
[1] "^ +"
[2] "/"
[3] ", "
[4] Object: [doubleQuote]
[0] """
[1] """
[5] " "
[6] ")"
[8] ";"
[9] "\n\t\ttaLog.value += "
[10] Object: [doubleQuote]
[0] """
[1] "\n"
[2] """
[11] ";"
[12] "\n\t"
[13] "}"
(中略)
[23] "\n"
[13] Object: [kakko<>]
[0] "<"
[1] "/script"
[2] ">"
[14] "\n"
[15] Object: [kakko<>]
[0] "<"
[1] "style"
[2] ">"
[16] "\n"
[17] Object: [kakko<>]
[0] "<"
[1] "/style"
[2] ">"
[18] "\n"
[19] Object: [kakko<>]
[0] "<"
[1] "/head"
[2] ">"
[20] "\n"
[21] Object: [kakko<>]
[0] "<"
[1] "body onload="
[2] Object: [doubleQuote]
[0] """
[1] "run()"
[2] """
[3] ">"
[22] "\nHello World!"
[23] Object: [kakko<>]
[0] "<"
[1] "span id="
[2] Object: [doubleQuote]
[0] """
[1] "gage"
[2] """
[3] ">"
[24] Object: [kakko<>]
[0] "<"
[1] "/span"
[2] ">"
[25] "\n"
[26] Object: [kakko<>]
[0] "<"
[1] "BR"
[2] ">"
[27] "\n"
[28] Object: [kakko<>]
[0] "<"
[1] "input type="
[2] Object: [doubleQuote]
[0] """
[1] "text"
[2] """
[3] " id="
[4] Object: [doubleQuote]
[0] """
[1] "text0"
[2] """
[5] ">"
[29] Object: [kakko<>]
[0] "<"
[1] "BR"
[2] ">"
[30] "\n"
[31] Object: [kakko<>]
[0] "<"
[1] "textarea id="
[2] Object: [doubleQuote]
[0] """
[1] "ta0"
[2] """
[3] " cmt="
[4] Object: [doubleQuote]
[0] """
[1] "disabled autocomplete=off"
[2] """
[5] " style="
[6] Object: [doubleQuote]
[0] """
[1] "\n\twidth : 100%;\n\theight : 45vh;\n"
[2] """
[7] ">"
[32] Object: [taggedByTextarea]
[33] Object: [kakko<>]
[0] "<"
[1] "/textarea"
[2] ">"
[34] "\n"
[35] Object: [kakko<>]
[0] "<"
[1] "textarea id="
[2] Object: [doubleQuote]
[0] """
[1] "taLog"
[2] """
[3] " cmt="
[4] Object: [doubleQuote]
[0] """
[1] "disabled autocomplete=off"
[2] """
[5] " style="
[6] Object: [doubleQuote]
[0] """
[1] "\n\twidth : 100%;\n\theight : 45vh;\n"
[2] """
[7] ">"
[36] Object: [taggedByTextarea]
[37] Object: [kakko<>]
[0] "<"
[1] "/textarea"
[2] ">"
[38] "\n\n\n"
[39] Object: [kakko<>]
[0] "<"
[1] "/body"
[2] ">"
[40] "\n"
[41] Object: [kakko<>]
[0] "<"
[1] "/html"
[2] ">"
[42] "\n"
ちゃんと正しく認識されています。
ところで、文字列認識の「正規表現」は /abc/i と書くことで文字列内に abc や ABC が存在している、その存在の手前の文字列や後の文字列なども参照できるという便利な機能です。
構文解析においては、正規表現の開始と終了を示す / という記号を「割り算」と混同しないように工夫をしなければなりません。
そういう「正しく認識」させるための工夫を最小限に抑えることが必要です。
あまり長くて複雑なプログラムを作っても、バグ修正は大変だし、あとからの機能拡張も大変になってしまうので、最小限を目指したほうがよいです。
この構文解析を行えると何が楽しいのかというと、
(訪問者のどんなニーズと この記事がつながるか)
デッサン人形を買いました。
2体セットで、1280円税込でした。
箱から出して、そのまま何もいじらないで立たせたら、普通に「ジョジョ立ち」でした。
ゴゴゴゴゴ
なにィー!コイツ!!最初からジョジョ立ち!
荒木飛呂彦の世界。
ピシャンとさせました。
可動範囲を調べました。その1。
可動範囲を調べました。その2。
股の開きがもっと可動できれば、もっと表現の幅が広がるのにと思いました。
でもデッサン人形としては普通だと思います。
出来る限りの「クローズなポーズ」と、「オープンなポーズ」。
自由ポーズ。
ガンプラでも難しいとされる「片ひざ付きポーズ」をこなしました。
おんぶ。
「パパー!あれみてー!」的な。
冗談で終わりたくないので、ひとつ描きました。
「中世ヨーロッパの戦士」です。
女の子ばっかり描きたくないという思いがあるから、男らしい男を描いた感じです。
今後、私に画家の魂が少しでも入ってくれることを期待します。
もっとリアルな可動範囲抜群のデッサン人形も売られていますが、昔ながらの方が魂が入るかなと思って。
(訪問者のどんなニーズと この記事がつながるか)