CS 498MC Martian Computing at the University of Illinois at Urbana–Champaign


The basic text quantities in Hoon are the atom cord @t and derivatives, and the list tape.
> `@t`'Excalibur'
'Excalibur'
> `tape`"Excalibur"
"Excalibur"
There are many helpful text conversion arms:
++cass: convert upper-case text to lower-case (tape→tape)++cuss: convert lower-case text to upper-case (tape→tape)++crip: convert tape to cord (tape→cord)++trip: convert cord to tape (cord→tape)You can convert to and from atoms with particular auras as well:
But wait: what’s a dime? A dime is a pair of aura as @ta and a value. This helps the function know what to render the value as.
++scan is used with a parsing rule to parse tape into atom.
dim:ag parses a decimal number out of a tape (tape→@ud)ddm:ag parses an Urbit-style decimal number (with 1.000 dot separators) out of a tape (tape→@ud)hex:ag parses a hexadecimal digit out of a tape (tape→@ux)d:ne renders a adecimal digit from an atom (@ud→cord)x:ne renders a hexadecimal digit from an atom (@ux→cord)There are many more of these, but you get the flavor of it.

Many more advanced text structures are available. These contain metadata tags such as %leaf to hint to the prettyprinter and other tools how to represent and process the text data.
Remember from way back when wain and wall: wain being a list of cords and wall being a list of tapes.
The primary data structure used by the prettyprinter is a tank, which is a nested tagged structure of tapes. A tank element can be tagged in one of three ways:
%leaf or leaf+"": a simple printed line%rose: a list of tank delimited by strings%palm: a list of tank delimited by strings with backsteps (least common)For instance, to make a single %leaf statement, you can type
leaf+"Rhongomyniad"
You can also us the >1.000< format, which converts a value to a $tank directly (and can be used with faces/names).
A tang is a list of tanks. It is used with ++slog to produce conditional error messages.

One of the most straightforward tools to use with text is ++trim, which splits a tape into two parts at a given index. You can use this together with ++find to produce a simple text tokenizer.
++ parse
|= [in-words=tape]
=/ out-words *(list tape)
|- ^+ out-words
=/ next-index (find " " in-words)
?: =(next-index ~) (weld out-words ~[in-words])
=/ values (trim +(+:next-index) in-words)
~& values
$(in-words +:values, out-words (weld out-words ~[-:values]))
Let’s consider abstractly manipulating XML entities. There are a number of ; mic runes which support this.
Sail is a part of Hoon used for creating and operating on nouns that represent XML nodes. With the appropriate rendering pipeline, a Sail document can be used to generate a static website.
The Sail runes are stored as ; mic macros. These operate on ++manx and ++marl (list of manx) values. A manx is a single XML node; XML being a superspec of HTML, therefore, Sail can be used to map and produce HTML as a function of Hoon operations.
A manx is a single XML node, and thus
[[%p ~] [[%$ [%$ "This is the first node."] ~] ~] ~]
Generally, one produces and manipulates marls rather than directly working with manxs.
;p: This will be rendered as an XML node.
=; ;p:
;p:
;p:
These are ultimately parsed by and from %zuse’s ++en-xml:html and ++de-xml:html arms.
`manx`+:(de-xml:html (crip "<element attribute=\"1\">text<!-- comment --></element>"))
Similar to XML/HTML, there are a number of tools (but not runes) in the ++enjs and ++dejs arms of %zuse.
> (tape:enjs:format "hello world")
[%s p='hello world']
> (sa:dejs:format (tape:enjs:format "hello world"))
"hello world"
zuse.hoon, ++enjs and ++dejs armsIt is convenient when parsing (and performing many other operations) to curry a function or cork it.
To curry a function means to wrap one of its arguments inside of it so that it becomes a function not of $n$ variables but of $n-1$ variables. Use ++cury to accomplish this.
> =add-1 (add:rs .1)
> (add-1 .2)
.3
To cork a function is to compose it forwards; that is, repeatedly apply it
> (:(cork dec dec dec dec dec) 1.000)
995
Use ++cork to cork a function.
Art by Chris Foss.
Parse the following HTML block into Sail elements such as manxs and marls:
<p>The <a href="/wiki/Alliterative_Morte_Arthure" title="Alliterative Morte Arthure">Alliterative <i>Morte Arthure</i></a>, a <a href="/wiki/Middle_English" title="Middle English">Middle English</a> poem, mentions Clarent, a sword of peace meant for knighting and ceremonies as opposed to battle, which <a href="/wiki/Mordred" title="Mordred">Mordred</a> stole and then used to kill Arthur at Camlann. The Prose <i>Lancelot</i> of the Vulgate Cycle mentions a sword called Seure (Sequence), or Secace in some manuscripts, which belonged to Arthur but was borrowed by Lancelot.</p>
(You’ll need to right-click and Inspect Element to get the <p> tag and its contents. Markdown and Jinja aren’t playing nice with the code block.)
Produce an %ask generator which accepts a text value and produces the Morse code equivalent. You may use the following core as a point of departure for composition.
You may decide how to handle spaces (omit or emit a space), but you should convert the message to lower-case first.
|%
++ en-morse !!
++ table
%- my
:~ :- 'a' '·-' :- 'b' '-···' :- 'c' '-·-·' :- 'd' '-··'
:- 'e' '·' :- 'f' '··-·' :- 'g' '--·' :- 'h' '····'
:- 'i' '··' :- 'j' '·---' :- 'k' '-·-' :- 'l' '·-··'
:- 'm' '--' :- 'n' '-·' :- 'o' '---' :- 'p' '·--·'
:- 'q' '--·-' :- 'r' '·-·' :- 's' '···' :- 't' '-'
:- 'u' '··-' :- 'v' '···-' :- 'w' '·--' :- 'x' '-··-'
:- 'y' '-·--' :- 'z' '--··' :- '0' '-----' :- '1' '·----'
:- '2' '··---' :- '3' '···--' :- '4' '····-' :- '5' '·····'
:- '6' '-····' :- '7' '--···' :- '8' '---··' :- '9' '----·'
==
--