Contributing to dlang.org

From D Wiki
Revision as of 12:53, 18 June 2019 by Kriyszig (talk | contribs) (Fixes spelling errors)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This document is aimed at people who want to contribute to the dlang.org website. The dlang.org repository is used to build the language specification and library documentation of the D programming language.

DAutoTest

Alternatively to building the documentation locally, you can use the documentation DAutoTest service, which will for every GitHub pull request automatically build all documentation pages, generate a diff of the results, and add a link to your GitHub pull request.

CI DAutoTest.png

Installing

We assume you already have a dmd rig up and running. If not, follow the steps at Starting as a Contributor to get it running. The directory structure we'll assume in the following is:

~/
  dlang/
    dmd/
    druntime/
    phobos/

With this setup, let's proceed to downloading the source of dlang.org as follows:

cd ~/dlang
git clone https://github.com/dlang/dlang.org

After this, the dlang.org directory will end up parallel to dmd, druntime, and phobos. Let's build the site (for now without the standard library documentation) by using the following command:

cd ~/dlang/dlang.org
make -j32 -f posix.mak html

The html target instructs make to only build the site. By default (if you specify no target), make also builds the core runtime and standard library documentation, both for the latest release and for the current code residing on your machine. That may get pretty involved, so let's leave it for later. For now, let's inspect the result of the build, all of which goes in the dlang.org/web directory. On Linux, for example, the command xdg-open opens your default browser with a given file or address so we can use it as such:

xdg-open ~/dlang/dlang.org/web/index.html

On OS X:

open ~/dlang/dlang.org/web/index.html


At this point if all went well a nicely-formatted HTML file pops up featuring a local replica of the dlang.org homepage. Congratulations!

Build commands

If you are just interested in previewing your change to Phobos documentation, the html and phobos-prerelease will save time:

 make -f posix.mak html phobos-prerelease

Of course, parallelizing with -j improves speed as well.

To informally test it, open the appropriate HTML documents in that directory. Note that the currently released phobos would be in ~/dlang/dlang.org/web/phobos, whereas the current (fresh) build of phobos's documentation will reside in ~/dlang/dlang.org/web/phobos-prerelease. So, for example, if you change the embedded documentation in ~/dlang/phobos/std/conv.d, the changes are visible in ~/dlang/dlang.org/web/phobos-prerelease/std_conv.html. (The build process replaces the slashes in submodules with underscores.)

You can also check the dlang.org contribution guide for more detailed build instructions.

Avoiding network traffic

Note that one of the first lines output during the make run looks like this:

LATEST=2.110.0 <-- place in the command line to skip network traffic.

That's advice worth heeding because fetching LATEST automatically involves network traffic, which adds time to the build. So for future builds use this:

make -f posix.mak LATEST=2.110.0 html phobos-prerelease

Full build

Note that the full build all of the documentation in all forms, including Kindle builds and various other things that may require installing additional tools, and may download/build old versions of DMD. This may take some time and can be run with:

make -f posix.mak

Or if you want to avoid network traffic:

make -f posix.mak LATEST=2.110.0

Ddoc Fundamentals

Browsing through ~/dlang/dlang.org/ reveals that most files have the .dd extension. Those files are in Ddoc format; in order to work on dlang.org a basic understanding of the Ddoc format is needed.

At its core, Ddoc is a pure macro expansion system. In this context "pure" means the macro language has no relationship to other file format; all the expansion engine does is take in macro definitions and then munch through text and expand macros as they come along. A few macros are predefined to values that make HTML generation easy, so in that sense a slight affinity with HTML does exist; however, those macros can be trivially redefined to any other purpose. Also, Ddoc recognizes sections marked in a particular way as D source code. There are a few subtleties inherent to macro processors (e.g. how recursion works or in what order nested macros are expanded), but aside from those Ddoc is deceptively simple and very flexible. Exploiting the favorable relationship between Ddoc's simplicity and power is key to using it effectively.

Ddoc source files have the following structure:

Ddoc
Text with embedded macros such as $(MACRO1) and $(MACRO2) goes here.
Macros:
  MACRO1=definition1
  MACRO2=definition2

That is, a Ddoc file consists of the actual word "Ddoc" followed by a newline, then followed by the actual text of the document, followed by a line containing "Macros:", followed by macro definitions of the form NAME=value. The "Macros:" section is optional (as is the indentation of the macro definitions; it's present here for aesthetic reasons only). You might have guessed already that the syntax $(MACRONAME) expands the macro called MACRONAME into whatever text was ascribed to it in the "Macros:" section. Let's actually test that by saving the following text into a file called e.g. test.dd:

Ddoc
Text with embedded macros such as $(MACRO1) and $(MACRO2) goes here.
Macros:
  MACRO1=definition1
  MACRO2=definition2
  DDOC=$(BODY)

The last macro definition seems to come out of nowhere and deserves some explanation, which this document will provide soon. For now let's "build" this file like this:

~/dlang/dmd/generated/linux/release/64/dmd test.dd
cat test.html

(You may of course just type dmd if it's in your $PATH) The produced file, by default carrying the .html extension, contains:

Text with embedded macros such as definition1 and definition2 goes here.

So the macro names got expanded to the text in their respective definitions. Sweet!

The Ddoc Expansion Process

Time to clear the air about the mysterious DDOC=$(BODY) definition. When dmd processes a .dd file, it doesn't immediately expand the text; the process goes as follows:

  • accumulate the text (sans the opening Ddoc\n) in memory and put it in a macro called BODY;
  • when processing reaches the \nMacros:\n section, read and memorize the macro definitions underneath;
  • expand and output the macro DDOC. In all likelihood DDOC expands $(BODY) with some adornments before and/or after it.

The rationale for going slightly roundabout is simple: many file formats include a prologue and an epilogue structure. Instead of asking the user to write the prologue and epilogue with each document, DDoc allows the user to define them in a single place (the DDOC macro definition) and then use that definition across many files.

The default value of DDOC is geared toward building simple HTML files. Indeed, if we remove the DDOC macro definition from the test.dd file and rebuild, the resulting test.html contains:

<html><head>
        <META http-equiv="content-type" content="text/html; charset=utf-8">
        <title>test</title>
        </head><body>
        <h1>test</h1>
        <!-- Generated by Ddoc from test.dd -->

Text with embedded macros such as definition1 and definition2 goes here.

        <hr><small>Page generated by <a href="http://dlang.org/ddoc.html">Ddoc</a>. </small>
        </body></html>

which is a serviceable albeit bland HTML document.

Macros with Parameters

Macros are key to the power of Ddoc for at least two reasons. First, they shorten and simplify the document by allowing you to replace clumsy typesetting directives such as <span class="important">Attention!</span> with $(ATTENTION). Second, they elevate the level of the document you're writing by leveraging the traditional "extra level of indirection". Depending on how you define ATTENTION, you get to format $(ATTENTION) in various ways. For example, for HTML you'd use:

ATTENTION=<span class="important">Attention!</span>

If you don't care to define css classes etc. you may want to just go sloppy:

ATTENTION=<font color="red">Attention!</font>

To format the document as plain text, just write:

ATTENTION=Attention!

And if you want to output LaTeX, write something like:

ATTENTION={\color{red}Attention!}

In short, you get to expand Ddoc documents into many other formats by using macros and combining the documents with appropriately-defined macro batteries. (This document will discuss soon how to effect such combinations.)

For now, let's note that ATTENTION is not quite a sterling example of flexibility; it allows us to render "Attention!" in a special way, but often the need is to render various other words and phrases as attention-attracting text. So what's needed is a macro taking the text to render as a parameter. Here's how to do so in Ddoc.

In a macro definition, certain constructs access arguments as follows:

  • $1, $2, $3, ..., $9 expand to the first, second, third, ..., ninth argument;
  • $+ expands to all arguments except for the first;
  • $0 simply expands to all macro arguments.

To pass arguments to a macro, insert a space after the macro name, then pass the arguments separated them by commas, then close the parenthesis: for example, $(MYMACRO how are you doing, Jeff?) passes the arguments "how are you doing" and "Jeff?" to a macro called MYMACRO.

There are a couple of subtleties related to passing and expanding parameters. The first whitespace after the macro name is "munched", i.e. it just disappears from the expansion. But if there's any whitespace character following the first one, it will be considered part of the first argument. An example will make this clear. Consider the macro definition TEST=|$1|. Then, the invocation

$(TEST abc)

produces

|abc|

whereas the invocation (note the extra space)

$(TEST  abc)

produces (note the extra space before the letters):

| abc|

The whitespace after comma (if present) in multiple argument lists is handled with a similar logic. The first whitespace character immediately after the comma is considered "aesthetic" and not present in the expanded text. However, if the source inserts more than one whitespace character after comma, the extra ones are considered part of the argument. By means of example, consider the macro definition TEST=|$1|$2|. Then we have the following expansions:

$(TEST abc,xyz)

produces

|abc|xyz|

then the invocation (note the extra space)

$(TEST abc, xyz)

produces no change in output:

|abc|xyz|

However, the expansion (there are THREE spaces after the comma):

$(TEST abc,   xyz)

produces the output:

|abc|  xyz|

which inserts TWO spaces before "xyz". One has been munched, the other two copied.

Ddoc "understands" parentheses and likes them paired. If a comma occurs within parenthesized text, it won't be considered a macro argument separator. For example, in the invocation $(MYMACRO abc, (xyz, tuv)), MYMACRO receives two arguments, not three. Nested parentheses work as expected, too; only top-level commas within the macro are considered argument separators.

Whitespace at the Beginning or End of Macro Definitions

In the "Macros:" section, any number of whitespace characters preceding or following the = sign are ignored. That makes it problematic to define macros that start with a space. To remedy that, define this macro:

SPACE = $(SPACE) $(SPACE)

The recursive expansion (which will be explained in detail below) expands to nothing while the macro itself is being expanded; so the final expansion will be a space flanked by two empty strings, which is exactly what we needed. Now it's easy to use space to e.g. indent some text:

INDENT = $(SPACE)$(SPACE)$(SPACE)$(SPACE)$0

The INDENT macro inserts four spaces before its arguments. Note that the space between "=" and the first "$(SPACE)" is not part of the output.

Similarly, to define a macro that inserts exactly one newline:

NEWLINE = $(NEWLINE)
$(NEWLINE)

Special characters: "(", ")", "$", ","

As shown above, the characters "(", ")", "$", and "," have special meaning to Ddoc. Sometimes it's necessary to "escape" them, i.e. make them part of the produced output without them being interpreted by Ddoc. To achieve that, the following macros are useful:

ARGS = $0
COMMA = ,
COMMENT = 
DOLLAR = $
LPAREN = (
RPAREN = )
EQUAL = =
TAIL = $+

The ARGS macro is useful when we want to pass a large text (possibly including comma) as a single argument to another macro. Consider:

SECTION=<h2>$1</h2>$+

Then if the first argument to SECTION includes one ore more commas, use it like this:

$(SECTION $(ARGS Trinkets, Treasures, and Other Tchotchkes),
Here goes the text of the section...
)

Alternatively (and this brings us to the second useful macro), you could use COMMA for the same effect:

$(SECTION Trinkets$(COMMA) Treasures$(COMMA) and Other Tchotchkes,
Here goes the text of the section...
)

The macro COMMENT expands to absolutely nothing whatsoever, which makes it a great macro for inserting comments in Ddoc files (including commenting out portions of the document). Pay attention, however: comments must still pair parentheses properly.

The DOLLAR macro expands to the dollar sign in a way that makes it impossible for later use for initiating a macro expansion. Same goes about LPAREN and RPAREN: they expand to parentheses, but only in a blind textual way; they don't carry the usual meaning of parentheses. So you get to use these macros if you want e.g. to define a macro that includes the dollar sign and/or includes unbalanced parentheses.

The EQUAL macro is of help for defining multiline macros that contain themselves the equal sign. Say we want to define a macro such that $(ASSIGN 1, 2) expands to two lines:

x = 1
y = 2

If we define the macro like this:

Macros:
...
ASSIGN=x = $1
y = $2
...

things won't work as needed because the macro processor will interpret y = $2 as a new macro definition, not the second line of ASSIGN. To fix this, just use:

Macros:
...
ASSIGN=x = $1
y $(EQUAL) $2
...

and now the equal sign will be generated properly but not interpreted as punctuation during macro creation.

Last but (somewhat ironically) not least, the TAIL macro expands to all of its arguments except the last one. Why is not $+ just enough? Because you may apply TAIL to $+ itself to access all arguments except the first two ones. Consider:

SAYS=$1 $2 says: ==$(TAIL $+)==

Then, $(SAYS Mad, Hatter, A land full of wonder, mystery, and danger!) expands to "Mad Hatter says: ==A land full of wonder, mystery, and danger!==" as we'd need it to.

Unfortunately, $(TAIL $(TAIL $+)) does not expand to all arguments except the first three ones, as we'd wish. Instead, it expands to nothing at all. This is to keep the overall expansion mechanics simple: once the first macro expansion occurs, the commas found it it are "painted" to be inert so they can't be reused as commas to separate arguments to the outer macro.

All is not lost, however, with a little handwritten drudgery. These macros do work properly:

TAIL=$+
TAIL2=$(TAIL $+)
TAIL3=$(TAIL2 $+)
TAIL4=$(TAIL3 $+)

You may continue this for as long as needed. A natural limit is 9 because numbered arguments go up to $9.

Recursive Macros

Ddoc strives to avoid being a Turing-complete language, thus keeping the complexity/power balance in a safe area. That means recursion is theoretically off the table. However, there are two crude mechanisms that make recursive macros usable in Ddoc:

  • If a macro is expanded within its own expansion more than 1000 levels, ddoc compilation fails with an error message.
  • If a macro is expanded without any argument within its own expansion, it "disappears" i.e. it expands to the null string.

The first rule is a hamfisted means to enforce that any ddoc expansion will finish and is rather uninteresting. The only thing to note is the limit on expansion depth, which is generous enough to not make it a nuisance. The second rule is the interesting one because it allows us to write recursive macros as long as the recursive invocation is always smaller than the input.

Consider, for example, generating an HTML list which needs to look like this:

<ul>
<li>Apples</li>
<li>Oranges</li>
<li>Bananas</li>
<li>Pineapples</li>
</ul>

One direct but awkward approach is:

UL = <ul>$0</ul>
LI = <li>$0</li>

which is then used like this:

$(UL
$(LI Apples)
$(LI Oranges)
$(LI Bananas)
$(LI Pineapples)
)

This works, but there's a more compact way of creating the list by using recursive macros:

UL = <ul>$(LIs $0)</ul>
LIs = <li>$1</li>$(LIs $+)

Now all you need to type to create the list is $(UL Apples, Oranges, Bananas, Pineapples). Sweet! (Literally.) Got an argument with commas inside? No problem, ARGS is ready to help: $(UL Apples, Oranges, $(ARGS Mandarins, Tangerines, Clementines, etc.), Bananas, Pineapples) expands as desired.

Other Special Characters and Patterns

There are two more patterns you should know about. Snippets of D code should go in between lines consisting of three or more minus signs, as follows:

---
void main() {
    import std.stdio;
    writeln("Hello, world");
}
---

Then the Ddoc processor understands and syntax-colorizes the code appropriately. The coloring is done by means of predefined macros, all of which may be redefined by the user. Only D is currently supported; no other languages are "understood" by the Ddoc engine.

Another pattern worth noting is words or short phrases enclosed in backticks (those small back apostrophes ` that look like specs of dust on the screen to those of us missing our glasses). Such `text` enclosed in backticks, when both the opening and the closing backtick occur on the same line, is rewritten as $(DDOC_BACKQUOTED text). By default the macro expands to an HTML span tag, but you may change it as needed.

One other pattern you should know about is the section. A section is a line starting of a word followed by a colon (e.g., "Summary:"). So "Macros:" itself is a section, and needs to be the last one in a .dd file. Ddoc recognizes a number of standardized sections (such as "License:" or "History:") and also has a special treatment of user-defined sections by means of the macros DDOC_SECTION_H and DDOC_SECTION, both of which are overridable. For more information about sections check the Ddoc documentation.

Predefined Macros

There were a few references to magical "predefined" macros throughout this document. Indeed there are a few macros that the Ddoc processor defines and uses. Take a look at the full list.

Macro names starting with D_ and DDOC_ are reserved. That doesn't mean you shouldn't redefine them (on the contrary, you always could, often should, and sometimes must); it just means macros named that way may have special meaning to Ddoc, so you shouldn't use them naively for your own macros.

The predefined macros are geared toward producing HTML output, but since they are all trivially overridable, this doesn't mean Ddoc locks you into using HTML or an HTML-like format.

Macro Definition Batteries: .ddoc files

So far, this document discussed .dd files as defining both their content and macros (the latter by means of the "Macros:" section). However, the whole point of a macro system is that you get to swap easily what the generated text looks like by swapping one set of macros for another. So we need to explore the opportunity of defining the content in a file, and the macros controlling the expansion of the content in a distinct file. Enter .ddoc files containing macros alone.

Ddoc has no file inclusion feature, but it has a simple mechanism of combining files in the command line. Simply specify the files in the command line in the order you'd like them processed, and dmd loads each in turn and processes it. (The order does matter; for example, a macro may be defined by multiple files; only the last definition sticks.) The .ddoc files may only contain macros, i.e. the entire content of a .ddoc file is processed the same way as the "Macros:" section of a .dd file.

Consider, for example, we have a file html.ddoc with the following content:

DDOC=<!DOCTYPE html>
<html>
<head><title>$(TITLE)</title></head>
<body>
$(BODY)
</body>
</html>
EMPH=<i>$0</i>
FORMAT=HTML

Now, say we also have a file called text.dd with the content:

Ddoc
Yo, this is one $(EMPH fine) $(FORMAT) doc!

To process text.dd in conjunction with html.ddoc, run this:

~/dlang/dmd/generated/linux/release/64/dmd html.ddoc test.dd

The generated content of test.html is:

<!DOCTYPE html>
<html>
<head><title>test</title></head>
<body>
Yo, this is one <i>fine</i> HTML doc!
</body>
</html>

To illustrate the flexibility gained, let's also define latex.ddoc as follows:

DDOC=\documentclass{article}
\title{$(TITLE)}
\begin{document}
\maketitle
$(BODY)
\end{document}
EMPH=\emph{$0}
FORMAT=\LaTeX

Now let's build test.tex as follows:

~/dlang/dmd/generated/linux/release/64/dmd -oftest.tex latex.ddoc test.dd

Amazingly, we now get:

\documentclass{article}
\title{test}
\begin{document}
\maketitle
Yo, this is one fine \emph{fine} \LaTeX doc!
\end{document}

which is suitable for direct submission to the latex processor to generate a beautiful PS or PDF document. You may define similar macros and .ddoc files for Markdown, XML, plaintext, and more. This is where the power of macro processors of Ddoc's kind comes through shining.

The ~/d/dlang.org directory contains the following .ddoc files, among which some are listed below:

macros.ddoc Fundamental macros useful for all of Ddoc, such as ARGS and TAIL
html.ddoc Macros for general HTML file creation
latex.ddoc Macros for general LaTeX file creation.
plaintext.ddoc Macros for plaintext file creation.
verbatim.ddoc Macros that expand to "themselves", i.e. the output shows all of the internal macros silently used by Ddoc during expansion. Very very useful for debugging documents and for figuring out what particular macro needs to be (re)defined for a desired effect.
dlang.org.ddoc Macros that define the look and feel of dlang.org.
doc.ddoc Supplemental macros that help the website but not the standard library documentation.
std.ddoc Supplemental macros that help the standard library documentation.

Back to dlang.org

Whew! That was quite a ride. Now you should be much better equipped to make changes on the files in ~/dlang/dlang.org. Each page on the site is built with a line like the following:

~/dlang/dmd/generated/linux/release/64/dmd -odweb macros.ddoc html.ddoc dlang.org.ddoc doc.ddoc somefile.dd

That command (which is part of the makefile) produces ~/dlang/dlang.org/web/somefile.html. Note that the .ddoc files are listed in a cascading manner, from the most general to the most specific. This is so a more specific macro definition may override a more general one.

If you take a look at somefile.dd and see a macro name you don't know about, look for it in the following order:

  • In the "Macros:" section of somefile.dd, if present;
  • In the files doc.ddoc, dlang.org.ddoc, html.ddoc, macros.ddoc in this order;
  • Among the predefined macros listed on the Ddoc page.

Things To Improve

Many macros were written before the recursion helpers were introduced and are more clumsy than they could.

The macros have been growing through accretion and it shows. Certain macros are redundant. Others are duplicated across the "Macros:" sections of different files. Certain macros seem too low-level or too much trouble to use instead of their expansion. (However, generally writing raw HTML inside Ddoc files is considered poor style because it relies on specific tricks instead of general CSS-driven formatting.)

Hopefully this document makes it a lot easier to understand and work with dlang.org. Good luck!