studies

いろいろ勉強用備忘録的な感じ

C言語のdigraphsを使ってみる

完全に小ネタ。

ISO/IEC 9899:1999によると、

In all aspects of the language, the six tokens(67

<:    :>    <%    %>    %:    %:%:
behave, respectively, the same as the six tokens
[    ]    {    }    #    ##
except for their spelling.(68



67) These tokens are sometimes called ‘‘digraphs’’.
68) Thus [ and <: behave differently when ‘‘stringized’’ (see 6.10.3.2), but can otherwise be freely interchanged.

なので、こんなソースもOKだ。

%:include <stdio.h>
%:define GLUE(x,y) x%:%:y

/* おなじみhello, world */
int main(void)
<%
    char hello<::> = "hello, world\n";
    int i;

    for (i=0; i<sizeof GLUE(hell,o); i++) <%
        putchar(hello<:i:>);
    %>
    return 0;
%>

trigraphsもある。

5.2.1.1 Trigraph sequences
1 All occurrences in a source file of the following sequences of three characters (called trigraph sequences(12 ) are replaced with the corresponding single character.

??= #        ??( [        ??/ \
??) ] ??' ^ ??< {
??! | ??> } ??- ~
No other trigraph sequences exist. Each ? that does not begin one of the trigraphs listed above is not changed.
2 EXAMPLE The following source line

printf("Eh???/n");

becomes (after replacement of the trigraph sequence ??/)

printf("Eh?\n");

12) The trigraph sequences enable the input of characters that are not defined in the Invariant Code Set as described in ISO/IEC 646, which is a subset of the seven-bit US ASCII code set.

??=include <stdio.h>
??=define GLUE(x,y) x??=??=y

int main(void)
??<
    char hello??(??) = "hello, world??/n";
    int i;

    for (i??'=i; ??-i>??-sizeof GLUE(hell,o); i++) ??<
        putchar(hello??(i??));
    ??>
    return 0;
??>

しかし、gccの場合、-trigraphsオプションをつけないとコンパイルされない。

tokens.c:1:1: warning: trigraph ??= ignored, use -trigraphs to enable [-Wtrigraphs]
??=include
^

これは、仕様書のEXAMPLEにあるように、trigraphは文字列の中にあっても展開されるという危険な動きをするからだ。