isposix

Un outil pour savoir si une commande est posix - retour accueil

git clone git://bebou.netlib.re/isposix
Log | Files | Refs | README |

awk.html (144120B)


      1 <!-- Copyright 2001-2024 IEEE and The Open Group, All Rights Reserved -->
      2 <!DOCTYPE HTML>
      3 <html lang="en">
      4 <head>
      5 <meta name="generator" content="HTML Tidy for HTML5 for Linux version 5.8.0">
      6 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      7 <link type="text/css" rel="stylesheet" href="style.css"><!-- Generated by The Open Group rhtm tool v1.2.4 -->
      8 <!-- Copyright (c) 2001-2024 The Open Group, All Rights Reserved -->
      9 <title>awk</title>
     10 </head>
     11 <body bgcolor="white">
     12 <div class="NAVHEADER">
     13 <table summary="Header navigation table" class="nav" width="100%" border="0" cellpadding="0" cellspacing="0">
     14 <tr class="nav">
     15 <td class="nav" width="15%" align="left" valign="bottom"><a href="../utilities/at.html" accesskey="P">&lt;&lt;&lt;
     16 Previous</a></td>
     17 <td class="nav" width="70%" align="center" valign="bottom"><a href="contents.html">Home</a></td>
     18 <td class="nav" width="15%" align="right" valign="bottom"><a href="../utilities/basename.html" accesskey="N">Next
     19 &gt;&gt;&gt;</a></td>
     20 </tr>
     21 </table>
     22 <hr align="left" width="100%"></div>
     23 <script language="JavaScript" src="../jscript/codes.js"></script><basefont size="3">
     24 <center><font size="2">The Open Group Base Specifications Issue 8<br>
     25 IEEE Std 1003.1-2024<br>
     26 Copyright © 2001-2024 The IEEE and The Open Group</font></center>
     27 <hr size="2" noshade>
     28 <a name="top" id="top"></a> <a name="awk" id="awk"></a> <a name="tag_20_06" id="tag_20_06"></a><!-- awk -->
     29 <h4 class="mansect"><a name="tag_20_06_01" id="tag_20_06_01"></a>NAME</h4>
     30 <blockquote>awk — pattern scanning and processing language</blockquote>
     31 <h4 class="mansect"><a name="tag_20_06_02" id="tag_20_06_02"></a>SYNOPSIS</h4>
     32 <blockquote class="synopsis">
     33 <p><code><tt>awk</tt> <b>[</b><tt>-F</tt> <i>sepstring</i><b>] [</b><tt>-v</tt> <i>assignment</i><b>]</b><tt>...</tt> <i>program</i>
     34 <b>[</b><i>argument</i><tt>...</tt><b>]</b> <tt><br>
     35 <br>
     36 awk</tt> <b>[</b><tt>-F</tt> <i>sepstring</i><b>]</b> <tt>-f</tt> <i>progfile</i> <b>[</b><tt>-f</tt>
     37 <i>progfile</i><b>]</b><tt>...</tt> <b>[</b><tt>-v</tt> <i>assignment</i><b>]</b><tt>...<br>
     38 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</tt> <b>[</b><i>argument</i><tt>...</tt><b>]</b> <tt><br></tt></code></p>
     39 </blockquote>
     40 <h4 class="mansect"><a name="tag_20_06_03" id="tag_20_06_03"></a>DESCRIPTION</h4>
     41 <blockquote>
     42 <p>The <i>awk</i> utility shall execute programs written in the <i>awk</i> programming language, which is specialized for textual
     43 data manipulation. An <i>awk</i> program is a sequence of patterns and corresponding actions. When input is read that matches a
     44 pattern, the action associated with that pattern is carried out.</p>
     45 <p>Input shall be interpreted as a sequence of records. By default, a record is a line, less its terminating &lt;newline&gt;, but
     46 this can be changed by using the <b>RS</b> built-in variable. Each record of input shall be matched in turn against each pattern in
     47 the program. For each pattern matched, the associated action shall be executed.</p>
     48 <p>The <i>awk</i> utility shall interpret each input record as a sequence of fields where, by default, a field is a string of
     49 non-&lt;blank&gt; non-&lt;newline&gt; characters. This default &lt;blank&gt; and &lt;newline&gt; field delimiter can be changed by
     50 using the <b>FS</b> built-in variable or the <b>-F</b> <i>sepstring</i> option. The <i>awk</i> utility shall denote the first field
     51 in a record $1, the second $2, and so on. The symbol $0 shall refer to the entire record; setting any other field causes the
     52 re-evaluation of $0. Assigning to $0 shall reset the values of all other fields and the <b>NF</b> built-in variable.</p>
     53 </blockquote>
     54 <h4 class="mansect"><a name="tag_20_06_04" id="tag_20_06_04"></a>OPTIONS</h4>
     55 <blockquote>
     56 <p>The <i>awk</i> utility shall conform to XBD <a href="../basedefs/V1_chap12.html#tag_12_02"><i>12.2 Utility Syntax
     57 Guidelines</i></a> .</p>
     58 <p>The following options shall be supported:</p>
     59 <dl compact>
     60 <dd></dd>
     61 <dt><b>-F&nbsp;</b><i>sepstring</i></dt>
     62 <dd>Define the input field separator. This option shall be equivalent to:
     63 <pre>
     64 <tt>-v FS=</tt><i>sepstring
     65 </i></pre>
     66 <p>except that if <b>-F</b> <i>sepstring</i> and <b>-v</b> <i><tt>FS=</tt>sepstring</i> are both used, it is unspecified whether
     67 the <b>FS</b> assignment resulting from <b>-F</b> <i>sepstring</i> is processed in command line order or is processed after the
     68 last <b>-v</b> <i><tt>FS=</tt>sepstring</i>. See the description of the <b>FS</b> built-in variable, and how it is used, in the
     69 EXTENDED DESCRIPTION section.</p>
     70 </dd>
     71 <dt><b>-f&nbsp;</b><i>progfile</i></dt>
     72 <dd>Specify the pathname of the file <i>progfile</i> containing an <i>awk</i> program. A pathname of <tt>'-'</tt> shall denote the
     73 standard input. If multiple instances of this option are specified, the concatenation of the files specified as <i>progfile</i> in
     74 the order specified shall be the <i>awk</i> program. The <i>awk</i> program can alternatively be specified in the command line as a
     75 single argument.</dd>
     76 <dt><b>-v&nbsp;</b><i>assignment</i></dt>
     77 <dd>
     78 The application shall ensure that the <i>assignment</i> argument is in the same form as an <i>assignment</i> operand. The specified
     79 variable assignment shall occur prior to executing the <i>awk</i> program, including the actions associated with <b>BEGIN</b>
     80 patterns (if any). Multiple occurrences of this option can be specified.</dd>
     81 </dl>
     82 </blockquote>
     83 <h4 class="mansect"><a name="tag_20_06_05" id="tag_20_06_05"></a>OPERANDS</h4>
     84 <blockquote>
     85 <p>The following operands shall be supported:</p>
     86 <dl compact>
     87 <dd></dd>
     88 <dt><i>program</i></dt>
     89 <dd>If no <b>-f</b> option is specified, the first operand to <i>awk</i> shall be the text of the <i>awk</i> program. The
     90 application shall supply the <i>program</i> operand as a single argument to <i>awk</i>. If the text does not end in a
     91 &lt;newline&gt;, <i>awk</i> shall interpret the text as if it did.</dd>
     92 <dt><i>argument</i></dt>
     93 <dd>Either of the following two types of <i>argument</i> can be intermixed:
     94 <dl compact>
     95 <dd></dd>
     96 <dt><i>file</i></dt>
     97 <dd>A pathname of a file that contains the input to be read, which is matched against the set of patterns in the program. If no
     98 <i>file</i> operands or their equivalents, achieved by modifying the <i>awk</i> variables <b>ARGV</b> and <b>ARGC</b>, are
     99 specified, or if a <i>file</i> operand is <tt>'-'</tt>, the standard input shall be used.</dd>
    100 <dt><i>assignment</i></dt>
    101 <dd>An operand that begins with an &lt;underscore&gt; or alphabetic character from the portable character set (see the table in XBD
    102 <a href="../basedefs/V1_chap06.html#tag_06_01"><i>6.1 Portable Character Set</i></a> ), followed by a sequence of underscores,
    103 digits, and alphabetics from the portable character set, followed by the <tt>'='</tt> character, shall specify a variable
    104 assignment rather than a pathname. The characters before the <tt>'='</tt> represent the name of an <i>awk</i> variable; if that
    105 name is an <i>awk</i> reserved word (see <a href="#tag_20_06_13_16">Grammar</a> ) the behavior is undefined. The characters
    106 following the &lt;equals-sign&gt; shall be interpreted as if they appeared in the <i>awk</i> program preceded and followed by a
    107 double-quote (<tt>'"'</tt> ) character, as a <b>STRING</b> token (see <a href="#tag_20_06_13_16">Grammar</a> ), except that if the
    108 last character is an unescaped &lt;backslash&gt;, it shall be interpreted as a literal &lt;backslash&gt; rather than as the first
    109 character of the sequence <tt>"\""</tt>. The variable shall be assigned the value of that <b>STRING</b> token and, if appropriate,
    110 shall be considered a <i>numeric string</i> (see <a href="#tag_20_06_13_02">Expressions in awk</a> ), the variable shall also be
    111 assigned its numeric value. Each such variable assignment shall occur just prior to the processing of the following <i>file</i>, if
    112 any. Thus, an assignment before the first <i>file</i> argument shall be executed after the <b>BEGIN</b> actions (if any), while an
    113 assignment after the last <i>file</i> argument shall occur before the <b>END</b> actions (if any). If there are no <i>file</i>
    114 arguments or their equivalents, achieved by modifying the <i>awk</i> variables <b>ARGV</b> and <b>ARGC</b>, assignments shall be
    115 executed before processing the standard input.</dd>
    116 </dl>
    117 </dd>
    118 </dl>
    119 </blockquote>
    120 <h4 class="mansect"><a name="tag_20_06_06" id="tag_20_06_06"></a>STDIN</h4>
    121 <blockquote>
    122 <p>The standard input shall be used only if no <i>file</i> operands or their equivalents, achieved by modifying the <i>awk</i>
    123 variables <b>ARGV</b> and <b>ARGC</b>, are specified; or if a <i>file</i> operand, or its equivalent, is <tt>'-'</tt>; or if a
    124 <i>progfile</i> option-argument is <tt>'-'</tt>; see the INPUT FILES section. If the <i>awk</i> program contains no actions and no
    125 patterns, but is otherwise a valid <i>awk</i> program, standard input and any <i>file</i> operands shall not be read and <i>awk</i>
    126 shall exit with a return status of zero.</p>
    127 </blockquote>
    128 <h4 class="mansect"><a name="tag_20_06_07" id="tag_20_06_07"></a>INPUT FILES</h4>
    129 <blockquote>
    130 <p>Input files to the <i>awk</i> program from any of the following sources shall be text files:</p>
    131 <ul>
    132 <li>
    133 <p>Any <i>file</i> operands or their equivalents, achieved by modifying the <i>awk</i> variables <b>ARGV</b> and <b>ARGC</b></p>
    134 </li>
    135 <li>
    136 <p>Standard input in the absence of any <i>file</i> operands, or their equivalents</p>
    137 </li>
    138 <li>
    139 <p>Arguments to the <b>getline</b> function</p>
    140 </li>
    141 </ul>
    142 <p>Whether the variable <b>RS</b> is set to a value other than a &lt;newline&gt; or not, for these files, implementations shall
    143 support records terminated with the specified separator up to {LINE_MAX} bytes and may support longer records.</p>
    144 <p>If <b>-f</b> <i>progfile</i> is specified, the application shall ensure that the files named by each of the <i>progfile</i>
    145 option-arguments are text files and their concatenation, in the same order as they appear in the arguments, is an <i>awk</i>
    146 program.</p>
    147 </blockquote>
    148 <h4 class="mansect"><a name="tag_20_06_08" id="tag_20_06_08"></a>ENVIRONMENT VARIABLES</h4>
    149 <blockquote>
    150 <p>The following environment variables shall affect the execution of <i>awk</i>:</p>
    151 <dl compact>
    152 <dd></dd>
    153 <dt><i>LANG</i></dt>
    154 <dd>Provide a default value for the internationalization variables that are unset or null. (See XBD <a href=
    155 "../basedefs/V1_chap08.html#tag_08_02"><i>8.2 Internationalization Variables</i></a> for the precedence of internationalization
    156 variables used to determine the values of locale categories.)</dd>
    157 <dt><i>LC_ALL</i></dt>
    158 <dd>If set to a non-empty string value, override the values of all the other internationalization variables.</dd>
    159 <dt><i>LC_COLLATE</i></dt>
    160 <dd>
    161 Determine the locale for the behavior of ranges, equivalence classes, and multi-character collating elements within regular
    162 expressions and in comparisons of string values.</dd>
    163 <dt><i>LC_CTYPE</i></dt>
    164 <dd>Determine the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as
    165 opposed to multi-byte characters in arguments and input files), the behavior of character classes within regular expressions, the
    166 identification of characters as letters, and the mapping of uppercase and lowercase characters for the <b>toupper</b> and
    167 <b>tolower</b> functions.</dd>
    168 <dt><i>LC_MESSAGES</i></dt>
    169 <dd>
    170 Determine the locale that should be used to affect the format and contents of diagnostic messages written to standard error.</dd>
    171 <dt><i>LC_NUMERIC</i></dt>
    172 <dd>
    173 Determine the radix character used when interpreting numeric input, performing conversions between numeric and string values, and
    174 formatting numeric output. Regardless of locale, the &lt;period&gt; character (the decimal-point character of the POSIX locale) is
    175 the decimal-point character recognized in processing <i>awk</i> programs (including assignments in command line arguments).</dd>
    176 <dt><i>NLSPATH</i></dt>
    177 <dd><sup>[<a href="javascript:open_code('XSI')">XSI</a>]</sup> <img src="../images/opt-start.gif" alt="[Option Start]" border="0">
    178 Determine the location of messages objects and message catalogs. <img src="../images/opt-end.gif" alt="[Option End]" border=
    179 "0"></dd>
    180 <dt><i>PATH</i></dt>
    181 <dd>Determine the search path when looking for commands executed by <i>system</i>(<i>expr</i>), or input and output pipes; see XBD
    182 <a href="../basedefs/V1_chap08.html#tag_08"><i>8. Environment Variables</i></a> .</dd>
    183 </dl>
    184 <p>In addition, all environment variables shall be visible via the <i>awk</i> variable <b>ENVIRON</b>.</p>
    185 </blockquote>
    186 <h4 class="mansect"><a name="tag_20_06_09" id="tag_20_06_09"></a>ASYNCHRONOUS EVENTS</h4>
    187 <blockquote>
    188 <p>Default.</p>
    189 </blockquote>
    190 <h4 class="mansect"><a name="tag_20_06_10" id="tag_20_06_10"></a>STDOUT</h4>
    191 <blockquote>
    192 <p>The nature of the output files depends on the <i>awk</i> program.</p>
    193 </blockquote>
    194 <h4 class="mansect"><a name="tag_20_06_11" id="tag_20_06_11"></a>STDERR</h4>
    195 <blockquote>
    196 <p>The standard error shall be used only for diagnostic messages.</p>
    197 </blockquote>
    198 <h4 class="mansect"><a name="tag_20_06_12" id="tag_20_06_12"></a>OUTPUT FILES</h4>
    199 <blockquote>
    200 <p>The nature of the output files depends on the <i>awk</i> program.<br></p>
    201 </blockquote>
    202 <h4 class="mansect"><a name="tag_20_06_13" id="tag_20_06_13"></a>EXTENDED DESCRIPTION</h4>
    203 <blockquote>
    204 <h5><a name="tag_20_06_13_01" id="tag_20_06_13_01"></a>Overall Program Structure</h5>
    205 <p>An <i>awk</i> program is composed of pairs of the form:</p>
    206 <pre>
    207 <i>pattern</i><tt> { </tt><i>action</i><tt> }
    208 </tt></pre>
    209 <p>Either the pattern or the action (including the enclosing brace characters) can be omitted.</p>
    210 <p>A missing pattern shall match any record of input, and a missing action shall be equivalent to:</p>
    211 <pre>
    212 <tt>{ print }
    213 </tt></pre>
    214 <p>Execution of the <i>awk</i> program shall start by first executing the actions associated with all <b>BEGIN</b> patterns in the
    215 order they occur in the program. Then each <i>file</i> operand (or standard input if no files were specified) shall be processed in
    216 turn by reading data from the file until a record separator is seen (&lt;newline&gt; by default). Before the first reference to a
    217 field in the record is evaluated, the record shall be split into fields, according to the rules in <a href=
    218 "#tag_20_06_13_04">Regular Expressions</a> , using the value of <b>FS</b> that was current at the time the record was read. Each
    219 pattern in the program then shall be evaluated in the order of occurrence, and the action associated with each pattern that matches
    220 the current record executed. The action for a matching pattern shall be executed before evaluating subsequent patterns. Finally,
    221 the actions associated with all <b>END</b> patterns shall be executed in the order they occur in the program.</p>
    222 <h5><a name="tag_20_06_13_02" id="tag_20_06_13_02"></a>Expressions in awk</h5>
    223 <p>Expressions describe computations used in <i>patterns</i> and <i>actions</i>. In the following table, valid expression
    224 operations are given in groups from highest precedence first to lowest precedence last, with equal-precedence operators grouped
    225 between horizontal lines. In expression evaluation, where the grammar is formally ambiguous, higher precedence operators shall be
    226 evaluated before lower precedence operators. In this table <i>expr</i>, <i>expr1</i>, <i>expr2</i>, and <i>expr3</i> represent any
    227 expression, while lvalue represents any entity that can be assigned to (that is, on the left side of an assignment operator). The
    228 precise syntax of expressions is given in <a href="#tag_20_06_13_16">Grammar</a> .</p>
    229 <p class="caption"><a name="tagtcjh_14" id="tagtcjh_14"></a> Table: Expressions in Decreasing Precedence in awk</p>
    230 <center>
    231 <table border="1" cellpadding="3" align="center">
    232 <tr valign="top">
    233 <th align="center">
    234 <p class="tent"><b>Syntax</b></p>
    235 </th>
    236 <th align="center">
    237 <p class="tent"><b>Name</b></p>
    238 </th>
    239 <th align="center">
    240 <p class="tent"><b>Type of Result</b></p>
    241 </th>
    242 <th align="center">
    243 <p class="tent"><b>Associativity</b></p>
    244 </th>
    245 </tr>
    246 
    247 <tr valign="top">
    248 <td align="left">
    249 <p class="tent">(<i>expr</i>)</p>
    250 </td>
    251 <td align="left">
    252 <p class="tent">Grouping</p>
    253 </td>
    254 <td align="left">
    255 <p class="tent">Type of <i>expr</i></p>
    256 </td>
    257 <td align="left">
    258 <p class="tent">N/A</p>
    259 </td>
    260 </tr>
    261 
    262 <tr valign="top">
    263 <td align="left">
    264 <p class="tent">$<i>expr</i></p>
    265 </td>
    266 <td align="left">
    267 <p class="tent">Field reference</p>
    268 </td>
    269 <td align="left">
    270 <p class="tent">Uninitialized or String</p>
    271 </td>
    272 <td align="left">
    273 <p class="tent">N/A</p>
    274 </td>
    275 </tr>
    276 
    277 <tr valign="top">
    278 <td align="left">
    279 <p class="tent">lvalue ++</p>
    280 <p class="tent">lvalue --</p>
    281 </td>
    282 <td align="left">
    283 <p class="tent">Post-increment</p>
    284 <p class="tent">Post-decrement</p>
    285 </td>
    286 <td align="left">
    287 <p class="tent">Numeric</p>
    288 <p class="tent">Numeric</p>
    289 </td>
    290 <td align="left">
    291 <p class="tent">N/A</p>
    292 <p class="tent">N/A</p>
    293 </td>
    294 </tr>
    295 
    296 <tr valign="top">
    297 <td align="left">
    298 <p class="tent">++ lvalue</p>
    299 <p class="tent">-- lvalue</p>
    300 </td>
    301 <td align="left">
    302 <p class="tent">Pre-increment</p>
    303 <p class="tent">Pre-decrement</p>
    304 </td>
    305 <td align="left">
    306 <p class="tent">Numeric</p>
    307 <p class="tent">Numeric</p>
    308 </td>
    309 <td align="left">
    310 <p class="tent">N/A</p>
    311 <p class="tent">N/A</p>
    312 </td>
    313 </tr>
    314 
    315 
    316 <tr valign="top">
    317 <td align="left">
    318 <p class="tent"><i>expr</i> ^ <i>expr</i></p>
    319 </td>
    320 <td align="left">
    321 <p class="tent">Exponentiation</p>
    322 </td>
    323 <td align="left">
    324 <p class="tent">Numeric</p>
    325 </td>
    326 <td align="left">
    327 <p class="tent">Right</p>
    328 </td>
    329 </tr>
    330 
    331 <tr valign="top">
    332 <td align="left">
    333 <p class="tent">! <i>expr</i></p>
    334 <p class="tent">+ <i>expr</i></p>
    335 <p class="tent">- <i>expr</i></p>
    336 </td>
    337 <td align="left">
    338 <p class="tent">Logical not</p>
    339 <p class="tent">Unary plus</p>
    340 <p class="tent">Unary minus</p>
    341 </td>
    342 <td align="left">
    343 <p class="tent">Numeric</p>
    344 <p class="tent">Numeric</p>
    345 <p class="tent">Numeric</p>
    346 </td>
    347 <td align="left">
    348 <p class="tent">N/A</p>
    349 <p class="tent">N/A</p>
    350 <p class="tent">N/A</p>
    351 </td>
    352 </tr>
    353 
    354 <tr valign="top">
    355 <td align="left">
    356 <p class="tent"><i>expr</i> * <i>expr</i></p>
    357 <p class="tent"><i>expr</i> / <i>expr</i></p>
    358 <p class="tent"><i>expr</i> % <i>expr</i></p>
    359 </td>
    360 <td align="left">
    361 <p class="tent">Multiplication</p>
    362 <p class="tent">Division</p>
    363 <p class="tent">Modulus</p>
    364 </td>
    365 <td align="left">
    366 <p class="tent">Numeric</p>
    367 <p class="tent">Numeric</p>
    368 <p class="tent">Numeric</p>
    369 </td>
    370 <td align="left">
    371 <p class="tent">Left</p>
    372 <p class="tent">Left</p>
    373 <p class="tent">Left</p>
    374 </td>
    375 </tr>
    376 
    377 <tr valign="top">
    378 <td align="left">
    379 <p class="tent"><i>expr</i> + <i>expr</i></p>
    380 <p class="tent"><i>expr</i> - <i>expr</i></p>
    381 </td>
    382 <td align="left">
    383 <p class="tent">Addition</p>
    384 <p class="tent">Subtraction</p>
    385 </td>
    386 <td align="left">
    387 <p class="tent">Numeric</p>
    388 <p class="tent">Numeric</p>
    389 </td>
    390 <td align="left">
    391 <p class="tent">Left</p>
    392 <p class="tent">Left</p>
    393 </td>
    394 </tr>
    395 
    396 
    397 <tr valign="top">
    398 <td align="left">
    399 <p class="tent"><i>expr</i> <i>expr</i></p>
    400 </td>
    401 <td align="left">
    402 <p class="tent">String concatenation</p>
    403 </td>
    404 <td align="left">
    405 <p class="tent">String</p>
    406 </td>
    407 <td align="left">
    408 <p class="tent">Left</p>
    409 </td>
    410 </tr>
    411 
    412 <tr valign="top">
    413 <td align="left">
    414 <p class="tent"><i>expr</i> &lt; <i>expr</i></p>
    415 <p class="tent"><i>expr</i> &lt;= <i>expr</i></p>
    416 <p class="tent"><i>expr</i> != <i>expr</i></p>
    417 <p class="tent"><i>expr</i> == <i>expr</i></p>
    418 <p class="tent"><i>expr</i> &gt; <i>expr</i></p>
    419 <p class="tent"><i>expr</i> &gt;= <i>expr</i></p>
    420 </td>
    421 <td align="left">
    422 <p class="tent">Less than</p>
    423 <p class="tent">Less than or equal to</p>
    424 <p class="tent">Not equal to</p>
    425 <p class="tent">Equal to</p>
    426 <p class="tent">Greater than</p>
    427 <p class="tent">Greater than or equal to</p>
    428 </td>
    429 <td align="left">
    430 <p class="tent">Numeric</p>
    431 <p class="tent">Numeric</p>
    432 <p class="tent">Numeric</p>
    433 <p class="tent">Numeric</p>
    434 <p class="tent">Numeric</p>
    435 <p class="tent">Numeric</p>
    436 </td>
    437 <td align="left">
    438 <p class="tent">None</p>
    439 <p class="tent">None</p>
    440 <p class="tent">None</p>
    441 <p class="tent">None</p>
    442 <p class="tent">None</p>
    443 <p class="tent">None</p>
    444 </td>
    445 </tr>
    446 
    447 
    448 <tr valign="top">
    449 <td align="left">
    450 <p class="tent"><i>expr</i> &#152; <i>expr</i></p>
    451 <p class="tent"><i>expr</i> !&#152; <i>expr</i></p>
    452 </td>
    453 <td align="left">
    454 <p class="tent">ERE match</p>
    455 <p class="tent">ERE non-match</p>
    456 </td>
    457 <td align="left">
    458 <p class="tent">Numeric</p>
    459 <p class="tent">Numeric</p>
    460 </td>
    461 <td align="left">
    462 <p class="tent">None</p>
    463 <p class="tent">None</p>
    464 </td>
    465 </tr>
    466 
    467 <tr valign="top">
    468 <td align="left">
    469 <p class="tent"><i>expr</i> in array</p>
    470 <p class="tent">(<i>index</i>) in <i>array</i></p>
    471 </td>
    472 <td align="left">
    473 <p class="tent">Array membership</p>
    474 <p class="tent">Multi-dimension array membership</p>
    475 </td>
    476 <td align="left">
    477 <p class="tent">Numeric</p>
    478 <p class="tent">Numeric</p>
    479 </td>
    480 <td align="left">
    481 <p class="tent">Left</p>
    482 <p class="tent">Left</p>
    483 </td>
    484 </tr>
    485 
    486 <tr valign="top">
    487 <td align="left">
    488 <p class="tent"><i>expr</i> &amp;&amp; <i>expr</i></p>
    489 </td>
    490 <td align="left">
    491 <p class="tent">Logical AND</p>
    492 </td>
    493 <td align="left">
    494 <p class="tent">Numeric</p>
    495 </td>
    496 <td align="left">
    497 <p class="tent">Left</p>
    498 </td>
    499 </tr>
    500 
    501 <tr valign="top">
    502 <td align="left">
    503 <p class="tent"><i>expr</i> || <i>expr</i></p>
    504 </td>
    505 <td align="left">
    506 <p class="tent">Logical OR</p>
    507 </td>
    508 <td align="left">
    509 <p class="tent">Numeric</p>
    510 </td>
    511 <td align="left">
    512 <p class="tent">Left</p>
    513 </td>
    514 </tr>
    515 
    516 <tr valign="top">
    517 <td align="left">
    518 <p class="tent"><i>expr1</i> ? <i>expr2</i> : <i>expr3</i></p>
    519 </td>
    520 <td align="left">
    521 <p class="tent">Conditional expression</p>
    522 </td>
    523 <td align="left">
    524 <p class="tent">Type of selected<br><i>expr2</i> or <i>expr3</i></p>
    525 </td>
    526 <td align="left">
    527 <p class="tent">Right</p>
    528 </td>
    529 </tr>
    530 
    531 <tr valign="top">
    532 <td align="left">
    533 <p class="tent">lvalue ^= <i>expr</i></p>
    534 <p class="tent">lvalue %= <i>expr</i></p>
    535 <p class="tent">lvalue *= <i>expr</i></p>
    536 <p class="tent">lvalue /= <i>expr</i></p>
    537 <p class="tent">lvalue += <i>expr</i></p>
    538 <p class="tent">lvalue -= <i>expr</i></p>
    539 <p class="tent">lvalue = <i>expr</i></p>
    540 </td>
    541 <td align="left">
    542 <p class="tent">Exponentiation assignment</p>
    543 <p class="tent">Modulus assignment</p>
    544 <p class="tent">Multiplication assignment</p>
    545 <p class="tent">Division assignment</p>
    546 <p class="tent">Addition assignment</p>
    547 <p class="tent">Subtraction assignment</p>
    548 <p class="tent">Assignment</p>
    549 </td>
    550 <td align="left">
    551 <p class="tent">Numeric</p>
    552 <p class="tent">Numeric</p>
    553 <p class="tent">Numeric</p>
    554 <p class="tent">Numeric</p>
    555 <p class="tent">Numeric</p>
    556 <p class="tent">Numeric</p>
    557 <p class="tent">Type of <i>expr</i></p>
    558 </td>
    559 <td align="left">
    560 <p class="tent">Right</p>
    561 <p class="tent">Right</p>
    562 <p class="tent">Right</p>
    563 <p class="tent">Right</p>
    564 <p class="tent">Right</p>
    565 <p class="tent">Right</p>
    566 <p class="tent">Right</p>
    567 </td>
    568 </tr>
    569 
    570 </table>
    571 </center>
    572 <p class="tent">Each expression shall have either a string value, a numeric value, or both. Except as stated for specific contexts,
    573 the value of an expression shall be implicitly converted to the type needed for the context in which it is used. A string value
    574 shall be converted to a numeric value either by the equivalent of the following calls to functions defined by the ISO&nbsp;C
    575 standard:</p>
    576 <pre>
    577 <tt>setlocale(LC_NUMERIC, "");
    578 </tt><i>numeric_value</i><tt> = atof(</tt><i>string_value</i><tt>);
    579 </tt></pre>
    580 <p class="tent">or by converting the initial portion of the string to type <b>double</b> representation as follows:</p>
    581 <blockquote>The input string is decomposed into two parts: an initial, possibly empty, sequence of white-space characters (as
    582 specified by <a href="../functions/isspace.html"><i>isspace</i>()</a>) and a subject sequence interpreted as a floating-point
    583 constant.
    584 <p class="tent">The expected form of the subject sequence is an optional <tt>'+'</tt> or <tt>'-'</tt> sign, then a non-empty
    585 sequence of digits optionally containing a radix character, then an optional exponent part. An exponent part consists of
    586 <tt>'e'</tt> or <tt>'E'</tt>, followed by an optional sign, followed by one or more decimal digits.</p>
    587 <p class="tent">The sequence starting with the first digit or the radix character (whichever occurs first) is interpreted as a
    588 floating constant of the C language, except that the radix character shall be used in place of a &lt;period&gt;, and if neither an
    589 exponent part nor a radix character appears, a radix character is assumed to follow the last digit in the string. If the subject
    590 sequence begins with a &lt;hyphen-minus&gt;, the value resulting from the conversion is negated.</p>
    591 </blockquote>
    592 <p class="tent">A numeric value that is exactly equal to the value of an integer (see <a href=
    593 "../utilities/V3_chap01.html#tag_18_01_02"><i>1.1.2 Concepts Derived from the ISO C Standard</i></a> ) shall be converted to a
    594 string by the equivalent of a call to the <b>sprintf</b> function (see <a href="#tag_20_06_13_13">String Functions</a> ) with the
    595 string <tt>"%d"</tt> as the <i>fmt</i> argument and the numeric value being converted as the first and only <i>expr</i> argument.
    596 Any other numeric value shall be converted to a string by the equivalent of a call to the <b>sprintf</b> function with the value of
    597 the variable <b>CONVFMT</b> as the <i>fmt</i> argument and the numeric value being converted as the first and only <i>expr</i>
    598 argument. The result of the conversion is unspecified if the value of <b>CONVFMT</b> is not a floating-point format specification.
    599 This volume of POSIX.1-2024 specifies no explicit conversions between numbers and strings. An application can force an expression
    600 to be treated as a number by adding zero to it, or can force it to be treated as a string by concatenating the null string
    601 (<tt>""</tt>) to it.</p>
    602 <p class="tent">A string value shall be considered a <i>numeric string</i> if it comes from one of the following:</p>
    603 <ol>
    604 <li class="tent">Field variables</li>
    605 <li class="tent">Input from the <i>getline</i>() function</li>
    606 <li class="tent"><b>FILENAME</b></li>
    607 <li class="tent"><b>ARGV</b> array elements</li>
    608 <li class="tent"><b>ENVIRON</b> array elements</li>
    609 <li class="tent">Array elements created by the <i>split</i>() function</li>
    610 <li class="tent">A command line variable assignment</li>
    611 <li class="tent">Variable assignment from another numeric string variable</li>
    612 </ol>
    613 <p class="tent">and an implementation-dependent condition corresponding to either case (a) or (b) below is met.</p>
    614 <ol type="a">
    615 <li class="tent">After the equivalent of the following calls to functions defined by the ISO&nbsp;C standard,
    616 <i>string_value_end</i> would differ from <i>string_value</i>, and any characters before the terminating null character in
    617 <i>string_value_end</i> would be &lt;blank&gt; characters:
    618 <pre>
    619 <tt>char *string_value_end;
    620 setlocale(LC_NUMERIC, "");
    621 numeric_value = strtod (string_value, &amp;string_value_end);
    622 </tt></pre></li>
    623 <li class="tent">After all the following conversions have been applied, the resulting string would lexically be recognized as a
    624 <b>NUMBER</b> token as described by the lexical conventions in <a href="#tag_20_06_13_16">Grammar</a> :
    625 <ul>
    626 <li class="tent">All leading and trailing &lt;blank&gt; characters are discarded.</li>
    627 <li class="tent">If the first non-&lt;blank&gt; is <tt>'+'</tt> or <tt>'-'</tt>, it is discarded.</li>
    628 <li class="tent">Each occurrence of the radix character from the current locale is changed to a &lt;period&gt;.</li>
    629 </ul>
    630 </li>
    631 </ol>
    632 In case (a) the numeric value of the <i>numeric string</i> shall be the value that would be returned by the <a href=
    633 "../functions/strtod.html"><i>strtod</i>()</a> call. In case (b) if the first non-&lt;blank&gt; is <tt>'-'</tt>, the numeric value
    634 of the <i>numeric string</i> shall be the negation of the numeric value of the recognized <b>NUMBER</b> token; otherwise, the
    635 numeric value of the <i>numeric string</i> shall be the numeric value of the recognized <b>NUMBER</b> token. Whether or not a
    636 string is a <i>numeric string</i> shall be relevant only in contexts where that term is used in this section.
    637 <p class="tent">When an expression is used in a Boolean context, if it has a numeric value, a value of zero shall be treated as
    638 false and any other value shall be treated as true. Otherwise, a string value of the null string shall be treated as false and any
    639 other value shall be treated as true. A Boolean context shall be one of the following:</p>
    640 <ul>
    641 <li class="tent">The first subexpression of a conditional expression</li>
    642 <li class="tent">An expression operated on by logical NOT, logical AND, or logical OR</li>
    643 <li class="tent">The second expression of a <b>for</b> statement</li>
    644 <li class="tent">The expression of an <b>if</b> statement</li>
    645 <li class="tent">The expression of the <b>while</b> clause in either a <b>while</b> or <b>do</b>...<b>while</b> statement</li>
    646 <li class="tent">An expression used as a pattern (as in Overall Program Structure)</li>
    647 </ul>
    648 <p class="tent">All arithmetic shall follow the semantics of floating-point arithmetic as specified by the ISO&nbsp;C standard (see
    649 <a href="../utilities/V3_chap01.html#tag_18_01_02"><i>1.1.2 Concepts Derived from the ISO C Standard</i></a> ).</p>
    650 <p class="tent">The value of the expression:</p>
    651 <pre>
    652 <i>expr1</i><tt> ^ </tt><i>expr2</i><tt>
    653 </tt></pre>
    654 <p class="tent">shall be equivalent to the value returned by the ISO&nbsp;C standard function call:</p>
    655 <pre>
    656 <tt>pow(</tt><i>expr1</i><tt>, </tt><i>expr2</i><tt>)
    657 </tt></pre>
    658 <p class="tent">The expression:</p>
    659 <pre>
    660 <tt>lvalue ^= </tt><i>expr</i><tt>
    661 </tt></pre>
    662 <p class="tent">shall be equivalent to the ISO&nbsp;C standard expression:</p>
    663 <pre>
    664 <tt>lvalue = pow(lvalue, </tt><i>expr</i><tt>)
    665 </tt></pre>
    666 <p class="tent">except that lvalue shall be evaluated only once. The value of the expression:</p>
    667 <pre>
    668 <i>expr1</i><tt> % </tt><i>expr2</i><tt>
    669 </tt></pre>
    670 <p class="tent">shall be equivalent to the value returned by the ISO&nbsp;C standard function call:</p>
    671 <pre>
    672 <tt>fmod(</tt><i>expr1</i><tt>, </tt><i>expr2</i><tt>)
    673 </tt></pre>
    674 <p class="tent">The expression:</p>
    675 <pre>
    676 <tt>lvalue %= </tt><i>expr</i><tt>
    677 </tt></pre>
    678 <p class="tent">shall be equivalent to the ISO&nbsp;C standard expression:</p>
    679 <pre>
    680 <tt>lvalue = fmod(lvalue, </tt><i>expr</i><tt>)
    681 </tt></pre>
    682 <p class="tent">except that lvalue shall be evaluated only once.</p>
    683 <p class="tent">Variables and fields shall be set by the assignment statement:</p>
    684 <pre>
    685 <tt>lvalue = </tt><i>expression</i><tt>
    686 </tt></pre>
    687 <p class="tent">and the type of <i>expression</i> shall determine the resulting variable type. The assignment includes the
    688 arithmetic assignments (<tt>"+="</tt>, <tt>"-="</tt>, <tt>"*="</tt>, <tt>"/="</tt>, <tt>"%="</tt>, <tt>"^="</tt>, <tt>"++"</tt>,
    689 <tt>"--"</tt>) all of which shall produce a numeric result. The left-hand side of an assignment and the target of increment and
    690 decrement operators can be one of a variable, an array with index, or a field selector.</p>
    691 <p class="tent">The <i>awk</i> language supplies arrays that are used for storing numbers or strings. Arrays need not be declared.
    692 They shall initially be empty, and their sizes shall change dynamically. The subscripts, or element identifiers, are strings,
    693 providing a type of associative array capability. An array name followed by a subscript within square brackets can be used as an
    694 lvalue and thus as an expression, as described in the grammar; see <a href="#tag_20_06_13_16">Grammar</a> . Unsubscripted array
    695 names can be used in only the following contexts:</p>
    696 <ul>
    697 <li class="tent">A parameter in a function definition or function call</li>
    698 <li class="tent">The <b>NAME</b> token following any use of the keyword <b>in</b> as specified in the grammar (see <a href=
    699 "#tag_20_06_13_16">Grammar</a> ); if the name used in this context is not an array name, the behavior is undefined</li>
    700 <li class="tent">The <b>NAME</b> token following the keyword <b>Delete</b> without a subscript as specified in the grammar (see
    701 <a href="#tag_20_06_13_16">Grammar</a> ); if the name used in this context is not an array name, the behavior is undefined.</li>
    702 </ul>
    703 <p class="tent">A valid array <i>index</i> shall consist of one or more &lt;comma&gt;-separated expressions, similar to the way in
    704 which multi-dimensional arrays are indexed in some programming languages. Because <i>awk</i> arrays are really one-dimensional,
    705 such a &lt;comma&gt;-separated list shall be converted to a single string by concatenating the string values of the separate
    706 expressions, each separated from the other by the value of the <b>SUBSEP</b> variable. Thus, the following two index operations
    707 shall be equivalent:</p>
    708 <pre>
    709 <i>var</i><b>[</b><i>expr1</i><tt>, </tt><i>expr2</i><tt>, ... </tt><i>exprn</i><b>]
    710 <br class="tent">
    711 </b><i>var</i><b>[</b><i>expr1</i><tt> SUBSEP </tt><i>expr2</i><tt> SUBSEP ... SUBSEP </tt><i>exprn</i><b>]</b><tt>
    712 </tt></pre>
    713 <p class="tent">The application shall ensure that a multi-dimensioned <i>index</i> used with the <b>in</b> operator is
    714 parenthesized. The <b>in</b> operator, which tests for the existence of a particular array element, shall not cause that element to
    715 exist. Any other reference to a nonexistent array element shall automatically create it.</p>
    716 <p class="tent">Comparisons (with the <tt>'&lt;'</tt>, <tt>"&lt;="</tt>, <tt>"!="</tt>, <tt>"=="</tt>, <tt>'&gt;'</tt>, and
    717 <tt>"&gt;="</tt> operators) shall be made numerically:</p>
    718 <ul>
    719 <li class="tent">if both operands are numeric,</li>
    720 <li class="tent">if one is numeric and the other has a string value that is a numeric string,</li>
    721 <li class="tent">if both have string values that are numeric strings, or</li>
    722 <li class="tent">if one is numeric and the other has the uninitialized value.</li>
    723 </ul>
    724 <p class="tent">Otherwise, operands shall be converted to strings as required and a string comparison shall be made as follows:</p>
    725 <ul>
    726 <li class="tent">For the <tt>"!="</tt> and <tt>"=="</tt> operators, the strings shall be compared to check if they are identical
    727 (not to check if they collate equally).</li>
    728 <li class="tent">For the other operators, the strings shall be compared using the locale-specific collation sequence.</li>
    729 </ul>
    730 <p class="tent">The value of the comparison expression shall be 1 if the relation is true, or 0 if the relation is false.</p>
    731 <h5><a name="tag_20_06_13_03" id="tag_20_06_13_03"></a>Variables and Special Variables</h5>
    732 <p class="tent">Variables can be used in an <i>awk</i> program by referencing them. With the exception of function parameters (see
    733 <a href="#tag_20_06_13_15">User-Defined Functions</a> ), they are not explicitly declared. Function parameter names shall be local
    734 to the function; all other variable names shall be global. The same name shall not be used as both a function parameter name and as
    735 the name of a function or a special <i>awk</i> variable. The same name shall not be used both as a variable name with global scope
    736 and as the name of a function. The same name shall not be used within the same scope both as a scalar variable and as an array.
    737 Uninitialized variables, including scalar variables, array elements, and field variables, shall have an uninitialized value. An
    738 uninitialized value shall have both a numeric value of zero and a string value of the empty string. Evaluation of variables with an
    739 uninitialized value, to either string or numeric, shall be determined by the context in which they are used.</p>
    740 <p class="tent">Field variables shall be designated by a <tt>'$'</tt> followed by a number or numerical expression. The effect of
    741 the field number <i>expression</i> evaluating to anything other than a non-negative integer is unspecified; uninitialized variables
    742 or string values need not be converted to numeric values in this context. New field variables can be created by assigning a value
    743 to them. References to nonexistent fields (that is, fields after $<b>NF</b>), shall evaluate to the uninitialized value. Such
    744 references shall not create new fields. However, assigning to a nonexistent field (for example, $(<b>NF</b>+2)=5) shall increase
    745 the value of <b>NF</b>; create any intervening fields with the uninitialized value; and cause the value of $0 to be recomputed,
    746 with the fields being separated by the value of <b>OFS</b>. Each field variable shall have a string value or an uninitialized value
    747 when created. Field variables shall have the uninitialized value when created from $0 using <b>FS</b> and the variable does not
    748 contain any characters. If appropriate, the field variable shall be considered a numeric string (see <a href=
    749 "#tag_20_06_13_02">Expressions in awk</a> ).</p>
    750 <p class="tent">Implementations shall support the following other special variables that are set by <i>awk</i>:</p>
    751 <dl compact>
    752 <dd></dd>
    753 <dt><b>ARGC</b></dt>
    754 <dd>A number determining when the iteration described for <b>ARGV</b> stops. When an <i>awk</i> program starts, <b>ARGC</b> shall
    755 be initialized to the number of elements in the <b>ARGV</b> array. <b>ARGC</b> can be updated by the <i>awk</i> program and by
    756 assignment operands. If <b>ARGC</b> is set to a value less than 1, the behavior is unspecified. It is unspecified whether
    757 alterations to <b>ARGC</b> can be made using the <b>-v</b> option.</dd>
    758 <dt><b>ARGV</b></dt>
    759 <dd>An array containing, initially, the command name (see <a href="../utilities/V3_chap02.html#tag_19_09_01"><i>2.9.1 Simple
    760 Commands</i></a> ) used to invoke <i>awk</i> in <tt>ARGV[0]</tt> and the command line arguments, if any, excluding options and the
    761 <i>program</i> operand, in <tt>ARGV[1]</tt> through <tt>ARGV[ARGC-1]</tt>. The elements in <b>ARGV</b> can be assigned new values
    762 or deleted, and new elements can be added. Note that alterations to <b>ARGV</b> cannot be made using either the <i>assignment</i>
    763 operand or the <b>-v</b> option, because an operand with a <tt>'['</tt> before <tt>'='</tt> is treated as a <i>file</i> operand,
    764 not an <i>assignment</i> operand, and applications are required to ensure that the <b>-v</b> option-argument has the same form as
    765 an <i>assignment</i> operand. (See the OPTIONS and OPERANDS sections.)
    766 <p class="tent">After processing the <b>BEGIN</b> actions, if any, <i>awk</i> begins interating over the elements of <b>ARGV</b>,
    767 processing them as if they were <i>argument</i> operands. It shall behave as if the implementation maintains an internal counter
    768 that is initialized to 1 and increments by 1 at the end of each iteration. For each iteration, the following shall occur:</p>
    769 <ul>
    770 <li class="tent">If the internal counter is greater than or equal to the current value of <b>ARGC</b> and no <i>file</i> operands
    771 have been processed, <i>awk</i> shall set <b>FILENAME</b> to <tt>'-'</tt> and process standard input as if it was given as a file
    772 operand. The internal counter shall not be incremented at the end of this iteration.</li>
    773 <li class="tent">Otherwise, if the internal counter is greater than or equal to the current value of <b>ARGC</b>, the iterations
    774 shall stop and processing of the <b>END</b> actions, if any, shall begin. Any <b>ARGV</b> elements with index values greater than
    775 or equal to <b>ARGC</b> shall not be processed as <i>argument</i> operands.</li>
    776 <li class="tent">Otherwise, if the element <tt>ARGV[</tt> <i>internal counter value</i><tt>]</tt> does not exist, it is unspecified
    777 whether that element is created. No other action shall be taken.</li>
    778 <li class="tent">Otherwise, if <tt>ARGV[</tt> <i>internal counter value</i><tt>]</tt> is a null string, no action shall be
    779 taken.</li>
    780 <li class="tent">Otherwise, if <tt>ARGV[</tt> <i>internal counter value</i><tt>]</tt> matches the format of an <i>assignment</i>
    781 operand (see OPERANDS), <i>awk</i> shall process the assignment.</li>
    782 <li class="tent">Otherwise, <tt>ARGV[</tt> <i>internal counter value</i><tt>]</tt> shall be treated as a <i>file</i> operand,
    783 <b>FILENAME</b> shall be set to that value, and the named file, or standard input if the value is <tt>'-'</tt>, shall be processed
    784 as an input file.</li>
    785 </ul>
    786 <p class="tent">Since only non-null elements are processed, setting an element of <b>ARGV</b> to the null string or deleting it
    787 means that it shall not be treated as an <i>argument</i> operand.</p>
    788 </dd>
    789 <dt><b>CONVFMT</b></dt>
    790 <dd>The <b>printf</b> format for converting numbers to strings (except for output statements, where <b>OFMT</b> is used);
    791 <tt>"%.6g"</tt> by default.</dd>
    792 <dt><b>ENVIRON</b></dt>
    793 <dd>An array representing the value of the environment, as described in the <i>exec</i> functions defined in the System Interfaces
    794 volume of POSIX.1-2024. The indices of the array shall be strings consisting of the names of the environment variables, and the
    795 value of each array element shall be a string consisting of the value of that variable. If appropriate, the environment variable
    796 shall be considered a <i>numeric string</i> (see <a href="#tag_20_06_13_02">Expressions in awk</a> ); the array element shall also
    797 have its numeric value.
    798 <p class="tent">In all cases where the behavior of <i>awk</i> is affected by environment variables (including the environment of
    799 any commands that <i>awk</i> executes via the <b>system</b> function or via pipeline redirections with the <b>print</b> statement,
    800 the <b>printf</b> statement, or the <b>getline</b> function), the environment used shall be the environment at the time <i>awk</i>
    801 began executing; it is implementation-defined whether any modification of <b>ENVIRON</b> affects this environment.</p>
    802 </dd>
    803 <dt><b>FILENAME</b></dt>
    804 <dd>The pathname used to open the current input file, or <tt>'-'</tt> if the file is standard input. Inside a <b>BEGIN</b> action
    805 <b>FILENAME</b> shall be unset. Inside an <b>END</b> action the value shall be the name of the last input file processed. If an
    806 application changes the value of <b>FILENAME</b>, the results are unspecified.</dd>
    807 <dt><b>FNR</b></dt>
    808 <dd>The ordinal number of the current record in the current file. Inside a <b>BEGIN</b> action the value shall be zero. Inside an
    809 <b>END</b> action the value shall be the number of the last record processed in the last file processed.</dd>
    810 <dt><b>FS</b></dt>
    811 <dd>Input field separator regular expression; a &lt;space&gt; by default.</dd>
    812 <dt><b>NF</b></dt>
    813 <dd>The number of fields in the current record. Inside a <b>BEGIN</b> action, the use of <b>NF</b> is undefined unless a
    814 <b>getline</b> function without a <i>var</i> argument is executed previously. Inside an <b>END</b> action, <b>NF</b> shall retain
    815 the value it had for the last record read, unless a subsequent, redirected, <b>getline</b> function without a <i>var</i> argument
    816 is performed prior to entering the <b>END</b> action.</dd>
    817 <dt><b>NR</b></dt>
    818 <dd>The ordinal number of the current record from the start of input. Inside a <b>BEGIN</b> action the value shall be zero. Inside
    819 an <b>END</b> action the value shall be the number of the last record processed. Records skipped by the <b>nextfile</b> statement
    820 shall not be included.</dd>
    821 <dt><b>OFMT</b></dt>
    822 <dd>The <b>printf</b> format for converting numbers to strings in output statements (see <a href="#tag_20_06_13_10">Output
    823 Statements</a> ); <tt>"%.6g"</tt> by default. The result of the conversion is unspecified if the value of <b>OFMT</b> is not a
    824 floating-point format specification.</dd>
    825 <dt><b>OFS</b></dt>
    826 <dd>The <b>print</b> statement output field separator; &lt;space&gt; by default.</dd>
    827 <dt><b>ORS</b></dt>
    828 <dd>The <b>print</b> statement output record separator; a &lt;newline&gt; by default.</dd>
    829 <dt><b>RLENGTH</b></dt>
    830 <dd>The length of the string matched by the <b>match</b> function.</dd>
    831 <dt><b>RS</b></dt>
    832 <dd>The first character of the string value of <b>RS</b> shall be the input record separator; a &lt;newline&gt; by default. If
    833 <b>RS</b> contains more than one character, the results are unspecified. If <b>RS</b> is null, then records are separated by
    834 sequences consisting of a &lt;newline&gt; plus one or more blank lines, leading or trailing blank lines shall not result in empty
    835 records at the beginning or end of the input, and a &lt;newline&gt; shall always be a field separator, no matter what the value of
    836 <b>FS</b> is.</dd>
    837 <dt><b>RSTART</b></dt>
    838 <dd>The starting position of the string matched by the <b>match</b> function, numbering from 1. This shall always be equivalent to
    839 the return value of the <b>match</b> function.</dd>
    840 <dt><b>SUBSEP</b></dt>
    841 <dd>The subscript separator string for multi-dimensional arrays; the default value is implementation-defined.</dd>
    842 </dl>
    843 <h5><a name="tag_20_06_13_04" id="tag_20_06_13_04"></a>Regular Expressions</h5>
    844 <p class="tent">The <i>awk</i> utility shall make use of the extended regular expression notation (see XBD <a href=
    845 "../basedefs/V1_chap09.html#tag_09_04"><i>9.4 Extended Regular Expressions</i></a> ) except that it shall allow the use of
    846 C-language conventions for escaping special characters within the EREs, as specified in the table in XBD <a href=
    847 "../basedefs/V1_chap05.html#tag_05"><i>5. File Format Notation</i></a> for <tt>'\\'</tt>, <tt>'\a'</tt>, <tt>'\b'</tt>,
    848 <tt>'\f'</tt>, <tt>'\n'</tt>, <tt>'\r'</tt>, <tt>'\t'</tt>, <tt>'\v'</tt> and in the following table for other sequences; these
    849 escape sequences shall be recognized both inside and outside bracket expressions. Note that records need not be separated by
    850 &lt;newline&gt; characters and string constants can contain &lt;newline&gt; characters, so even the <tt>"\n"</tt> sequence is valid
    851 in <i>awk</i> EREs. Using a &lt;slash&gt; character within the lexical token <b>ERE</b> (except as one of the two delimiters)
    852 requires the escaping shown in the following table.<br></p>
    853 <p class="caption"><a name="tagtcjh_15" id="tagtcjh_15"></a> Table: Escape Sequences in awk</p>
    854 <center>
    855 <table border="1" cellpadding="3" align="center">
    856 <tr valign="top">
    857 <th align="center">
    858 <p class="tent"><b>Escape Sequence</b></p>
    859 </th>
    860 <th align="center">
    861 <p class="tent"><b>Description</b></p>
    862 </th>
    863 <th align="center">
    864 <p class="tent"><b>Meaning</b></p>
    865 </th>
    866 </tr>
    867 <tr valign="top">
    868 <td align="left">
    869 <p class="tent">\"</p>
    870 </td>
    871 <td align="left">
    872 <p class="tent">&lt;backslash&gt; &lt;quotation-mark&gt;</p>
    873 </td>
    874 <td align="left">
    875 <p class="tent">In the lexical token <b>STRING</b>, &lt;quotation-mark&gt; character. Otherwise undefined.</p>
    876 </td>
    877 </tr>
    878 <tr valign="top">
    879 <td align="left">
    880 <p class="tent">\/</p>
    881 </td>
    882 <td align="left">
    883 <p class="tent">&lt;backslash&gt; &lt;slash&gt;</p>
    884 </td>
    885 <td align="left">
    886 <p class="tent">In the lexical token <b>ERE</b>, &lt;slash&gt; character. Otherwise undefined.</p>
    887 </td>
    888 </tr>
    889 <tr valign="top">
    890 <td align="left">
    891 <p class="tent">\ddd</p>
    892 </td>
    893 <td align="left">
    894 <p class="tent">A &lt;backslash&gt; character followed by the longest sequence of one, two, or three octal-digit characters
    895 (01234567). If all of the digits are 0 (that is, representation of the NUL character), the behavior is undefined. If the digits
    896 produce a value greater than octal 377, the behavior is undefined.</p>
    897 </td>
    898 <td align="left">
    899 <p class="tent">The character whose encoding is represented by the one, two, or three-digit octal integer. Multi-byte characters
    900 require multiple, concatenated escape sequences of this type, including the leading &lt;backslash&gt; for each byte.</p>
    901 </td>
    902 </tr>
    903 <tr valign="top">
    904 <td align="left">
    905 <p class="tent">\., \[, \(,\*, \+, \?, \{, \|, \^, \$</p>
    906 </td>
    907 <td align="left">
    908 <p class="tent">A &lt;backslash&gt; character followed by a character that has a special meaning in EREs (see XBD <a href=
    909 "../basedefs/V1_chap09.html#tag_09_04"><i>9.4 Extended Regular Expressions</i></a> ), other than &lt;backslash&gt;.</p>
    910 </td>
    911 <td align="left">
    912 <p class="tent">In the lexical token <b>ERE</b> when not inside a bracket expression, the sequence shall represent itself.
    913 Otherwise undefined.</p>
    914 </td>
    915 </tr>
    916 <tr valign="top">
    917 <td align="left">
    918 <p class="tent">\\</p>
    919 </td>
    920 <td align="left">
    921 <p class="tent">Two &lt;backslash&gt; characters.</p>
    922 </td>
    923 <td align="left">
    924 <p class="tent">In the lexical token <b>ERE</b>, the sequence shall represent itself. In the lexical token <b>STRING</b>, it shall
    925 represent a single &lt;backslash&gt;.</p>
    926 </td>
    927 </tr>
    928 <tr valign="top">
    929 <td align="left">
    930 <p class="tent">\c</p>
    931 </td>
    932 <td align="left">
    933 <p class="tent">A &lt;backslash&gt; character followed by any character not described in this table or in the table in XBD <a href=
    934 "../basedefs/V1_chap05.html#tag_05"><i>5. File Format Notation</i></a> (<tt>'\\'</tt>, <tt>'\a'</tt>, <tt>'\b'</tt>, <tt>'\f'</tt>,
    935 <tt>'\n'</tt>, <tt>'\r'</tt>, <tt>'\t'</tt>, <tt>'\v'</tt>).</p>
    936 </td>
    937 <td align="left">
    938 <p class="tent">Undefined</p>
    939 </td>
    940 </tr>
    941 </table>
    942 </center>
    943 <p class="tent">A regular expression can be matched against a specific field or string by using one of the two regular expression
    944 matching operators, <tt>'~'</tt> and <tt>"!~"</tt>. These operators shall interpret their right-hand operand as a regular
    945 expression and their left-hand operand as a string. If the regular expression matches the string, the <tt>'~'</tt> expression shall
    946 evaluate to a value of 1, and the <tt>"!~"</tt> expression shall evaluate to a value of 0. (The regular expression matching
    947 operation is as defined by the term matched in XBD <a href="../basedefs/V1_chap09.html#tag_09_01"><i>9.1 Regular Expression
    948 Definitions</i></a> , where a match occurs on any part of the string unless the regular expression is limited with the
    949 &lt;circumflex&gt; or &lt;dollar-sign&gt; special characters.) If the regular expression does not match the string, the
    950 <tt>'~'</tt> expression shall evaluate to a value of 0, and the <tt>"!~"</tt> expression shall evaluate to a value of 1. If the
    951 right-hand operand is any expression other than the lexical token <b>ERE</b>, the string value of the expression shall be
    952 interpreted as an extended regular expression, including the escape conventions described above. Note that these escape conventions
    953 shall also be applied in determining the value of a string literal (the lexical token <b>STRING</b>), and thus shall be applied a
    954 second time when a string literal is used in this context.</p>
    955 <p class="tent">When an <b>ERE</b> token appears as an expression in any context other than as the right-hand of the <tt>'~'</tt>
    956 or <tt>"!~"</tt> operator or as one of the built-in function arguments described below, the value of the resulting expression shall
    957 be the equivalent of:</p>
    958 <pre>
    959 <tt>$0 ~ /</tt><i>ere</i><tt>/
    960 </tt></pre>
    961 <p class="tent">The <i>ere</i> argument to the <b>gsub</b>, <b>match</b>, <b>sub</b> functions, and the <i>fs</i> argument to the
    962 <b>split</b> function (see <a href="#tag_20_06_13_13">String Functions</a> ) shall be interpreted as extended regular expressions.
    963 These can be either <b>ERE</b> tokens or arbitrary expressions, and shall be interpreted in the same manner as the right-hand side
    964 of the <tt>'~'</tt> or <tt>"!~"</tt> operator.</p>
    965 <p class="tent">An extended regular expression can be used to separate fields by assigning a string containing the expression to
    966 the built-in variable <b>FS</b>, either directly or as a consequence of using the <b>-F</b> <i>sepstring</i> option. The default
    967 value of the <b>FS</b> variable shall be a single &lt;space&gt;. The following describes <b>FS</b> behavior:</p>
    968 <ol>
    969 <li class="tent">If <b>FS</b> is a null string, the behavior is unspecified.</li>
    970 <li class="tent">If <b>FS</b> is a single character:
    971 <ol type="a">
    972 <li class="tent">If <b>FS</b> is &lt;space&gt;, skip leading and trailing &lt;blank&gt; and &lt;newline&gt; characters; fields
    973 shall be delimited by sets of one or more &lt;blank&gt; or &lt;newline&gt; characters.</li>
    974 <li class="tent">Otherwise, if <b>FS</b> is any other character <i>c</i>, fields shall be delimited by each single occurrence of
    975 <i>c</i>.</li>
    976 </ol>
    977 </li>
    978 <li class="tent">Otherwise, the string value of <b>FS</b> shall be considered to be an extended regular expression. Each occurrence
    979 of a sequence of one or more characters matching the extended regular expression shall delimit fields.</li>
    980 </ol>
    981 <p class="tent">When ERE matching is performed against input records; that is, the match is against $0 and the current value of $0
    982 resulted from processing an input record, record separator characters (the first character of the value of the variable <b>RS</b>,
    983 &lt;newline&gt; by default) cannot be embedded in the expression, and no expression shall match the record separator character. If
    984 the record separator is not &lt;newline&gt;, &lt;newline&gt; characters embedded in the expression can be matched. When ERE
    985 matching is not performed against input records, it shall be based on text strings; any character (including &lt;newline&gt; and
    986 the record separator) can be embedded in the pattern, and an appropriate pattern shall match any character. However, in all
    987 <i>awk</i> ERE matching, the use of one or more NUL characters in the pattern, input record, or text string produces undefined
    988 results.</p>
    989 <h5><a name="tag_20_06_13_05" id="tag_20_06_13_05"></a>Patterns</h5>
    990 <p class="tent">A <i>pattern</i> is any valid <i>expression</i>, a range specified by two expressions separated by a comma, or one
    991 of the two special patterns <b>BEGIN</b> or <b>END</b>.</p>
    992 <h5><a name="tag_20_06_13_06" id="tag_20_06_13_06"></a>Special Patterns</h5>
    993 <p class="tent">The <i>awk</i> utility shall recognize two special patterns, <b>BEGIN</b> and <b>END</b>. Each <b>BEGIN</b> pattern
    994 shall be matched once and its associated action executed before the first record of input is read—except possibly by use of the
    995 <b>getline</b> function (see <a href="#tag_20_06_13_14">Input/Output and General Functions</a> ) in a prior <b>BEGIN</b> action—and
    996 before command line assignment is done. Each <b>END</b> pattern shall be matched once and its associated action executed after the
    997 last record of input has been read, or if there is no further input file to process following a <b>nextfile</b> statement. These
    998 two patterns shall have associated actions.</p>
    999 <p class="tent"><b>BEGIN</b> and <b>END</b> shall not combine with other patterns. Multiple <b>BEGIN</b> and <b>END</b> patterns
   1000 shall be allowed. The actions associated with the <b>BEGIN</b> patterns shall be executed in the order specified in the program, as
   1001 are the <b>END</b> actions. An <b>END</b> pattern can precede a <b>BEGIN</b> pattern in a program.</p>
   1002 <p class="tent">If an <i>awk</i> program consists of only actions with the pattern <b>BEGIN</b>, and the <b>BEGIN</b> action
   1003 contains no <b>getline</b> function, <i>awk</i> shall exit without reading its input when the last statement in the last
   1004 <b>BEGIN</b> action is executed. If an <i>awk</i> program consists of only actions with the pattern <b>END</b> or only actions with
   1005 the patterns <b>BEGIN</b> and <b>END</b>, the input shall be read before the statements in the <b>END</b> actions are executed.</p>
   1006 <h5><a name="tag_20_06_13_07" id="tag_20_06_13_07"></a>Expression Patterns</h5>
   1007 <p class="tent">An expression pattern shall be evaluated as if it were an expression in a Boolean context. If the result is true,
   1008 the pattern shall be considered to match, and the associated action (if any) shall be executed. If the result is false, the action
   1009 shall not be executed.</p>
   1010 <h5><a name="tag_20_06_13_08" id="tag_20_06_13_08"></a>Pattern Ranges</h5>
   1011 <p class="tent">A pattern range consists of two expressions separated by a comma; in this case, the action shall be performed for
   1012 all records between a match of the first expression and the following match of the second expression, inclusive. At this point, the
   1013 pattern range can be repeated starting at input records subsequent to the end of the matched range.</p>
   1014 <h5><a name="tag_20_06_13_09" id="tag_20_06_13_09"></a>Actions</h5>
   1015 <p class="tent">An action is a sequence of statements as shown in the grammar in <a href="#tag_20_06_13_16">Grammar</a> . Any
   1016 single statement can be replaced by a statement list enclosed in curly braces. The application shall ensure that statements in a
   1017 statement list are separated by &lt;newline&gt; or &lt;semicolon&gt; characters. Statements in a statement list shall be executed
   1018 sequentially in the order that they appear.</p>
   1019 <p class="tent">The <i>expression</i> acting as the conditional in an <b>if</b> statement shall be evaluated and if it is non-zero
   1020 or non-null, the following statement shall be executed; otherwise, if <b>else</b> is present, the statement following the
   1021 <b>else</b> shall be executed.</p>
   1022 <p class="tent">The <b>if</b>, <b>while</b>, <b>do</b>...<b>while</b>, <b>for</b>, <b>break</b>, and <b>continue</b> statements are
   1023 based on the ISO&nbsp;C standard (see <a href="../utilities/V3_chap01.html#tag_18_01_02"><i>1.1.2 Concepts Derived from the ISO C
   1024 Standard</i></a> ), except that the Boolean expressions shall be treated as described in <a href="#tag_20_06_13_02">Expressions in
   1025 awk</a> , and except in the case of:</p>
   1026 <pre>
   1027 <tt>for (</tt><i>variable</i><tt> in </tt><i>array</i><tt>)
   1028 </tt></pre>
   1029 <p class="tent">which shall iterate, assigning each <i>index</i> of <i>array</i> to <i>variable</i> in an unspecified order. The
   1030 results of adding new elements to <i>array</i> within such a <b>for</b> loop are undefined. If a <b>break</b> or <b>continue</b>
   1031 statement occurs outside of a loop, the behavior is undefined.</p>
   1032 <p class="tent">The <b>delete</b> statement shall remove either a specified individual array element or, if no element is
   1033 specified, all array elements. Thus, the following code:</p>
   1034 <pre>
   1035 <tt>for (index in array)
   1036     delete array[index]
   1037 </tt></pre>
   1038 <p class="tent">is equivalent to:</p>
   1039 <pre>
   1040 <tt>delete array
   1041 </tt></pre>
   1042 <p class="tent">Both delete all elements of the array.</p>
   1043 <p class="tent">The <b>next</b> statement shall cause all further processing of the current input record to be abandoned. The
   1044 behavior is undefined if a <b>next</b> statement appears or is invoked in a <b>BEGIN</b> or <b>END</b> action.</p>
   1045 <p class="tent">The <b>nextfile</b> statement shall cause all further processing of the current input file to be abandoned. The
   1046 behavior is undefined if a <b>nextfile</b> statement appears or is invoked in a <b>BEGIN</b> or <b>END</b> action, or in a
   1047 user-defined function.</p>
   1048 <p class="tent">The <b>exit</b> statement shall invoke all <b>END</b> actions in the order in which they occur in the program
   1049 source and then terminate the program without reading further input. An <b>exit</b> statement inside an <b>END</b> action shall
   1050 terminate the program without further execution of <b>END</b> actions. If an expression is specified in an <b>exit</b> statement,
   1051 its numeric value shall be the exit status of <i>awk</i>, unless subsequent errors are encountered or a subsequent <b>exit</b>
   1052 statement with an expression is executed.</p>
   1053 <h5><a name="tag_20_06_13_10" id="tag_20_06_13_10"></a>Output Statements</h5>
   1054 <p class="tent">Both <b>print</b> and <b>printf</b> statements shall write to standard output by default. The output shall be
   1055 written to the location specified by <i>output_redirection</i> if one is supplied, as follows:</p>
   1056 <pre>
   1057 <tt>&gt; </tt><i>expression</i><tt>
   1058 &gt;&gt; </tt><i>expression</i><tt>
   1059 | </tt><i>expression</i><tt>
   1060 </tt></pre>
   1061 <p class="tent">In all cases, the <i>expression</i> shall be evaluated to produce a string that is used as a pathname into which to
   1062 write (for <tt>'&gt;'</tt> or <tt>"&gt;&gt;"</tt>) or as a command to be executed (for <tt>'|'</tt>). Using the first two forms, if
   1063 the file of that name is not currently open, it shall be opened, creating it if necessary and using the first form, truncating the
   1064 file. The output then shall be appended to the file. As long as the file remains open, subsequent calls in which <i>expression</i>
   1065 evaluates to the same string value shall simply append output to the file. The file remains open until the <b>close</b> function
   1066 (see <a href="#tag_20_06_13_14">Input/Output and General Functions</a> ) is called with an expression that evaluates to the same
   1067 string value.</p>
   1068 <p class="tent">The third form shall write output onto a stream piped to the input of a command. The stream shall be created if no
   1069 stream is currently open with the value of <i>expression</i> as its command name. The stream created shall be equivalent to one
   1070 created by a call to the <a href="../functions/popen.html"><i>popen</i>()</a> function defined in the System Interfaces volume of
   1071 POSIX.1-2024 with the value of <i>expression</i> as the <i>command</i> argument and a value of <i>w</i> as the <i>mode</i>
   1072 argument. As long as the stream remains open, subsequent calls in which <i>expression</i> evaluates to the same string value shall
   1073 write output to the existing stream. The stream shall remain open until the <b>close</b> function (see <a href=
   1074 "#tag_20_06_13_14">Input/Output and General Functions</a> ) is called with an expression that evaluates to the same string value.
   1075 At that time, the stream shall be closed as if by a call to the <a href="../functions/pclose.html"><i>pclose</i>()</a> function
   1076 defined in the System Interfaces volume of POSIX.1-2024.</p>
   1077 <p class="tent">As described in detail by the grammar in <a href="#tag_20_06_13_16">Grammar</a> , these output statements shall
   1078 take a &lt;comma&gt;-separated list of <i>expression</i>s referred to in the grammar by the non-terminal symbols <b>expr_list</b>,
   1079 <b>print_expr_list</b>, or <b>print_expr_list_opt</b>. This list is referred to here as the <i>expression list</i>, and each member
   1080 is referred to as an <i>expression argument</i>.</p>
   1081 <p class="tent">The <b>print</b> statement shall write the value of each expression argument onto the indicated output stream
   1082 separated by the current output field separator (see variable <b>OFS</b> above), and terminated by the output record separator (see
   1083 variable <b>ORS</b> above). All expression arguments shall be taken as strings, being converted if necessary; this conversion shall
   1084 be as described in <a href="#tag_20_06_13_02">Expressions in awk</a> , with the exception that the <b>printf</b> format in
   1085 <b>OFMT</b> shall be used instead of the value in <b>CONVFMT</b>. An empty expression list shall stand for the whole input record
   1086 ($0).</p>
   1087 <p class="tent">The <b>printf</b> statement shall produce output based on a notation similar to the File Format Notation used to
   1088 describe file formats in this volume of POSIX.1-2024 (see XBD <a href="../basedefs/V1_chap05.html#tag_05"><i>5. File Format
   1089 Notation</i></a> ). Output shall be produced as specified with the first <i>expression</i> argument as the string <i>format</i> and
   1090 subsequent <i>expression</i> arguments as the strings <i>arg1</i> to <i>argn</i>, inclusive, with the following exceptions:</p>
   1091 <ol>
   1092 <li class="tent">The <i>format</i> shall be an actual character string rather than a graphical representation. Therefore, it cannot
   1093 contain empty character positions. The &lt;space&gt; in the <i>format</i> string, in any context other than a <i>flag</i> of a
   1094 conversion specification, shall be treated as an ordinary character that is copied to the output.</li>
   1095 <li class="tent">If the character set contains a <tt>'Δ'</tt> character and that character appears in the <i>format</i> string, it
   1096 shall be treated as an ordinary character that is copied to the output.</li>
   1097 <li class="tent">The <i>escape sequences</i> beginning with a &lt;backslash&gt; character shall be treated as sequences of ordinary
   1098 characters that are copied to the output. Note that these same sequences shall be interpreted lexically by <i>awk</i> when they
   1099 appear in literal strings, but they shall not be treated specially by the <b>printf</b> statement.</li>
   1100 <li class="tent">A <i>field width</i> or <i>precision</i> can be specified as the <tt>'*'</tt> character instead of a digit string.
   1101 In this case the next argument from the expression list shall be fetched and its numeric value taken as the field width or
   1102 precision.</li>
   1103 <li class="tent">The implementation shall not precede or follow output from the <tt>d</tt> or <tt>u</tt> conversion specifier
   1104 characters with &lt;blank&gt; characters not specified by the <i>format</i> string.</li>
   1105 <li class="tent">The implementation shall not precede output from the <tt>o</tt> conversion specifier character with leading zeros
   1106 not specified by the <i>format</i> string.</li>
   1107 <li class="tent">For the <tt>c</tt> conversion specifier character: if the argument has a numeric value, the character whose
   1108 encoding is that value shall be output. If the value is zero or is not the encoding of any character in the character set, the
   1109 behavior is undefined. If the argument does not have a numeric value, the first character of the string value shall be output; if
   1110 the string does not contain any characters, the behavior is undefined.</li>
   1111 <li class="tent">For each conversion specification that consumes an argument, the next expression argument shall be evaluated. With
   1112 the exception of the <tt>c</tt> conversion specifier character, the value shall be converted (according to the rules specified in
   1113 <a href="#tag_20_06_13_02">Expressions in awk</a> ) to the appropriate type for the conversion specification.</li>
   1114 <li class="tent">If there are insufficient expression arguments to satisfy all the conversion specifications in the <i>format</i>
   1115 string, the behavior is undefined.</li>
   1116 <li class="tent">If any character sequence in the <i>format</i> string begins with a <tt>'%'</tt> character, but does not form a
   1117 valid conversion specification, the behavior is unspecified.</li>
   1118 </ol>
   1119 <p class="tent">Both <b>print</b> and <b>printf</b> can output at least {LINE_MAX} bytes.</p>
   1120 <h5><a name="tag_20_06_13_11" id="tag_20_06_13_11"></a>Functions</h5>
   1121 <p class="tent">The <i>awk</i> language has a variety of built-in functions: arithmetic, string, input/output, and general.</p>
   1122 <p class="tent">Function parameters, if present, can be either scalars or arrays; the behavior is undefined if an array name is
   1123 passed as a parameter that the function uses as a scalar, or if a scalar expression is passed as a parameter that the function uses
   1124 as an array. Function parameters shall be passed by value if scalar and by reference if array name.</p>
   1125 <h5><a name="tag_20_06_13_12" id="tag_20_06_13_12"></a>Arithmetic Functions</h5>
   1126 <p class="tent">The arithmetic functions, except for <b>int</b>, shall be based on the ISO&nbsp;C standard (see <a href=
   1127 "../utilities/V3_chap01.html#tag_18_01_02"><i>1.1.2 Concepts Derived from the ISO C Standard</i></a> ). The behavior is undefined
   1128 in cases where the ISO&nbsp;C standard specifies that an error be returned or that the behavior is undefined. Although the grammar
   1129 (see <a href="#tag_20_06_13_16">Grammar</a> ) permits built-in functions to appear with no arguments or parentheses, unless the
   1130 argument or parentheses are indicated as optional in the following list (by displaying them within the <tt>"[]"</tt> brackets),
   1131 such use is undefined.</p>
   1132 <dl compact>
   1133 <dd></dd>
   1134 <dt><b>atan2</b>(<i>y</i>,<i>x</i>)</dt>
   1135 <dd>Return arctangent of <i>y</i>/<i>x</i> in radians in the range [-ℼ,ℼ].</dd>
   1136 <dt><b>cos</b>(<i>x</i>)</dt>
   1137 <dd>Return cosine of <i>x</i>, where <i>x</i> is in radians.</dd>
   1138 <dt><b>sin</b>(<i>x</i>)</dt>
   1139 <dd>Return sine of <i>x</i>, where <i>x</i> is in radians.</dd>
   1140 <dt><b>exp</b>(<i>x</i>)</dt>
   1141 <dd>Return the exponential function of <i>x</i>.</dd>
   1142 <dt><b>log</b>(<i>x</i>)</dt>
   1143 <dd>Return the natural logarithm of <i>x</i>.</dd>
   1144 <dt><b>sqrt</b>(<i>x</i>)</dt>
   1145 <dd>Return the square root of <i>x</i>.</dd>
   1146 <dt><b>int</b>(<i>x</i>)</dt>
   1147 <dd>Return the argument truncated to an integer. Truncation shall be toward 0 when <i>x</i>&gt;0.</dd>
   1148 <dt><b>rand</b>()</dt>
   1149 <dd>Return a floating point pseudo-random number <i>n</i>, such that 0&lt;=<i>n</i>&lt;1.</dd>
   1150 <dt><b>srand</b>(<b>[</b><i>expr</i><b>]</b>)</dt>
   1151 <dd>Set the seed value for <b>rand</b> to <i>expr</i> or use the seconds since the Epoch if <i>expr</i> is omitted. The previous
   1152 seed value shall be returned. The behavior is unspecified if <i>expr</i> is not an integer expression or if the value of
   1153 <i>expr</i> is not within the range 0 through 2<sup><small>31</small></sup>-1 (2147483647), inclusive. The initial seed value is
   1154 unspecified if <b>rand</b> is called without calling <b>srand</b> first. The <b>srand</b> function uses the argument as a seed for
   1155 a new sequence of pseudo-random numbers to be returned by subsequent calls to <b>rand</b>. If <b>srand</b> is then called with the
   1156 same seed value, the sequence of pseudo-random numbers shall be repeated.</dd>
   1157 </dl>
   1158 <h5><a name="tag_20_06_13_13" id="tag_20_06_13_13"></a>String Functions</h5>
   1159 <p class="tent">The string functions in the following list shall be supported. Although the grammar (see <a href=
   1160 "#tag_20_06_13_16">Grammar</a> ) permits built-in functions to appear with no arguments or parentheses, unless the argument or
   1161 parentheses are indicated as optional in the following list (by displaying them within the <tt>"[]"</tt> brackets), such use is
   1162 undefined.</p>
   1163 <dl compact>
   1164 <dd></dd>
   1165 <dt><b>gsub</b>(<i>ere</i>,&nbsp;<i>repl</i><b>[</b>,&nbsp;<i>in</i><b>]</b>)</dt>
   1166 <dd>
   1167 Behave like <b>sub</b> (see below), except that it shall replace all occurrences of the regular expression (like the <a href=
   1168 "../utilities/ed.html"><i>ed</i></a> utility global substitute) in $0 or in the <i>in</i> argument, when specified.</dd>
   1169 <dt><b>index</b>(<i>s</i>,&nbsp;<i>t</i>)</dt>
   1170 <dd>Return the position, in characters, numbering from 1, in string <i>s</i> where string <i>t</i> first occurs, or zero if it does
   1171 not occur at all.</dd>
   1172 <dt><b>length[</b>(<b>[</b><i>arg</i><b>]</b>)<b>]</b></dt>
   1173 <dd>
   1174 If <i>arg</i> is an array, return the number of elements in the array; otherwise, return the length, in characters, of <i>arg</i>
   1175 taken as a string, or of the whole record, $0, if there is no argument.</dd>
   1176 <dt><b>match</b>(<i>s</i>,&nbsp;<i>ere</i>)</dt>
   1177 <dd>Return the position, in characters, numbering from 1, in string <i>s</i> where the extended regular expression <i>ere</i>
   1178 occurs, or zero if it does not occur at all. RSTART shall be set to the starting position (which is the same as the returned
   1179 value), zero if no match is found; RLENGTH shall be set to the length of the matched string, -1 if no match is found.</dd>
   1180 <dt><b>split</b>(<i>s</i>,&nbsp;<i>a</i><b>[</b>,&nbsp;<i>fs&nbsp;</i><b>]</b>)</dt>
   1181 <dd>
   1182 Split the string <i>s</i> into array elements <i>a</i>[1], <i>a</i>[2], ..., <i>a</i>[<i>n</i>], and return <i>n</i>. All elements
   1183 of the array shall be deleted before the split is performed. The separation shall be done with the ERE <i>fs</i> or with the field
   1184 separator <b>FS</b> if <i>fs</i> is not given. Each array element shall have a string value when created and, if appropriate, the
   1185 array element shall be considered a numeric string (see <a href="#tag_20_06_13_02">Expressions in awk</a> ). The effect of a null
   1186 string as the value of <i>fs</i> is unspecified.</dd>
   1187 <dt><b>sprintf</b>(<i>fmt</i>,&nbsp;<i>expr</i>,&nbsp;<i>expr</i>,&nbsp;...)</dt>
   1188 <dd>
   1189 Format the expressions according to the <b>printf</b> format given by <i>fmt</i> and return the resulting string.</dd>
   1190 <dt><b>sub(</b><i>ere</i>,&nbsp;<i>repl</i><b>[</b>,&nbsp;<i>in&nbsp;</i><b>]</b>)</dt>
   1191 <dd>
   1192 Substitute the string <i>repl</i> in place of the first instance of the extended regular expression <i>ERE</i> in string <i>in</i>
   1193 and return the number of substitutions. An &lt;ampersand&gt; (<tt>'&amp;'</tt>) appearing in the string <i>repl</i> shall be
   1194 replaced by the string from <i>in</i> that matches the ERE. An &lt;ampersand&gt; preceded with a &lt;backslash&gt; shall be
   1195 interpreted as the literal &lt;ampersand&gt; character. An occurrence of two consecutive &lt;backslash&gt; characters shall be
   1196 interpreted as just a single literal &lt;backslash&gt; character. Any other occurrence of a &lt;backslash&gt; (for example,
   1197 preceding any other character) shall be treated as a literal &lt;backslash&gt; character. Note that if <i>repl</i> is a string
   1198 literal (the lexical token <b>STRING</b>; see <a href="#tag_20_06_13_16">Grammar</a> ), the handling of the &lt;ampersand&gt;
   1199 character occurs after any lexical processing, including any lexical &lt;backslash&gt;-escape sequence processing. If <i>in</i> is
   1200 specified and it is not an lvalue (see <a href="#tag_20_06_13_02">Expressions in awk</a> ), the behavior is undefined. If <i>in</i>
   1201 is omitted, <i>awk</i> shall use the current record ($0) in its place.</dd>
   1202 <dt><b>substr</b>(<i>s</i>,&nbsp;<i>m</i><b>[</b>,&nbsp;<i>n&nbsp;</i><b>]</b>)</dt>
   1203 <dd>
   1204 Return the at most <i>n</i>-character substring of <i>s</i> that begins at position <i>m</i>, numbering from 1. If <i>n</i> is
   1205 omitted, or if <i>n</i> specifies more characters than are left in the string, the length of the substring shall be limited by the
   1206 length of the string <i>s</i>.</dd>
   1207 <dt><b>tolower</b>(<i>s</i>)</dt>
   1208 <dd>Return a string based on the string <i>s</i>. Each character in <i>s</i> that is an uppercase letter specified to have a
   1209 <b>tolower</b> mapping by the <i>LC_CTYPE</i> category of the current locale shall be replaced in the returned string by the
   1210 lowercase letter specified by the mapping. Other characters in <i>s</i> shall be unchanged in the returned string.</dd>
   1211 <dt><b>toupper</b>(<i>s</i>)</dt>
   1212 <dd>Return a string based on the string <i>s</i>. Each character in <i>s</i> that is a lowercase letter specified to have a
   1213 <b>toupper</b> mapping by the <i>LC_CTYPE</i> category of the current locale is replaced in the returned string by the uppercase
   1214 letter specified by the mapping. Other characters in <i>s</i> are unchanged in the returned string.</dd>
   1215 </dl>
   1216 <p class="tent">All of the preceding functions that take <i>ERE</i> as a parameter expect a pattern or a string valued expression
   1217 that is a regular expression as defined in <a href="#tag_20_06_13_04">Regular Expressions</a> .</p>
   1218 <h5><a name="tag_20_06_13_14" id="tag_20_06_13_14"></a>Input/Output and General Functions</h5>
   1219 <p class="tent">The input/output and general functions are:</p>
   1220 <dl compact>
   1221 <dd></dd>
   1222 <dt><b>close</b>(<i>expression</i>)</dt>
   1223 <dd>
   1224 Close the file or pipe opened by a <b>print</b> or <b>printf</b> statement or a call to <b>getline</b> with the same string-valued
   1225 <i>expression</i>. The limit on the number of open <i>expression</i> arguments is implementation-defined. If the close was
   1226 successful, the function shall return zero; otherwise, it shall return non-zero.</dd>
   1227 <dt><b>fflush</b>(<b>[</b><i>expression</i><b>]</b>)</dt>
   1228 <dd>
   1229 Write any unwritten data to the file or piped stream opened by a <b>print</b> or <b>printf</b> statement with the same
   1230 string-valued <i>expression</i>. If no argument, or if <i>expression</i> evaluates to the null string, then write all such data for
   1231 all such open files and piped streams, and standard output.
   1232 <p class="tent">If <b>fflush</b> is successful, it shall return 0; otherwise, it shall return non-zero.</p>
   1233 </dd>
   1234 <dt><i>expression&nbsp;|&nbsp;</i><b>getline&nbsp;[</b><i>var</i><b>]</b></dt>
   1235 <dd>
   1236 Read a record of input from a stream piped from the output of a command. The stream shall be created if no stream is currently open
   1237 with the value of <i>expression</i> as its command name. The stream created shall be equivalent to one created by a call to the
   1238 <a href="../functions/popen.html"><i>popen</i>()</a> function with the value of <i>expression</i> as the <i>command</i> argument
   1239 and a value of <i>r</i> as the <i>mode</i> argument. As long as the stream remains open, subsequent calls in which
   1240 <i>expression</i> evaluates to the same string value shall read subsequent records from the stream. The stream shall remain open
   1241 until the <b>close</b> function is called with an expression that evaluates to the same string value. At that time, the stream
   1242 shall be closed as if by a call to the <a href="../functions/pclose.html"><i>pclose</i>()</a> function. If <i>var</i> is omitted,
   1243 $0 and <b>NF</b> shall be set; otherwise, <i>var</i> shall be set and, if appropriate, it shall be considered a numeric string (see
   1244 <a href="#tag_20_06_13_02">Expressions in awk</a> ).
   1245 <p class="tent">The <b>getline</b> operator can form ambiguous constructs when there are unparenthesized operators (including
   1246 concatenate) to the left of the <tt>'|'</tt> (to the beginning of the expression containing <b>getline</b>). In the context of the
   1247 <tt>'$'</tt> operator, <tt>'|'</tt> shall behave as if it had a lower precedence than <tt>'$'</tt>. The result of evaluating other
   1248 operators is unspecified, and conforming applications shall parenthesize properly all such usages.</p>
   1249 </dd>
   1250 <dt><b>getline</b></dt>
   1251 <dd>Set $0 to the next input record from the current input file. This form of <b>getline</b> shall set the <b>NF</b>, <b>NR</b>,
   1252 and <b>FNR</b> variables.</dd>
   1253 <dt><b>getline&nbsp;</b><i>var</i></dt>
   1254 <dd>Set variable <i>var</i> to the next input record from the current input file and, if appropriate, <i>var</i> shall be
   1255 considered a numeric string (see <a href="#tag_20_06_13_02">Expressions in awk</a> ). This form of <b>getline</b> shall set the
   1256 <b>FNR</b> and <b>NR</b> variables.</dd>
   1257 <dt><b>getline&nbsp;[</b><i>var</i><b>]&nbsp;</b>&lt;&nbsp;<i>expression</i></dt>
   1258 <dd>
   1259 Read the next record of input from a named file. The <i>expression</i> shall be evaluated to produce a string that is used as a
   1260 pathname. If the file of that name is not currently open, it shall be opened. As long as the stream remains open, subsequent calls
   1261 in which <i>expression</i> evaluates to the same string value shall read subsequent records from the file. The file shall remain
   1262 open until the <b>close</b> function is called with an expression that evaluates to the same string value. If <i>var</i> is
   1263 omitted, $0 and <b>NF</b> shall be set; otherwise, <i>var</i> shall be set and, if appropriate, it shall be considered a numeric
   1264 string (see <a href="#tag_20_06_13_02">Expressions in awk</a> ).
   1265 <p class="tent">The <b>getline</b> operator can form ambiguous constructs when there are unparenthesized binary operators
   1266 (including concatenate) to the right of the <tt>'&lt;'</tt> (up to the end of the expression containing the <b>getline</b>). The
   1267 result of evaluating such a construct is unspecified, and conforming applications shall parenthesize properly all such usages.</p>
   1268 </dd>
   1269 <dt><b>system</b>(<i>expression</i>)</dt>
   1270 <dd>
   1271 Execute the command given by <i>expression</i> in a manner equivalent to the <a href="../functions/system.html"><i>system</i>()</a>
   1272 function defined in the System Interfaces volume of POSIX.1-2024 and return the exit status of the command.</dd>
   1273 </dl>
   1274 <p class="tent">All forms of <b>getline</b> shall return 1 for successful input, zero for end-of-file, and -1 for an error.</p>
   1275 <p class="tent">Where strings are used as the name of a file or pipeline, the application shall ensure that the strings are
   1276 textually identical. The terminology &quot;same string value&quot; implies that &quot;equivalent strings&quot;, even those that differ only by
   1277 &lt;space&gt; characters, represent different files.</p>
   1278 <h5><a name="tag_20_06_13_15" id="tag_20_06_13_15"></a>User-Defined Functions</h5>
   1279 <p class="tent">The <i>awk</i> language also provides user-defined functions. Such functions can be defined as:</p>
   1280 <pre>
   1281 <tt>function </tt><i>name</i><tt>(</tt><b>[</b><i>parameter</i><tt>, ...</tt><b>]</b><tt>) { </tt><i>statements</i><tt> }
   1282 </tt></pre>
   1283 <p class="tent">A function can be referred to anywhere in an <i>awk</i> program; in particular, its use can precede its definition.
   1284 The scope of a function is global.</p>
   1285 <p class="tent">The number of parameters in the function definition need not match the number of parameters in the function call.
   1286 Excess formal parameters can be used as local variables. If fewer arguments are supplied in a function call than are in the
   1287 function definition, the extra parameters that are used in the function body as scalars shall evaluate to the uninitialized value
   1288 until they are otherwise initialized, and the extra parameters that are used in the function body as arrays shall be treated as
   1289 uninitialized arrays where each element evaluates to the uninitialized value until otherwise initialized.</p>
   1290 <p class="tent">When invoking a function, no white space can be placed between the function name and the opening parenthesis.
   1291 Function calls can be nested and recursive calls can be made upon functions. Upon return from any nested or recursive function
   1292 call, the values of all of the calling function's parameters shall be unchanged, except for array parameters passed by reference.
   1293 The <b>return</b> statement can be used to return a value. If a <b>return</b> statement appears outside of a function definition,
   1294 the behavior is undefined.</p>
   1295 <p class="tent">In the function definition, &lt;newline&gt; characters shall be optional before the opening brace and after the
   1296 closing brace. Function definitions can appear anywhere in the program where a <i>pattern-action</i> pair is allowed.</p>
   1297 <h5><a name="tag_20_06_13_16" id="tag_20_06_13_16"></a>Grammar</h5>
   1298 <p class="tent">The grammar in this section and the lexical conventions in the following section shall together describe the syntax
   1299 for <i>awk</i> programs. The general conventions for this style of grammar are described in <a href=
   1300 "../utilities/V3_chap01.html#tag_18_03"><i>1.3 Grammar Conventions</i></a> . A valid program can be represented as the non-terminal
   1301 symbol <i>program</i> in the grammar. This formal syntax shall take precedence over the preceding text syntax description.</p>
   1302 <pre>
   1303 <tt>%token NAME NUMBER STRING ERE
   1304 %token FUNC_NAME   /* Name followed by '(' without white space. */
   1305 <br class="tent">
   1306 /* Keywords */
   1307 %token       Begin   End
   1308 /*          'BEGIN' 'END'                                */
   1309 <br class="tent">
   1310 %token       Break   Continue   Delete   Do   Else
   1311 /*          'break' 'continue' 'delete' 'do' 'else'      */
   1312 <br class="tent">
   1313 %token       Exit   For   Function   If   In   Next
   1314 /*          'exit' 'for' 'function' 'if' 'in' 'next'     */
   1315 <br class="tent">
   1316 %token       Nextfile   Print   Printf   Return   While
   1317 /*          'nextfile' 'print' 'printf' 'return' 'while' */
   1318 <br class="tent">
   1319 /* Reserved function names */
   1320 %token BUILTIN_FUNC_NAME
   1321             /* One token for the following:
   1322              * atan2 cos sin exp log sqrt int rand srand
   1323              * gsub index length match split sprintf sub
   1324              * substr tolower toupper close fflush system
   1325              */
   1326 %token GETLINE
   1327             /* Syntactically different from other built-ins. */
   1328 <br class="tent">
   1329 /* Two-character tokens. */
   1330 %token ADD_ASSIGN SUB_ASSIGN MUL_ASSIGN DIV_ASSIGN MOD_ASSIGN POW_ASSIGN
   1331 /*     '+='       '-='       '*='       '/='       '%='       '^=' */
   1332 <br class="tent">
   1333 %token OR   AND  NO_MATCH   EQ   LE   GE   NE   INCR  DECR  APPEND
   1334 /*     '||' '&amp;&amp;' '!~' '==' '&lt;=' '&gt;=' '!=' '++'  '--'  '&gt;&gt;'   */
   1335 <br class="tent">
   1336 /* One-character tokens. */
   1337 %token '{' '}' '(' ')' '[' ']' ',' ';' NEWLINE
   1338 %token '+' '-' '*' '%' '^' '!' '&gt;' '&lt;' '|' '?' ':' '~' '$' '='
   1339 <br class="tent">
   1340 %start program
   1341 %%
   1342 <br class="tent">
   1343 program          : item_list
   1344                  | item_list item
   1345                  ;
   1346 <br class="tent">
   1347 item_list        : /* empty */
   1348                  | item_list item terminator
   1349                  ;
   1350 <br class="tent">
   1351 item             : action
   1352                  | pattern action
   1353                  | normal_pattern
   1354                  | Function NAME      '(' param_list_opt ')'
   1355                        newline_opt action
   1356                  | Function FUNC_NAME '(' param_list_opt ')'
   1357                        newline_opt action
   1358                  ;
   1359 <br class="tent">
   1360 param_list_opt   : /* empty */
   1361                  | param_list
   1362                  ;
   1363 <br class="tent">
   1364 param_list       : NAME
   1365                  | param_list ',' NAME
   1366                  ;
   1367 <br class="tent">
   1368 pattern          : normal_pattern
   1369                  | special_pattern
   1370                  ;
   1371 <br class="tent">
   1372 normal_pattern   : expr
   1373                  | expr ',' newline_opt expr
   1374                  ;
   1375 <br class="tent">
   1376 special_pattern  : Begin
   1377                  | End
   1378                  ;
   1379 <br class="tent">
   1380 action           : '{' newline_opt                             '}'
   1381                  | '{' newline_opt terminated_statement_list   '}'
   1382                  | '{' newline_opt unterminated_statement_list '}'
   1383                  ;
   1384 <br class="tent">
   1385 terminator       : terminator NEWLINE
   1386                  |            ';'
   1387                  |            NEWLINE
   1388                  ;
   1389 <br class="tent">
   1390 terminated_statement_list : terminated_statement
   1391                  | terminated_statement_list terminated_statement
   1392                  ;
   1393 <br class="tent">
   1394 unterminated_statement_list : unterminated_statement
   1395                  | terminated_statement_list unterminated_statement
   1396                  ;
   1397 <br class="tent">
   1398 terminated_statement : action newline_opt
   1399                  | If '(' expr ')' newline_opt terminated_statement
   1400                  | If '(' expr ')' newline_opt terminated_statement
   1401                        Else newline_opt terminated_statement
   1402                  | While '(' expr ')' newline_opt terminated_statement
   1403                  | For '(' simple_statement_opt ';'
   1404                       expr_opt ';' simple_statement_opt ')' newline_opt
   1405                       terminated_statement
   1406                  | For '(' NAME In NAME ')' newline_opt
   1407                       terminated_statement
   1408                  | ';' newline_opt
   1409                  | terminatable_statement NEWLINE newline_opt
   1410                  | terminatable_statement ';'     newline_opt
   1411                  ;
   1412 <br class="tent">
   1413 unterminated_statement : terminatable_statement
   1414                  | If '(' expr ')' newline_opt unterminated_statement
   1415                  | If '(' expr ')' newline_opt terminated_statement
   1416                       Else newline_opt unterminated_statement
   1417                  | While '(' expr ')' newline_opt unterminated_statement
   1418                  | For '(' simple_statement_opt ';'
   1419                   expr_opt ';' simple_statement_opt ')' newline_opt
   1420                       unterminated_statement
   1421                  | For '(' NAME In NAME ')' newline_opt
   1422                       unterminated_statement
   1423                  ;
   1424 <br class="tent">
   1425 terminatable_statement : simple_statement
   1426                  | Break
   1427                  | Continue
   1428                  | Next
   1429                  | Nextfile
   1430                  | Exit expr_opt
   1431                  | Return expr_opt
   1432                  | Do newline_opt terminated_statement While '(' expr ')'
   1433                  ;
   1434 <br class="tent">
   1435 simple_statement_opt : /* empty */
   1436                  | simple_statement
   1437                  ;
   1438 <br class="tent">
   1439 simple_statement : Delete NAME '[' expr_list ']'
   1440                  | Delete NAME
   1441                  | expr
   1442                  | print_statement
   1443                  ;
   1444 <br class="tent">
   1445 print_statement  : simple_print_statement
   1446                  | simple_print_statement output_redirection
   1447                  ;
   1448 <br class="tent">
   1449 simple_print_statement : Print  print_expr_list_opt
   1450                  | Print  '(' multiple_expr_list ')'
   1451                  | Printf print_expr_list
   1452                  | Printf '(' multiple_expr_list ')'
   1453                  ;
   1454 <br class="tent">
   1455 output_redirection : '&gt;'    expr
   1456                  | APPEND expr
   1457                  | '|'    expr
   1458                  ;
   1459 <br class="tent">
   1460 expr_list_opt    : /* empty */
   1461                  | expr_list
   1462                  ;
   1463 <br class="tent">
   1464 expr_list        : expr
   1465                  | multiple_expr_list
   1466                  ;
   1467 <br class="tent">
   1468 multiple_expr_list : expr ',' newline_opt expr
   1469                  | multiple_expr_list ',' newline_opt expr
   1470                  ;
   1471 <br class="tent">
   1472 expr_opt         : /* empty */
   1473                  | expr
   1474                  ;
   1475 <br class="tent">
   1476 expr             : unary_expr
   1477                  | non_unary_expr
   1478                  ;
   1479 <br class="tent">
   1480 unary_expr       : '+' expr
   1481                  | '-' expr
   1482                  | unary_expr '^'      expr
   1483                  | unary_expr '*'      expr
   1484                  | unary_expr '/'      expr
   1485                  | unary_expr '%'      expr
   1486                  | unary_expr '+'      expr
   1487                  | unary_expr '-'      expr
   1488                  | unary_expr          non_unary_expr
   1489                  | unary_expr '&lt;'      expr
   1490                  | unary_expr LE       expr
   1491                  | unary_expr NE       expr
   1492                  | unary_expr EQ       expr
   1493                  | unary_expr '&gt;'      expr
   1494                  | unary_expr GE       expr
   1495                  | unary_expr '~'      expr
   1496                  | unary_expr NO_MATCH expr
   1497                  | unary_expr In NAME
   1498                  | unary_expr AND newline_opt expr
   1499                  | unary_expr OR  newline_opt expr
   1500                  | unary_expr '?' expr ':' expr
   1501                  | unary_input_function
   1502                  ;
   1503 <br class="tent">
   1504 non_unary_expr   : '(' expr ')'
   1505                  | '!' expr
   1506                  | non_unary_expr '^'      expr
   1507                  | non_unary_expr '*'      expr
   1508                  | non_unary_expr '/'      expr
   1509                  | non_unary_expr '%'      expr
   1510                  | non_unary_expr '+'      expr
   1511                  | non_unary_expr '-'      expr
   1512                  | non_unary_expr          non_unary_expr
   1513                  | non_unary_expr '&lt;'      expr
   1514                  | non_unary_expr LE       expr
   1515                  | non_unary_expr NE       expr
   1516                  | non_unary_expr EQ       expr
   1517                  | non_unary_expr '&gt;'      expr
   1518                  | non_unary_expr GE       expr
   1519                  | non_unary_expr '~'      expr
   1520                  | non_unary_expr NO_MATCH expr
   1521                  | non_unary_expr In NAME
   1522                  | '(' multiple_expr_list ')' In NAME
   1523                  | non_unary_expr AND newline_opt expr
   1524                  | non_unary_expr OR  newline_opt expr
   1525                  | non_unary_expr '?' expr ':' expr
   1526                  | NUMBER
   1527                  | STRING
   1528                  | lvalue
   1529                  | ERE
   1530                  | lvalue INCR
   1531                  | lvalue DECR
   1532                  | INCR lvalue
   1533                  | DECR lvalue
   1534                  | lvalue POW_ASSIGN expr
   1535                  | lvalue MOD_ASSIGN expr
   1536                  | lvalue MUL_ASSIGN expr
   1537                  | lvalue DIV_ASSIGN expr
   1538                  | lvalue ADD_ASSIGN expr
   1539                  | lvalue SUB_ASSIGN expr
   1540                  | lvalue '=' expr
   1541                  | FUNC_NAME '(' expr_list_opt ')'
   1542                       /* no white space allowed before '(' */
   1543                  | BUILTIN_FUNC_NAME '(' expr_list_opt ')'
   1544                  | BUILTIN_FUNC_NAME
   1545                  | non_unary_input_function
   1546                  ;
   1547 <br class="tent">
   1548 print_expr_list_opt : /* empty */
   1549                  | print_expr_list
   1550                  ;
   1551 <br class="tent">
   1552 print_expr_list  : print_expr
   1553                  | print_expr_list ',' newline_opt print_expr
   1554                  ;
   1555 <br class="tent">
   1556 print_expr       : unary_print_expr
   1557                  | non_unary_print_expr
   1558                  ;
   1559 <br class="tent">
   1560 unary_print_expr : '+' print_expr
   1561                  | '-' print_expr
   1562                  | unary_print_expr '^'      print_expr
   1563                  | unary_print_expr '*'      print_expr
   1564                  | unary_print_expr '/'      print_expr
   1565                  | unary_print_expr '%'      print_expr
   1566                  | unary_print_expr '+'      print_expr
   1567                  | unary_print_expr '-'      print_expr
   1568                  | unary_print_expr          non_unary_print_expr
   1569                  | unary_print_expr '~'      print_expr
   1570                  | unary_print_expr NO_MATCH print_expr
   1571                  | unary_print_expr In NAME
   1572                  | unary_print_expr AND newline_opt print_expr
   1573                  | unary_print_expr OR  newline_opt print_expr
   1574                  | unary_print_expr '?' print_expr ':' print_expr
   1575                  ;
   1576 <br class="tent">
   1577 non_unary_print_expr : '(' expr ')'
   1578                  | '!' print_expr
   1579                  | non_unary_print_expr '^'      print_expr
   1580                  | non_unary_print_expr '*'      print_expr
   1581                  | non_unary_print_expr '/'      print_expr
   1582                  | non_unary_print_expr '%'      print_expr
   1583                  | non_unary_print_expr '+'      print_expr
   1584                  | non_unary_print_expr '-'      print_expr
   1585                  | non_unary_print_expr          non_unary_print_expr
   1586                  | non_unary_print_expr '~'      print_expr
   1587                  | non_unary_print_expr NO_MATCH print_expr
   1588                  | non_unary_print_expr In NAME
   1589                  | '(' multiple_expr_list ')' In NAME
   1590                  | non_unary_print_expr AND newline_opt print_expr
   1591                  | non_unary_print_expr OR  newline_opt print_expr
   1592                  | non_unary_print_expr '?' print_expr ':' print_expr
   1593                  | NUMBER
   1594                  | STRING
   1595                  | lvalue
   1596                  | ERE
   1597                  | lvalue INCR
   1598                  | lvalue DECR
   1599                  | INCR lvalue
   1600                  | DECR lvalue
   1601                  | lvalue POW_ASSIGN print_expr
   1602                  | lvalue MOD_ASSIGN print_expr
   1603                  | lvalue MUL_ASSIGN print_expr
   1604                  | lvalue DIV_ASSIGN print_expr
   1605                  | lvalue ADD_ASSIGN print_expr
   1606                  | lvalue SUB_ASSIGN print_expr
   1607                  | lvalue '=' print_expr
   1608                  | FUNC_NAME '(' expr_list_opt ')'
   1609                      /* no white space allowed before '(' */
   1610                  | BUILTIN_FUNC_NAME '(' expr_list_opt ')'
   1611                  | BUILTIN_FUNC_NAME
   1612                  ;
   1613 <br class="tent">
   1614 lvalue           : NAME
   1615                  | NAME '[' expr_list ']'
   1616                  | '$' expr
   1617                  ;
   1618 <br class="tent">
   1619 non_unary_input_function : simple_get
   1620                  | simple_get '&lt;' expr
   1621                  | non_unary_expr '|' simple_get
   1622                  ;
   1623 <br class="tent">
   1624 unary_input_function : unary_expr '|' simple_get
   1625                  ;
   1626 <br class="tent">
   1627 simple_get       : GETLINE
   1628                  | GETLINE lvalue
   1629                  ;
   1630 <br class="tent">
   1631 newline_opt      : /* empty */
   1632                  | newline_opt NEWLINE
   1633                  ;
   1634 </tt></pre>
   1635 <p class="tent">This grammar has several ambiguities that shall be resolved as follows:</p>
   1636 <ul>
   1637 <li class="tent">Operator precedence and associativity shall be as described in <a href="#tagtcjh_14">Expressions in Decreasing
   1638 Precedence in awk</a> .</li>
   1639 <li class="tent">In case of ambiguity, an <b>else</b> shall be associated with the most immediately preceding <b>if</b> that would
   1640 satisfy the grammar.</li>
   1641 <li class="tent">In some contexts, a &lt;slash&gt; (<tt>'/'</tt>) that is used to surround an ERE could also be the division
   1642 operator. This shall be resolved in such a way that wherever the division operator could appear, a &lt;slash&gt; is assumed to be
   1643 the division operator. (There is no unary division operator.)</li>
   1644 </ul>
   1645 <p class="tent">Each expression in an <i>awk</i> program shall conform to the precedence and associativity rules, even when this is
   1646 not needed to resolve an ambiguity. For example, because <tt>'$'</tt> has higher precedence than <tt>'++'</tt>, the string
   1647 <tt>"$x++--"</tt> is not a valid <i>awk</i> expression, even though it is unambiguously parsed by the grammar as
   1648 <tt>"$(x++)--"</tt>.</p>
   1649 <p class="tent">One convention that might not be obvious from the formal grammar is where &lt;newline&gt; characters are
   1650 acceptable. There are several obvious placements such as terminating a statement, and a &lt;backslash&gt; can be used to escape
   1651 &lt;newline&gt; characters between any lexical tokens. In addition, &lt;newline&gt; characters without &lt;backslash&gt; characters
   1652 can follow a comma, an open brace, logical AND operator (<tt>"&amp;&amp;"</tt>), logical OR operator (<tt>"||"</tt>), the <b>do</b>
   1653 keyword, the <b>else</b> keyword, and the closing parenthesis of an <b>if</b>, <b>for</b>, or <b>while</b> statement. For
   1654 example:</p>
   1655 <pre>
   1656 <tt>{ print  $1,
   1657          $2 }
   1658 </tt></pre>
   1659 <h5><a name="tag_20_06_13_17" id="tag_20_06_13_17"></a>Lexical Conventions</h5>
   1660 <p class="tent">The lexical conventions for <i>awk</i> programs, with respect to the preceding grammar, shall be as follows:</p>
   1661 <ol>
   1662 <li class="tent">Except as noted, <i>awk</i> shall recognize the longest possible token or delimiter beginning at a given
   1663 point.</li>
   1664 <li class="tent">A comment shall consist of any characters beginning with the &lt;number-sign&gt; character and terminated by, but
   1665 excluding the next occurrence of, a &lt;newline&gt;. Comments shall have no effect, except to delimit lexical tokens.</li>
   1666 <li class="tent">The &lt;newline&gt; shall be recognized as the token <b>NEWLINE</b>.</li>
   1667 <li class="tent">A &lt;backslash&gt; character immediately followed by a &lt;newline&gt; shall have no effect.</li>
   1668 <li class="tent">The token <b>STRING</b> shall represent a string constant. A string constant shall begin with the character
   1669 <tt>'"'</tt>. Within a string constant, a &lt;backslash&gt; character shall be considered to begin an escape sequence as specified
   1670 in the table in XBD <a href="../basedefs/V1_chap05.html#tag_05"><i>5. File Format Notation</i></a> (<tt>'\\'</tt>, <tt>'\a'</tt>,
   1671 <tt>'\b'</tt>, <tt>'\f'</tt>, <tt>'\n'</tt>, <tt>'\r'</tt>, <tt>'\t'</tt>, <tt>'\v'</tt>). In addition, the escape sequences in
   1672 <a href="#tagtcjh_15">Escape Sequences in awk</a> shall be recognized. A &lt;newline&gt; shall not occur within a string constant.
   1673 A string constant shall be terminated by the first unescaped occurrence of the character <tt>'"'</tt> after the one that begins the
   1674 string constant. The value of the string shall be the sequence of all unescaped characters and values of escape sequences between,
   1675 but not including, the two delimiting <tt>'"'</tt> characters.</li>
   1676 <li class="tent">The token <b>ERE</b> represents an extended regular expression constant. An ERE constant shall begin with the
   1677 &lt;slash&gt; character. Within an ERE constant, a &lt;backslash&gt; character shall be considered to begin an escape sequence as
   1678 specified in the table in XBD <a href="../basedefs/V1_chap05.html#tag_05"><i>5. File Format Notation</i></a> . In addition, the
   1679 escape sequences in <a href="#tagtcjh_15">Escape Sequences in awk</a> shall be recognized. The application shall ensure that a
   1680 &lt;newline&gt; does not occur within an ERE constant. An ERE constant shall be terminated by the first unescaped occurrence of the
   1681 &lt;slash&gt; character after the one that begins the ERE constant. The extended regular expression represented by the ERE constant
   1682 shall be the sequence of all unescaped characters and values of escape sequences between, but not including, the two delimiting
   1683 &lt;slash&gt; characters.</li>
   1684 <li class="tent">A &lt;blank&gt; shall have no effect, except to delimit lexical tokens or within <b>STRING</b> or <b>ERE</b>
   1685 tokens.</li>
   1686 <li class="tent">The token <b>NUMBER</b> shall represent a numeric constant. Its form and numeric value shall either be equivalent
   1687 to the <b>decimal-floating-constant</b> token as specified by the ISO&nbsp;C standard, or it shall be a sequence of decimal digits
   1688 and shall be evaluated as an integer constant in decimal. In addition, implementations may accept numeric constants with the form
   1689 and numeric value equivalent to the <b>hexadecimal-constant</b> and <b>hexadecimal-floating-constant</b> tokens as specified by the
   1690 ISO&nbsp;C standard. Note that these forms do not use the radix character from the current locale; they always use a
   1691 &lt;period&gt;.
   1692 <p class="tent">If the value is too large or too small to be representable (see <a href=
   1693 "../utilities/V3_chap01.html#tag_18_01_02"><i>1.1.2 Concepts Derived from the ISO C Standard</i></a> ), the behavior is
   1694 undefined.</p>
   1695 </li>
   1696 <li class="tent">A sequence of underscores, digits, and alphabetics from the portable character set (see XBD <a href=
   1697 "../basedefs/V1_chap06.html#tag_06_01"><i>6.1 Portable Character Set</i></a> ), beginning with an &lt;underscore&gt; or alphabetic
   1698 character, shall be considered a word.</li>
   1699 <li class="tent">The following words are keywords that shall be recognized as individual tokens; the name of the token is the same
   1700 as the keyword:
   1701 <table cellpadding="3">
   1702 <tr valign="top">
   1703 <td align="left">
   1704 <p class="tent"><b><br>
   1705 BEGIN<br>
   1706 break<br>
   1707 continue<br></b></p>
   1708 </td>
   1709 <td align="left">
   1710 <p class="tent"><b><br>
   1711 delete<br>
   1712 do<br>
   1713 else<br></b></p>
   1714 </td>
   1715 <td align="left">
   1716 <p class="tent"><b><br>
   1717 END<br>
   1718 exit<br>
   1719 for<br></b></p>
   1720 </td>
   1721 <td align="left">
   1722 <p class="tent"><b><br>
   1723 function<br>
   1724 getline<br>
   1725 if<br></b></p>
   1726 </td>
   1727 <td align="left">
   1728 <p class="tent"><b><br>
   1729 in<br>
   1730 next<br>
   1731 nextfile<br></b></p>
   1732 </td>
   1733 <td align="left">
   1734 <p class="tent"><b><br>
   1735 print<br>
   1736 printf<br>
   1737 return<br></b></p>
   1738 </td>
   1739 <td align="left">
   1740 <p class="tent"><b><br>
   1741 while<br></b></p>
   1742 </td>
   1743 </tr>
   1744 </table>
   1745 </li>
   1746 <li class="tent">The following words are names of built-in functions and shall be recognized as the token <b>BUILTIN_FUNC_NAME</b>:
   1747 <table cellpadding="3">
   1748 <tr valign="top">
   1749 <td align="left">
   1750 <p class="tent"><b><br>
   1751 atan2<br>
   1752 close<br>
   1753 cos<br>
   1754 exp<br></b></p>
   1755 </td>
   1756 <td align="left">
   1757 <p class="tent"><b><br>
   1758 fflush<br>
   1759 gsub<br>
   1760 index<br></b></p>
   1761 </td>
   1762 <td align="left">
   1763 <p class="tent"><b><br>
   1764 int<br>
   1765 length<br>
   1766 log<br></b></p>
   1767 </td>
   1768 <td align="left">
   1769 <p class="tent"><b><br>
   1770 match<br>
   1771 rand<br>
   1772 sin<br></b></p>
   1773 </td>
   1774 <td align="left">
   1775 <p class="tent"><b><br>
   1776 split<br>
   1777 sprintf<br>
   1778 sqrt<br></b></p>
   1779 </td>
   1780 <td align="left">
   1781 <p class="tent"><b><br>
   1782 srand<br>
   1783 sub<br>
   1784 substr<br></b></p>
   1785 </td>
   1786 <td align="left">
   1787 <p class="tent"><b><br>
   1788 system<br>
   1789 tolower<br>
   1790 toupper<br></b></p>
   1791 </td>
   1792 </tr>
   1793 </table>
   1794 <p class="tent">The above-listed keywords and names of built-in functions are considered reserved words.</p>
   1795 </li>
   1796 <li class="tent">The token <b>NAME</b> shall consist of a word that is not a keyword or a name of a built-in function and is not
   1797 followed immediately (without any delimiters) by the <tt>'('</tt> character.</li>
   1798 <li class="tent">The token <b>FUNC_NAME</b> shall consist of a word that is not a keyword or a name of a built-in function,
   1799 followed immediately (without any delimiters) by the <tt>'('</tt> character. The <tt>'('</tt> character shall not be included as
   1800 part of the token.</li>
   1801 <li class="tent">The following two-character sequences shall be recognized as the named tokens:
   1802 <center>
   1803 <table border="1" cellpadding="3" align="center">
   1804 <tr valign="top">
   1805 <th align="center">
   1806 <p class="tent"><b>Token Name</b></p>
   1807 </th>
   1808 <th align="center">
   1809 <p class="tent"><b>Sequence</b></p>
   1810 </th>
   1811 <th align="center">
   1812 <p class="tent"><b>Token Name</b></p>
   1813 </th>
   1814 <th align="center">
   1815 <p class="tent"><b>Sequence</b></p>
   1816 </th>
   1817 </tr>
   1818 <tr valign="top">
   1819 <td align="left">
   1820 <p class="tent"><b>ADD_ASSIGN</b></p>
   1821 </td>
   1822 <td align="center">
   1823 <p class="tent">+=</p>
   1824 </td>
   1825 <td align="left">
   1826 <p class="tent"><b>NO_MATCH</b></p>
   1827 </td>
   1828 <td align="center">
   1829 <p class="tent">!~</p>
   1830 </td>
   1831 </tr>
   1832 <tr valign="top">
   1833 <td align="left">
   1834 <p class="tent"><b>SUB_ASSIGN</b></p>
   1835 </td>
   1836 <td align="center">
   1837 <p class="tent">-=</p>
   1838 </td>
   1839 <td align="left">
   1840 <p class="tent"><b>EQ</b></p>
   1841 </td>
   1842 <td align="center">
   1843 <p class="tent">==</p>
   1844 </td>
   1845 </tr>
   1846 <tr valign="top">
   1847 <td align="left">
   1848 <p class="tent"><b>MUL_ASSIGN</b></p>
   1849 </td>
   1850 <td align="center">
   1851 <p class="tent">*=</p>
   1852 </td>
   1853 <td align="left">
   1854 <p class="tent"><b>LE</b></p>
   1855 </td>
   1856 <td align="center">
   1857 <p class="tent">&lt;=</p>
   1858 </td>
   1859 </tr>
   1860 <tr valign="top">
   1861 <td align="left">
   1862 <p class="tent"><b>DIV_ASSIGN</b></p>
   1863 </td>
   1864 <td align="center">
   1865 <p class="tent">/=</p>
   1866 </td>
   1867 <td align="left">
   1868 <p class="tent"><b>GE</b></p>
   1869 </td>
   1870 <td align="center">
   1871 <p class="tent">&gt;=</p>
   1872 </td>
   1873 </tr>
   1874 <tr valign="top">
   1875 <td align="left">
   1876 <p class="tent"><b>MOD_ASSIGN</b></p>
   1877 </td>
   1878 <td align="center">
   1879 <p class="tent">%=</p>
   1880 </td>
   1881 <td align="left">
   1882 <p class="tent"><b>NE</b></p>
   1883 </td>
   1884 <td align="center">
   1885 <p class="tent">!=</p>
   1886 </td>
   1887 </tr>
   1888 <tr valign="top">
   1889 <td align="left">
   1890 <p class="tent"><b>POW_ASSIGN</b></p>
   1891 </td>
   1892 <td align="center">
   1893 <p class="tent">^=</p>
   1894 </td>
   1895 <td align="left">
   1896 <p class="tent"><b>INCR</b></p>
   1897 </td>
   1898 <td align="center">
   1899 <p class="tent">++</p>
   1900 </td>
   1901 </tr>
   1902 <tr valign="top">
   1903 <td align="left">
   1904 <p class="tent"><b>OR</b></p>
   1905 </td>
   1906 <td align="center">
   1907 <p class="tent">||</p>
   1908 </td>
   1909 <td align="left">
   1910 <p class="tent"><b>DECR</b></p>
   1911 </td>
   1912 <td align="center">
   1913 <p class="tent">--</p>
   1914 </td>
   1915 </tr>
   1916 <tr valign="top">
   1917 <td align="left">
   1918 <p class="tent"><b>AND</b></p>
   1919 </td>
   1920 <td align="center">
   1921 <p class="tent">&amp;&</p>
   1922 </td>
   1923 <td align="left">
   1924 <p class="tent"><b>APPEND</b></p>
   1925 </td>
   1926 <td align="center">
   1927 <p class="tent">&gt;&gt;</p>
   1928 </td>
   1929 </tr>
   1930 </table>
   1931 </center>
   1932 </li>
   1933 <li class="tent">The following single characters shall be recognized as tokens whose names are the character:
   1934 <pre>
   1935 <tt>&lt;newline&gt; { } ( ) [ ] , ; + - * % ^ ! &gt; &lt; | ? : ~ $ =
   1936 </tt></pre></li>
   1937 </ol>
   1938 <p class="tent">There is a lexical ambiguity between the token <b>ERE</b> and the tokens <tt>'/'</tt> and <b>DIV_ASSIGN</b>. When
   1939 an input sequence begins with a &lt;slash&gt; character in any syntactic context where the token <tt>'/'</tt> or <b>DIV_ASSIGN</b>
   1940 could appear as the next token in a valid program, the longer of those two tokens that can be recognized shall be recognized. In
   1941 any other syntactic context where the token <b>ERE</b> could appear as the next token in a valid program, the token <b>ERE</b>
   1942 shall be recognized.</p>
   1943 </blockquote>
   1944 <h4 class="mansect"><a name="tag_20_06_14" id="tag_20_06_14"></a>EXIT STATUS</h4>
   1945 <blockquote>
   1946 <p>The following exit values shall be returned:</p>
   1947 <dl compact>
   1948 <dd></dd>
   1949 <dt>&nbsp;0</dt>
   1950 <dd>All input files were processed successfully.</dd>
   1951 <dt>&gt;0</dt>
   1952 <dd>An error occurred.</dd>
   1953 </dl>
   1954 <p class="tent">The exit status can be altered within the program by using an <b>exit</b> expression.</p>
   1955 </blockquote>
   1956 <h4 class="mansect"><a name="tag_20_06_15" id="tag_20_06_15"></a>CONSEQUENCES OF ERRORS</h4>
   1957 <blockquote>
   1958 <p>If any <i>file</i> operand is specified and the named file cannot be accessed, <i>awk</i> shall write a diagnostic message to
   1959 standard error and terminate without any further action.</p>
   1960 <p class="tent">If the program specified by either the <i>program</i> operand or a <i>progfile</i> operand is not a valid
   1961 <i>awk</i> program (as specified in the EXTENDED DESCRIPTION section), the behavior is undefined.</p>
   1962 </blockquote>
   1963 <hr>
   1964 <div class="box"><em>The following sections are informative.</em></div>
   1965 <h4 class="mansect"><a name="tag_20_06_16" id="tag_20_06_16"></a>APPLICATION USAGE</h4>
   1966 <blockquote>
   1967 <p>Since &lt;backslash&gt; has a special meaning both in the <i>assignment</i> option-argument to the <b>-v</b> option and in the
   1968 <i>assignment</i> operand, applications that need to pass strings to <i>awk</i> without special interpretation of &lt;backslash&gt;
   1969 should not use these methods but should instead make use of the <b>ARGV</b> or <b>ENVIRON</b> array.</p>
   1970 <p class="tent">The <b>index</b>, <b>length</b>, <b>match</b>, and <b>substr</b> functions should not be confused with similar
   1971 functions in the ISO&nbsp;C standard; the <i>awk</i> versions deal with characters, while the ISO&nbsp;C standard deals with
   1972 bytes.</p>
   1973 <p class="tent">Because the concatenation operation is represented by adjacent expressions rather than an explicit operator, it is
   1974 often necessary to use parentheses to enforce the proper evaluation precedence.</p>
   1975 <p class="tent">When using <i>awk</i> to process pathnames, it is recommended that LC_ALL, or at least LC_CTYPE and LC_COLLATE, are
   1976 set to POSIX or C in the environment, since pathnames can contain byte sequences that do not form valid characters in some locales,
   1977 in which case the utility's behavior would be undefined. In the POSIX locale each byte is a valid single-byte character, and
   1978 therefore this problem is avoided.</p>
   1979 <p class="tent">Since the <tt>"=="</tt> operator checks if strings are identical, not whether they collate equally, applications
   1980 needing to check whether strings collate equally can use:</p>
   1981 <pre>
   1982 <tt>a &lt;= b &amp;& a &gt;= b
   1983 </tt></pre>
   1984 <p class="tent">To specify a <i>file</i> operand naming a file with a name containing an &lt;equals-sign&gt;, users can use
   1985 <tt>"./"</tt> as the first two characters of a relative file pathname that starts with an &lt;underscore&gt; or an alphabetic
   1986 character to keep the <i>file</i> operand from being interpreted as an <i>assignment</i> operand. Similarly, <tt>"./-"</tt> can be
   1987 used to access a file named <tt>'-'</tt> in the current directory rather than use standard input.</p>
   1988 </blockquote>
   1989 <h4 class="mansect"><a name="tag_20_06_17" id="tag_20_06_17"></a>EXAMPLES</h4>
   1990 <blockquote>
   1991 <p>The <i>awk</i> program specified in the command line is most easily specified within single-quotes (for example,
   1992 '<i>program</i>') for applications using <a href="../utilities/sh.html"><i>sh</i></a>, because <i>awk</i> programs commonly contain
   1993 characters that are special to the shell, including double-quotes. In the cases where an <i>awk</i> program contains single-quote
   1994 characters, it is usually easiest to specify most of the program as strings within single-quotes concatenated by the shell with
   1995 quoted single-quote characters. For example:</p>
   1996 <pre>
   1997 <tt>awk '/'\''/ { print "quote:", $0 }'
   1998 </tt></pre>
   1999 <p class="tent">prints all lines from the standard input containing a single-quote character, prefixed with <i>quote</i>:.</p>
   2000 <p class="tent">The following are examples of simple <i>awk</i> programs:</p>
   2001 <ol>
   2002 <li class="tent">Write to the standard output all input lines for which field 3 is greater than 5:
   2003 <pre>
   2004 <tt>$3 &gt; 5
   2005 </tt></pre></li>
   2006 <li class="tent">Write every tenth line:
   2007 <pre>
   2008 <tt>(NR % 10) == 0
   2009 </tt></pre></li>
   2010 <li class="tent">Write any line with a substring matching the regular expression:
   2011 <pre>
   2012 <tt>/(G|D)(2[0-9][[:alpha:]]*)/
   2013 </tt></pre></li>
   2014 <li class="tent">Print any line with a substring containing a <tt>'G'</tt> or <tt>'D'</tt>, followed by a sequence of digits and
   2015 characters. This example uses character classes <b>digit</b> and <b>alpha</b> to match language-independent digit and alphabetic
   2016 characters respectively:
   2017 <pre>
   2018 <tt>/(G|D)([[:digit:][:alpha:]]*)/
   2019 </tt></pre></li>
   2020 <li class="tent">Write any line in which the second field matches the regular expression and the fourth field does not:
   2021 <pre>
   2022 <tt>$2 ~ /xyz/ &amp;& $4 !~ /xyz/
   2023 </tt></pre></li>
   2024 <li class="tent">Write any line in which the second field contains a &lt;backslash&gt;:
   2025 <pre>
   2026 <tt>$2 ~ /\\/
   2027 </tt></pre></li>
   2028 <li class="tent">Write any line in which the second field contains a &lt;backslash&gt;. Note that &lt;backslash&gt;-escapes are
   2029 interpreted twice; once in lexical processing of the string and once in processing the regular expression:
   2030 <pre>
   2031 <tt>$2 ~ "\\\\"
   2032 </tt></pre></li>
   2033 <li class="tent">Write the second to the last and the last field in each line. Separate the fields by a &lt;colon&gt;:
   2034 <pre>
   2035 <tt>{OFS=":";print $(NF-1), $NF}
   2036 </tt></pre></li>
   2037 <li class="tent">Write the line number and number of fields in each line. The three strings representing the line number, the
   2038 &lt;colon&gt;, and the number of fields are concatenated and that string is written to standard output:
   2039 <pre>
   2040 <tt>{print NR ":" NF}
   2041 </tt></pre></li>
   2042 <li class="tent">Write lines longer than 72 characters:
   2043 <pre>
   2044 <tt>length($0) &gt; 72
   2045 </tt></pre></li>
   2046 <li class="tent">Write the first two fields in opposite order separated by <b>OFS</b>:
   2047 <pre>
   2048 <tt>{ print $2, $1 }
   2049 </tt></pre></li>
   2050 <li class="tent">Same, with input fields separated by a &lt;comma&gt; or &lt;space&gt; and &lt;tab&gt; characters, or both:
   2051 <pre>
   2052 <tt>BEGIN { FS = ",[ \t]*|[ \t]+" }
   2053       { print $2, $1 }
   2054 </tt></pre></li>
   2055 <li class="tent">Add up the first column, print sum, and average:
   2056 <pre>
   2057 <tt>      {s += $1 }
   2058 END   {print "sum is ", s, " average is", s/NR}
   2059 </tt></pre></li>
   2060 <li class="tent">Write fields in reverse order, one per line (many lines out for each line in):
   2061 <pre>
   2062 <tt>{ for (i = NF; i &gt; 0; --i) print $i }
   2063 </tt></pre></li>
   2064 <li class="tent">Write all lines between occurrences of the strings <b>start</b> and <b>stop</b>:
   2065 <pre>
   2066 <tt>/start/, /stop/
   2067 </tt></pre></li>
   2068 <li class="tent">Write all lines whose first field is different from the previous one:
   2069 <pre>
   2070 <tt>$1 != prev { print; prev = $1 }
   2071 </tt></pre></li>
   2072 <li class="tent">Simulate <a href="../utilities/echo.html"><i>echo</i></a>:
   2073 <pre>
   2074 <tt>BEGIN  {
   2075         for (i = 1; i &lt; ARGC; ++i)
   2076         printf("%s%s", ARGV[i], i==ARGC-1?"\n":" ")
   2077 }
   2078 </tt></pre></li>
   2079 <li class="tent">Write the path prefixes contained in the <i>PATH</i> environment variable, one per line:
   2080 <pre>
   2081 <tt>BEGIN  {
   2082         n = split (ENVIRON["PATH"], path, ":")
   2083         for (i = 1; i &lt;= n; ++i)
   2084         print path[i]
   2085 }
   2086 </tt></pre></li>
   2087 <li class="tent">If there is a file named <b>input</b> containing page headers of the form: Page #
   2088 <p class="tent">and a file named <b>program</b> that contains:</p>
   2089 <pre>
   2090 <tt>/Page/   { $2 = n++; }
   2091          { print }
   2092 </tt></pre>
   2093 then the command line:
   2094 <pre>
   2095 <tt>awk -f program n=5 input
   2096 </tt></pre>
   2097 <p class="tent">prints the file <b>input</b>, filling in page numbers starting at 5.</p>
   2098 </li>
   2099 </ol>
   2100 </blockquote>
   2101 <h4 class="mansect"><a name="tag_20_06_18" id="tag_20_06_18"></a>RATIONALE</h4>
   2102 <blockquote>
   2103 <p>This description is based on the new <i>awk</i>, &quot;nawk&quot;, (see the referenced <i>The AWK Programming Language</i>), which
   2104 introduced a number of new features to the historical <i>awk</i>:</p>
   2105 <ol>
   2106 <li class="tent">New keywords: <b>delete</b>, <b>do</b>, <b>function</b>, <b>return</b></li>
   2107 <li class="tent">New built-in functions: <b>atan2</b>, <b>close</b>, <b>cos</b>, <b>gsub</b>, <b>match</b>, <b>rand</b>,
   2108 <b>sin</b>, <b>srand</b>, <b>sub</b>, <b>system</b></li>
   2109 <li class="tent">New predefined variables: <b>FNR</b>, <b>ARGC</b>, <b>ARGV</b>, <b>RSTART</b>, <b>RLENGTH</b>, <b>SUBSEP</b></li>
   2110 <li class="tent">New expression operators: <b>?</b>, <b>:</b>, <b>,</b>, <b>^</b></li>
   2111 <li class="tent">The <b>FS</b> variable and the third argument to <b>split</b>, now treated as extended regular expressions.</li>
   2112 <li class="tent">The operator precedence, changed to more closely match the C language. Two examples of code that operate
   2113 differently are:
   2114 <pre>
   2115 <tt>while ( n /= 10 &gt; 1) ...
   2116 if (!"wk" ~ /bwk/) ...
   2117 </tt></pre></li>
   2118 </ol>
   2119 <p class="tent">Several features have been added based on newer implementations of <i>awk</i>:</p>
   2120 <ul>
   2121 <li class="tent">Multiple instances of <b>-f</b> <i>progfile</i> are permitted.</li>
   2122 <li class="tent">The new option <b>-v</b> <i>assignment.</i></li>
   2123 <li class="tent">The new predefined variable <b>ENVIRON</b>.</li>
   2124 <li class="tent">New built-in functions <b>toupper</b> and <b>tolower</b>.</li>
   2125 <li class="tent">More formatting capabilities are added to <b>printf</b> to match the ISO&nbsp;C standard.</li>
   2126 </ul>
   2127 <p class="tent">Earlier versions of this standard required implementations to support multiple adjacent &lt;semicolon&gt;s, lines
   2128 with one or more &lt;semicolon&gt; before a rule (<i>pattern-action</i> pairs), and lines with only &lt;semicolon&gt;(s). These are
   2129 not required by this standard and are considered poor programming practice, but can be accepted by an implementation of <i>awk</i>
   2130 as an extension.</p>
   2131 <p class="tent">The overall <i>awk</i> syntax has always been based on the C language, with a few features from the shell command
   2132 language and other sources. Because of this, it is not completely compatible with any other language, which has caused confusion
   2133 for some users. It is not the intent of the standard developers to address such issues. A few relatively minor changes toward
   2134 making the language more compatible with the ISO&nbsp;C standard were made; most of these changes are based on similar changes in
   2135 recent implementations, as described above. There remain several C-language conventions that are not in <i>awk</i>. One of the
   2136 notable ones is the &lt;comma&gt; operator, which is commonly used to specify multiple expressions in the C language <b>for</b>
   2137 statement. Also, there are various places where <i>awk</i> is more restrictive than the C language regarding the type of expression
   2138 that can be used in a given context. These limitations are due to the different features that the <i>awk</i> language does
   2139 provide.</p>
   2140 <p class="tent">Regular expressions in <i>awk</i> have been extended somewhat from historical implementations to make them a pure
   2141 superset of extended regular expressions, as defined by POSIX.1-2024 (see XBD <a href="../basedefs/V1_chap09.html#tag_09_04"><i>9.4
   2142 Extended Regular Expressions</i></a> ). The main extensions are internationalization features and interval expressions. Historical
   2143 implementations of <i>awk</i> have long supported &lt;backslash&gt;-escape sequences as an extension to extended regular
   2144 expressions, and this extension has been retained despite inconsistency with other utilities. The number of escape sequences
   2145 recognized in both extended regular expressions and strings has varied (generally increasing with time) among implementations. The
   2146 set specified by POSIX.1-2024 includes most sequences known to be supported by popular implementations and by the ISO&nbsp;C
   2147 standard. One sequence that is not supported is hexadecimal value escapes beginning with <tt>'\x'</tt>. This would allow values
   2148 expressed in more than 9 bits to be used within <i>awk</i> as in the ISO&nbsp;C standard. However, because this syntax has a
   2149 non-deterministic length, it does not permit the subsequent character to be a hexadecimal digit. This limitation can be dealt with
   2150 in the C language by the use of lexical string concatenation. In the <i>awk</i> language, concatenation could also be a solution
   2151 for strings, but not for extended regular expressions (either lexical ERE tokens or strings used dynamically as regular
   2152 expressions). Because of this limitation, the feature has not been added to POSIX.1-2024.</p>
   2153 <p class="tent">When a string variable is used in a context where an extended regular expression normally appears (where the
   2154 lexical token ERE is used in the grammar) the string does not contain the literal &lt;slash&gt; characters.</p>
   2155 <p class="tent">Some versions of <i>awk</i> allow the form:</p>
   2156 <pre>
   2157 <tt>func name(args, ... ) { statements }
   2158 </tt></pre>
   2159 <p class="tent">This has been deprecated by the authors of the language, who asked that it not be specified.</p>
   2160 <p class="tent">Historical implementations of <i>awk</i> produce an error if a <b>next</b> statement is executed in a <b>BEGIN</b>
   2161 action, and cause <i>awk</i> to terminate if a <b>next</b> statement is executed in an <b>END</b> action. This behavior has not
   2162 been documented, and it was not believed that it was necessary to standardize it.</p>
   2163 <p class="tent">The specification of conversions between string and numeric values is much more detailed than in the documentation
   2164 of historical implementations or in the referenced <i>The AWK Programming Language</i>. Although most of the behavior is designed
   2165 to be intuitive, the details are necessary to ensure compatible behavior from different implementations. This is especially
   2166 important in relational expressions since the types of the operands determine whether a string or numeric comparison is performed.
   2167 From the perspective of an application developer, it is usually sufficient to expect intuitive behavior and to force conversions
   2168 (by adding zero or concatenating a null string) when the type of an expression does not obviously match what is needed. The intent
   2169 has been to specify historical practice in almost all cases. The one exception is that, in historical implementations, variables
   2170 and constants maintain both string and numeric values after their original value is converted by any use. This means that
   2171 referencing a variable or constant can have unexpected side-effects. For example, with historical implementations the following
   2172 program:</p>
   2173 <pre>
   2174 <tt>{
   2175     a = "+2"
   2176     b = 2
   2177     if (NR % 2)
   2178         c = a + b
   2179     if (a == b)
   2180         print "numeric comparison"
   2181     else
   2182         print "string comparison"
   2183 }
   2184 </tt></pre>
   2185 <p class="tent">would perform a numeric comparison (and output numeric comparison) for each odd-numbered line, but perform a string
   2186 comparison (and output string comparison) for each even-numbered line. POSIX.1-2024 ensures that comparisons will be numeric if
   2187 necessary. With historical implementations, the following program:</p>
   2188 <pre>
   2189 <tt>BEGIN {
   2190     OFMT = "%e"
   2191     print 3.14
   2192     OFMT = "%f"
   2193     print 3.14
   2194 }
   2195 </tt></pre>
   2196 <p class="tent">would output <tt>"3.140000e+00"</tt> twice, because in the second <b>print</b> statement the constant
   2197 <tt>"3.14"</tt> would have a string value from the previous conversion. POSIX.1-2024 requires that the output of the second
   2198 <b>print</b> statement be <tt>"3.140000"</tt>. The behavior of historical implementations was seen as too unintuitive and
   2199 unpredictable.</p>
   2200 <p class="tent">It was pointed out that with the rules contained in early drafts, the following script would print nothing:</p>
   2201 <pre>
   2202 <tt>BEGIN {
   2203     y[1.5] = 1
   2204     OFMT = "%e"
   2205     print y[1.5]
   2206 }
   2207 </tt></pre>
   2208 <p class="tent">Therefore, a new variable, <b>CONVFMT</b>, was introduced. The <b>OFMT</b> variable is now restricted to affecting
   2209 output conversions of numbers to strings and <b>CONVFMT</b> is used for internal conversions, such as comparisons or array
   2210 indexing. The default value is the same as that for <b>OFMT</b>, so unless a program changes <b>CONVFMT</b> (which no historical
   2211 program would do), it will receive the historical behavior associated with internal string conversions.</p>
   2212 <p class="tent">The POSIX <i>awk</i> lexical and syntactic conventions are specified more formally than in other sources. Again the
   2213 intent has been to specify historical practice. One convention that may not be obvious from the formal grammar as in other verbal
   2214 descriptions is where &lt;newline&gt; characters are acceptable. There are several obvious placements such as terminating a
   2215 statement, and a &lt;backslash&gt; can be used to escape &lt;newline&gt; characters between any lexical tokens. In addition,
   2216 &lt;newline&gt; characters without &lt;backslash&gt; characters can follow a comma, an open brace, a logical AND operator
   2217 (<tt>"&amp;&amp;"</tt>), a logical OR operator (<tt>"||"</tt>), the <b>do</b> keyword, the <b>else</b> keyword, and the closing
   2218 parenthesis of an <b>if</b>, <b>for</b>, or <b>while</b> statement. For example:</p>
   2219 <pre>
   2220 <tt>{ print $1,
   2221         $2 }
   2222 </tt></pre>
   2223 <p class="tent">The requirement that <i>awk</i> add a trailing &lt;newline&gt; to the program argument text is to simplify the
   2224 grammar, making it match a text file in form. There is no way for an application or test suite to determine whether a literal
   2225 &lt;newline&gt; is added or whether <i>awk</i> simply acts as if it did.</p>
   2226 <p class="tent">POSIX.1-2024 requires several changes from historical implementations in order to support internationalization.
   2227 Probably the most subtle of these is the use of the decimal-point character, defined by the <i>LC_NUMERIC</i> category of the
   2228 locale, in representations of floating-point numbers. This locale-specific character is used in recognizing numeric input, in
   2229 converting between strings and numeric values, and in formatting output. However, regardless of locale, the &lt;period&gt;
   2230 character (the decimal-point character of the POSIX locale) is the decimal-point character recognized in processing <i>awk</i>
   2231 programs (including assignments in command line arguments). This is essentially the same convention as the one used in the
   2232 ISO&nbsp;C standard. The difference is that the C language includes the <a href=
   2233 "../functions/setlocale.html"><i>setlocale</i>()</a> function, which permits an application to modify its locale. Because of this
   2234 capability, a C application begins executing with its locale set to the C locale, and only executes in the environment-specified
   2235 locale after an explicit call to <a href="../functions/setlocale.html"><i>setlocale</i>()</a>. However, adding such an elaborate
   2236 new feature to the <i>awk</i> language was seen as inappropriate for POSIX.1-2024. It is possible to execute an <i>awk</i> program
   2237 explicitly in any desired locale by setting the environment in the shell.</p>
   2238 <p class="tent">The undefined behavior resulting from NULs in extended regular expressions allows future extensions for the GNU
   2239 <i>gawk</i> program to process binary data.</p>
   2240 <p class="tent">The behavior in the case of invalid <i>awk</i> programs (including lexical, syntactic, and semantic errors) is
   2241 undefined because it was considered overly limiting on implementations to specify. In most cases such errors can be expected to
   2242 produce a diagnostic and a non-zero exit status. However, some implementations may choose to extend the language in ways that make
   2243 use of certain invalid constructs. Other invalid constructs might be deemed worthy of a warning, but otherwise cause some
   2244 reasonable behavior. Still other constructs may be very difficult to detect in some implementations. Also, different
   2245 implementations might detect a given error during an initial parsing of the program (before reading any input files) while others
   2246 might detect it when executing the program after reading some input. Implementors should be aware that diagnosing errors as early
   2247 as possible and producing useful diagnostics can ease debugging of applications, and thus make an implementation more usable.</p>
   2248 <p class="tent">The unspecified behavior from using multi-character <b>RS</b> values is to allow possible future extensions based
   2249 on extended regular expressions used for record separators. Historical implementations take the first character of the string and
   2250 ignore the others.</p>
   2251 <p class="tent">Unspecified behavior when <a href=
   2252 "../utilities/split.html"><i>split</i></a>(<i>string</i>,<i>array</i>,&lt;null&gt;) is used is to allow a proposed future extension
   2253 that would split up a string into an array of individual characters.</p>
   2254 <p class="tent">In the context of the <b>getline</b> function, equally good arguments for different precedences of the <b>|</b> and
   2255 <b>&lt;</b> operators can be made. Historical practice has been that:</p>
   2256 <pre>
   2257 <tt>getline &lt; "a" "b"
   2258 </tt></pre>
   2259 <p class="tent">is parsed as:</p>
   2260 <pre>
   2261 <tt>( getline &lt; "a" ) "b"
   2262 </tt></pre>
   2263 <p class="tent">although many would argue that the intent was that the file <b>ab</b> should be read. However:</p>
   2264 <pre>
   2265 <tt>getline &lt; "x" + 1
   2266 </tt></pre>
   2267 <p class="tent">parses as:</p>
   2268 <pre>
   2269 <tt>getline &lt; ( "x" + 1 )
   2270 </tt></pre>
   2271 <p class="tent">Similar problems occur with the <b>|</b> version of <b>getline</b>, particularly in combination with <b>$</b>. For
   2272 example:</p>
   2273 <pre>
   2274 <tt>$"echo hi" | getline
   2275 </tt></pre>
   2276 <p class="tent">(This situation is particularly problematic when used in a <b>print</b> statement, where the <b>|getline</b> part
   2277 might be a redirection of the <b>print</b>.)</p>
   2278 <p class="tent">Since in most cases such constructs are not (or at least should not) be used (because they have a natural ambiguity
   2279 for which there is no conventional parsing), the meaning of these constructs has been made explicitly unspecified. (The effect is
   2280 that a conforming application that runs into the problem must parenthesize to resolve the ambiguity.) There appeared to be few if
   2281 any actual uses of such constructs.</p>
   2282 <p class="tent">Grammars can be written that would cause an error under these circumstances. Where backwards-compatibility is not a
   2283 large consideration, implementors may wish to use such grammars.</p>
   2284 <p class="tent">Some historical implementations have allowed some built-in functions to be called without an argument list, the
   2285 result being a default argument list chosen in some &quot;reasonable&quot; way. Use of <b>length</b> as a synonym for <b>length($0)</b> is
   2286 the only one of these forms that is thought to be widely known or widely used; this particular form is documented in various places
   2287 (for example, most historical <i>awk</i> reference pages, although not in the referenced <i>The AWK Programming Language</i>) as
   2288 legitimate practice. With this exception, default argument lists have always been undocumented and vaguely defined, and it is not
   2289 at all clear how (or if) they should be generalized to user-defined functions. They add no useful functionality and preclude
   2290 possible future extensions that might need to name functions without calling them. Not standardizing them seems the simplest
   2291 course. The standard developers considered that <b>length</b> merited special treatment, however, since it has been documented in
   2292 the past and sees possibly substantial use in historical programs. Accordingly, this usage has been made legitimate, but
   2293 Issue&nbsp;5 removed the obsolescent marking for XSI-conforming implementations and many otherwise conforming applications depend
   2294 on this feature.</p>
   2295 <p class="tent">In <b>sub</b> and <b>gsub</b>, if <i>repl</i> is a string literal (the lexical token <b>STRING</b>), then two
   2296 consecutive &lt;backslash&gt; characters should be used in the string to ensure a single &lt;backslash&gt; will precede the
   2297 &lt;ampersand&gt; when the resultant string is passed to the function. (For example, to specify one literal &lt;ampersand&gt; in
   2298 the replacement string, use <b>gsub</b>(<b>ERE</b>, <tt>"\\&amp;"</tt>).)</p>
   2299 <p class="tent">Historically, the only special character in the <i>repl</i> argument of <b>sub</b> and <b>gsub</b> string functions
   2300 was the &lt;ampersand&gt; (<tt>'&amp;'</tt>) character and preceding it with the &lt;backslash&gt; character was used to turn off
   2301 its special meaning.</p>
   2302 <p class="tent">The description in the ISO&nbsp;POSIX-2:1993 standard introduced behavior such that the &lt;backslash&gt; character
   2303 was another special character and it was unspecified whether there were any other special characters. This description introduced
   2304 several portability problems, some of which are described below, and so it has been replaced with the more historical description.
   2305 Some of the problems include:</p>
   2306 <ul>
   2307 <li class="tent">Historically, to create the replacement string, a script could use <b>gsub</b>(<b>ERE</b>, <tt>"\\&amp;"</tt>),
   2308 but with the ISO&nbsp;POSIX-2:1993 standard wording, it was necessary to use <b>gsub</b>(<b>ERE</b>, <tt>"\\\\&amp;"</tt>). The
   2309 &lt;backslash&gt; characters are doubled here because all string literals are subject to lexical analysis, which would reduce each
   2310 pair of &lt;backslash&gt; characters to a single &lt;backslash&gt; before being passed to <b>gsub</b>.</li>
   2311 <li class="tent">Since it was unspecified what the special characters were, for portable scripts to guarantee that characters are
   2312 printed literally, each character had to be preceded with a &lt;backslash&gt;. (For example, a portable script had to use
   2313 <b>gsub</b>(<b>ERE</b>, <tt>"\\h\\i"</tt>) to produce a replacement string of <tt>"hi"</tt>.)</li>
   2314 </ul>
   2315 <p class="tent">The description for comparisons in the ISO&nbsp;POSIX-2:1993 standard did not properly describe historical practice
   2316 because of the way numeric strings are compared as numbers. The current rules cause the following code:</p>
   2317 <pre>
   2318 <tt>if (0 == "000")
   2319     print "strange, but true"
   2320 else
   2321     print "not true"
   2322 </tt></pre>
   2323 <p class="tent">to do a numeric comparison, causing the <b>if</b> to succeed. It should be intuitively obvious that this is
   2324 incorrect behavior, and indeed, no historical implementation of <i>awk</i> actually behaves this way.</p>
   2325 <p class="tent">To fix this problem, the definition of <i>numeric string</i> was enhanced to include only those values obtained
   2326 from specific circumstances (mostly external sources) where it is not possible to determine unambiguously whether the value is
   2327 intended to be a string or a numeric.</p>
   2328 <p class="tent">Variables that are assigned to a numeric string shall also be treated as a numeric string. (For example, the notion
   2329 of a numeric string can be propagated across assignments.) In comparisons, all variables having the uninitialized value are to be
   2330 treated as a numeric operand evaluating to the numeric value zero.</p>
   2331 <p class="tent">Uninitialized variables include all types of variables including scalars, array elements, and fields. The
   2332 definition of an uninitialized value in <a href="#tag_20_06_13_03">Variables and Special Variables</a> is necessary to describe the
   2333 value placed on uninitialized variables and on fields that are valid (for example, <b>&lt;</b> <b>$NF</b>) but have no characters
   2334 in them and to describe how these variables are to be used in comparisons. A valid field, such as <b>$1</b>, that has no characters
   2335 in it can be obtained from an input line of <tt>"\t\t"</tt> when <b>FS=</b><tt>'\t'</tt>. Historically, the comparison
   2336 (<b>$1&lt;</b>10) was done numerically after evaluating <b>$1</b> to the value zero.</p>
   2337 <p class="tent">The phrase &quot;... also shall have the numeric value of the numeric string&quot; was removed from several sections of the
   2338 ISO&nbsp;POSIX-2:1993 standard because is specifies an unnecessary implementation detail. It is not necessary for POSIX.1-2024 to
   2339 specify that these objects be assigned two different values. It is only necessary to specify that these objects may evaluate to two
   2340 different values depending on context.</p>
   2341 <p class="tent">Historical implementations of <i>awk</i> did not parse hexadecimal integer or floating constants like
   2342 <tt>"0xa"</tt> and <tt>"0xap0"</tt>. Due to an oversight, the 2001 through 2004 editions of this standard required support for
   2343 hexadecimal floating constants. This was due to the reference to <a href="../functions/atof.html"><i>atof</i>()</a>. This version
   2344 of the standard allows but does not require implementations to use <a href="../functions/atof.html"><i>atof</i>()</a> and includes
   2345 a description of how floating-point numbers are recognized as an alternative to match historic behavior. The intent of this change
   2346 is to allow implementations to recognize floating-point constants according to either the ISO/IEC&nbsp;9899:1990 standard or
   2347 ISO/IEC&nbsp;9899:1999 standard, and to allow (but not require) implementations to recognize hexadecimal integer constants.</p>
   2348 <p class="tent">Historical implementations of <i>awk</i> did not support floating-point infinities and NaNs in <i>numeric
   2349 strings</i>; e.g., <tt>"-INF"</tt> and <tt>"NaN"</tt>. However, implementations that use the <a href=
   2350 "../functions/atof.html"><i>atof</i>()</a> or <a href="../functions/strtod.html"><i>strtod</i>()</a> functions to do the conversion
   2351 picked up support for these values if they used a ISO/IEC&nbsp;9899:1999 standard version of the function instead of a
   2352 ISO/IEC&nbsp;9899:1990 standard version. Due to an oversight, the 2001 through 2004 editions of this standard did not allow support
   2353 for infinities and NaNs, but in this revision support is allowed (but not required). This is a silent change to the behavior of
   2354 <i>awk</i> programs; for example, in the POSIX locale the expression:</p>
   2355 <pre>
   2356 <tt>("-INF" + 0 &lt; 0)
   2357 </tt></pre>
   2358 <p class="tent">formerly had the value 0 because <tt>"-INF"</tt> converted to 0, but now it may have the value 0 or 1.</p>
   2359 <p class="tent">Deleting all elements of an array one element at a time, via:</p>
   2360 <pre>
   2361 <tt>for (index in array)
   2362     delete array[index]
   2363 </tt></pre>
   2364 <p class="tent">is usually not efficient. This standard requires <tt>delete array</tt> to have the same effects, and this was
   2365 supported in most implementations as a more efficient operation. It is also possible to use <tt>split("", array)</tt> to achieve
   2366 the same effect and efficiency.</p>
   2367 </blockquote>
   2368 <h4 class="mansect"><a name="tag_20_06_19" id="tag_20_06_19"></a>FUTURE DIRECTIONS</h4>
   2369 <blockquote>
   2370 <p>If this utility is directed to create a new directory entry that contains any bytes that have the encoded value of a
   2371 &lt;newline&gt; character, implementations are encouraged to treat this as an error. A future version of this standard may require
   2372 implementations to treat this as an error.</p>
   2373 <p class="tent">A future version of this standard may require <b>srand</b> to accept any numeric value and calculate the seed by
   2374 taking the provided value, converting it to an integer, and calculating the integer value modulo
   2375 2<sup><small><i>n</i></small></sup> where <i>n</i> is an implementation-defined value greater than or equal to 32.</p>
   2376 <p class="tent">A future version of this standard may require the initial seed for the <b>rand</b> function (the seed value used if
   2377 <b>srand</b> is not called) to be an integer between 0 and 2<sup><small><i>n</i></small></sup>-1 inclusive where <i>n</i> is an
   2378 implementation-defined value greater than or equal to 32. Additionally, the initial seed value may be required to be a
   2379 (pseudo-)random value such that two invocations of <i>awk</i> are unlikely to emit the same sequence of random values (unless the
   2380 seed is explicitly set to the same value via <b>srand</b>).</p>
   2381 <p class="tent">A future version of this standard may define a new <b>posix_srand</b> function that enables application authors to
   2382 set the seed to a (pseudo-)random value generated by the system. Alternatively, the specification of the <b>srand</b> function may
   2383 be altered to provide some means to set the default seed value to a (pseudo-)random value.</p>
   2384 </blockquote>
   2385 <h4 class="mansect"><a name="tag_20_06_20" id="tag_20_06_20"></a>SEE ALSO</h4>
   2386 <blockquote>
   2387 <p><a href="../utilities/V3_chap01.html#tag_18_03"><i>1.3 Grammar Conventions</i></a> , <a href=
   2388 "../utilities/grep.html#"><i>grep</i></a> , <a href="../utilities/lex.html#"><i>lex</i></a> , <a href=
   2389 "../utilities/sed.html#"><i>sed</i></a></p>
   2390 <p class="tent">XBD <a href="../basedefs/V1_chap05.html#tag_05"><i>5. File Format Notation</i></a> , <a href=
   2391 "../basedefs/V1_chap06.html#tag_06_01"><i>6.1 Portable Character Set</i></a> , <a href="../basedefs/V1_chap08.html#tag_08"><i>8.
   2392 Environment Variables</i></a> , <a href="../basedefs/V1_chap09.html#tag_09"><i>9. Regular Expressions</i></a> , <a href=
   2393 "../basedefs/V1_chap12.html#tag_12_02"><i>12.2 Utility Syntax Guidelines</i></a></p>
   2394 <p class="tent">XSH <a href="../functions/atof.html#"><i>atof</i></a> , <a href="../functions/exec.html#tag_17_129"><i>exec</i></a>
   2395 , <a href="../functions/isspace.html#"><i>isspace</i></a> , <a href="../functions/popen.html#"><i>popen</i></a> , <a href=
   2396 "../functions/setlocale.html#"><i>setlocale</i></a> , <a href="../functions/strtod.html#"><i>strtod</i></a></p>
   2397 </blockquote>
   2398 <h4 class="mansect"><a name="tag_20_06_21" id="tag_20_06_21"></a>CHANGE HISTORY</h4>
   2399 <blockquote>
   2400 <p>First released in Issue 2.</p>
   2401 </blockquote>
   2402 <h4 class="mansect"><a name="tag_20_06_22" id="tag_20_06_22"></a>Issue 5</h4>
   2403 <blockquote>
   2404 <p>The FUTURE DIRECTIONS section is added.</p>
   2405 </blockquote>
   2406 <h4 class="mansect"><a name="tag_20_06_23" id="tag_20_06_23"></a>Issue 6</h4>
   2407 <blockquote>
   2408 <p>The <i>awk</i> utility is aligned with the IEEE&nbsp;P1003.2b draft standard.</p>
   2409 <p class="tent">The normative text is reworded to avoid use of the term &quot;must&quot; for application requirements.<br></p>
   2410 <p class="tent">IEEE PASC Interpretation 1003.2 #211 is applied, adding the sentence &quot;An occurrence of two consecutive
   2411 &lt;backslash&gt; characters shall be interpreted as just a single literal &lt;backslash&gt; character.&quot; into the description of
   2412 the <b>sub</b> string function.</p>
   2413 </blockquote>
   2414 <h4 class="mansect"><a name="tag_20_06_24" id="tag_20_06_24"></a>Issue 7</h4>
   2415 <blockquote>
   2416 <p>PASC Interpretation 1003.2-1992 #107 (SD5-XCU-ERN-73) is applied, updating the description of the <b>OFS</b> variable.</p>
   2417 <p class="tent">Austin Group Interpretation 1003.1-2001 #189 is applied.</p>
   2418 <p class="tent">Austin Group Interpretation 1003.1-2001 #201 is applied, permitting implementations to support infinities and
   2419 NaNs.</p>
   2420 <p class="tent">SD5-XCU-ERN-79 is applied, restoring the horizontal lines to <a href="#tagtcjh_14">Expressions in Decreasing
   2421 Precedence in awk</a> , and SD5-XCU-ERN-80 is applied, changing the order of some table entries.</p>
   2422 <p class="tent">SD5-XCU-ERN-87 is applied, updating the descriptive text of the Grammar.</p>
   2423 <p class="tent">SD5-XCU-ERN-97 is applied, updating the SYNOPSIS.</p>
   2424 <p class="tent">The EXTENDED DESCRIPTION is changed to make the support of hexadecimal integer and floating constants optional.</p>
   2425 <p class="tent">POSIX.1-2008, Technical Corrigendum 1, XCU/TC1-2008/0057 [224], XCU/TC1-2008/0058 [454], XCU/TC1-2008/0059 [224],
   2426 XCU/TC1-2008/0060 [224], XCU/TC1-2008/0061 [254], XCU/TC1-2008/0062 [254], XCU/TC1-2008/0063 [224], and XCU/TC1-2008/0064 [454] are
   2427 applied.</p>
   2428 <p class="tent">POSIX.1-2008, Technical Corrigendum 2, XCU/TC2-2008/0058 [584], XCU/TC2-2008/0059 [963], XCU/TC2-2008/0060 [226],
   2429 XCU/TC2-2008/0061 [663], XCU/TC2-2008/0062 [963], XCU/TC2-2008/0063 [226], and XCU/TC2-2008/0064 [963] are applied.</p>
   2430 </blockquote>
   2431 <h4 class="mansect"><a name="tag_20_06_25" id="tag_20_06_25"></a>Issue 8</h4>
   2432 <blockquote>
   2433 <p>Austin Group Defect 251 is applied, encouraging implementations to disallow the creation of filenames containing any bytes that
   2434 have the encoded value of a &lt;newline&gt; character.</p>
   2435 <p class="tent">Austin Group Defects 544 and 1136 are applied, requiring implementations to accept the <b>delete</b> statement with
   2436 an unsubscripted array name.</p>
   2437 <p class="tent">Austin Group Defect 607 is applied, adding the <b>nextfile</b> statement.</p>
   2438 <p class="tent">Austin Group Defect 634 is applied, adding the <b>fflush</b> function.</p>
   2439 <p class="tent">Austin Group Defects 974 and 1451 are applied, clarifying the <b>ARGC</b>, <b>ARGV</b> and <b>FILENAME</b>
   2440 variables, and adding to APPLICATION USAGE.</p>
   2441 <p class="tent">Austin Group Defect 983 is applied, changing the descriptions of the <b>rand</b> and <b>srand</b> functions and the
   2442 FUTURE DIRECTIONS section.</p>
   2443 <p class="tent">Austin Group Defect 1070 is applied, requiring the <tt>"!="</tt> and <tt>"=="</tt> operators to perform string
   2444 comparisons by checking if the strings are identical (and not by checking if they collate equally).</p>
   2445 <p class="tent">Austin Group Defect 1105 is applied, clarifying the requirements for &lt;backslash&gt; escaping.</p>
   2446 <p class="tent">Austin Group Defect 1122 is applied, changing the description of <i>NLSPATH .</i></p>
   2447 <p class="tent">Austin Group Defect 1198 is applied, requiring comparisons to be performed numerically when both operands have
   2448 string values that are numeric strings.</p>
   2449 <p class="tent">Austin Group Defect 1277 is applied, clarifying that using a &lt;slash&gt; character within an ERE requires
   2450 escaping only if it is within the lexical token <b>ERE</b>.</p>
   2451 <p class="tent">Austin Group Defect 1320 is applied, clarifying the condition under which ERE matching is against input
   2452 records.</p>
   2453 <p class="tent">Austin Group Defect 1395 is applied, changing the requirements for string to number conversion.</p>
   2454 <p class="tent">Austin Group Defect 1468 is applied, clarifying the behavior when <b>FS</b> is an ERE that can match the null
   2455 string.</p>
   2456 <p class="tent">Austin Group Defect 1566 is applied, specifying the behavior of the <b>length</b> function when passed an array
   2457 argument.</p>
   2458 </blockquote>
   2459 <div class="box"><em>End of informative text.</em></div>
   2460 <hr>
   2461 <p>&nbsp;</p>
   2462 <a href="#top"><span class="topOfPage">return to top of page</span></a><br>
   2463 <hr size="2" noshade>
   2464 <center><font size="2">UNIX® is a registered Trademark of The Open Group.<br>
   2465 POSIX™ is a Trademark of The IEEE.<br>
   2466 Copyright © 2001-2024 The IEEE and The Open Group, All Rights Reserved<br>
   2467 [ <a href="../mindex.html">Main Index</a> | <a href="../basedefs/contents.html">XBD</a> | <a href=
   2468 "../functions/contents.html">XSH</a> | <a href="../utilities/contents.html">XCU</a> | <a href="../xrat/contents.html">XRAT</a>
   2469 ]</font></center>
   2470 <hr size="2" noshade>
   2471 <div class="NAVHEADER">
   2472 <table summary="Header navigation table" class="nav" width="100%" border="0" cellpadding="0" cellspacing="0">
   2473 <tr class="nav">
   2474 <td class="nav" width="15%" align="left" valign="bottom"><a href="../utilities/at.html" accesskey="P">&lt;&lt;&lt;
   2475 Previous</a></td>
   2476 <td class="nav" width="70%" align="center" valign="bottom"><a href="contents.html">Home</a></td>
   2477 <td class="nav" width="15%" align="right" valign="bottom"><a href="../utilities/basename.html" accesskey="N">Next
   2478 &gt;&gt;&gt;</a></td>
   2479 </tr>
   2480 </table>
   2481 <hr align="left" width="100%"></div>
   2482 </body>
   2483 </html>