ACC SHELL

Path : /usr/share/doc/packages/ksh/
File Upload :
Current File : //usr/share/doc/packages/ksh/MEMORANDUM











      sIntroduction to ksh-93                       datDecember 21, 1993
       Charge Case 311466-6713
       File Case 61175                              froDavid G. Korn
                                                       MH 11267
                                                       3C-526B x7975
                                                       (research!dgk)

                                                       TM

                           MEMORANDUM FOR FILE

       1.  INTRODUCTION

       The term "shell" is used to describe a program that provides
       a command language interface.   Because  the  UNIX*   system
       shell is a user level program, and not part of the operating
       system itself, anyone can write a new  shell  or  modify  an
       existing  one.   This has caused an evolutionary progress in
       the design and implementation of  shells,  with  the  better
       ones  surviving.   The  most  widely  available  UNIX system
       shells are the Bourne shell[7], written by Steve  Bourne  at
       AT&T  Bell Laboratories, the C shell[8], written by Bill Joy
       at the University of California, Berkeley, and the KornShell
       language   [9],   written   by   David  Korn  at  AT&T  Bell
       Laboratories.  The Bourne shell is available on  almost  all
       versions  of the UNIX system.  The C Shell is available with
       all Berkeley Software Distribution (BSD) UNIX systems and on
       many  other systems.  The KornShell is available on System V
       Release 4 systems.  In addition, it  is  available  on  many
       other  systems.   The  source  for the KornShell language is
       available from the AT&T Toolchest,  an  electronic  software
       distribution  system.   It runs on all known versions of the
       UNIX system and on many UNIX system look-alikes.

       There have been several articles comparing the  UNIX  system
       shells.    Jason  Levitt[10]  highlights  some  of  the  new
       features  introduced  by  the  KornShell   language.    Rich
       Bilancia[11]  explains  some  of the advantages of using the
       KornShell language.  John Sebes[12] provides a more detailed
       comparison  of  the three shells, both as a command language
       and as a programming language.

       The KornShell language is a superset of  the  Bourne  shell.
       The  KornShell  language  has  many  of  the popular C shell
       features, plus additional features of its own.  Its  initial
       popularity  stems  primarily  from  its  improvements  as  a
       command language.  The primary interactive  benefit  of  the

       ____________________

       *  UNIX is a registered trademark of USL




       KornShell  command  language is a visual command line editor
       that allows you to make corrections to your current  command
       line  or  to earlier command lines, without having to retype
       them.

       However, in  the  long  run,  the  power  of  the  KornShell
       language  as a high-level programming language, as described
       by Dolotta and  Mashey[13],  may  prove  to  be  of  greater
       significance.   ksh-93  provides  the  programming  power of
       several other interpretive languages such as awk, FIT, PERL,
       and  tcl.  An application that was originally written in the
       C  programming  language  was  rewritten  in  the  KornShell
       language.   More  than  20,000 lines of C code were replaced
       with KornShell scripts totaling fewer than  700  lines.   In
       most  instances  there  was  no  perceptible  difference  in
       performance between the two versions of the code.

       The KornShell language  has  been  embedded  into  windowing
       systems  allowing  graphical user interfaces to be developed
       in shell rather than having to build applications that  need
       to  be  compiled.  The wksh program[14] provides a method of
       developing OpenLook or Motif applications as ksh scripts.

       This memo is an introduction to  ksh-93,  the  program  that
       implements  an  enhanced  version of the KornShell language.
       It is referred to as ksh in the rest of this memo.  The memo
       describes  the  KornShell  language based on the features of
       the 12/28/93 release of ksh.  This memo is not  a  tutorial,
       only  an  introduction.  The second edition of reference [9]
       gives a more complete treatment of the KornShell language.

       A concerted effort has been made to achieve  both  System  V
       Bourne  shell  compatibility and IEEE POSIX compatibility so
       that scripts written for either  of  these  shells  can  run
       without modification with ksh.  In addition, ksh-93 attempts
       to be compatible with older versions of ksh.  When there are
       conflicts  between versions of the shell, ksh-93 selects the
       behavior  dictated  by  the  IEEE   POSIX   standard.    The
       description of features in this memo assumes that the reader
       is already familiar with the Bourne shell.


       2.  COMMAND LANGUAGE

       There is no separate command language.  All features of  the
       language,  except  job  control,  can  be used both within a
       script and interactively from a terminal.  However, features
       that  are  more  likely  to  be  used while running commands
       interactively from a terminal are presented here.

       2.1  Setting Options

       By convention, UNIX  commands  consist  of  a  command  name
       followed by options and other arguments.  Options are either
       of the form -letter, or -letter value.  In the former  case,
       several  options  may  be  grouped  after  a  single -.  The
       argument -- signifies an end to the option list and is  only
       required when the first non-option argument begins with a -.
       Most commands print  an  error  message  which  shows  which
       options  are  permitted  when given incorrect arguments.  In
       addition, the option sequence -?  causes  most  commands  to





       print a usage message which lists the valid options.

       Ordinarily, ksh executes a command by using the command name
       to locate a program to run and by running the program  as  a
       separate  process.  Some commands, referred to as built-ins,
       are carried out by ksh itself, without creating  a  separate
       process.   The  reasons  that some commands are built-in are
       presented  later.   In  nearly  all  cases  the  distinction
       between  a  command  that is built-in and one that is not is
       invisible to the user.  However, nearly  all  commands  that
       are built-in follow command line conventions.

       ksh  has  several  options  that  can  be set by the user as
       command line arguments at invocation and as option arguments
       to  the  set  command.  Most other options can be set with a
       single letter option or  as  a  name  that  follows  the  -o
       option.   Use set -o to display the current option settings.
       Some of these options, such as interactive and monitor  (see
       Job  Control  below),  are enabled automatically by ksh when
       the shell is connected to a terminal device.  Other options,
       such  as  noclobber  and ignoreeof, are normally placed in a
       startup file.  The noclobber option causes ksh to  print  an
       error  message  when  you use > to redirect output to a file
       that already exists.  If you want to redirect to an existing
       file,  then  you  have  to  use >| to override the noclobber
       option.  The ignoreeof option is used to prevent the end-of-
       file  character,  normally ^D (Control- d), from exiting the
       shell and possibly logging you out.  You must type  exit  to
       log  out.  Most of the options are described in this memo as
       appropriate.

       2.2  Command Aliases

       Command aliases provide a mechanism of associating a command
       name and arguments with a shorter name.  Aliases are defined
       with the alias built-in.   The  form  of  an  alias  command
       definition is:
                             alias name=value
       As  with  most  other shell assignments, no space is allowed
       before or after the =.  The  characters  of  an  alias  name
       cannot  be  characters  that  are special to the shell.  The
       replacement string,  value,  can  contain  any  valid  shell
       script,  including  meta-characters such as pipe symbols and
       i/o-redirection provided that they are quoted.  Unlike  csh,
       aliases  in  ksh  cannot  take  arguments.   The  equivalent
       functionality of aliases with arguments can be achieved with
       shell functions, described later.

       As  a  command  is  being  read, the command name is checked
       against a list of alias names.  If it is found, the name  is
       replaced  by  the  alias value associated with the alias and
       then rescanned.  When rescanning the  value  for  an  alias,
       alias  substitutions  are performed except for an alias that
       is currently being processed.  This prevents infinite  loops
       in  alias  substitutions.   For  example  with  the aliases,
       alias l=ls 'ls=ls -C', the command name l becomes ls,  which
       becomes  ls -C.   Ordinarily,  only the command name word is
       processed for alias substitution.  However, if the value  of
       an  alias ends in a space, then the word following the alias
       is also checked  for  alias  substitution.   This  makes  it
       possible to define an alias whose first argument is the name





       of a command and have alias substitution performed  on  this
       argument, for example nohup='nohup '.

       Aliases  can  be  used to redefine built-in commands so that
       the alias,
                            alias test=./test
       can be used  to  look  for  test  in  your  current  working
       directory  rather  than  using  the  built-in  test command.
       Reserved words such as for and while cannot  be  changed  by
       aliasing.  The command alias, without arguments, generates a
       list of aliases and corresponding alias values.  The unalias
       command removes the name and text of an alias.

       Aliases  are  used to save typing and to improve readability
       of scripts.  Several aliases are  predefined  by  ksh.   For
       example, the predefined alias
                        alias integer='typeset -i'
       allows  the  integer  variables  i  and j to be declared and
       initialized with the command
                             integer i=0 j=1

       While  aliases  can  be  defined  in  scripts,  it  is   not
       recommended.   The  location  of  an  alias  command  can be
       important since aliases are only processed when a command is
       read.   A  .   procedure (the shell equivalent of an include
       file) is read all at once (unlike start up files  which  are
       read  a command at a time) so that any aliases defined there
       will not effect any commands within this script.  Predefined
       aliases do not have this problem.

       2.3  Command Re-entry

       When run interactively, ksh saves the commands you type at a
       terminal in a file.  If the variable HISTFILE is set to  the
       name  of a file to which the user has write access, then the
       commands are stored in this  history  file.   Otherwise  the
       file  $HOME/.sh_history  is  checked for write access and if
       this fails an unnamed file  is  used  to  hold  the  history
       lines.    Commands   are   always  appended  to  this  file.
       Instances of ksh that run  concurrently  and  use  the  same
       history  file name, share access to the history file so that
       a command entered in one shell will be available for editing
       in  another  shell.   The  file  may  be  truncated when ksh
       determines that no other shell is using  the  history  file.
       The  number of commands accessible to the user is determined
       by the value of the HISTSIZE variable at the time the  shell
       is  invoked.   The  default  value is 256.  Each command may
       consist of one or more lines since  a  compound  command  is
       considered  one  command.   If  the  character  !  is placed
       within the primary prompt string, PS1, then it  is  replaced
       by the command number each time the prompt is given.

       A  built-in  command  named hist is used to list and/or edit
       any of these saved commands.   The  option  -l  is  used  to
       specify  listing  of  previous  commands.   The  command can
       always be specified with a range of one  or  more  commands.
       The  range  can  be  specified by giving the command number,
       relative or absolute, or by giving the  first  character  or
       characters  of  the  command.  When given without specifying
       the range, the last 16 commands are listed, each preceded by
       the command number.






       If  the  listing  option  is not selected, then the range of
       commands specified, or the  last  command  if  no  range  is
       given,  is  passed  to  an  editor  program before being re-
       executed by ksh.  The editor to be  used  may  be  specified
       with  the  option  -e and following it with the editor name.
       If this option is not specified,  the  value  of  the  shell
       variable  HISTEDIT  is  used  as  the  name  of  the editor,
       providing that this variable has a non-null value.  If  this
       variable  is  not set, or is null, and the -e option has not
       been selected, then /bin/ed is used.  When editing has  been
       complete,  the  edited  text automatically becomes the input
       for ksh.  As this text is read by ksh, it is echoed onto the
       terminal.

       The -s option causes the editing to be bypassed and just re-
       executes the command.  In this case only  a  single  command
       can  be  specified  as the range and an optional argument of
       the form old=new may be added which requests a simple string
       substitution prior to evaluation.  A convenient alias,
                            alias r='hist -s'
       has  been pre-defined so that the single key-stroke r can be
       used to re-execute the previous command and  the  key-stroke
       sequence,  r abc=def c  can  be  used to re-execute the last
       command that  starts  with  the  letter  c  with  the  first
       occurrence  of  the string abc replaced with the string def.
       Typing  r c > file  re-executes  the  most  recent   command
       starting  with the letter c, with standard output redirected
       to file.

       2.4  In-line editing

       Lines typed from a terminal  frequently  need  changes  made
       before entering them.  With the Bourne shell the only method
       to fix up commands is by backspacing or  killing  the  whole
       line.   ksh offers options that allow the user to edit parts
       of the current command line before submitting  the  command.
       The in-line edit options make the command line into a single
       line screen edit window.  When the command  is  longer  than
       the  width of the terminal, only a portion of the command is
       visible.  Moving within the line  automatically  makes  that
       portion  visible.   Editing  can be performed on this window
       until the return key is pressed.   The  editing  modes  have
       editing  directives  that  access  the history file in which
       previous commands are saved.  A user can  copy  any  of  the
       most  recent HISTSIZE commands from this file into the input
       edit window.  You can locate commands  by  searching  or  by
       position.

       The  in-line  editing  options  do  not  use  the termcap or
       terminfo databases.  They work on most  standard  terminals.
       They  only  require  that  the backspace character moves the
       cursor left and the space character overwrites  the  current
       character  on  the screen and moves the cursor to the right.
       Very few terminals or terminal emulators do  not  have  this
       behavior.

       There  is  a choice of editor options.  The emacs, gmacs, or
       vi option is selected by turning on the corresponding option
       of  the  set  command.  If the value of the EDITOR or VISUAL
       variables ends with any of these suffixes the  corresponding





       option  is  turned  on.   A  large  subset  of each of these
       editors' features is available within the shell.  Additional
       functions,  such  as  file  name  completion, have also been
       added.

       In the emacs or gmacs mode the user positions the cursor  to
       the  point  needing  correction  and  inserts,  deletes,  or
       replaces characters as needed.  The only difference  between
       these two modes is the meaning of the directive ^T.  Control
       keys and escape sequences are used  for  cursor  positioning
       and  control functions.  The available editing functions are
       listed in the manual page.

       The vi editing mode starts in insert mode and enters control
       mode when the user types ESC ( 033 ).  The return key, which
       submits the current command for processing, can  be  entered
       from  either  mode.  The cursor can be anywhere on the line.
       A  subset  of  commonly  used  vi  editing  directives   are
       available.  The k and j directives that normally move up and
       down by one line, move  up  and  down  one  command  in  the
       history  file,  copying  the  command  into  the  input edit
       window.  For reasons of efficiency, the terminal is kept  in
       canonical  mode  until  an ESC is typed.  On some terminals,
       and on earlier versions of the UNIX operating  system,  this
       doesn't work correctly.  The viraw option, which always uses
       raw or cbreak mode, must be used in this case.

       Most of the code for the editing options does  not  rely  on
       the ksh code and can be used in a stand-alone mode with most
       any command to add in-line edit  capability.   However,  all
       versions  of the in-line editors have some features that use
       some shell specific code.  For example, with all edit modes,
       the ESC-= directive applied to command words (the first word
       on the line, or the first word after a ;, |, (, or &)  lists
       all  aliases,  functions, or commands that match the portion
       of the given current word.  When  applied  to  other  words,
       this  directive  prints  the  names  of files that match the
       current word.  The ESC-* directive adds the expanded list of
       matching  files  to the command line.  A trailing * is added
       to the word if it doesn't contain any file pattern  matching
       characters  before  the expansion.  In emacs and gmacs mode,
       ESC-ESC indicates command completion when applied to command
       names,  otherwise  it  indicates  pathname completion.  With
       command or pathname completion, the list  generated  by  the
       ESC-=  directive  is  examined  to  find  the longest common
       prefix.  With command completion, only the last component of
       the  pathname is used to compute the longest command prefix.
       If the longest common prefix is a complete match,  then  the
       word  is  replaced  by  the pathname, and a / is appended if
       pathname is a directory, otherwise a space is added.  In  vi
       mode, \ from control mode gives the same behavior.

       2.5  Key Binding

       It  is  possible  to  intercept keys as they are entered and
       apply new meanings or  bindings.   A  trap  named  KEYBD  is
       evaluated  each  time  ksh processes characters entered from
       the keyboard, other than those typed while entering a search
       string  or an argument to an edit directive such as r in vi-
       mode.  The action associated with this trap can  change  the
       value  of  the  entered  key  to  cause the key to perform a





       different operation.

       When the KEYBD trap  is  entered,  the  .sh.edtext  variable
       contains  the  contents  of  the  current input line and the
       .sh.edcol variable gives the current cursor position  within
       this   line.   The  .sh.edmode  variable  contains  the  ESC
       character when the trap is  entered  from  vi  insert  mode.
       Otherwise,  this  value  is  null.   The .sh.edchar variable
       contains the character or escape sequence  that  caused  the
       trap.   A  key  sequence  is  either a single character, ESC
       followed by a single character, or ESC[ followed by a single
       character.   In  the  vi edit mode, the characters after the
       ESC must be entered within half a second after the ESC.  The
       value  of  .sh.edchar at the end of the trap will be used as
       the input sequence.

       Using the associative array facility of ksh described later,
       and  the  function  facility  of  ksh, it is easy to write a
       single trap so that keys  can  be  bound  dynamically.   For
       example,

            typeset -A Keytable
            trap 'eval "${Keytable[${.sh.edchar}]}"' KEYBD
            function keybind # key action
            {
                    typeset key=$(print -f "%q" "$2")
                    case $# in
                    2)      Keytable[$1]='.sh.edchar=${.sh.edmode}'"$key"
                            ;;
                    1)      unset Keytable[$1]
                            ;;
                    *)      print -u2 "Usage: $0 key [action]"
                            ;;
                    esac
            }


       2.6  Job Control

       The job control mechanism is almost identical to the version
       introduced in csh of the  Berkeley  UNIX  operating  system,
       version  4.1  and later.  The job control feature allows the
       user to stop and restart programs, and to move  programs  to
       and  from  the  foreground and the background.  It will only
       work on systems that provide  support  for  these  features.
       However,  even  systems  without  job control have a monitor
       option which, when enabled,  will  report  the  progress  of
       background  jobs  and  enable  the  user to kill jobs by job
       number or job name.

       An interactive shell associates a  job  with  each  pipeline
       typed  in  from  the terminal and assigns it a small integer
       number  called  the  job  number.   If  the   job   is   run
       asynchronously,  the  job number is printed at the terminal.
       At any given time, only one job  owns  the  terminal,  i.e.,
       keyboard  signals are only sent to the processes in one job.
       When ksh creates a foreground job, it gives it ownership  of
       the  terminal.  If you are running a job and wish to stop it
       you hit the key ^Z (control-Z) which sends a STOP signal  to
       all  processes  in  the  current  job.   The  shell receives
       notification that the processes have stopped and takes  back





       control of the terminal.

       There  are  commands  to continue programs in the foreground
       and background.  There are several ways to  refer  to  jobs.
       The  character  %  introduces  a job name.  You can refer to
       jobs by name or number as described in the manual page.  The
       built-in  command  bg  allows  you  to continue a job in the
       background, while the built-in  command  fg  allows  you  to
       continue  a  job  in the foreground even though you may have
       started it in the background.

       A job being run in the background will stop if it  tries  to
       read  from  the  terminal.   It  is  also  possible  to stop
       background jobs that try to write on the terminal by setting
       the terminal options appropriately.

       There  is  a  built-in command jobs that lists the status of
       all running and stopped jobs.  In addition, you are informed
       of   the  change  of  state  (running  or  stopped)  of  any
       background jobs just before each prompt.  If you want to  be
       notified  about  background  job completions as soon as they
       occur without waiting for a  prompt,  then  use  the  notify
       option.   When  you  try  to  exit  the shell while jobs are
       stopped or running, you will receive a message from ksh.  If
       you  ignore  this message and try to exit again, all stopped
       processes  will  be  terminated.   In  addition,  for  login
       shells,  the  HUP signal will be sent to all background jobs
       unless the job has been disowned with the disown command.

       A built-in version of kill makes  it  possible  to  use  job
       numbers  as targets for signals.  Signals can be selected by
       number or name.  The name of the signal is the name found in
       the  include  file /usr/include/sys/signal.h with the prefix
       SIG removed.  The -l option of kill provides a means to  map
       individual  signal  names  to  and  from  signal number.  In
       addition, if no signal name  or  number  is  given,  kill -l
       generates a list of valid signal names.

       2.7  Changing Directories

       By  default, ksh maintains a logical view of the file system
       hierarchy  which  makes  symbolic  links  transparent.   For
       systems that have symbolic links, this means that if /bin is
       a symbolic link to /usr/bin  and  you  change  directory  to
       /bin,  pwd will indicate that you are in /bin, not /usr/bin.
       pwd -P  generates  the  physical  pathname  of  the  present
       working  directory  by resolving all the symbolic links.  By
       default, the cd command will take you where you expect to go
       even if you cross symbolic links.  A subsequent cd .. in the
       example above will place you in /,  not  /usr.   On  systems
       with   symbolic  links,  cd -P  causes  ..   to  be  treated
       physically.

       ksh remembers your last directory in  the  variable  OLDPWD.
       The  cd  built-in  can be given with argument - to return to
       the previous directory and print the name of the  directory.
       Note  that  cd -  done  twice  returns  you  to the starting
       directory, not the second previous directory.   A  directory
       stack  manager  has  been written as shell functions to push
       and pop directories from the stack.







       2.8  Prompts

       When ksh reads commands from a terminal, it issues a  prompt
       whenever it is ready to accept more input and then waits for
       the user to respond.  The TMOUT variable can be  set  to  be
       the  number  of  seconds  that the shell will wait for input
       before terminating.  A 60 second warning message is  printed
       before terminating.

       The  shell uses two prompts.  The primary prompt, defined by
       the value of the PS1 variable, is issued  at  the  start  of
       each command.  The secondary prompt, defined by the value of
       the PS2 variable, is issued when more  input  is  needed  to
       complete a command.

       ksh   allows  the  user  to  specify  a  list  of  files  or
       directories to check before issuing  the  PS1  prompt.   The
       variable  MAILPATH  is  a colon ( : ) separated list of file
       names to be checked for changes periodically.  The  user  is
       notified  before the next prompt.  Each of the names in this
       list can be followed by a ?  and a message to be given  when
       a  change has been detected in the file.  The prompt will be
       evaluated for parameter expansion, command substitution  and
       arithmetic   expansion   which  are  described  later.   The
       parameter $_ within a mail message will evaluate to the name
       of  the  file  that has changed.  The parameter MAILCHECK is
       used to specify the minimal interval in seconds  before  new
       mail is checked for.

       In  addition  to  replacing  each  !  in the prompt with the
       command number, ksh expands the value of  the  PS1  variable
       for parameter expansions, arithmetic expansions, and command
       substitutions as described below  to  generate  the  prompt.
       The  expansion  characters  that  are to be applied when the
       prompt is issued must be quoted to  prevent  the  expansions
       from  occurring  when  assigning  the  value  to  PS1.   For
       example, PS1="$PWD" causes PS1 to be set to the value of PWD
       at  the time of the assignment whereas PS1='$PWD' causes PWD
       to be expanded at the time the prompt is issued.

       Command substitution  may  require  a  separate  process  to
       execute  and  cause  the prompt display to be somewhat slow,
       especially when the return key is pressed several times in a
       row.   Therefore,  its  use within PS1 is discouraged.  Some
       variables are maintained by ksh so that their values can  be
       used  with PS1.  The PWD variable stores the pathname of the
       current working directory.  The value of SECONDS variable is
       the  value  of  the  most recent assignment plus the elapsed
       time.  By default, the time is  measured  in  milli-seconds,
       but  since  SECONDS is a floating point variable, the number
       of places after the decimal point in the expanded value  can
       be specified with typeset -Fplaces SECONDS.  In a roundabout
       way, this variable can be used to generate a time stamp into
       the  PS1  prompt  without creating a process at each prompt.
       The following code explains how you can do this on System V.
       On  BSD,  you  need  a  different  command to initialize the
       SECONDS variable.

            # . this script and use $TIME as part of your PS1 string to
            # get the time of day in your prompt





            typeset -RZ2  _x1 _x2 _x3
            (( SECONDS=$(date  '+3600*%H+60*%M+%S') ))
            _s='_x1=(SECONDS/3600)%24,_x2=(SECONDS/60)%60,_x3=SECONDS%60,0'
            TIME='"${_d[_s]}$_x1:$_x2:$_x3"'
            # PS1=${TIME}whatever



       2.9  Tilde substitution

       The character ~ at the  beginning  of  a  word  has  special
       meaning  to  ksh.   If  the characters after the ~ up to a /
       match a user login name in the password database, then the ~
       and  the  name  are replaced by that user's login directory.
       If no match is found, the original word is unchanged.   A  ~
       by  itself,  or in front of a /, is replaced by the value of
       the HOME parameter.  A ~ followed by a + or - is replaced by
       the value of $PWD or $OLDPWD respectively.

       2.10  Output formats

       The  output  of  built-in  commands  and  traces have values
       quoted so that they can be  re-input  to  the  shell.   This
       makes it easy to cut and paste shell output on systems which
       use a pointing device such as a mouse.  In addition,  output
       can be saved in a file for reuse.

       2.11  The ENV file

       When  an  interactive  ksh  starts,  it  evaluates  the $ENV
       variable to arrive at a file name.  If  this  value  is  not
       null, ksh attempts to read and process commands in a file by
       this name.  Earlier versions of ksh read the  ENV  file  for
       all  invocations  of  the  shell primarily to allow function
       definitions to be available for all shell invocations.   The
       function search path, FPATH, described later, eliminated the
       primary need for this capability and it was removed  because
       the high performance cost was no longer deemed acceptable.


       3.  PROGRAMMING LANGUAGE

       The  KornShell  vastly  extends the set of applications that
       can be implemented efficiently at the shell level.  It  does
       this  by providing simple yet powerful mechanisms to perform
       arithmetic,  pattern  matching,  substring  generation,  and
       arrays.   Users can write applications as separate functions
       that can be defined in the same file  or  in  a  library  of
       functions stored in a directory and loaded on demand.

       3.1  String Processing

       The  shell  is  primarily  a string processing language.  By
       default, variables hold variable length strings.  There  are
       no  limits  to the length of strings.  Storage management is
       handled by the shell automatically.   Declarations  are  not
       required.  With most programming languages, string constants
       are designated by enclosing characters in single  quotes  or
       double  quotes.  Since most of the words in the language are
       strings, the  shell  requires  quotes  only  when  a  string
       contains characters that are normally processed specially by





       the shell, but their literal meaning is intended.   However,
       since  the  shell  is a string processing language, and some
       characters  can  occur   as   literals   and   as   language
       metacharacters,   quoting   is  an  important  part  of  the
       language.

       There are four quoting mechanisms in ksh.  The  simplest  is
       to  enclose  a  sequence of characters inside single quotes.
       All characters between a pair of single  quotes  have  their
       literal meaning; the single quote itself cannot appear.  A $
       immediately preceding a single quoted string causes all  the
       characters until the matching single quote to be interpreted
       as  an  ANSI-C  language  string.   Thus,  '\n'   represents
       characters  \  and n, whereas, $'\n' represents the new-line
       character.  Double quoted strings remove the special meaning
       of  all  characters  except  $,  `, and \, so that parameter
       expansion  and  command  substitution  (defined  below)  are
       performed.   The  final mechanism for quoting a character is
       by preceding it with the escape character \.  This mechanism
       works outside of quoted strings and for the characters $, `,
       ", and \ in double quoted strings.

       Variables  are  designated  by  one  or  more   strings   of
       alphanumeric   characters   beginning   with  an  alphabetic
       character separated by a ..  Upper and lower case characters
       are  distinct,  so  that  the  variable A and a are names of
       different variables.  There is no limit to the length of the
       name  of  a variable.  You do not have to declare variables.
       You can assign a value to a variable by writing the name  of
       the  variable,  followed  by  an  equal  sign, followed by a
       character string that represents its  value.   To  create  a
       variable  whose  name  contains a ., the variable whose name
       consists of the characters before the last .   must  already
       exist.   You reference a variable by putting the name inside
       curly braces and preceding the braces with  a  dollar  sign.
       The braces may be omitted when the name is alphanumeric.  If
       x and y are two  shell  variables,  then  to  define  a  new
       variable,  z, whose value is the concatenation of the values
       of x and y, you just say z=$x$y.  It is that easy.

       The $ can be thought of as meaning "value of."  You can also
       capture   the  output  of  any  command  with  the  notation
       $(command).  This is referred to  as  command  substitution.
       For  example,  x=$(date)  assigns  the  output from the date
       command to the variable  x.   Command  substitution  in  the
       Bourne  shell  is  denoted  by enclosing the command between
       backquotes,  (``).   This   notation   suffers   from   some
       complicated  quoting  rules.   Thus, it is hard to write sed
       patterns  which  contains  back   slashes   within   command
       substitution.   Putting  the  pattern in single quotes is of
       little  help.   ksh  accepts  the   Bourne   shell   command
       substitution   syntax   for   backward  compatibility.   The
       $(command) notation allows the  command  itself  to  contain
       quoted strings even if the substitution occurs within double
       quotes. Nesting is legal.

       The special command substitution of the form $(cat file) can
       be  replaced  by  $(< file), which is faster because the cat
       command doesn't have to run.







       3.2  Shell Parameters and Variables

       There are three types of parameters  used  by  ksh,  special
       parameters,  positional  parameters,  and  named  parameters
       which are called variables.  ksh defines  the  same  special
       parameters,  0,  *,  @,  #, ?, $, !, and -, as in the Bourne
       shell.

       Positional parameters are set when the shell is invoked,  as
       arguments  to  the  set  built-in, and by calls to functions
       (see below) and .  procedures.  They are  named  by  numbers
       starting at 1.

       The  third  type  of  parameter is a variable.  As mentioned
       earlier, ksh uses variables whose names consist  of  one  or
       more  alpha-numeric  strings  separated by a ..  There is no
       need to specify the type of a variable in the shell because,
       by  default, variables store strings of arbitrary length and
       values will automatically be converted to numbers when  used
       in  an  arithmetic context.  However, ksh variables can have
       one  or  more   attributes   that   control   the   internal
       representation  of  the  variable,  the  way the variable is
       printed, and its access or scope.  In addition,  ksh  allows
       variables  to  represent  arrays of values and references to
       other  variables.   The  typeset  built-in  command  of  ksh
       assigns  attributes  to  variables.   Two of the attributes,
       readonly and export, are  available  in  the  Bourne  shell.
       Most  of  the  remaining attributes are discussed here.  The
       complete list of attributes  appears  in  the  manual.   The
       unset  built-in  of  ksh  removes  values  and attributes of
       variables.  When a variable  is  exported,  certain  of  its
       attributes are also exported.

       Whenever  a  value  is  assigned to a variable, the value is
       transformed according to the  attributes  of  the  variable.
       Changing  the  attribute of a variable can change its value.
       The attributes -L and  -R  are  for  left  and  right  field
       justification  respectively.   They  are useful for aligning
       columns in a report.  For each of these attributes, a  width
       can  be  defined  explicitly or else it is defined the first
       time an assignment is made to the variable.  Each assignment
       causes  justification of the field, truncating if necessary.
       Assignment to fixed sized  variables  provides  one  way  to
       generate  a  substring  consisting  of  a  fixed  number  of
       characters from the beginning or end  of  a  string.   Other
       methods are discussed later.

       The  attributes  -u and -l are used for upper case and lower
       case formatting, respectively.  Since it makes no  sense  to
       have both attributes on simultaneously, turning on either of
       these attributes turns the other off.  The following script,
       using  read and print which are described later, provides an
       example of the use of shell variables with attributes.  This
       script  reads a file of lines each consisting of five fields
       separated by : and prints fields 4 and 2 in  upper  case  in
       columns  1-15,  left  justified,  and  columns  20-25 right-
       justified respectively.

            typeset -uL15 f4                # 15 character left justified
            typeset -uR6 f2                 # 6 character right justified
            IFS=:                           # set field separator to :





            while   read -r f1 f2 f3 f4 f5  # read line, split into fields
            do      print -r -- "$f4  $f2"  # print fields 4 and 2
            done


       The -i,  -E,  and  -F,  attributes  are  used  to  represent
       numbers.   Each can be followed by a decimal number.  The -i
       attribute causes the value to be represented as  an  integer
       and  it can be followed by a number representing the numeric
       base when expanding its value.  Whenever a value is assigned
       to  an  integer  variable,  it is evaluated as an arithmetic
       expression and then truncated to an integer.

       The -E attribute causes  the  value  to  be  represented  in
       scientific  notation  whenever  its  value is expanded.  The
       number following the -E determines the number of significant
       figures,  and  defaults  to  6.  The -F attribute causes the
       value to be represented with a fixed number of places  after
       the  decimal point.  Assignments to variables with the -E or
       -F attributes cause the evaluation of the right hand side of
       the assignment.

       ksh  allows  one-dimensional  arrays  in  addition to simple
       variables.  There  are  two  types  of  arrays;  associative
       arrays and indexed arrays.  The subscript for an associative
       array is an arbitrary string, whereas the subscript  for  an
       indexed  array is an arithmetic expression that is evaluated
       to yield an integer  index.   Any  variable  can  become  an
       indexed  array by referring to it with an integer subscript.
       All elements of an array need  not  exist.   Subscripts  for
       arrays  must  evaluate  to  an  integer  between  0 and some
       maximum value, otherwise  an  error  results.   The  maximum
       value  may  vary from one machine to another but is at least
       4095.  Evaluation of subscripts is  described  in  the  next
       section.  Attributes apply to the whole array.

       Assignments  to  array  variables  can be made to individual
       elements via parameter assignment commands  or  the  typeset
       built-in.  Additionally, values can be assigned sequentially
       with compound assignment as described below, or  by  the  -A
       name  option of the set command.  Referencing of subscripted
       variables requires the character $, but also requires braces
       around  the  array  element  name.  The braces are needed to
       avoid conflicts with the  file  name  generation  mechanism.
       The form of any array element reference is:
                            ${name[subscript]}
       Subscript  values  of  *  and  @ can be used to generate all
       elements of an array, as they  are  used  for  expansion  of
       positional   parameters.   The  list  of  currently  defined
       subscripts for  a  given  variable  can  be  generated  with
       ${!name[@]}, or ${!name[*]}.

       The  -n  or  nameref  attribute  causes  the  variable to be
       treated as a reference to the variable defined by its value.
       Once  this attribute is set, all references to this variable
       become references to the variable named by the value of this
       variable.    For  example,  if  foo=bar,  then  setting  the
       reference  attribute  on  foo  will  cause  all   subsequent
       references  to  foo  to behave as the variable whose name is
       $foo was referenced, which in this case is the variable bar.
       Unsetting  this attribute breaks the association.  Reference





       variables are usually used inside functions whose  arguments
       are  the  names of shell variables.  The names for reference
       variables cannot contain a ..  Whenever a shell variable  is
       referenced,  the  portion  of the variable up to the first .
       is checked to see whether it matches the name of a reference
       variable.   If  it  does,  then  the  name  of  the variable
       actually used consists of the concatenation of the  name  of
       the  variable  defined  by  the reference plus the remaining
       portion of the original variable name.  For  example,  using
       the predefined alias, alias nameref='typeset -n',

            .bar.home.bam="hello world"
            nameref foo=.bar.home
            print ${foo.bam}
            hello world


       3.3  Compound Assignment

       Compound assignments are used to assign values to arrays and
       compound  data  structures.   The  syntax  for  a   compound
       assignment  is name=(assignment-list) where name is the name
       of the variable to which you  want  to  assign  values.   No
       space  is  permitted between the variable name and the = but
       can appear between the = and  the  open  parenthesis.   New-
       lines can appear between the parentheses.

       The  assignment-list  can  be  in  several  different  forms
       yielding different results.  If assignment-list is simply  a
       list of words, then the words are processed as they are with
       the for command and  assigned  sequentially  as  an  indexed
       array.  For example,
                                foo=( * )
       creates  an  indexed array foo and assigns the file names in
       the current directory to each index starting at zero.

       The second form for assignment-list is a list of assignments
       of  the  special  form  [word]=word.   No space is permitted
       before or after the =.  In this case, the variable given  by
       name  becomes  an associative array with the given arguments
       as subscripts.  For example,
                     bar=( [color]=red [shape]=box )
       creates an associate array named bar  whose  subscripts  are
       color and shape.

       The  third  form  for  assignment-list  is  a list of normal
       assignments,   including   compound   assignments.     These
       assignments cause sub-variables to be assigned corresponding
       to the given assignments.  In addition to  assignments,  the
       assignment-list  can  contain typeset commands.  In addition
       to  creating  sub-variables,  the  effect  of   a   compound
       assignment  is to make the value of the original variable be
       a parenthesized assignment  list  of  its  components.   For
       example, the assignment

            foo=(
                    left=bar
                    typeset -i count=3
                    point=(
                            x=50
                            y=60





                    )
                    colors=( red green yellow )
                    right=bam
            )

       is equivalent to the assignments

            foo.left=bar
            foo.count=3
            foo.point.x=50
            foo.point.y=60
            foo.colors=( red green yellow )
            foo.right=bam

       In addition, the value of "$foo" is

            (
                    colors=( red green yellow )
                    left=bar
                    typeset -i count=3
                    point=(
                            y=60
                            x=50
                    )
                    right=bam
            )


       3.4  Substring Generation

       The  expansion of a variable or parameter can be modified so
       that only a portion of  the  value  results.   It  is  often
       necessary  to  extract  a  portion  of a shell variable or a
       portion of an array.  There are several parameter  expansion
       operators  that  can  do  this.   One  method  to generate a
       substring   is   with   an    expansion    of    the    form
       ${name:offset:length}   where   offset   is   an  arithmetic
       expression that defines the offset of  the  first  character
       starting from 0, and length is an arithmetic expression that
       defines the length of the substring.  If :length is omitted,
       the  length of the value of name starting at offset is used.
       The :offset:length operators can also be  applied  to  array
       expansions and to parameters * and @ to generate portions of
       an     array.      For     example,      the      expansion,
       ${name[@]:offset:length},  yields  up  to length elements of
       the array name starting at the element offset.

       The other parameter expansion modifiers use  shell  patterns
       to  describe portions of the string to modify and delete.  A
       description of shell  patterns  is  contained  below.   When
       these modifiers are applied to special parameters @ and * or
       to  array  parameters  given  as  name[@]  or  name[*],  the
       operation  is  performed  on  each  element.  There are four
       parameter expansion modifiers that  strip  off  leading  and
       trailing  substrings  during parameter expansion by removing
       the characters matching a given pattern.   An  expansion  of
       the form ${name#pattern} causes the smallest matching prefix
       of the value of name to  be  removed.   The  largest  prefix
       matching  pattern  is  removed  by  using  ##  instead of #.
       Similarly, an expansion of the form  ${name%pattern}  causes
       the  smallest  matching  substring  at the end of name to be





       removed.  Again, using %% instead of %, causes  the  largest
       matching  trailing substring to be deleted.  For example, if
       the shell variable file has value foo.c, then the expression
       ${file%.c}.o has value foo.o.

       The  value  of  an  expansion can be changed by specifying a
       pattern that matches the part that needs to be changed after
       the the parameter expansion modifier /.  An expansion of the
       form ${name/pattern/string}  replaces  the  first  match  of
       pattern  with  the  value  of  variable name to string.  The
       second  /  is  not  necessary  when  string  is  null.   The
       expansion ${name//pattern/string} changes all occurrences of
       the pattern into string.  The parameter expansion  modifiers
       /#  and  /% cause the matching pattern to be anchored to the
       beginning and end respectively.

       Finally, there are parameter expansion modifiers that  yield
       the name of the variable, the string length of the value, or
       the number of elements of an  array.   ${!name}  yields  the
       name  of  the variable which will be name itself except when
       name is a reference variable.  In this case  it  will  yield
       the  name  of the variable it refers to.  When applied to an
       array variable, ${!name[@]}  and  ${!name[*]}  generate  the
       names  of  all  subscripts.   ${#name} will be the length in
       bytes of $name.  For an array variable ${#name[*]} gives the
       number of elements in the array.

       3.5  Arithmetic Evaluation

       For  the  most  part,  the  shell  is  a  string  processing
       language.  However, the need for arithmetic  has  long  been
       obvious.   Many  of  the  characters that are special to the
       Bourne shell are needed as arithmetic  operators.   To  make
       arithmetic  easy  to use, and to maintain compatibility with
       the Bourne shell, ksh uses matching (( and ))  to  delineate
       arithmetic expressions.  While single parentheses might have
       been more desirable, these already  mean  subshell  so  that
       another  notation  was  required.  The arithmetic expression
       inside the  double  parentheses  follows  the  same  syntax,
       associativity  and  precedence as the ANSI-C[15] programming
       language.   The  characters  between  the  matching   double
       parentheses  are  processed  with  the  same  rules used for
       double quotes so that spaces can be used to aid  readability
       without additional quoting.

       All   arithmetic  evaluations  are  performed  using  double
       precision  floating  point   arithmetic.    Floating   point
       constants  follow  the  same rules as the ANSI-C programming
       language.  Integer arithmetic constants are written as
                               base#number,
       where base is a decimal integer between two  and  sixty-four
       and  number  is  any  non-negative number.  Base ten is used
       when no base is specified.  The digits  are  represented  by
       the characters 0-9a-zA-Z_@.  For bases less than or equal to
       36,  upper  and  lower   case   characters   can   be   used
       interchangeably to represent the digits from 10 thru 35.

       Arithmetic  expressions  are made from constants, variables,
       and operators.  Parentheses may be used for  grouping.   The
       contents  inside  the  double parentheses are processed with
       the same expansions as occurs in a double quoted string,  so





       that all $ expansions are performed before the expression is
       evaluated.  However, there is usually no need to use  the  $
       to  get  the  value  of  a  variable  because the arithmetic
       evaluator replaces the name of the  variable  by  its  value
       within  an arithmetic expression.  The $ cannot be used when
       the variable is the subject of assignment  or  an  increment
       operation.   As a rule it is better not to use $ in front of
       variables in an arithmetic expression.

       An arithmetic command of the form (( ...  ))  is  a  command
       that  evaluates  the  enclosed  arithmetic  expression.  For
       example, the command
                                (( x++ ))
       can be used to increment the variable  x,  assuming  that  x
       contains  some  numerical  value.  The arithmetic command is
       true (return value 0), when the resulting expression is non-
       zero,  and  false  (return  value  1)  when  the  expression
       evaluates to zero.  This makes the command easy to use  with
       the if and while compound commands.

       The  for  compound  command  has  been  extended  for use in
       arithmetic contexts.  The syntax,
                      for (( expr1; expr2 ; expr3 ))
       can be used as the first line of a for loop  with  the  same
       semantics  as  the  for  statement in the ANSI-C programming
       language.

       Arithmetic evaluations can also be performed as part of  the
       evaluation of a command line.  The syntax $(( ... )) expands
       to the value of the enclosed  arithmetic  expression.   This
       expansion   can   occur   wherever  parameter  expansion  is
       performed.   For  example  using  the  ksh   command   print
       (described later)
                              print $((2+2))
       prints the number 4.

       The  following  script  prints  the  first  n  lines  of its
       standard input onto its standard  output,  where  n  can  be
       supplied  as an optional argument whose default value is 20.

            integer n=${1-20}                       # set n
            while   (( n-- >=0 )) && read -r line   # at most n lines
            do      print -r -- "$line"
            done


       3.6  Shell Expansions

       The commands you enter from the terminal or  from  a  script
       are  divided  into  words  and  each  word undergoes several
       expansions to generate the command name and  its  arguments.
       This  is  done  in  two  phases.  The first phase recognizes
       reserved words, spaces and operators to decide where command
       boundaries  lie.  Alias substitutions take place during this
       phase.   The  second  phase  performs  expansions   in   the
       following order:

          o Tilde  substitution,  parameter  expansion,  arithmetic
            expansion, and command substitution are performed  from
            left to right.  The option -u or nounset, will cause an
            error to occur when any variable that  is  not  set  is





            expanded.

          o The characters that result from parameter expansion and
            command  substitution  above  are  checked   with   the
            characters  in  the  IFS  variable  for  possible field
            splitting.  (See a description of read below to see how
            IFS is used.)  Setting IFS to a null value causes field
            splitting to be skipped.

          o Pathname generation (as described below)  is  performed
            on  each of the fields.  Any field that doesn't match a
            pathname is left alone.  The option, -f or  noglob,  is
            used to disable pathname generation.

       3.7  Pattern Matching

       The shell is primarily a string processing language and uses
       patterns for matching file names as  well  as  for  matching
       strings.  The characters ?, *, and [ are processed specially
       by the shell when not quoted.  These characters are used  to
       form  patterns that match strings.  Patterns are used by the
       shell to match pathnames, to  specify  substrings,  and  for
       case  commands.  The character ?  matches any one character.
       The character  *  matches  zero  or  more  characters.   The
       character  sequence  [...]  defines  a  character class that
       matches any character  contained  within  [].   A  range  of
       characters can be specified by putting a - between the first
       and last character of the range.  An  exclamation  mark,  !,
       immediately  after  the [, means match all characters except
       the  characters  specified.   For   example,   the   pattern
       a?c*.[!a-z]  matches  any  string beginning with an a, whose
       third character is a c, and that ends in .   (dot)  followed
       by  any  character  except the lower case letters, a-z.  The
       sequence [:alpha:] inside a character class, matches any set
       of   characters  in  the  ANSI-C  alpha  class.   Similarly,
       [:class:] matches each of the characters in the given  class
       for   all   the  ANSI-C  character  classes.   For  example,
       [[:alnum:]_] matches  any  alpha-numeric  character  or  the
       character _.

       ksh  treats  strings  of  the  form  (pattern-list  ), where
       pattern-list is a list of one or more patterns separated  by
       a  |,  specially  when  preceded  by  *, ?, +, @, or !.  A ?
       preceding  (pattern-list)  means  that  the   pattern   list
       enclosed  in () is optional.  An @(pattern-list) matches any
       pattern  in  the  list  of  patterns  enclosed  in  ().    A
       *(pattern-list)  matches  any  string  that contains zero or
       more of each of the enclosed  patterns,  whereas  +(pattern-
       list)  requires  a  match of one or more of any of the given
       patterns.  For instance, the pattern  +([0-9])?(.)   matches
       one  or  more  digits  optionally  followed  by a .(dot).  A
       !(pattern-list) matches anything except  any  of  the  given
       patterns.  For example, print !(*.o) displays all file names
       in the current directory that do not end in .o.

       When patterns are used to generate pathnames when  expanding
       commands  several  other  rules  apply.  A separate match is
       made for each file name component  of  the  pathname.   Read
       permission  is required for any portion of the pathname that
       contains any special pattern character.   Search  permission
       is required for every component except possibly the last.






       By  default,  file names in each directory that begin with .
       are skipped when performing a match.  If the pattern  to  be
       matched  starts  with a leading ., then only files beginning
       with a ., are examined when reading each directory  to  find
       matching  files.   If the FIGNORE variable is set, then only
       files that do not match this pattern are  considered.   This
       overrides  the  special  meaning of .  in a pattern and in a
       file name.

       If the markdirs option is set, each matching  pathname  that
       is  the name of a directory has a trailing / appended to the
       name.

       3.8  Conditional Expressions

       The Bourne shell uses the test command, or the equivalent  [
       command, to test files for attributes and to compare strings
       or numbers.  The problem with test is  that  the  shell  has
       expanded  the  words of the test command and split them into
       arguments  before  test  begins  execution.    test   cannot
       distinguish  between  operators and operands.  In most cases
       test "$1"  will  test  whether  argument  1   is   non-null.
       However,  if argument 1 is -f, then test will treat -f as an
       operator and yield a syntax error.  One of the most frequent
       errors  with  test  occurs  when its operands are not within
       double quotes.  In this case, the  argument  may  expand  to
       more  than  a  single argument or to no argument at all.  In
       either case this will likely cause  a  syntax  error.   What
       makes   this   most  insidious  is  that  these  errors  are
       frequently data dependent.  A script  that  appears  to  run
       correctly may abort if given unexpected data.

       To get around these problems, ksh has a compound command for
       conditional expression testing as part of the language.  The
       reserved  words  [[ and ]] delimit the range of the command.
       Because they are reserved words,  not  operator  characters,
       they  require  spaces  to separate them from arguments.  The
       words  between  [[  and  ]]  are  not  processed  for  field
       splitting  or  for  pathname generation.  In addition, since
       ksh determines the  operators  before  parameter  expansion,
       expansions  that  yield  no  argument cause no problem.  The
       operators within [[...]] are almost the same  as  those  for
       the  test  command.   All  unary  operators  are of the form
       -letter and are followed by a single operand.  Instead of -a
       and  -o,  [[...]] uses && and || to indicate "and" and "or".
       Parentheses are used without quoting for grouping.

       The right hand side of the string  comparison  operators  ==
       and  !=  takes  a  pattern  and  tests whether the left hand
       operand matches this pattern.  Quoting the  pattern  results
       is  a  string comparison rather than the pattern match.  The
       operators < and > within [[...]]  designate  lexicographical
       comparison.

       In   addition   there   are  several  other  new  comparison
       primitives.  The binary operators -ot and  -nt  compare  the
       modification  times  of two files to see which file is older
       than or newer than the other.  The binary operator -ef tests
       whether  two  files  have the same device and i-node number,
       i. e., a link to the same file.






       The unary operator -L returns  true  if  its  operand  is  a
       symbolic  link.   The unary operator -O (-G) returns true if
       the owner (or group) of the file operand matches that of the
       caller.  The unary operator -o returns true when its operand
       is the name of an option that is currently on.

       The  following  script  illustrates  some  of  the  uses  of
       [[...]].  The reference manual contains the complete list of
       operators.

            for i
            do      # execute foo for numeric directory
                    if      [[ -d $i && $i == +([0-9]) ]]
                    then    foo
                    # otherwise if writable or executable file and not mine
                    elif    [[ (-w $i||-x $i) && ! -O $i ]]
                    then    bar
                    fi
            done


       3.9  Input and Output

       ksh has extended I/O capabilities to enhance the use of  the
       shell  as a programming language.  As with the Bourne shell,
       you use the I/O redirection operator, <,  to  control  where
       input  comes  from,  and the I/O redirection operator, >, to
       control where output goes to.  Each of these  operators  can
       be  preceded  with a single digit that specifies a file unit
       number to associate with the file  stream.   Ordinarily  you
       specify  these  I/O  redirection  operators  with a specific
       command to which it applies.  However, if  you  specify  I/O
       redirections  with  the  exec  command,  and  don't  specify
       arguments to exec, then the I/O redirection applies  to  the
       current  program.   For  example,  the command exec < foobar
       opens file foobar for reading.  The  exec  command  is  also
       used  to  close files.  A file descriptor unit can be opened
       as a copy of an  existing  file  descriptor  unit  by  using
       either  of  the  <&  or  >&  operators  and putting the file
       descriptor unit of the original file  after  the  &.   Thus,
       2>&1 means open standard error (file descriptor 2) as a copy
       of standard output (file descriptor 1).  A  file  descriptor
       value  of  -  after  the & indicates that the file should be
       closed.  To close file unit 5, specify exec 5<&-.  There are
       two  additional redirection operators with ksh and the POSIX
       shell that are  not  part  of  the  Bourne  shell.   The  >|
       operator  overrides  the  effect  of  the  noclobber  option
       described earlier.  The <> operator  causes  a  file  to  be
       opened for both reading and writing.

       ksh  recognizes certain pathnames and treats them specially.
       Pathnames of the form /dev/fd/n are treated as equivalent to
       the  file  defined  by file descriptor n.  These name can be
       used as the  script  argument  to  ksh  and  in  conditional
       testing  as  described  above.   On  underlying systems that
       support /dev/fd in the  file  system,  these  names  can  be
       passed   to   other   commands.    Pathnames   of  the  form
       /dev/tcp/hostid/port and /dev/udp/hostid/port can be used to
       create  tcp  and  udp  connections  to services given by the
       hostid number  and  port  number.   The  hostid  cannot  use





       symbolic  values.  In  practice  these numbers are typically
       generated   by   command   substitution.     For    example,
       exec 5> /dev/tcp/$(service name)  would open file descriptor
       5 for sending messages to hostid and port number defined  by
       the output of service name.

       The  Bourne  shell  has  a built-in command read for reading
       lines from standard input (file descriptor 0) and  splitting
       it into fields based on the value of the IFS variable, and a
       command echo to write strings to standard output.  (On  some
       systems,   echo   is  not  a  built-in  command  and  incurs
       considerable overhead to use.)   Unfortunately,  neither  of
       these  commands  is  able  to perform some very basic tasks.
       For example.  with  the  Bourne  shell,  the  read  built-in
       cannot read a single line that ends in \.  With ksh the read
       built-in has a -r option to remove the special meaning for \
       which  allows it to be treated as a regular character rather
       than the  line  continuation  character.   With  the  Bourne
       shell,  there  is  no  simple way to have more than one file
       open at any time for reading.  ksh has options on  the  read
       command  to  specify the file descriptor for the input.  The
       fields that are read from a  line  can  be  stored  into  an
       indexed  array  with  the  -A option to read.  This allows a
       line to be split into an arbitrary number of fields.

       The way the Bourne shell uses  the  IFS  variable  to  split
       lines  into  fields  greatly limits its utility.  Often data
       files consist of lines that use a character  such  as  :  to
       delimit  fields  with  two adjacent delimiters that denote a
       null field.  The Bourne shell treats adjacent delimiters  as
       a  single  field  delimiter.   With ksh, delimiters that are
       considered white space characters have the behavior  of  the
       Bourne  shell,  but  other adjacent delimiters separate null
       fields.

       The read command is often used in scripts that interact with
       the  user  by  prompting  the  user and then requesting some
       input.  With the Bourne shell two commands are  needed;  one
       to prompt the user, the other to read the reply.  ksh allows
       these two commands to be combined.  The  first  argument  of
       the read command can be followed by a ?  and a prompt string
       which is used whenever  the  input  device  is  a  terminal.
       Because the prompt is associated with the read built-in, the
       built-in command line editors will be able to re-output  the
       prompt  whenever the line needs to be refreshed when reading
       from a terminal device.

       With the Bourne shell, there is no way to set a  time  limit
       for waiting for the user response to read.  The -t option to
       read takes a floating point argument that gives the time  in
       seconds,  or fractions of seconds that the shell should wait
       for a reply.

       The version of the echo command in System V  treats  certain
       sequences beginning with \ as control sequences.  This makes
       it hard to output strings without interpretation.  Most  BSD
       derived  systems  do  not  interpret  \  control  sequences.
       Unfortunately, the BSD versions of echo accepts a -n  option
       to  prevent a trailing new-line, but has no way to cause the
       string -n to be  printed.   Neither  of  these  versions  is
       adequate.  Also,  because  they are incompatible, it is very





       hard to write portable shell scripts using  echo.   The  ksh
       built-in,  print, outputs characters to the terminal or to a
       file and subsumes the functions of  all  versions  of  echo.
       Ordinarily,  escape  sequences in arguments beginning with \
       are processed the same as for the  System  V  echo  command.
       However  print  follows the standard conventions for options
       and has options that make  print  very  versatile.   The  -r
       option  can  be  used  to  output  the arguments without any
       special meaning.  The -n option can be used here to suppress
       the  trailing new-line that is ordinarily appended.  As with
       read, it is possible to specify the file  descriptor  number
       as  an  option  to  the  command  to  avoid  having  to  use
       redirection operators with each occurrence of the command.

       The IEEE POSIX shell and utilities  standard  committee  was
       unable to reconcile the differences between the System V and
       BSD versions of echo.  They introduced a new  command  named
       printf  which  takes  an  ANSI-C format string and a list of
       options and outputs the strings using the ANSI-C  formatting
       rules.   Since  ksh  is POSIX conforming, it accepts printf.
       However, there is a -f options to print that can be used  to
       specify  a  format  string which processes the arguments the
       same way that printf does.

       The format processing for print and printf has been extended
       slightly.  There are three additional formatting directives.
       The %b format causes the \ escape sequences to  be  expanded
       as  they  are with the System V echo command.  The %q format
       causes quotes to be placed on the output as required so that
       it  can  be  used as shell input.  Special characters in the
       output of most ksh built-in commands and in the output  from
       an execution trace are quoted in an equivalent fashion.  The
       %P format causes an extended regular expression string to be
       converted  into a shell pattern.  This is useful for writing
       shell applications that have to accept  regular  expressions
       as  input.  Finally, the escape sequence \E which expands to
       the terminal escape character (octal 033) has been added.

       The shell is frequently used as a programming  language  for
       interactive  dialogues.  The select statement has been added
       to the language to make it easier to present menu  selection
       alternatives  to  the user and evaluate the reply.  The list
       of alternatives is numbered and  put  in  columns.   A  user
       settable  prompt,  PS3,  is  issued  and  if the answer is a
       number corresponding to one of the alternatives, the  select
       loop  variable is set to this value.  In any case, the REPLY
       variable is used to store the user entered reply.  The shell
       variables  LINES  and COLUMNS are used to control the layout
       of select lists.

       3.10  Option Parsing

       The getopts built-in command can be used to process  command
       arguments  in  a manner consistent with the way ksh does for
       its own built-in commands.

       The getopts built-in allows  users  to  specify  options  as
       separate  arguments  or  to  group  options that do not take
       arguments together.  Options that require arguments  do  not
       require  space  to  separate  them from the option argument.
       The OPTARG variable stores the value of the option  argument





       after finding a variable that takes an argument.  The OPTIND
       variable holds the index of the  current  options  argument.
       After processing options, the arguments should be shifted by
       OPTIND-1 to make the remaining arguments be "$@".

       The   getopts   argument   description   allows   additional
       information  to  be specified along with the options that is
       used to generate usage messages for incorrect arguments  and
       for  the  option  argument  -?.  The example in the APPENDIX
       uses getopts to process its arguments.

       3.11  Co-process

       ksh can spawn a co-process by adding a |& after  a  command.
       This  process  will  be  run with its standard input and its
       standard  output  connected  to  the  shell.   The  built-in
       command  print  with  the  -p  option  will  write  into the
       standard input of this process and the built-in command read
       with  the  -p  option  will  read  from  the  output of this
       process.

       In addition, the I/O redirection operators <& and >& can  be
       used to move the input or output pipe of the co-process to a
       numbered file descriptor.  Use exec 3>& p to move the  input
       of  the  co-process  to  file  descriptor 3.  After you have
       connected to file descriptor 3, you can direct the output of
       any command to the co-process by running command >&3.  Also,
       by  moving  the  input  of  the  co-process  to  a  numbered
       descriptor,  it is possible to run a second co-process.  The
       output of both co-processes  will  be  the  file  descriptor
       associated  with  read -p.   You can use exec 4<& p to cause
       the output of these co-processes to go to file descriptor  4
       of the shell.  Once you have moved the pipe to descriptor 4,
       it is possible to connect a  server  to  the  co-process  by
       running  command 4<& p  or to close the co-process pipe with
       exec 4<& -.

       3.12  Functions

       Function definitions are of the form

            function name
            {
                    any shell script
            }

       A function whose name contains a .  is called  a  discipline
       function.   The portion of the name after the last .  is the
       name of the discipline.   Discipline  functions  named  get,
       set,  and unset can be assigned to any variable to intercept
       lookups, assignments and unsetting of the  variable  defined
       by  the portion of the name before the last ..  Applications
       can create additional disciplines  for  variables  that  are
       created  as  part of user defined built-ins.  The portion of
       the name before the last .  must refer to  the  name  of  an
       existing  variable.  Thus, if p is a reference to PATH, then
       the function name p.get  and  PATH.get  refer  to  the  same
       function.

       The  function  is  invoked  either by specifying name as the
       command name and optionally following it with  arguments  or





       by  using  it  as  an  option  to  the  .  built-in command.
       Positional parameters are saved before  each  function  call
       and  restored when completed.  The arguments that follow the
       function  name  on  the  calling  line   become   positional
       parameters  inside the function.  The return built-in can be
       used to cause  the  function  to  return  to  the  statement
       following the point of invocation.

       Functions can also be defined with the System V notation,

            name ()
            {
                    any shell script
            }

       Functions  defined  with  this  syntax cannot be used as the
       first argument to a . procedure.  ksh accepts this  notation
       for  compatibility  only.   There  is  no  need  to use this
       notation when writing ksh scripts.

       Functions defined with the function name syntax and  invoked
       by  name  are  executed in the current shell environment and
       can  share  named  variables  with  the   calling   program.
       Options,  other  than execution trace -x, set by the calling
       program are passed down to a function.  The options are  not
       shared  with  the  function so that any options set within a
       function  are  restored  when  the  function  exits.   Traps
       ignored  by  the  caller are ignored within the function and
       cannot be enabled.  Traps caught by the calling program  are
       reset  to their default action within the function.  In most
       instances, the default action is to cause  the  function  to
       terminate.   A  trap  on  EXIT  defined  within  a  function
       executes after the function completes but before the  caller
       resumes.    Therefore,  any  variable  assignments  and  any
       options set as part of a trap action will be effective after
       the caller resumes.

       By  default,  variables  are  inherited  by the function and
       shared by  the  calling  program.   However,  for  functions
       defined  with  the  function name syntax that are invoked by
       name, environment substitutions preceding the function  call
       apply  only  to  the  scope  of  the  function  call.  Also,
       variables whose names do not contain a .  that  are  defined
       with  the typeset built-in command are local to the function
       that they are declared in.  Thus, for the function defined

            function  name
            {
                 typeset -i x=10
                 let z=x+y
                 print $z
            }

       invoked as y=13 name, x  and  y  are  local  variables  with
       respect to the function name while z is global.

       Functions  defined  with  the  name()  syntax, and functions
       invoked as an argument to the .  command,  share  everything
       other   than   positional   parameters   with   the  caller.
       Assignments that precede the call remain in effect after the
       function completes.






       Alias  and  function  names  are  not  passed  down to shell
       scripts or carried across separate invocations of ksh.   The
       $FPATH  variable gives a colon separated list of directories
       that is searched for function  definitions  when  trying  to
       resolve the command name.  Whenever a file name contained in
       $FPATH is found, the complete file is read and all functions
       contained within become defined.

       Calls that reference functions can be recursive.  Except for
       special  built-ins,  function  names  take  precedence  over
       built-in  names  and  names of programs when used as command
       names.  To write a replacement  function  that  invokes  the
       command  that  you  wish to replace, you can use the command
       built-in command.  The arguments to command are the name and
       arguments  of  the program you want to execute.  For example
       to write a cd  function  which  changes  the  directory  and
       prints out the directory name, you can write

            function  cd
            {
                 if      command cd  "$@"
                 then    print  -r -- $PWD
                 fi
            }


       The  FPATH  variable is a colon separated list that ksh uses
       to search for function definitions.  When ksh encounters  an
       autoload  function,  it  runs  the  .  command on the script
       containing the function, and then executes the function.

       For interactive shells, function  definitions  may  also  be
       placed  in  the ENV file.  However, this causes the shell to
       take longer to begin executing.

       3.13  Process Substitution

       This feature is only  available  on  versions  of  the  UNIX
       operating  system  which  support  the /dev/fd directory for
       naming open  files.   Each  command  argument  of  the  form
       <(list)  or  >(list)  will  run  process list asynchronously
       connected to some file in the /dev/fd directory.   The  name
       of  this  file  will become the argument to the command.  If
       the form with > is selected then writing on this  file  will
       provide  input for list.  If < is used, then the file passed
       as an argument will contain the output of the list  process.
       For example,

            paste  <(cut -f1 file1)  <(cut -f2 file2) | tee >(process1)  >(process2)

       extracts  fields  1  and  3  from  the files file1 and file2
       respectively, places the results side by side, and sends  it
       to  the  processes process1 and process2, as well as putting
       it onto the standard output.  Note that the  file  which  is
       passed  as  an  argument  to  the  command  is a UNIX system
       pipe(2) so that the programs that expect to lseek(2) on  the
       file will not work.

       3.14  Finding Commands







       The  addition  of aliases, functions, and more built-ins has
       made it substantially more difficult to know  what  a  given
       command name really means.

       Commands that begin with reserved words are an integral part
       of the  shell  language  itself  and  typically  define  the
       control  flow  of  the language.  Some control flow commands
       are not reserved words  in  the  language  but  are  special
       built-ins.    Special   built-ins  are  built-ins  that  are
       considered a part of the language rather than user definable
       commands.   The  best  examples  of  commands  that fit this
       description are break and continue.  Because  they  are  not
       reserved  words,  they can be the result of shell expansions
       and are not effected by quoting.  These  commands  have  the
       following special properties:

          o Assignments  that  precede  them  apply  to the current
            shell process, not just to the given command.

          o An error in the format of these commands cause a  shell
            script or function that contains them to abort.

          o They cannot be overridden by shell functions.

       Other  commands  are  built-in  because  they  perform  side
       effects on the current  environment  that  would  be  nearly
       impossible to implement otherwise.  Built-ins such as cd and
       read are examples of such built-ins.   These  built-ins  are
       semantically  equivalent  to  commands that are not built-in
       except that they don't take a path search to locate.

       A third reason to have a command built-in is so that it will
       be  unaffected  by  the  setting  of the PATH variable.  The
       print  command fits this category.  Scripts that  use  print
       will be portable to all sites that run ksh.

       The  final  reason for having a command be a built-in is for
       performance.  On most systems it is more than  an  order  of
       magnitude faster to initiate a command that is built-in than
       to create a separate process to run the  command.   Examples
       that fit this category are test and pwd.

       Given  a  command  name  ksh decides what it means using the
       following order:

          o Reserved words define commands that form  part  of  the
            shell grammar.  They cannot be quoted.

          o Alias  substitutions occur first as part of the reading
            of commands.  Using quotes in  the  command  name  will
            prevent alias substitutions.

          o Special built-ins.

          o Functions.

          o Commands that are built-in that are not associated with
            a pathname such as cd and print.







          o If the command name contains a /, the program or script
            corresponding to the given name is executed.

          o A path search locates the pathname corresponding to the
            command.  If the pathname where it is found matches the
            pathname associated with a built-in command, the built-
            in command is executed.  If  the  directory  where  the
            command  is  found is listed in the FPATH variable, the
            file is read into the shell like a dot  script,  and  a
            function  by  that name is invoked.  Once a pathname is
            found, ksh  remembers  its  location  and  only  checks
            relative  directories in PATH the next time the command
            name is used.  Assigning a value to PATH causes ksh  to
            forget the location of all command names.

          o The  FPATH  variable  is  searched  and files found are
            treated as described above.

       The  first  argument  of  the  command  built-in,  described
       earlier,  skips  the  checks  for  reserved  words  and  for
       function definitions.  In all other  ways,  command  behaves
       like  a built-in that is not associated with a pathname.  As
       a result, if the first argument  of  command  is  a  special
       built-in,  the  special  properties  of this built-in do not
       apply.  For  example,  whereas,  exec 3< foo  will  cause  a
       script   containing   it   to   abort  if  the  open  fails,
       command exec 3< foo results in a non-zero  exit  status  but
       does not abort the script.

       You can get a complete list of the special built-in commands
       with builtin -s.   In  addition  builtin  without  arguments
       gives  a list of the current built-ins and the pathname that
       they are associated  with.   A  built-in  can  be  bound  to
       another  pathname  by  giving the pathname for the built-in.
       The basename of this path must be the name  of  an  existing
       built-in  for  this  to succeed.  Specifying the name of the
       built-in without a pathname causes this built-in to be found
       before  a  path search.  A built-in can be deleted  with the
       -d option.

       On systems with run  time  loading  of  libraries,  built-in
       commands  can  be  added  with  the  builtin  command.  Each
       command that is to be  built-in  must  be  written  as  a  C
       function whose name is of the form b_name, where name is the
       name of the built-in that is to be added.  The function  has
       the  same  argument  calling  convention as main.  The lower
       eight bits of the return value become the  exit  status  for
       this   built-in.   Builtins  are  added  by  specifying  the
       pathname of the library as an argument to the -f  option  of
       builtin.

       The  built-in command, whence, when used with the -v option,
       tells how a given command is bound.  A line is  printed  for
       each  argument  to  whence telling what would happen if this
       argument were  used  as  a  command  name.   It  reports  on
       reserved  words,  aliases, built-ins, and functions.  If the
       command is none of the above, it  follows  the  path  search
       rules  and  prints  the full path-name, if any, otherwise it
       prints an error message.







       3.15  Symbolic Names

       To  avoid  implementation  dependencies,  ksh  accepts   and
       generates  symbolic  names  for built-ins that use numerical
       values in the Bourne shell.  The  -S  option  of  the  umask
       built-in  command accepts and displays default file creation
       permissions  symbolically.   It  uses  the   same   symbolic
       notation as the chmod command.

       The  trap and kill built-in commands allows the signal names
       to be given symbolically.  The names of  signals  and  traps
       corresponding  to  signals  are  the same as the signal name
       with the SIG prefix removed.  The trap 0 is named EXIT.

       3.16  Additional Variables

       In addition to the  variables  discussed  earlier,  ksh  has
       other  variables  that  it  handles specially.  The variable
       RANDOM produces a random number in the range 0 to 32767 each
       time it is referenced.  Assignment to this variable sets the
       seed for the random number generator.

       The parameter PPID is used to generate the process id of the
       process which invoked this shell.

       3.17  Added Traps

       A  new  trap named ERR has been added.  This trap is invoked
       whenever the shell would exit if the  -e  option  were  set.
       This  trap  is used by Fourth Generation Make[16] which runs
       ksh as a co-process.

       A trap named DEBUG gets executed after each  command.   This
       trap can be used for debugging and other purposes.

       The KEYBD trap was described earlier.

       3.18  Debugging

       The  primary method for debugging Bourne shell scripts is to
       use the -x option to enable the execution trace.  After  all
       the  expansions have been performed, but before each command
       is executed, the trace writes to standard error the name and
       arguments  of each command preceded by a +.  While the trace
       is very useful, there is no way to find  out  what  line  of
       source  a given trace line corresponds to.  With ksh the PS4
       variable  is  evaluated  for  parameter  expansion  and   is
       displayed before each command, instead of the +.

       The  LINENO  variable  is  set  to  the  current line number
       relative to the beginning of the current script or function.
       It is most useful as part of the PS4 prompt.

       The  DEBUG  trap  can  be  used to write a break point shell
       debugger  in  ksh.   An  example  of  such  a  debugger   is
       kshdb.[17]

       3.19  Timing Commands

       Finding  the  time  it  takes to execute commands has been a
       serious problem with  the  Bourne  shell.   Since  the  time





       command  is  not  part  of  the language, it is necessary to
       write a script in order to time a for or  while  loop.   The
       extra  time  in invoking the shell and processing the script
       is accumulated along with the time to execute the script.

       More seriously, the Bourne shell does not give correct times
       for  pipelines.   The  reason for this is that the times for
       some members of a pipeline are not  counted  when  computing
       the time.  As an extreme example, running time on the script
                  cat < /dev/null | sort -u bigfile | wc
       with the Bourne shell will show very little user and  system
       time no matter how large bigfile is.

       To  correct  these  problems,  a reserved word time has been
       added to replace the time command.  Any function, command or
       pipeline  can  be  preceded  by this reserved word to obtain
       information about  the  elapsed,  user,  and  system  times.
       Since  I/O  redirections  bind  to the command, not to time,
       parentheses  should  be  used   to   redirect   the   timing
       information  which is normally printed on file descriptor 2.


       4.  SECURITY

       There are several documented problems  associated  with  the
       security  of  shell  procedures[18].   These  security holes
       occur  primarily  because  a   user   can   manipulate   the
       environment   to  subvert  the  intent  of  a  setuid  shell
       procedure.  Sometimes, shell procedures are  initiated  from
       binary  programs, without the author's awareness, by library
       routines which invoke shells to carry out their tasks.  When
       the  binary  program  is run setuid then the shell procedure
       runs with the permissions  afforded  to  the  owner  of  the
       binary file.

       In the Bourne shell, the IFS parameter is used to split each
       word into separate command arguments.  If a user knows  that
       some  setuid  program  will run sh -c /bin/pwd (or any other
       command in /bin) then  the  user  sets  and  exports  IFS=/.
       Instead  of running /bin/pwd the shell will run bin with pwd
       as an argument.  The user puts his or her  own  bin  program
       into  the current directory.  This program can create a copy
       of the shell, make this  shell  setuid,  and  then  run  the
       /bin/pwd  program  so that the original program continues to
       run successfully.  This kind of penetration is not  possible
       with  ksh since the IFS parameter only splits arguments that
       result from command or parameter substitution.

       Some setuid programs run  programs  using  system()  without
       giving  the  full  pathname.   If  the  user  sets  the PATH
       variable so that the desired command will be found in his or
       her  local  bin, then the same technique described above can
       be employed to compromise the security of  the  system.   To
       close  up  this  and  other  security  holes, ksh resets the
       effective user id to the real  user  id  and  the  effective
       group  id  to the real group id unless the privileged option
       (-p)  is  specified  at  invocation.   In  this  mode,   the
       privileged   mode,  the  .profile  and  ENV  files  are  not
       processed.  Instead, the file /etc/suid_profile is read  and
       executed.   This  gives  an  administrator  control over the
       environment to set the PATH variable or to log setuid  shell





       invocations.   Clearly security of the system is compromised
       if /etc or this file is publicly writable.

       Some versions of the UNIX  operating  system  look  for  the
       characters  #!  as the first two characters of an executable
       file.  If these characters are found, then the next word  on
       this  line  is  taken  as the interpreter to invoke for this
       command and the interpreter is execed with the name  of  the
       script  as argument zero and argument one.  If the setuid or
       setgid bits are on for this file, then  the  interpreter  is
       run with the effective uid and/or gid set accordingly.  This
       scheme has three major drawbacks.  First of all, putting the
       pathname of the interpreter into the script makes the script
       less portable since the interpreter may be  installed  in  a
       different  directory on another system.  Secondly, using the
       #!  notation forces an exec of the interpreter even when the
       call  is  invoked  from  the interpreter which it must exec.
       This is inefficient since ksh can handle a failed exec  much
       faster than starting up again.  More importantly, setuid and
       setgid procedures provide an easy target for intrusion.   By
       linking  a  setuid  or  setgid procedure to a name beginning
       with a - the interpreter is fooled into thinking that it  is
       being  invoked  with  a  command line option rather than the
       name of a file.  When the interpreter is the shell, the user
       gets  a  privileged interactive shell.  There is code in ksh
       to guard against this simple form of intrusion.

       A more reliable way to handle setuid and  setgid  procedures
       is  provided  with  ksh.  The technique does not require any
       changes  to  the  operating  system  and   provides   better
       security.   Another advantage to this method is that it also
       allows scripts which have execute  permission  but  no  read
       permission  to  run.   Taking  away  read  permission  makes
       scripts more secure.

       The method relies on a setuid root program  to  authenticate
       the request and exec the shell with the correct mode bits to
       carry  out  the  task.   This  shell  is  invoked  with  the
       requested  file  already  open  for reading.  A script which
       cannot be opened for reading or which has its setuid  and/or
       setgid bits turned on causes this setuid root program to get
       execed.  For security reasons, this  program  is  given  the
       full   pathname   /etc/suid_exec.    A  description  of  the
       implementation of the /etc/suid_exec program can be found in
       a separate paper[19].


       5.  CODE CHANGES

       ksh  is  written  in ANSI-C as a reusable library.  The code
       can be compiled with C++ and older K&R C as well.  The  code
       uses  the  IEEE  POSIX  1003.1  and  ISO 9945-1 standard[20]
       wherever possible so that ksh should be able to run  on  any
       POSIX  compliant  system.   In  addition,  it is possible to
       compile ksh for older systems.

       Unlike earlier version of the Bourne shell, ksh treats eight
       bit  characters  transparently  without  stripping  off  the
       leading bit.  There is also a compile time switch to  enable
       handling multi-byte and multi-width characters sets.






       On  systems  with  dynamic  libraries, it is possible to add
       built-in commands at run time   with  the  built-in  command
       builtin described earlier.  It is also possible to embed ksh
       in applications in a manner analogous to tcl.


       6.  EXAMPLE

       An example of a ksh script  is  included  in  the  Appendix.
       This  one  page  program  is  a  variant  of the UNIX system
       grep(1) program.  Pattern matching for this version of  grep
       means shell patterns.

       The  first  half uses the getopts command to find the option
       flags.  Nearly  all  options  have  been  implemented.   The
       second  half goes through each line of each file to look for
       a pattern match.

       This program is not intended to serve as a  replacement  for
       grep  which  has been highly tuned for performance.  It does
       illustrate the programming  power  of  ksh.   Note  that  no
       auxiliary  processes  are  spawned  by  this script.  It was
       written and debugged in under two hours.  While  performance
       is acceptable for small files, this program runs at only one
       tenth the speed of grep for large files.


       7.  PERFORMANCE

       ksh executes many scripts faster than the  System  V  Bourne
       shell;  in  some  cases  more  than  10  times as fast.  The
       primary reason for this is that ksh creates fewer processes.
       The  time to execute a built-in command or a function is one
       or two orders of magnitude faster than performing  a  fork()
       and   exec()   to   create   a  separate  process.   Command
       substitution and commands inside parentheses  are  performed
       without   creating  another  process,  unless  necessary  to
       preserve correct behavior.

       Another reason for improved performance is the  use  of  the
       sfio[21], library for I/O.  The sfio library buffers all I/O
       and buffers are flushed only when required.  The  algorithms
       used  in  sfio  perform  better than traditional versions of
       standard I/O so that programs that spend most of their  time
       formatting  output may actually perform better than versions
       written in C.

       Several of the internal algorithms have been changed so that
       the  number  of  subroutine  calls  has  been  substantially
       reduced.  ksh uses variable sized hash tables for variables.
       Scripts  that  rely heavily on referencing variables execute
       faster.  More processing  is  performed  while  reading  the
       script  so that execution time is saved while running loops.
       These changes are not noticeable for scripts that fork() and
       run  processes,  but  they  reduce the time that it takes to
       interpret commands by more than a factor of two.

       Most  importantly,   ksh   provide   mechanisms   to   write
       applications  that  do  not  require as many processes.  The
       arithmetic provided by the shell eliminates the need for the
       expr   command.    The   pattern   matching   and  substring





       capabilities eliminate the need to use sed or awk to process
       strings.

       The  architecture  of  ksh  makes  it  easy to make commands
       built-ins without changing the semantics  at  all.   Systems
       that  have  run-time binding of libraries allow applications
       to be sped up by supplying the critical  programs  as  shell
       built-in commands.  Implementations on other systems can add
       built-in  commands  at  compile  time.   The  procedure  for
       writing  built-in commands that can be loaded at run time is
       in a separate document.[22],


       8.  CONCLUSION

       The 1988 version of ksh has tens  of  thousands  of  regular
       users  and  is  a suitable replacement for the Bourne shell.
       The 1993 version of ksh  is  essentially  upward  compatible
       with  both  the 1988 version of ksh and with the recent IEEE
       POSIX and ISO shell standard.  The 1993 version offers  many
       advantages  for  programming  applications,  and it has been
       rewritten so that it can be used in  embedded  applications.
       It also offers improved performance.



       MH-11267-DGK-dgk              David G. Korn







































                                 APPENDIX

































































                                REFERENCES

         7. S.  R.  Bourne, An Introduction to the UNIX Shell, Bell
            System Technical Journal, Vol. 57, No. 6, Part  2,  pp.
            1947-1972, July 1978.

         8. W.  Joy,  An Introduction to the C Shell, Unix Program-
            mer's Manual, Berkeley Software Distribution, Universi-
            ty of California, Berkeley, 1980.

         9. Morris Bolsky and David Korn, The KornShell Command and
            Programming Language, Prentice Hall, 1989.

        10. Jason Levitt, The Korn  Shell:  An  Emerging  Standard,
            UNIX/World, pp. 74-81, September 1986.

        11. Rich Bilancia, Proficiency and Power are Yours With the
            Korn Shell, UNIX/World, pp. 103-107, September 1987.

        12. John Sebes, Comparing UNIX Shells, UNIX Papers,  Edited
            by the Waite Group, Howard W. Sams & Co., 1987.

        13. T.  A.  Dolotta  and J. R. Mashey, Using the shell as a
            Primary Programming Tool,  Proc.  2nd.  Int.  Conf.  on
            Software Engineering, 1976, pages 169-176.

        14. J.  S.  Pendergrast,  WKSH  - Korn Shell with X-Windows
            Support, USL. 1991.

        15. American National Standard for  Information  Systems  -
            Programming Language - C, ANSI X3.159-1989.

        16. G.  S.  Fowler, The Fourth Generation Make, Proceedings
            of the Portland USENIX meeting, pp. 159-174, 1985.

        17. Bill Rosenblatt, Debugging Shell  Scripts  with  kshdb,
            Unix World, Volume X, No. 5, 1993.

        18. F.  T.  Grampp  and R. H. Morris, UNIX Operating System
            Security, AT&T Bell Labs Tech. Journal, Vol. 63, No. 8,
            Part 2, pp. 1649-1671, 1984.

        19. D. G Korn Parlez-vous Kanji?  TM-59554-860602-03, 1986.

        20. POSIX - Part 1: System Application  Program  Interface,
            IEEE Std 1003.1-1990, ISO/IEC 9945-1:1990.

        21. David  Korn  and  Kiem-Phong  Vo,  SFIO  -  A Safe/Fast
            String/File I/O, Proceedings of the Summer Usenix,  pp.
            235-255, 1991.

        22. David Korn, Guidelines for writing ksh-93 built-in com-
            mands, to be published, 1994.














ACC SHELL 2018