ACC SHELL

Path : /usr/share/doc/packages/ksh/
File Upload :
Current File : //usr/share/doc/packages/ksh/Builtins

       





             Guidelines for writing ksh-93 built-in commands
                              David G. Korn







       1.  INTRODUCTION

       A  built-in  command is executed without creating a separate
       process.  Instead, the command is invoked as a C function by
       ksh.   If  this  function  has  no side effects in the shell
       process, then the behavior of this built-in is identical  to
       that  of  the  equivalent  stand-alone command.  The primary
       difference in this case is  performance.   The  overhead  of
       process  creation  is  eliminated.   For  commands  of short
       duration, the effect can be dramatic.  For example,  on  SUN
       OS  4.1,  the  time  do run wc on a small file of about 1000
       bytes, runs about 50 times faster as a built-in command.

       In addition, built-in commands that have side effects on the
       shell  environment  can be written.  This is usually done to
       extend the application domain for  shell  programming.   For
       example,  an X-windows extension that makes heavy use of the
       shell variable namespace was added as a group  of  built-ins
       commands  that  are  added  at  run  time.   The result is a
       windowing  shell  that  can  be  used  to  write   X-windows
       applications.

       While  there  are  definite  advantages  to  adding built-in
       commands, there are some disadvantages as well.   Since  the
       built-in  command  and  ksh  share the same address space, a
       coding error in the built-in program may affect the behavior
       of  ksh; perhaps causing it to core dump or hang.  Debugging
       is also more complex since your code is  now  a  part  of  a
       larger entity.  The isolation provided by a separate process
       guarantees that all resources used by the  command  will  be
       freed  when  the command completes.  Also, since the address
       space of ksh will be larger, this may increase the  time  it
       takes  ksh  to  fork() and exec() a non-builtin command.  It
       makes no sense to add a built-in command that takes  a  long
       time  to run or that is run only once, since the performance
       benefits will  be  negligible.   Built-ins  that  have  side
       effects   in   the   current   shell  environment  have  the
       disadvantage of increasing the coupling between the built-in
       and  ksh  making  the  overall  system less modular and more
       monolithic.

       Despite these drawbacks, in  many  cases  extending  ksh  by
       adding built-in commands makes sense and allows reuse of the
       shell scripting ability in an application  specific  domain.
       This memo describes how to write ksh extensions.








       2.  WRITING BUILT-IN COMMANDS

       There  is a development kit available for writing ksh built-
       ins.  The development kit has  three  directories,  include,
       lib,  and  bin.   The  include  directory  contains  a  sub-
       directory named ast that contains interface  prototypes  for
       functions  that  you  can  call  from  built-ins.   The  lib
       directory contains the ast  library1  and  a  library  named
       libcmd  that  contains  a version of several of the standard
       POSIX[1] utilities that can be made run time built-ins.   It
       is  best  to  set  the  value  of  the  environment variable
       PACKAGE_ast to the pathname of the directory containing  the
       development  kit.  Users of nmake[2] 2.3 and above will then
       be able to use the rule
            :PACKAGE:       ast
       in their makefiles and not have to specify any  -I  switches
       to the compiler.

       A  built-in  command has a calling convention similar to the
       main function of a program,
            int main(int argc, char *argv[]).
       However, instead of main, you must  use  the  function  name
       b_name,  where  name is the name of the built-in you wish to
       define.  The built-in function takes a third void*  argument
       which  you can define as NULL.  Instead of exit, you need to
       use return to terminate your  command.   The  return  value,
       will become the exit status of the command.

       The  steps  necessary  to create and add a run time built-in
       are illustrated in the following simple  example.   Suppose,
       you  wish  to  add  a  built-in  command  named  hello which
       requires one argument and prints the word hello followed  by
       its  argument.   First,  write  the following program in the
       file hello.c:
            #include     <stdio.h>
            int b_hello(int argc, char *argv[], void *context)
            {
                    if(argc != 2)
                    {
                            fprintf(stderr,"Usage: hello arg\n");
                            return(2);
                    }
                    printf("hello %s\n",argv[1]);
                    return(0);
            }

       Next, the program needs to be compiled.  On some systems  it
       is  necessary  to  specify  a  compiler  option  to  produce
       position independent code for dynamic linking.   If  you  do
       not  compile  with  nmake  it  is important to specify the a
       special include directory when compiling built-ins.
            cc -pic -I$PACKAGE_ast/include -c hello.c
       since the special version of <stdio.h>  in  the  development
       kit  is  required.   This  command  generates hello.o in the
       current directory.


       ____________________

       1. ast stands for Advanced Software Technology




       On some systems, you cannot load hello.o directly, you  must
       build  a  shared library instead.  Unfortunately, the method
       for generating  a  shared  library  differs  with  operating
       system.   However,  if  you  are building with the ATT nmake
       program you can use the :LIBRARY: rule to specify this in  a
       system  independent  fashion.   In  addition,  if  you  have
       several built-ins, it is desirable to build a shared library
       that contains them all.

       The final step is using the built-in.  This can be done with
       the  ksh  command  builtin.   To  load  the  shared  library
       hello.so and to add the built-in hello, invoke the command,
            builtin -f hello hello
       The  suffix  for  the shared library can be omitted in which
       case the shell will add an appropriate suffix for the system
       that  it  is  loading  from.   Once  this  command  has been
       invoked, you can invoke hello as you do any other command.

       It is often desirable to make a command built-in  the  first
       time  that  it  is  referenced.   The  first  time  hello is
       invoked,  ksh  should  load  and  execute  it,  whereas  for
       subsequent invocations ksh should just execute the built-in.
       This can be done by creating a file  named  hello  with  the
       following contents:
            function hello
            {
                    unset -f hello
                    builtin -f hello hello
                    hello "$@"
            }
       This file hello needs to be placed in a directory that is in
       your FPATH variable.  In addition,  the  full  pathname  for
       hello.so  should be used in this script so that the run time
       loader will be able to find this shared  library  no  matter
       where the command hello is invoked.


       3.  CODING REQUIREMENTS AND CONVENTIONS

       As mentioned above, the entry point for built-ins must be of
       the form b_name.  Your built-ins can call functions from the
       standard  C  library,  the  ast library, interface functions
       provided by ksh, and your own functions.  You  should  avoid
       using  any  global  symbols beginning with sh_, nv_, and ed_
       since these are used by ksh itself.   In  addition,  #define
       constants in ksh interface files, use symbols beginning with
       SH_ to that you should avoid using names beginning with SH_.

       3.1  Header Files

       The  development  kit provides a portable interface to the C
       library and to libast.  The header files in the  development
       kit are compatible with K&R C[3], ANSI-C[4], and C++[5].

       The  best  thing  to  do  is  to  include  the  header  file
       <shell.h>.  This header file causes the <ast.h> header,  the
       <error.h>  header  and the <stak.h> header to be included as
       well as defining prototypes for functions that you can  call
       to  get  shell  services for your builtins.  The header file
       <ast.h> provides prototypes for many  libast  functions  and
       all  the  symbol  and  function  definitions from the ANSI-C





       headers, <stddef.h>, <stdlib.h>, <stdarg.h>, <limits.h>, and
       <string.h>.    It   also   provides   all  the  symbols  and
       definitions  for   the   POSIX[6]   headers   <sys/types.h>,
       <fcntl.h>,  and  <unistd.h>.   You  should  include  <ast.h>
       instead of one or more  of  these  headers.   The  <error.h>
       header  provides  the  interface  to  the  error  and option
       parsing  routines  defined  below.   The   <stak.h>   header
       provides  the  interface  to  the memory allocation routines
       described below.

       Programs that want to use the  information  in  <sys/stat.h>
       should  include  the file <ls.h> instead.  This provides the
       complete POSIX interface to stat() related functions even on
       non-POSIX systems.

       3.2  Input/Output

       ksh  uses sfio, the Safe/Fast I/O library[7], to perform all
       I/O operations.  The sfio library, which is part of  libast,
       provides  a  superset  of  the functionality provided by the
       standard I/O library defined in  ANSI-C.   If  none  of  the
       additional  functionality  is  required,  and if you are not
       familiar with sfio and you do not want  to  spend  the  time
       learning  it,  then  you  can use sfio via the stdio library
       interface.   The  development  kit   contains   the   header
       <stdio.h>  which  maps  stdio  calls to sfio calls.  In most
       instances the mapping is done by macros or inline  functions
       so  that  there  is  no overhead.  The man page for the sfio
       library is in an Appendix.

       However, there are some very nice extensions and performance
       improvements  in sfio and if you plan any major extensions I
       recommend that you use it natively.

       3.3  Error Handling

       For error messages  it  is  best  to  use  the  ast  library
       function  errormsg() rather that sending output to stderr or
       the equivalent sfstderr  directly.   Using  errormsg()  will
       make   error  message  appear  more  uniform  to  the  user.
       Furthermore, using errormsg() should make it  easier  to  do
       error  message  translation  for  other  locales  in  future
       versions of ksh.

       The first argument to errormsg() specifies the dictionary in
       which  the  string  will  be  searched for translation.  The
       second argument to errormsg() contains that error  type  and
       value.   The third argument is a printf style format and the
       remaining arguments are arguments to be printed as  part  of
       the  message.   A  new-line  is  inserted at the end of each
       message and therefore, should not  appear  as  part  of  the
       format  string.   The  second  argument should be one of the
       following:

       ERROR_exit(n): If n is not-zero, the builtin will exit value
            n after printing the message.

       ERROR_system(n): Exit   builtin  with  exit  value  n  after
            printing the message.  The  message  will  display  the
            message  corresponding  to errno enclosed within [ ] at
            the end of the message.






       ERROR_usage(n): Will generate a usage message and exit.   If
            n is non-zero, the exit value will be 2.  Otherwise the
            exit value will be 0.

       ERROR_debug(n): Will print a level n debugging  message  and
            will then continue.

       ERROR_warn(n): Prints a warning message. n is ignored.

       3.4  Option Parsing

       The  first  thing  that a built-in should do is to check the
       arguments for correctness and to print any usage messages on
       standard error.  For consistency with the rest of ksh, it is
       best to use the libast functions optget() and  optusage()for
       this  purpose.  The header <error.h> included prototypes for
       these functions.  The optget() function is  similar  to  the
       System  V  C  library  function  getopt(), but provides some
       additional  capabilities.   Built-ins  that   use   optget()
       provide a more consistent user interface.

       The optget() function is invoked as
            int optget(char *argv[], const char *optstring)
       where  argv  is  the argument list and optstring is a string
       that  specifies  the  allowable  arguments  and   additional
       information  that is used to format usage messages.  In fact
       a complete man page in troff or html  can  be  generated  by
       passing  a usage string as described by the getopts command.
       Like getopt(), single letter options are represented by  the
       letter  itself,  and options that take a string argument are
       followed by  the  :  character.   Option  strings  have  the
       following special characters:

       :    Used  after a letter option to indicate that the option
            takes an option argument.   The  variable  opt_info.arg
            will  point  to  this value after the given argument is
            encountered.

       #    Used after a letter option to indicate that the  option
            can   only   take  a  numerical  value.   The  variable
            opt_info.num will contain this value  after  the  given
            argument is encountered.

       ?    Used  after  a  :  or  # (and after the optional ?)  to
            indicate the  the  preceding  option  argument  is  not
            required.

       [...] After  a  :  or #, the characters contained inside the
            brackets are used to identify the option argument  when
            generating a usage message.

       space The  remainder  of  the  string will only be used when
            generating usage messages.

       The optget() function returns the matching option letter  if
       one  of  the  legal  option is matched.  Otherwise, optget()
       returns

       ':'  If there is  an  error.   In  this  case  the  variable
            opt_info.arg contains the error string.






       0    Indicates   the   end   of   options.    The   variable
            opt_info.index  contains  the   number   of   arguments
            processed.

       '?'  A  usage  message has been required.  You normally call
            optusage() to generate and display the usage message.

       The following is an example of the option parsing portion of
       the wc utility.
            #include <shell.h>
            while(1) switch(n=optget(argv,"xf:[file]"))
            {
                    case 'f':
                            file = opt_info.arg;
                            break;
                    case ':':
                            error(ERROR_exit(0), opt_info.arg);
                            break;
                    case '?':
                            error(ERROR_usage(2), opt_info.arg);
                            break;
            }

       3.5  Storage Management

       It  is  important  that  any memory used by your built-in be
       returned.  Otherwise, if your built-in is called frequently,
       ksh  will  eventually  run  out of memory.  You should avoid
       using  malloc()  for  memory  that  must  be  freed   before
       returning  from  you  built-in, because by default, ksh will
       terminate you built-in in the event of an interrupt and  the
       memory will not be freed.

       The  best  way  to  to  allocate  variable  sized storage is
       through calls to the  stak  library  which  is  included  in
       libast and which is used extensively by ksh itself.  Objects
       allocated with the stakalloc() function are freed  when  you
       function  completes  or aborts.  The stak library provides a
       convenient way to build variable length  strings  and  other
       objects  dynamically.   The man page for the stak library is
       contained in the Appendix.

       Before ksh calls each built-in command, it saves the current
       stack  location and restores it after it returns.  It is not
       necessary to save and restore the stack location in  the  b_
       entry function, but you may want to write functions that use
       this stack are restore it when leaving  the  function.   The
       following  coding  convention  will  do this in an efficient
       manner:
            yourfunction()
            {
                    char    *savebase;
                    int     saveoffset;
                    if(saveoffset=staktell())
                            savebase = stakfreeze(0);
                    ...
                    if(saveoffset)
                            stakset(savebase,saveoffset);
                    else
                            stakseek(0);





            }


       4.  CALLING ksh SERVICES

       Some of the more interesting  applications  are  those  that
       extend  the  functionality  of  ksh  in application specific
       directions.  A  prime  example  of  this  is  the  X-windows
       extension  which adds builtins to create and delete widgets.
       The nval library is used to interface with  the  shell  name
       space.   The  shell  library  is  used to access other shell
       services.

       4.1  The nval library

       A great deal of power is derived from  the  ability  to  use
       portions  of  the  hierarchal variable namespace provided by
       ksh-93 and turn these names into active objects.

       The nval library is used to interface with shell  variables.
       A  man  page  for this file is provided in an Appendix.  You
       need to include the header <nval.h> to access the  functions
       defined  in the nval library.  All the functions provided by
       the nval library begin with  the  prefix  nv_.   Each  shell
       variable  is  an  object  in  an  associative  table that is
       referenced by name.  The type  Namval_t*  is  pointer  to  a
       shell  variable.   To operate on a shell variable, you first
       get a handle to the variable with the nv_open() function and
       then supply the handle returned as the first argument of the
       function that provides an operation on  the  variable.   You
       must call nv_close() when you are finished using this handle
       so that the space can be freed once the value is unset.  The
       two  most  frequent  operations  are to get the value of the
       variable,  and  to  assign  value  to  the  variable.    The
       nv_getval() returns a pointer the the value of the variable.
       In some cases the pointer returned is to a region that  will
       be  overwritten  by the next nv_getval() call so that if the
       value isn't used immediately, it  should  be  copied.   Many
       variables   can   also   generate   a  numeric  value.   The
       nv_getnum() function returns a numeric value for  the  given
       variable   pointer,  calling  the  arithmetic  evaluator  if
       necessary.

       The nv_putval() function is used to assign a new value to  a
       given  variable.   The  second  argument  to putval() is the
       value to be assigned and the third argument is a flag  which
       is used in interpreting the second argument.

       Each  shell  variable  can have one or more attributes.  The
       nv_isattr() is used to test for the existence of one or more
       attributes.   See  the  appendix  for  a  complete  list  of
       attributes.

       By default, each shell variable passively stores the  string
       you  give  with with nv_putval(), and returns the value with
       getval().  However, it is possible to turn any node into  an
       active  entity  by  assigning  functions  to it that will be
       called whenever nv_putval() and/or  nv_getval()  is  called.
       In  fact  there are up to five functions that can associated
       with each variable to override  the  default  actions.   The
       type Namfun_t is used to define these functions.  Only those





       that are non-NULL override the default actions.  To override
       the  default  actions,  you  must  allocate  an  instance of
       Namfun_t, and then assign the functions  that  you  wish  to
       override.    The   putval()   function   is  called  by  the
       nv_putval()  function.   A  NULL  for  the  value   argument
       indicates  a  request  to  unset  the  variable.   The  type
       argument might contain the NV_INTEGER bit so you  should  be
       prepared  to  do  a  conversion  if necessary.  The getval()
       function is called by nv_getval() value and  must  return  a
       string.    The   getnum()  function  is  called  by  by  the
       arithmetic evaluator and must return  double.   If  omitted,
       then  it  will  call nv_getval() and convert the result to a
       number.

       The functionality of a variable can further be increased  by
       adding  discipline functions that can be associated with the
       variable.  A discipline function allows a script  that  uses
       your   variable   to   define   functions   whose   name  is
       varname.discname where varname is the name of the  variable,
       and  discname  is the name of the discipline.  When the user
       defines such a function,  the  settrap()  function  will  be
       called  with the name of the discipline and a pointer to the
       parse tree corresponding to the  discipline  function.   The
       application  determines  when  these  functions are actually
       executed.  By default, ksh defines get, set,  and  unset  as
       discipline functions.

       In addition, it is possible to provide a data area that will
       be passed as an argument to each of these functions whenever
       any  of  these  functions are called.  To have private data,
       you need to define and allocate a structure that looks like
            struct yours
            {
                    Namfun_t        fun;
                    your_data_fields;
            };

       4.2  The shell library

       There are several functions that are used by ksh itself that
       can also be called from built-in commands.  The man page for
       these routines are in the Appendix.

       The sh_addbuiltin() function can be used to  add  or  delete
       builtin  commands.   It  takes the name of the built-in, the
       address of the function that implements the built-in, and  a
       void*  pointer  that  will be passed to this function as the
       third agument whenever  it  is  invoked.   If  the  function
       address  is  NULL,  the  specified built-in will be deleted.
       However, special built-in functions  cannot  be  deleted  or
       modified.

       The  sh_fmtq()  function takes a string and returns a string
       that is quoted as necessary so that it can be used as  shell
       input.   This function is used to implement the %q option of
       the shell built-in printf command.

       The sh_parse() function returns a parse  tree  corresponding
       to  a  give  file  stream.   The  tree  can  be  executed by
       supplying it as the first argument to the sh_trap() function
       and   giving   a   value   of  1  as  the  second  argument.





       Alternatively, the sh_trap() function can parse and  execute
       a  string  by  passing  the string as the first argument and
       giving 0 as the second argument.

       The sh_isoption() function can be used to set to see whether
       one or more of the option settings is enabled.




























































                                REFERENCES

         1. POSIX   -   Part  2:  Shell  and  Utilities,  IEEE  Std
            1003.2-1992, ISO/IEC 9945-2:1993.

         2. Glenn Fowler, Nmake reference needed

         3. Brian W. Kernighan and Dennis M. Ritchie,  The  C  Pro-
            gramming Language, Prentice Hall, 1978.

         4. American  National  Standard  for Information Systems -
            Programming Language - C, ANSI X3.159-1989.

         5. Bjarne Stroustroup, C++, Addison Wesley, xxxx

         6. POSIX - Part 1: System Application  Program  Interface,
            IEEE Std 1003.1-1990, ISO/IEC 9945-1:1990.

         7. David  Korn  and  Kiem-Phong Vo, SFIO - A Safe/Fast In-
            put/Output library, Proceedings of the  Summer  Usenix,
            pp. , 1991.














































ACC SHELL 2018