close
The Wayback Machine - https://web.archive.org/web/20150116081559/http://www.acm.org:80/tsc/apis.html

Standards, APIs, Interfaces and Bindings

David Emery

This note addresses the problem of using API standards from multiple programming languages. Many (but not all) computer standards are specified as Application Program Interfaces (APIs). APIs are most commonly expressed as a set of operations, associated data definitions, and the semantics of the operations on some underlying system.

The Problem

APIs are only useful in the context of an executable computer program, and computer programs must be written in some programming language. So the problem here is how to provide access to the API from multiple programming languages. To explain the problem, we define two related terms:
Interface:
Within the context of an API standard, the abstract expression of the requirements and behavior of the interface.
Binding:
The realization of an API standard interface in terms of a given programming language.
For example, the POSIX operating system interface (ISO 9945-1:1990) specifies that a file can be opened for input, for output, or for both input and output. It also specifies that there is a data value associated with each open file. The C binding for POSIX specifies that there is an operation named "open", with the following (ANSI C) signature:
   #include 
   /* contains definitions for the following
    *  #define O_RDONLY 	read-only access
    *  #define O_RDRW		read-write access
    *  #define O_WRONLY		write-only access
    *  #define EACCES       	access denied
    *  #define EROFS		cannot write to read-only file system
    *  extern int errno;
    *  ...
    */

   int open (char* path, int oflag, int mode);
   /*  return -1 for failure and set errno */
This provides a C programmer with access to the POSIX interface for opening files. But what happens if you are programming in a language other than C? You need a language binding for your programming langauge to the POSIX interface, that maps calls in your programming language to the underlying POSIX interface.

Approaches for Language Bindings

It turns out that there are actually two related problems in describing language bindings to standard API interfaces. One problem deals with the mapping of specific language features/capabilities to the interface, and the other problem relates to how the language binding is documented as a standard.

Most APIs are described in terms of a specific langauge binding. This comes from the historical fact that most API standards are derived from existing practice that evolved in a specific programming language. The POSIX standards, for instance, derive from the use of the C language in the original Unix system. In this case, a single document specifies both the interface and an associated language binding. Some APIs are written to be 'programming language independent', and use a metalangauge or formal description language to specify the interface. See the work of ISO JTC1/SC22/WG11, or ISO 9075-3, Call Level Interface for Database Language SQL.

Consider the POSIX 'open file' code mentioned earlier, and assume that you want to call this operation from a strongly-typed langauge such as Ada, Pascal or Modula-2. A "direct mapping" would lexically substitute your Ada/Pascal/Modula-2 identitifers for the equivalent C identifiers. Thus a "direct Ada binding" would preserve the C binding's structure and names, as much as possible:

    package POSIX is
	...
	Errno : Integer;
	function Open 
	  (Path : in String;  Oflags : in Integer;  Mode : in return Integer;
	-- returns -1 for failure and set the global variable
	-- errno with information about the reason for failure
	...
    end POSIX;
This direct binding makes no use of the strong type definitions available in Ada. Nor does this direct binding use Ada's exception mechanism.

The alternative approach for constructing alternative language bindings to APIs specified in terms of an existing binding is called the "abstract mapping" approach. An abstract mapping starts with the API's interface semantics, and then produces a representation of the abstract interface in terms appropriate to the language at hand. An abstract binding for the POSIX open operation might look like:

	package POSIX is 
	   type POSIX_String is ...
	   No_Such_File : exception;
	   Read_Only_File_System : exception;
	   subtype Pathname is POSIX_String;
	   type Error_Code is ...
	   function Last_Error return Error_Code;
	   ...
	end POSIX;

	with POSIX; 
	package Posix_IO is
	   type File_Descriptor is private;
	   type Open_Mode is (Read_Only, Write_Only, Read_Write);
	   type Open_Flags is (...);
	
	   function Open 
		(The_File : in POSIX.Pathname;
		 With_Flags : in Open_Flags;
		 With_Mode : in Open_Mode)
	      return File_Descriptor;
   	   -- raises POSIX.No_Such_File if the named file does not exist
	   --   or if access is denied.
	   -- raises Read_Only_File_System if the named file is on a read-only
	   --   file system and the mode is Write_Only or Read_Write
     	   ...
	end POSIX_IO;
A direct binding is generally simpler to produce, since it can often be done with machine translation from one programming language to another. But sometimes a concept in one language does not translate to a similar concept in another language. (For example, the semantics of the C pre-processor do not map to language features in most other programming laguages.) The resulting binding does not take full advantage of the target programming language, particularly when going from a less-expressive language (e.g. a language without strong typing) to a more more semantically-rich language (e.g. a language with strong typing, or with an exception model.) An abstract binding can make full use of the entire semantics of the target language, but requires substantial human analysis to produce the 'abstractions' from the API standard and then to represent these abstractions in the target language.

Documenting Language Bindings

The other problem in language binding is a problem in standardization. How should the new langauge binding be documented? One approach, analogous to the "direct mapping" approach for describing the binding, is to produce a "thin document". A thin Ada language binding standard for the operation Create would cite the previously standardized C binding directly:
    The Ada operation POSIX.Open behaves the same as the C operation
    open().  The meaning of the parameters Path, Oflag and Mode match
    their C counterparts, and the integer return value has the same
    meaning as the int value returned by open().
The alternative is to produce each language binding as a self-contained document, a "thick" document. In this case, the full semantics of the file open operation would be described by the appropriate language binding document. Both the C and Ada documents would describe the behavior of Open, the meaning of the parameters, the errors that are detected, etc.

A "thin document" has the advantage of being easy to generate. It also provides a single point of reference for the semantics of the Interface, by citing one document (the POSIX C binding, in our example) as the single normative reference for the file open operation. The user of the Ada binding would have to refer to both the Ada binding (thin) document, and also the C binding document. "Thick documents" duplicate the underlying interface semantics in each language binding. The Ada programmer would have to read only the Ada binding to find out the semantics of the file open operation. Because each binding has a copy of the semantics, there is a potential for conflicts between multiple specifications for the same interface.

Summary

This note has presented two related problems in defining language bindings to standard interfaces, particularly those specified in terms of a specific programming language. "Direct vs Abstract" refers to how the features of the underlying interface are mapped to programming language features. "Thick vs Thin" refers to the documentation for a set of language bindings to the same API.

See [Moore, Emery & Rada, Sharing Standards: Language-Independent Standards, CACM 37.12 (Dec 94)] for some experiences in developing language bindings. The POSIX/Ada binding rationale ISO 14519-1:1995/IEEE 1003.5:1992 explains the analysis that produced an "abstract binding" and associated "thick" document.


Back to the TSC page.