Recommendation: Basic I/O class types

Version: 1.0

This text contains a recommendation about how to use certain class types to make libraries for the language Objective Caml more interoperable. The recommendation is the result of a discussion between Nicolas Cannasse (extlib), Gerd Stolpmann (ocamlnet), and Yamagata Yoriyuki (camomile). The mentioned libraries will be/are changed such that they support the recommendation.

The nature of the class system allows us to use objects of a class X as instances of another class Y when the type of X is a subtype of Y. It is not required that X and Y have an inheritance relation. Moreover, X and Y can even be defined in different, separately compiled libraries that do not know of each other. The latter observation is the basis for this recommendation, as it becomes possible to define interoperable class types by convention. Especially, it is not necessary to have one formal definition in the Objective Caml system to which the using libraries refer. The formal definition is absent, and informally replaced by this convention.

From an abstract point of view, we are going to establish a minimum class type T that contains the I/O methods the authors regard as relevant for this recommendation. A library will usually have additional methods in a class type TL. Formally, this is reflected by the requirement that TL is a subtype of T. The language allows it then to coerce instances of TL to T. Assumed we have two libraries L and L' defining I/O class types TL, and TL', respectively, it is now possible to use instances x of type TL in the context of L' by

For example, this allows us to XXX (fill in: use an ocamlnet channel in the context of camomile - when the libraries are changed).

I/O of octets

Octet streams are the first kind of I/O classes defined by this recommendation. They play a special role as all I/O must finally be transformed into operations of octets, because the current operating systems require this.

Formal type definition. The following class types are recommended. The names of the class types are not normative, but the names of the methods and their types are.

class type rec_octet_in_channel =
object
  method input : string -> int -> int -> int
  method close_in : unit -> unit
end

class type rec_octet_out_channel =
object
  method output : string -> int -> int -> int
  method flush : unit -> unit
  method close_out : unit -> unit
end

Meaning. Both input and output channels have two states: open and closed. It is outside the scope of this recommendation how a channel is opened. When it is open, however, the defined methods can be called, and have a defined meaning. When a channel is closed, it is usually regarded as an error to invoke the methods. It is not defined what happens in this case, but it is suggested to raise a Failure or library-specific exception. The implementation may even opt to silently ignore such calls.

There are two kinds of semantics: blocking I/O and non-blocking I/O. An implementation must define which semantics it selects, or the conditions under which a certain semantic model is effective.

Note that the following definition differs from the Unix API as the "end of file condition" is indicated by an exception, and the "operation would block condition" is expressed by the return value 0!

method input : string -> int -> int -> int:
Data is read from the channel and put into the string. The first int argument is the position in the string where to store the octets, and the second int argument is the maximum number of octets to read. The method returns the actual number of octets read from the channel and put into the string. The implementation is always free to read fewer octets than requested. (This is even allowed when reading files from a local disk!) When the end of the stream is reached, and there are no more octets that could be read, the method raises the exception End_of_file. An implementation for blocking I/O must at least read one octet, or raise End_of_file. An implementation for non-blocking I/O can return the value 0 to indicate that there are currently no octets to read.
As a special case, when zero octets are requested for a blocking implementation, the method must return the number 0.
method close_in : unit -> unit:
The input channel is closed.
method output : string -> int -> int -> int:
Data is taken from the string and written to the channel. The first int argument is the position in the string where the octets are passed, and the second int argument is the number of octets to write. The method returns the actual number of octets written to the channel. The implementation is always free to write fewer octets than requested. (This is even allowed when writing to files on a local disk!) An implementation for blocking I/O must at least write one octet. An implementation for non-blocking I/O can return the value 0 to indicate that currently no octets can be processed.
As a special case, when zero octets are requested for a blocking implementation, the method must return the number 0.
method flush : unit -> unit:
The implementation may choose that output does not write directly to the underlying resource, but into a buffer. In this case, the call of flush writes the contents of the buffer to the resource. When there is no such buffer, the call does nothing.
method close_out : unit -> unit:
The buffer, if any, is flushed, and the output channel is closed.

Rationale. The authors examined a number of alternatives, and came to the conclusion that this definition is the best for general-purpose I/O channels. The reasons:

Polymorphic I/O

Another case is that the I/O class is polymorphic in the elementary (character-level) type subject to the I/O operations. This case is especially handled to allow I/O of Unicode characters. We intentionally define this case on the level of characters and not on the level of the implied monoid to reduce complexity.

Formal type definition. The following class types are recommended. The names of the class types are not normative, but the names of the methods and their types are.

class type ['t] rec_poly_in_channel =
  method get : unit -> 't
  method close_in : unit -> unit
end

class type ['t] rec_poly_out_channel =
  method put : 't -> unit
  method flush : unit -> unit
  method close_out : unit -> unit
end

Meaning. As in the octet case, the channels have the two states open and closed. Again, we do not define how to open a channel, and we let it up to the implementation how to handle the case when methods of a closed channel are called.

Polymorphic channels always perform blocking I/O.

method get : unit -> 't:
A character is read from the channel and returned. When there is no more character, and the channel is at its end, the exception End_of_file is raised.
method close_in : unit -> unit:
The input channel is closed.
method put : 't -> unit:
The character passed as argument is written to the channel.
method flush : unit -> unit:
The implementation may choose that put does not write directly to the underlying resource, but into a buffer. In this case, the call of flush writes the contents of the buffer to the resource. When there is no such buffer, the call does nothing.
method close_out : unit -> unit:
The buffer, if any, is flushed, and the output channel is closed.

Rationale. The authors examined a number of alternatives, and came to the conclusion that this definition is the best for general-purpose I/O channels. The reasons:


Gerd Stolpmann
Last modified: Fri May 28 09:27:52 CEST 2004