connections               package:base               R Documentation

_F_u_n_c_t_i_o_n_s _t_o _M_a_n_i_p_u_l_a_t_e _C_o_n_n_e_c_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Functions to create, open and close connections.

_U_s_a_g_e:

     file(description = "", open = "", blocking = TRUE,
          encoding = getOption("encoding"))

     url(description, open = "", blocking = TRUE,
         encoding = getOption("encoding"))

     gzfile(description, open = "", encoding = getOption("encoding"),
            compression = 6)

     bzfile(description, open = "", encoding = getOption("encoding"))

     unz(description, filename, open = "",
         encoding = getOption("encoding"))

     pipe(description, open = "", encoding = getOption("encoding"))

     fifo(description, open = "", blocking = FALSE,
          encoding = getOption("encoding"))

     socketConnection(host = "localhost", port, server = FALSE,
                      blocking = FALSE, open = "a+",
                      encoding = getOption("encoding"))

     open(con, ...)
     ## S3 method for class 'connection':
     open(con, open = "r", blocking = TRUE, ...)

     close(con, ...)
     ## S3 method for class 'connection':
     close(con, type = "rw", ...)

     flush(con)

     isOpen(con, rw = "")
     isIncomplete(con)

_A_r_g_u_m_e_n_t_s:

description: character string. A description of the connection: see
          'Details'.

    open: character.  A description of how to open the connection (if
          at all).  See 'Details' for possible values.

blocking: logical.  See the 'Blocking' section below.

encoding: The name of the encoding to be used.  See the 'Encoding'
          section below.

compression: integer in 0-9.  The amount of compression to be applied
          when writing, from none to maximal.  The default is a good
          space/time compromise.

filename: a filename within a zip file.

    host: character.  Host name for port.

    port: integer.  The TCP port number.

  server: logical.  Should the socket be a client or a server?

     con: a connection.

    type: character. Currently ignored.

      rw: character.  Empty or '"read"' or '"write"', partial matches
          allowed.

     ...: arguments passed to or from other methods.

_D_e_t_a_i_l_s:

     The first eight functions create connections.  By default the
     connection is not opened (except for 'socketConnection'), but may
     be opened by setting a non-empty value of argument 'open'.

     For 'file' the description is either a path to the file to be
     opened or a complete URL, or '""' (the default) or '"stdin"' or
     '"clipboard"' (see below).

     For 'url' the description is a complete URL, including scheme
     (such as 'http://', 'ftp://' or 'file://').

     For 'gzfile' the description is the path to a file that is
     compressed by 'gzip': it can also opened uncompressed files.

     For 'bzfile' the description is the path to a file that is
     compressed by 'bzip2'.

     'unz' reads (only) single files within zip files, in binary mode.
     The description is the full path to the zip file, with '.zip'
     extension if required.

     For 'pipe' the description is the command line to be piped to or
     from.

     For 'fifo' the description is the path of the fifo.

     'file' allows 'description="stdin"' to refer to the C-level
     'stdin' of the process (which need not be connected to anything in
     a console or embedded version of R), provided the C99 function
     'fdopen' is supported on the platform.

     'gzfile' and 'bzfile' open the actual file in binary mode and so
     no translations are done if the original file was a text file.
     (See 'gzcon' for a way to add compression to non-file connections
     such as URLs.)

     All platforms support 'file', 'gzfile', 'bzfile', 'unz' and
     'url("file://")' connections.  The other types may be partially
     implemented or not implemented at all.  (They do work on most Unix
     platforms, and all but 'fifo' on Windows.)

     Proxies can be specified for 'url' connections: see
     'download.file'.

     'open', 'close' and 'seek' are generic functions: the following
     applies to the methods relevant to connections.

     'open' opens a connection.  In general functions using connections
     will open them if they are not open, but then close them again, so
     to leave a connection open call 'open' explicitly.

     Possible values for the mode 'open' to open a connection are

     '"_r"' _o_r '"_r_t"' Open for reading in text mode.

     '"_w"' _o_r '"_w_t"' Open for writing in text mode.

     '"_a"' _o_r '"_a_t"' Open for appending in text mode.

     '"_r_b"' Open for reading in binary mode.

     '"_w_b"' Open for writing in binary mode.

     '"_a_b"' Open for appending in binary mode.

     '"_r+"', '"_r+_b"' Open for reading and writing.

     '"_w+"', '"_w+_b"' Open for reading and writing, truncating file
          initially.

     '"_a+"', '"_a+_b"' Open for reading and appending.

     Not all modes are applicable to all connections: for example URLs
     can only be opened for reading.  Only file and socket connections
     can be opened for reading and writing/appending. For many
     connections there is little or no difference between text and
     binary modes, but there is for file-like connections on Windows,
     and 'pushBack' is text-oriented and is only allowed on connections
     open for reading in text mode. If a file or fifo is created on a
     Unix-alike, its permissions will be the maximal allowed by the
     current setting of 'umask' (see 'Sys.umask').

     'gzfile' connections are an exception, as the file always has to
     be opened in binary mode.  Thus modes such as 'r' are binary, and
     'rt' is needed to have a text-mode connection.

     'close' closes and destroys a connection.  Note that this will
     happen automatically in due course if there is no R object
     referring to the connection.

     A maximum of 128 connections can be allocated (not necessarily
     open) at any one time.  Three of these are pre-allocated (see
     'stdout').   The OS will impose limits on the numbers of
     connections of various types, but these are usually larger than
     125.

     'flush' flushes the output stream of a connection open for
     write/append (where implemented).

     If for a 'file' or 'fifo' connection the description is '""', the
     file/fifo is immediately opened (in '"w+"' mode unless
     'open="w+b"' is specified) and unlinked from the file system. 
     This provides a temporary file/fifo to write to and then read
     from.

     A note on 'file://' URLs.  The most general form (from RFC1738) is
     'file://host/path/to/file', but R only accepts the form with an
     empty 'host' field referring to the local machine. This is then
     'file:///path/to/file', where 'path/to/file' is relative to '/'. 
     So although the third slash is strictly part of the specification
     not part of the path, this can be regarded as a way to specify the
     file '/path/to/file'.  It is not possible to specify a relative
     path using a file URL. Also, no attempt is made to decode an
     encoded URL: call 'URLdecode' if necessary.

     Note that 'https://' connections are not supported.

_V_a_l_u_e:

     'file', 'pipe', 'fifo', 'url', 'gzfile', 'bzfile', 'unz' and
     'socketConnection' return a connection object which inherits from
     class '"connection"' and has a first more specific class.

     'isOpen' returns a logical value, whether the connection is
     currently open.

     'isIncomplete' returns a logical value, whether last read attempt
     was blocked, or for an output text connection whether there is
     unflushed output.

_E_n_c_o_d_i_n_g:

     The encoding of the input/output stream of a connection in _text_
     mode can be specified by name, in the same way as it would be
     given to 'iconv': see that help page for how to find out what
     names are recognized on your platform.  Additionally, '""' and
     '"native.enc"' both mean the 'native' encoding, that is the
     internal encoding of the current locale and hence no translation
     is done. Not all builds of R support this, and if yours does not,
     specifying a non-default encoding will give an error when the
     connection is opened.

     Re-encoding only works for connections in text mode.

     The encoding '"UCS-2LE"' is treated specially, as it is the
     appropriate value for Windows 'Unicode' text files.  If the first
     two bytes are the Byte Order Mark '0xFFFE' then these are removed
     as most implementations of 'iconv' do not accept BOMs.  Note that
     some implementations will handle BOMs using encoding '"UCS-2"' but
     many will not.

     Exactly what happens when the requested translation cannot be done
     is in general undocumented.  Requesting a conversion that is not
     supported is an error, reported when the connection is opened.  On
     output the result is likely to be that up to the error, with a
     warning.  On input, it will most likely be all or some of the
     input up to the error.

_B_l_o_c_k_i_n_g:

     The default condition for all but fifo and socket connections is
     to be in blocking mode.  In that mode, functions do not return to
     the R evaluator until they are complete.   In non-blocking mode,
     operations return as soon as possible, so on input they will
     return with whatever input is available (possibly none) and for
     output they will return whether or not the write succeeded.

     The function 'readLines' behaves differently in respect of
     incomplete last lines in the two modes: see its help page.

     Even when a connection is in blocking mode, attempts are made to
     ensure that it does not block the event loop and hence the
     operation of GUI parts of R.  These do not always succeed, and the
     whole process will be blocked during a DNS lookup on Unix, for
     example.

     Most blocking operations on URLs and sockets are subject to the
     timeout set by 'options("timeout")'.  Note that this is a timeout
     for no response at all, not for the whole operation.  The timeout
     is set at the time the connection is opened (more precisely, when
     the last connection of that type - 'http:', 'ftp:' or socket - was
     opened).

_F_i_f_o_s:

     Fifos default to non-blocking.  That follows S version 4 and is
     probably most natural, but it does have some implications. In
     particular, opening a non-blocking fifo connection for writing
     (only) will fail unless some other process is reading on the fifo.

     Opening a fifo for both reading and writing (in any mode: one can
     only append to fifos) connects both sides of the fifo to the R
     process, and provides an similar facility to 'file()'.

_C_l_i_p_b_o_a_r_d:

     'file' can also be used with 'description = "clipboard"' in mode
     '"r"' only.  This reads the X11 primary selection (see <URL:
     http://standards.freedesktop.org/clipboards-spec/clipboards-latest.txt>),
     which can also be specified as '"X11_primary"' and the secondary
     selection as '"X11_secondary"'.  On most systems the clipboard
     selection (that used by 'Copy' from an 'Edit' menu) can be
     specified as '"X11_clipboard"'.

     When a clipboard is opened for reading, the contents are
     immediately copied to internal storage in the connection.

     Unix users wishing to _write_ to one of the selections may be able
     to do so via 'xclip' (<URL:
     http://people.debian.org/~kims/xclip/>), for example by
     'pipe("xclip -i", "w")' for the primary selection.

     MacOS X users can use 'pipe("pbpaste")' and 'pipe("pbcopy", "w")'
     to read from and write to that system's clipboard.

_N_o_t_e:

     R's connections are modelled on those in S version 4 (see
     Chambers, 1998).  However R goes well beyond the S model, for
     example in output text connections and URL, 'gzfile', 'bzfile' and
     socket connections.

     The default mode in R is '"r"' except for socket connections. This
     differs from S, where it is the equivalent of '"r+"', known as
     '"*"'.

     On (rare) platforms where 'vsnprintf' does not return the needed
     length of output there is a 100,000 character output limit on the
     length of line for 'fifo', 'gzfile' and 'bzfile' connections:
     longer lines will be truncated with a warning.

_R_e_f_e_r_e_n_c_e_s:

     Chambers, J. M. (1998) _Programming with Data.  A Guide to the S
     Language._ Springer.

_S_e_e _A_l_s_o:

     'textConnection', 'seek', 'showConnections', 'pushBack'.

     Functions making direct use of connections are 'readLines',
     'readBin', 'readChar', 'writeLines', 'writeBin', 'writeChar',
     'cat', 'sink', 'scan', 'parse', 'read.dcf', 'load', 'save', 'dput'
     and 'dump'.

     'capabilities' to see if 'url', 'fifo' and 'socketConnection' are
     supported by this build of R.

     'gzcon' to wrap gzip (de)compression around a connection.

_E_x_a_m_p_l_e_s:

     zz <- file("ex.data", "w")  # open an output file connection
     cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")
     cat("One more line\n", file = zz)
     close(zz)
     readLines("ex.data")
     unlink("ex.data")

     zz <- gzfile("ex.gz", "w")  # compressed file
     cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")
     close(zz)
     readLines(zz <- gzfile("ex.gz"))
     close(zz)
     unlink("ex.gz")

     zz <- bzfile("ex.bz2", "w")  # bzip2-ed file
     cat("TITLE extra line", "2 3 5 7", "", "11 13 17", file = zz, sep = "\n")
     close(zz)
     print(readLines(zz <- bzfile("ex.bz2")))
     close(zz)
     unlink("ex.bz2")

     ## An example of a file open for reading and writing
     Tfile <- file("test1", "w+")
     c(isOpen(Tfile, "r"), isOpen(Tfile, "w")) # both TRUE
     cat("abc\ndef\n", file=Tfile)
     readLines(Tfile)
     seek(Tfile, 0, rw="r") # reset to beginning
     readLines(Tfile)
     cat("ghi\n", file=Tfile)
     readLines(Tfile)
     close(Tfile)
     unlink("test1")

     ## We can do the same thing with an anonymous file.
     Tfile <- file()
     cat("abc\ndef\n", file=Tfile)
     readLines(Tfile)
     close(Tfile)

     ## fifo example -- may fail, e.g. on Cygwin, even with OS support for fifos
     if(capabilities("fifo")) {
       zz <- fifo("foo-fifo", "w+")
       writeLines("abc", zz)
       print(readLines(zz))
       close(zz)
       unlink("foo-fifo")
     }


     ## Unix examples of use of pipes

     # read listing of current directory
     readLines(pipe("ls -1"))

     # remove trailing commas. Suppose

     ## Not run: 
     % cat data2
     450, 390, 467, 654,  30, 542, 334, 432, 421,
     357, 497, 493, 550, 549, 467, 575, 578, 342,
     446, 547, 534, 495, 979, 479
     ## End(Not run)
     # Then read this by
     scan(pipe("sed -e s/,$// data2_"), sep=",")


     # convert decimal point to comma in output: see also write.table
     # both R strings and (probably) the shell need \ doubled
     zz <- pipe(paste("sed s/\\\\./,/ >", "outfile"), "w")
     cat(format(round(stats::rnorm(48), 4)), fill=70, file = zz)
     close(zz)
     file.show("outfile", delete.file=TRUE)

     ## example for a machine running a finger daemon

     con <- socketConnection(port = 79, blocking = TRUE)
     writeLines(paste(system("whoami", intern=TRUE), "\r", sep=""), con)
     gsub(" *$", "", readLines(con))
     close(con)


     ## Not run: 
     ## two R processes communicating via non-blocking sockets
     # R process 1
     con1 <- socketConnection(port = 6011, server=TRUE)
     writeLines(LETTERS, con1)
     close(con1)

     # R process 2
     con2 <- socketConnection(Sys.info()["nodename"], port = 6011)
     # as non-blocking, may need to loop for input
     readLines(con2)
     while(isIncomplete(con2)) {Sys.sleep(1); readLines(con2)}
     close(con2)

     ## examples of use of encodings
     cat(x, file = (con <- file("foo", "w", encoding="UTF-8"))); close(con)
     # read a 'Windows Unicode' file
     A <- read.table(con <- file("students", encoding="UCS-2LE")); close(con)
     ## End(Not run)

