Plain text files vs. word processor files
- A plain text file uses a character encoding such as UTF-8 or ASCII to represent numbers, letters, and symbols. The only non-printing characters in the file that can be used to format the text are newline, tab, and formfeed. Plain text files are often displayed using a monospace font so horizontal alignment and columnar formatting is sometimes done using space characters.
- Word processor documents are generally stored in a binary format to allow for localization and formatted text, such as boldface, italics and multiple fonts, and to be structured into columns and tables.
Word processors were developed to allow formatting of text for presentation on a printed page, while text produced by text editors is generally used for other purposes, such as input data for a computer program.
When both formats are available, the user must select with care. Saving a plain text file in a word-processor format adds formatting information that can make the text unreadable by a program that expects plain text. Conversely, saving a word-processor document as plain text removes any formatting information.
Before text editors existed, computer text was punched into cards with keypunch machines. Physical boxes of these thin cardboard cards were then inserted into a card-reader. Magnetic tape and disk "card-image" files created from such card decks often had no line-separation characters at all, and assumed fixed-length 80-character records. An alternative to cards was punched paper tape. It could be created by some teleprinters (such as the Teletype), which used special characters to indicate ends of records.
The first text editors were "line editors" oriented to teleprinter- or typewriter-style terminals without displays. Commands (often a single keystroke) effected edits to a file at an imaginary insertion point called the "cursor". Edits were verified by typing a command to print a small section of the file, and periodically by printing the entire file. In some line editors, the cursor could be moved by commands that specified the line number in the file, text strings (context) for which to search, and eventually regular expressions. Line editors were major improvements over keypunching. Some line editors could be used by keypunch; editing commands could be taken from a deck of cards and applied to a specified file. Some common line editors supported a "verify" mode in which change commands displayed the altered lines.
When computer terminals with video screens became available, screen-based text editors (sometimes called just "screen editors") became common. One of the earliest full-screen editors was O26, which was written for the operator console of the CDC 6000 series computers in 1967. Another early full-screen editor was vi. Written in the 1970s, it is still a standard editor on Unix and Linux operating systems. Emacs, one of the first open source and free software projects, is another early full-screen or real-time editor, one that was ported to many systems. A full-screen editor's ease-of-use and speed (compared to the line-based editors) motivated many early purchases of video terminals.
The core data structure in a text editor is the one that manages the string (sequence of characters) or list of records that represents the current state of the file being edited. While the former could be stored in a single long consecutive array of characters, the desire for text editors that could more quickly insert text, delete text, and undo/redo previous edits led to the development of more complicated sequence data structures. A typical text editor uses a gap buffer, a linked list of lines (as in PaperClip), a piece table, or a rope, as its sequence data structure.
Types of text editors
Some text editors are small and simple, while others offer broad and complex functions. For example, Unix and Unix-like operating systems have the pico editor (or a variant), but many also include the vi and Emacs editors. Microsoft Windows systems come with the simple Notepad, though many people—especially programmers—prefer other editors with more features. Under Apple Macintosh's classic Mac OS there was the native SimpleText, which was replaced in Mac OS X by TextEdit, which combines features of a text editor with those typical of a word processor such as rulers, margins and multiple font selection. These features are not available simultaneously, but must be switched by user command, or through the program automatically determining the file type.
Most word processors can read and write files in plain text format, allowing them to open files saved from text editors. Saving these files from a word processor, however, requires ensuring the file is written in plain text format, and that any text encoding or BOM settings won't obscure the file for its intended use. Non-WYSIWYG word processors, such as WordStar, are more easily pressed into service as text editors, and in fact were commonly used as such during the 1980s. The default file format of these word processors often resembles a markup language, with the basic format being plain text and visual formatting achieved using non-printing control characters or escape sequences. Later word processors like Microsoft Word store their files in a binary format and are almost never used to edit plain text files.
Some text editors can edit unusually large files such as log files or an entire database placed in a single file. Simpler text editors may just read files into the computer's main memory. With larger files, this may be a slow process, and the entire file may not fit. Some text editors do not let the user start editing until this read-in is complete. Editing performance also often suffers in nonspecialized editors, with the editor taking seconds or even minutes to respond to keystrokes or navigation commands. By only storing the visible portion of large files in memory, editing performance improves.
Some editors are programmable, meaning they can be customized for specific uses. One motive for customizing is to make a text editor use the commands of another text editor with which the user is more familiar, or to duplicate missing functionality the user has come to depend on. Software developers often use editor customizations tailored to the programming language or development environment they are working in. The programmability of some text editors is limited to enhancing the core editing functionality of the program, but Emacs can be extended far beyond editing text files—for web browsing, reading email, online chat, managing files or playing games. Emacs can even emulate Vi, its rival in the traditional editor wars of Unix culture.
An important group of programmable editors uses REXX as a scripting language. These "orthodox editors" contain a "command line" into which commands and macros can be typed and text lines into which line commands and macros can be typed. Most such editors are derivatives of ISPF/PDF EDIT or of XEDIT, IBM's flagship editor for VM/SP through z/VM. Among them are THE, KEDIT, X2, Uni-edit, and SEDIT.
A text editor written or customized for a specific use can determine what the user is editing and assist the user, often by completing programming terms and showing tooltips with relevant documentation. Many text editors for software developers include source code syntax highlighting and automatic indentation to make programs easier to read and write. Programming editors often let the user select the name of an include file, function or variable, then jump to its definition. Some also allow for easy navigation back to the original section of code by storing the initial cursor location or by displaying the requested definition in a popup window or temporary buffer. Some editors implement this ability themselves, but often an auxiliary utility like ctags is used to locate the definitions.
- Find and replace – Text editors provide extensive facilities for searching and replacing text, either on groups of files or interactively. Advanced editors can use regular expressions to search and edit text or code.
- Cut, copy, and paste – most text editors provide methods to duplicate and move text within the file, or between files.
- Ability to handle UTF-8 encoded text.
- Text formatting – Text editors often provide basic formatting features like line wrap, auto-indentation, bullet list formatting using ASCII characters, comment formatting, syntax highlighting and so on.
- Undo and redo – As with word processors, text editors provide a way to undo and redo the last edit. Often—especially with older text editors—there is only one level of edit history remembered and successively issuing the undo command will only "toggle" the last change. Modern or more complex editors usually provide a multiple level history such that issuing the undo command repeatedly will revert the document to successively older edits. A separate redo command will cycle the edits "forward" toward the most recent changes. The number of changes remembered depends upon the editor and is often configurable by the user.
- Data transformation – Reading or merging the contents of another text file into the file currently being edited. Some text editors provide a way to insert the output of a command issued to the operating system's shell.
- Filtering – Some advanced text editors allow the editor to send all or sections of the file being edited to another utility and read the result back into the file in place of the lines being "filtered". This, for example, is useful for sorting a series of lines alphabetically or numerically, doing mathematical computations, indenting source code, and so on.
- Syntax highlighting – contextually highlights source code, markup languages, config files and other text that appears in an organized or predictable format. Editors generally allow users to customize the colors or styles used for each language element. Some text editors also allow users to install and use themes to change the look and feel of the editor's entire user interface.
- Extensibility - a text editor intended for use by programmers must provide some plugin mechanism, or be scriptable, so a programmer can customize the editor with features needed to manage individual software projects, customize functionality or key bindings for specific programming languages or version control systems, or conform to specific coding styles.
Some editors include special features and extra functions, for instance,
- Source code editors are text editors with additional functionality to facilitate the production of source code. These often feature user-programmable syntax highlighting and code navigation functions as well as coding tools or keyboard macros similar to an HTML editor (see below).
- Folding editors. This subclass includes so-called "orthodox editors" that are derivatives of Xedit. Editors that implement folding without programing-specific features are usually called outliners (see below).
- IDEs (integrated development environments) are designed to manage and streamline large programming projects. They are usually only used for programming as they contain many features unnecessary for simple text editing.
- World Wide Web authors are offered a variety of HTML editors dedicated to the task of creating web pages. These include: Dreamweaver, KompoZer and E Text Editor. Many offer the option of viewing a work in progress on a built-in HTML rendering engine or standard web browser. Most web development is done in a dynamic programming language such as Ruby or PHP using a source code editor or IDE. The HTML delivered by all but the simplest static web sites is stored as individual template files that are assembled by the software controlling the site and do not compose a complete HTML document.
- Mathematicians, physicists, and computer scientists often produce articles and books using TeX or LaTeX in plain text files. Such documents are often produced by a standard text editor, but some people use specialized TeX editors.
- Outliners. Also called tree-based editors, because they combine a hierarchical outline tree with a text editor. Folding (see above) can be considered a specialized form of outlining.
- Collaborative editors allow multiple users to work on the same document simultaneously from remote locations over a network. The changes made by individual users are tracked and merged into the document automatically to eliminate the possibility of conflicting edits. These editors also typically include an online chat component for discussion among editors.
- Simultaneous editing is a technique in End-user development research to edit all items in a multiple selection. It allows the user to manipulate all the selected items at once through direct manipulation. The Lapis text editor and the multi edit plugin for gedit are examples of this technique. The Lapis editor can also create an automatic multiple selection based on an example item.
- Distraction-free editors provide a minimalistic interface with the purpose of isolating the writer from the rest of the applications and operating system, thus being able to focus on the writing without distractions from interface elements like a toolbar or notification area.
Programmable editors can usually be enhanced to perform any or all of these functions, but simpler editors focus on just one, or, like gPHPedit, are targeted at a single programming language.
- List of text editors
- Comparison of text editors
- Editor war
- File viewer – does not change file, faster for very large files
- Hex editor – used for editing binary files
- Stream editor – used for non-interactive editing
- Originally macros were written in assembler, CLIST (TSO), CMS EXEC (VM), EXEC2 (VM/SE) or PL/I, but most users dropped CLIST, EXEC and EXEC2 once REXX was available.
- A line command is a command typed into the sequence number entry area associated with a specific line of text and whose scope is limited to that line, or, in the case of a block command, associated with the block of lines between the beginning and ending line commands. An example of the latter would be typing the command ucc (block upper case) into the entry areas of two lines; this has the same effect as typing uc (upper case) into the entry area of each line in the range.
- H. Albert Napier; Ollie N. Rivers; Stuart Wagner (2005). Creating a Winning E-Business. Cengage Learning. p. 330. ISBN 1111796092.
- Peter Norton; Scott H. Clark (2002). Peter Norton's New Inside the PC. Sams Publishing. p. 54. ISBN 0672322897.
- L. Gopalakrishnan; G. Padmanabhan; Sudhat Shukla (2003). Your Home PC: Making The Most Of Your Personal Computer. Tata McGraw-Hill Education. p. 190. ISBN 0070473544.
- "The Best Free Text Editors for Windows, Linux, and Mac".
Every operating system comes with a default, basic text editor, but most of us install our own enhanced text editors to get more features.
- "The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition". The IEEE and The Open Group. 2004. Retrieved January 18, 2010.
- "Introducing the Emacs editing environment".
- "Multics Emacs: The History, Design and Implementation".
Some Multics users purchased these terminals ..., using them either as "glass teletypes" or via "local editing."
- Charles Crowley. "Data Structures for Text Sequences". Section "Introduction".
- "Text Editors for Programmeres - Programming Tools".
If you open a .doc file in a text editor, you will notice that most of the file is formatting codes. Text editors, however, do not add formatting codes, which makes it easier to compile your code.
- "From Vim to Emacs+Evil chaotic migration guide".
- "Gitorious". Retrieved 27 May 2015.
- "LAPIS: Smart Editing with Text Structure". Retrieved 27 May 2015.
- "Lightweight Structured Text Processing". Retrieved 27 May 2015.
- New gedit plugin: multi edit, and a demo video.
- Orthodox Editors as a Special Class of Advanced Editors, discusses Xedit and its clones with an emphasis of folding capabilities and programmability