Reverse engineering

Reverse engineering, also called back engineering, is the processes of extracting knowledge or design information from anything man-made and re-producing it or re-producing anything based on the extracted information.[1]:3 The process often involves disassembling something (a mechanical device, electronic component, computer program, or biological, chemical, or organic matter) and analyzing its components and workings in detail.

The reasons and goals for obtaining such information vary widely from everyday or socially beneficial actions, to criminal actions, depending upon the situation. Often no intellectual property rights are breached, such as when a person or business cannot recollect how something was done, or what something does, and needs to reverse engineer it to work it out for themselves. Reverse engineering is also beneficial in crime prevention, where suspected malware is reverse engineered to understand what it does, and how to detect and remove it, and to allow computers and devices to work together ("interoperate") and to allow saved files on obsolete systems to be used in newer systems. By contrast, reverse engineering can also be used to "crack" software and media to remove their copy protection,[1]:5 or to create a (possibly improved) copy or even a knockoff; this is usually the goal of a competitor.[1]:4

Reverse engineering has its origins in the analysis of hardware for commercial or military advantage.[2]:13 However, the reverse engineering process in itself is not concerned with creating a copy or changing the artifact in some way; it is only an analysis in order to deduce design features from products with little or no additional knowledge about the procedures involved in their original production.[2]:15 In some cases, the goal of the reverse engineering process can simply be a redocumentation of legacy systems.[2]:15[3] Even when the product reverse engineered is that of a competitor, the goal may not be to copy them, but to perform competitor analysis.[4] Reverse engineering may also be used to create interoperable products; despite some narrowly tailored US and EU legislation, the legality of using specific reverse engineering techniques for this purpose has been hotly contested in courts worldwide for more than two decades.[5]

Motivation

Reasons for reverse engineering:

Common situations

Reverse engineering of machines

As computer-aided design (CAD) has become more popular, reverse engineering has become a viable method to create a 3D virtual model of an existing physical part for use in 3D CAD, CAM, CAE or other software.[7] The reverse-engineering process involves measuring an object and then reconstructing it as a 3D model. The physical object can be measured using 3D scanning technologies like CMMs, laser scanners, structured light digitizers, or Industrial CT Scanning (computed tomography). The measured data alone, usually represented as a point cloud, lacks topological information and is therefore often processed and modeled into a more usable format such as a triangular-faced mesh, a set of NURBS surfaces, or a CAD model.[8]

Hybrid Modelling is commonly used term when NURBS and Parametric modelling are implemented together. Using a combination of geometric and freeform surfaces can provide a powerful method of 3D modelling. Areas of freeform data can be combined with exact geometric surfaces to create a hybrid model. A typical example of this would be the reverse engineering of a cylinder head, which includes freeform cast features, such as water jackets and high tolerance machined areas.[9]

Reverse engineering is also used by businesses to bring existing physical geometry into digital product development environments, to make a digital 3D record of their own products, or to assess competitors' products. It is used to analyse, for instance, how a product works, what it does, and what components it consists of, estimate costs, and identify potential patent infringement, etc.

Value engineering is a related activity also used by businesses. It involves de-constructing and analysing products, but the objective is to find opportunities for cost cutting.

Reverse engineering of software

The term reverse engineering as applied to software means different things to different people, prompting Chikofsky and Cross to write a paper researching the various uses and defining a taxonomy. From their paper, they state, "Reverse engineering is the process of analyzing a subject system to create representations of the system at a higher level of abstraction."[10] It can also be seen as "going backwards through the development cycle".[11] In this model, the output of the implementation phase (in source code form) is reverse-engineered back to the analysis phase, in an inversion of the traditional waterfall model. Another term for this technique is program comprehension.[3]

Reverse engineering is a process of examination only: the software system under consideration is not modified (which would make it re-engineering). Software anti-tamper technology like obfuscation is used to deter both reverse engineering and re-engineering of proprietary software and software-powered systems. In practice, two main types of reverse engineering emerge. In the first case, source code is already available for the software, but higher-level aspects of the program, perhaps poorly documented or documented but no longer valid, are discovered. In the second case, there is no source code available for the software, and any efforts towards discovering one possible source code for the software are regarded as reverse engineering. This second usage of the term is the one most people are familiar with. Reverse engineering of software can make use of the clean room design technique to avoid copyright infringement.

On a related note, black box testing in software engineering has a lot in common with reverse engineering. The tester usually has the API, but their goals are to find bugs and undocumented features by bashing the product from outside.[12]

Other purposes of reverse engineering include security auditing, removal of copy protection ("cracking"), circumvention of access restrictions often present in consumer electronics, customization of embedded systems (such as engine management systems), in-house repairs or retrofits, enabling of additional features on low-cost "crippled" hardware (such as some graphics card chip-sets), or even mere satisfaction of curiosity.

Binary software

This process is sometimes termed Reverse Code Engineering, or RCE.[13] As an example, decompilation of binaries for the Java platform can be accomplished using Jad. One famous case of reverse engineering was the first non-IBM implementation of the PC BIOS which launched the historic IBM PC compatible industry that has been the overwhelmingly dominant computer hardware platform for many years. Reverse engineering of software is protected in the U.S. by the fair use exception in copyright law.[14] The Samba software, which allows systems that are not running Microsoft Windows systems to share files with systems that are, is a classic example of software reverse engineering,[15] since the Samba project had to reverse-engineer unpublished information about how Windows file sharing worked, so that non-Windows computers could emulate it. The Wine project does the same thing for the Windows API, and OpenOffice.org is one party doing this for the Microsoft Office file formats. The ReactOS project is even more ambitious in its goals, as it strives to provide binary (ABI and API) compatibility with the current Windows OSes of the NT branch, allowing software and drivers written for Windows to run on a clean-room reverse-engineered Free Software (GPL) counterpart. WindowsSCOPE allows for reverse-engineering the full contents of a Windows system's live memory including a binary-level, graphical reverse engineering of all running processes.

Another classic, if not well-known example is that in 1987 Bell Laboratories reverse-engineered the Mac OS System 4.1, originally running on the Apple Macintosh SE, so they could run it on RISC machines of their own.[16]

Binary software techniques

Reverse engineering of software can be accomplished by various methods. The three main groups of software reverse engineering are

  1. Analysis through observation of information exchange, most prevalent in protocol reverse engineering, which involves using bus analyzers and packet sniffers, for example, for accessing a computer bus or computer network connection and revealing the traffic data thereon. Bus or network behavior can then be analyzed to produce a stand-alone implementation that mimics that behavior. This is especially useful for reverse engineering device drivers. Sometimes, reverse engineering on embedded systems is greatly assisted by tools deliberately introduced by the manufacturer, such as JTAG ports or other debugging means. In Microsoft Windows, low-level debuggers such as SoftICE are popular.
  2. Disassembly using a disassembler, meaning the raw machine language of the program is read and understood in its own terms, only with the aid of machine-language mnemonics. This works on any computer program but can take quite some time, especially for someone not used to machine code. The Interactive Disassembler is a particularly popular tool.
  3. Decompilation using a decompiler, a process that tries, with varying results, to recreate the source code in some high-level language for a program only available in machine code or bytecode.

Software classification

Software classification is the process of identifying similarities between different software binaries (for example, two different versions of the same binary) used to detect code relations between software samples. This task was traditionally done manually for several reasons (such as patch analysis for vulnerability detection and copyright infringement) but nowadays can be done somewhat automatically for large numbers of samples.

This method is being used mostly for long and thorough reverse engineering tasks (complete analysis of a complex algorithm or big piece of software). In general, statistical classification is considered to be a hard problem and this is also true for software classification, therefore there aren't many solutions/tools that handle this task well.

Source code

A number of UML tools refer to the process of importing and analysing source code to generate UML diagrams as "reverse engineering". See List of UML tools.

Although UML is one approach to providing "reverse engineering" more recent advances in international standards activities have resulted in the development of the Knowledge Discovery Metamodel (KDM). This standard delivers an ontology for the intermediate (or abstracted) representation of programming language constructs and their interrelationships. An Object Management Group standard (on its way to becoming an ISO standard as well), KDM has started to take hold in industry with the development of tools and analysis environments which can deliver the extraction and analysis of source, binary, and byte code. For source code analysis, KDM's granular standards' architecture enables the extraction of software system flows (data, control, & call maps), architectures, and business layer knowledge (rules, terms, process). The standard enables the use of a common data format (XMI) enabling the correlation of the various layers of system knowledge for either detailed analysis (e.g. root cause, impact) or derived analysis (e.g. business process extraction). Although efforts to represent language constructs can be never-ending given the number of languages, the continuous evolution of software languages and the development of new languages, the standard does allow for the use of extensions to support the broad language set as well as evolution. KDM is compatible with UML, BPMN, RDF and other standards enabling migration into other environments and thus leverage system knowledge for efforts such as software system transformation and enterprise business layer analysis.

Reverse engineering of protocols

Protocols are sets of rules that describe message formats and how messages are exchanged (i.e., the protocol state-machine). Accordingly, the problem of protocol reverse-engineering can be partitioned into two subproblems; message format and state-machine reverse-engineering.

The message formats have traditionally been reverse-engineered through a tedious manual process, which involved analysis of how protocol implementations process messages, but recent research proposed a number of automatic solutions.[17][18][19] Typically, these automatic approaches either group observed messages into clusters using various clustering analyses, or emulate the protocol implementation tracing the message processing.

There has been less work on reverse-engineering of state-machines of protocols. In general, the protocol state-machines can be learned either through a process of offline learning, which passively observes communication and attempts to build the most general state-machine accepting all observed sequences of messages, and online learning, which allows interactive generation of probing sequences of messages and listening to responses to those probing sequences. In general, offline learning of small state-machines is known to be NP-complete,[20] while online learning can be done in polynomial time.[21] An automatic offline approach has been demonstrated by Comparetti et al.[19] and an online approach very recently by Cho et al.[22]

Other components of typical protocols, like encryption and hash functions, can be reverse-engineered automatically as well. Typically, the automatic approaches trace the execution of protocol implementations and try to detect buffers in memory holding unencrypted packets.[23]

Reverse engineering of integrated circuits/smart cards

Reverse engineering is an invasive and destructive form of analyzing a smart card. The attacker grinds away layer after layer of the smart card and takes pictures with an electron microscope. With this technique, it is possible to reveal the complete hardware and software part of the smart card. The major problem for the attacker is to bring everything into the right order to find out how everything works. The makers of the card try to hide keys and operations by mixing up memory positions, for example, bus scrambling.[24][25] In some cases, it is even possible to attach a probe to measure voltages while the smart card is still operational. The makers of the card employ sensors to detect and prevent this attack.[26] This attack is not very common because it requires a large investment in effort and special equipment that is generally only available to large chip manufacturers. Furthermore, the payoff from this attack is low since other security techniques are often employed such as shadow accounts.

Reverse engineering for military applications

Reverse engineering is often used by people in order to copy other nations' technologies, devices, or information that have been obtained by regular troops in the fields or by intelligence operations. It was often used during the Second World War and the Cold War. Well-known examples from WWII and later include:

Overlap with patent law

Reverse engineering applies primarily to gaining understanding of a process or artifact, where the manner of its construction, use, or internal processes is not made clear by its creator.

Patented items do not of themselves have to be reverse-engineered to be studied, since the essence of a patent is that the inventor provides detailed public disclosure themselves, and in return receives legal protection of the invention involved. However, an item produced under one or more patents could also include other technology that is not patented and not disclosed. Indeed, one common motivation of reverse engineering is to determine whether a competitor's product contains patent infringements or copyright infringements.

Legality

United States

In the United States even if an artifact or process is protected by trade secrets, reverse-engineering the artifact or process is often lawful as long as it has been legitimately obtained.[29]

Reverse engineering of computer software in the US often falls under both contract law as a breach of contract as well as any other relevant laws. This is because most EULA's (end user license agreement) specifically prohibit it, and U.S. courts have ruled that if such terms are present, they override the copyright law which expressly permits it (see Bowers v. Baystate Technologies[30][31]).

Sec. 103(f) of the DMCA (17 U.S.C. § 1201 (f)) says that a person who is in legal possession of a program, is permitted to reverse-engineer and circumvent its protection if this is necessary in order to achieve "interoperability" - a term broadly covering other devices and programs being able to interact with it, make use of it, and to use and transfer data to and from it, in useful ways. A limited exemption exists that allows the knowledge thus gained to be shared and used for interoperability purposes. The section states:

(f) Reverse Engineering.—

(1) Notwithstanding the provisions of subsection (a)(1)(A), a person who has lawfully obtained the right to use a copy of a computer program may circumvent a technological measure that effectively controls access to a particular portion of that program for the sole purpose of identifying and analyzing those elements of the program that are necessary to achieve interoperability of an independently created computer program with other programs, and that have not previously been readily available to the person engaging in the circumvention, to the extent any such acts of identification and analysis do not constitute infringement under this title.

(2) Notwithstanding the provisions of subsections (a)(2) and (b), a person may develop and employ technological means to circumvent a technological measure, or to circumvent protection afforded by a technological measure, in order to enable the identification and analysis under paragraph (1), or for the purpose of enabling interoperability of an independently created computer program with other programs, if such means are necessary to achieve such interoperability, to the extent that doing so does not constitute infringement under this title.

(3) The information acquired through the acts permitted under paragraph (1), and the means permitted under paragraph (2), may be made available to others if the person referred to in paragraph (1) or (2), as the case may be, provides such information or means solely for the purpose of enabling interoperability of an independently created computer program with other programs, and to the extent that doing so does not constitute infringement under this title or violate applicable law other than this section.

(4) For purposes of this subsection, the term 「interoperability」 means the ability of computer programs to exchange information, and of such programs mutually to use the information which has been exchanged.

European Union

EU Directive 2009/24, on the legal protection of computer programs, governs reverse engineering in the European Union. The directive states:[32]

(15) The unauthorised reproduction, translation, adaptation or transformation of the form of the code in which a copy of a computer program has been made available constitutes an infringement of the exclusive rights of the author. Nevertheless, circumstances may exist when such a reproduction of the code and translation of its form are indispensable to obtain the necessary information to achieve the interoperability of an independently created program with other programs. It has therefore to be considered that, in these limited circumstances only, performance of the acts of reproduction and translation by or on behalf of a person having a right to use a copy of the program is legitimate and compatible with fair practice and must therefore be deemed not to require the authorisation of the rightholder. An objective of this exception is to make it possible to connect all components of a computer system, including those of different manufacturers, so that they can work together. Such an exception to the author's exclusive rights may not be used in a way which prejudices the legitimate interests of the rightholder or which conflicts with a normal exploitation of the program.

This superseded an earlier 1991 Directive.[33]

See also

References

  1. 1 2 3 Eilam, Eldad (2005). Reversing: secrets of reverse engineering. John Wiley & Sons. ISBN 978-0-7645-7481-8.
  2. 1 2 3 Chikofsky, E. J. & Cross, J. H., II (1990). "Reverse Engineering and Design Recovery: A Taxonomy". IEEE Software. 7 (1): 13–17. doi:10.1109/52.43044.
  3. 1 2 A Survey of Reverse Engineering and Program Comprehension. Michael L. Nelson, April 19, 1996, ODU CS 551 – Software Engineering Survey.arXiv:cs/0503068v1
  4. Vinesh Raja; Kiran J. Fernandes (2007). Reverse Engineering: An Industrial Perspective. Springer Science & Business Media. p. 3. ISBN 978-1-84628-856-2.
  5. Jonathan Band; Masanobu Katoh (2011). Interfaces on Trial 2.0. MIT Press. p. 136,. ISBN 978-0-262-29446-1.
  6. Internet Engineering Task Force RFC 2828 Internet Security Glossary
  7. Varady, T; Martin, R; Cox, J (1997). "Reverse engineering of geometric models–an introduction" (PDF). Computer-Aided Design. 29 (4): 255–268. doi:10.1016/S0010-4485(96)00054-1.
  8. "Haman Engineering Solutions".
  9. "Reverse Engineering".
  10. Chikofsky, E. J.; Cross, J. H. (January 1990). "Reverse engineering and design recovery: A taxonomy" (PDF). IEEE Software. 7: 13–17. doi:10.1109/52.43044.
  11. Warden, R. (1992). Software Reuse and Reverse Engineering in Practice. London, England: Chapman & Hall. pp. 283–305.
  12. Shahbaz, Muzammil (2012). Reverse Engineering and Testing of Black-Box Software Components: by Grammatical Inference techniques. LAP LAMBERT Academic Publishing. ISBN 978-3659140730.
  13. Chuvakin, Anton; Cyrus Peikari (January 2004). Security Warrior (1st ed.). O'Reilly.
  14. Samuelson, Pamela & Scotchmer, Suzanne (2002). "The Law and Economics of Reverse Engineering". Yale Law Journal. 111 (7): 1575–1663. doi:10.2307/797533. JSTOR 797533.
  15. "Samba: An Introduction". 2001-11-27. Retrieved 2009-05-07.
  16. Lee, Newton (2013). Counterterrorism and Cybersecurity: Total Information Awareness (2nd Edition). Springer Science+Business Media. p. 110.
  17. W. Cui, J. Kannan, and H. J. Wang. Discoverer: Automatic protocol reverse engineering from network traces. In Proceedings of 16th USENIX Security Symposium on USENIX Security Symposium, pp. 1–14.
  18. W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irún-Briz. Tupni: Automatic reverse engineering of input formats. In Proceedings of the 15th ACM Conference on Computer and Communications Security, pp. 391–402. ACM, Oct 2008.
  19. 1 2 P. M. Comparetti, G. Wondracek, C. Kruegel, and E. Kirda. Prospex: Protocol specification extraction. In Proceedings of the 2009 30th IEEE Symposium on Security and Privacy, pp. 110–125, Washington, 2009. IEEE Computer Society.
  20. Gold, E (1978). "Complexity of automaton identification from given data". Information and Control. 37 (3): 302–320. doi:10.1016/S0019-9958(78)90562-4.
  21. D. Angluin (1987). "Learning regular sets from queries and counterexamples". Information and Computation. 75 (2): 87–106. doi:10.1016/0890-5401(87)90052-6.
  22. C.Y. Cho, D. Babic, R. Shin, and D. Song. Inference and Analysis of Formal Models of Botnet Command and Control Protocols, 2010 ACM Conference on Computer and Communications Security.
  23. Polyglot: automatic extraction of protocol message format using dynamic binary analysis. J. Caballero, H. Yin, Z. Liang, and D. Song. Proceedings of the 14th ACM conference on Computer and communications security, p. 317-329.
  24. Wolfgang Rankl, Wolfgang Effing, Smart Card Handbook (2004)
  25. T. Welz: Smart cards as methods for payment (2008), Seminar ITS-Security Ruhr-Universität Bochum
  26. David C. Musker: Protecting & Exploiting Intellectual Property in Electronics, IBC Conferences, 10 June 1998
  27. "Redstone rocket". centennialofflight.net. Retrieved 2010-04-27.
  28. "The Chinese Air Force: Evolving Concepts, Roles, and Capabilities", Center for the Study of Chinese Military Affairs (U.S), by National Defense University Press, pg. 277
  29. "Trade Secrets 101," Feature Article, March 2011. ASME. Retrieved on 2013-10-31.
  30. Baystate v. Bowers Discussion. Utsystem.edu. Retrieved on 2011-05-29.
  31. Gross, Grant. (2003-06-26) Contract case could hurt reverse engineering | Developer World. InfoWorld. Retrieved on 2011-05-29.
  32. DIRECTIVE 2009/24/EC OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 23 April 2009 on the legal protection of computer programs
  33. Council Directive 91/250/EEC of 14 May 1991 on the legal protection of computer programs. Eur-lex.europa.eu. Retrieved on 2011-05-29.

Further reading

This article is issued from Wikipedia - version of the 11/30/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.