How Does Linux and In a previous article, I discussed the questions: What is an enterprise? What is open source software? What is Linux? I then explained the unique requirements of enterprise computing and why mainframes are so important. Here is a summary. An enterprise is an organization that is: 1. Of immense size, widely distributed geographically. 2. Organized into divisions and managed by a large hierarchy, with a complex IT (Information Technology) infrastructure maintained by a combination of human workers and automated systems. 3. Dependent on software to manipulate and store massive amounts of data. 4. Dependent on hardware consisting of large, complicated, inter-connecting systems that cannot be allowed to fail, degrade, interfere with one another, or run inefficiently. Open source software refers to programs for which the source code is freely available. A significant amount of open source software is available at no cost, with liberal licensing terms that permit modifications by anyone. Linux refers to any operating system based on the Linux kernel. All Linux systems are open source and many of them are available for free. Enterprise computing is carried out within a very complex — and expensive — environment. The priorities are: total reliability, stability, fault tolerance, security and efficiency. At the same time, enterprise systems require immense amounts of computing power, I/O throughput, data storage and network connectivity. Enterprises tend to build their IT facilities around mainframe computers, because they are uniquely capable of consolidating the data processing needs of a huge organization, while simultaneously supporting up to tens of thousands of applications and users efficiently. Let us continue our discussion by asking, What role does Linux and open source software have to play in enterprise computing?
Note: This article was for written Technical Support magazine, a publication of the Network and Systems Professionals Association (NaSPA).
It has been a long time since users and applications have interacted directly with computing hardware. Instead, we and our programs use user-friendly and application-friendly interfaces that hide the underlying details, a process referred to as "virtualization". For example, virtual memory allows programs to access more memory than actually exists. The ultimate virtualization is to create a virtual machine: an entire system that can be devoted exclusively to a single user or to a single application. A virtual machine may or may not be the same computer as the underlying hardware. The important point is that the machine, as viewed by the user or by the program, doesn't really exist. It is created by a sophisticated combination of hardware and system software. To understand why such systems are so important, we need to take a moment to go back in time and see how the idea of virtualization arose. The first true virtual machine facility was created in 1967 at the IBM Cambridge Scientific Center (CSC) in Massachusetts. The CSC programmers, working closely with researchers from MIT, wrote an operating system called CP that ran on the IBM System/360 model 40. (CP stood for "Control Program".) CP was the first operating system able to emulate a virtual machine that supported all the hardware characteristics of the target computer, in this case, itself. A System/360 model 40 running CP could emulate up to 14 virtual machines that, for practical purposes, were clones of itself. Indeed, the emulation was so good that it was possible to use a virtual machine to install, test, and use other System/360 operating systems. The project to develop CP was part of a small but important effort within IBM to create a true time-sharing system. As part of this project, the CSC programmers developed a new interactive, single-user operating system called CMS to run under CP. (CMS stood for Cambridge Monitor System.) CMS ran in its own virtual machine, essentially giving each user his or her own System/360 model 40 computer. (Take a moment to think about what an incredible achievement that must have been in 1967.) A year earlier, in 1966, IBM had begun to ship a brand new computer designed specifically to support time sharing, the System/360 model 67. The actual time-sharing facility was provided by a new operating system called TSS (Time Sharing System). The CSC researchers liked the model 67, but not TSS, so they completely rewrote and enhanced CP to run on the new machine. They then ported CMS to the new system. Although the official IBM party line was to lease the model 67 with TSS, the combination proved disappointing and unpopular. One reason was poor performance. Another reason was that most of IBM was still oriented to traditional "batch processing", not time sharing. When users found out how well CP/CMS and the virtual machine facility ran on the model 67, they started to demand it instead of TSS. In 1968, IBM responded to the pressure by releasing CP/CMS as "Type-III" software, IBM's label for programs "contributed" by IBM employees or customers. The source-code was distributed for free, with no promise of support or liability. As far as IBM was concerned, any organization could run Type-III software but, when it came to fixing bugs and enhancing the product, users were on their own. The users, however, liked virtualization and they rose to the challenge. Because IBM offered no support, the user community supported CP and CMS themselves. They worked hard, studying the source code and modifying it significantly. Many of the enhancements were shared widely, and some of them found their way back into the core products. The result was the creation of the first significant open-source community, long before the Internet came into wide use and more than two decades before Linus Torvalds began the Linux project. In 1970, IBM announced the System/370, a brand new family of computers designed to replace the System/360. In 1972, they announced a number of important enhancements to the 370 line. Among these was a brand new operating system called VM/370, a much-enhanced, completely rewritten version of CP. The virtual machine facility was now a very important part of mainstream IBM, and CMS was rechristened the Conversational Monitor System (surely one of the strangest names in operating system history). CMS soon became the principal interactive interface for most enterprise users, furnishing every user with his- or her own (virtual) computer. In this way, five years before the Apple II and almost a decade before the introduction of the IBM PC, enterprise users became the first large group of people in history to have their own personal computers. Since then, IBM has continually enhanced their mainframe offerings, resulting in a series of new architectures. Following System/360 (1964) and System/370 (1970), IBM developed ESA/370 (1988), ESA/390 (1990) and the current z/Architecture (2000). With each new architecture, the virtual machine facility has grown in importance until, today, many people consider the most important use of a mainframe to be running virtual machines. Since 1986, IBM has supported Unix — in the form of AIX — on many of their computers including, starting in 1990, mainframes. However, it was only in 2000 that Linux was introduced to mainframes. Since then, it has become clear that a powerful computer supporting virtual machines running Linux is an excellent platform on which to support an enterprise. As a result, the world of enterprise computing — which pioneered the concepts of the open-source community and personal computing — has now become the incubator of the ultimate platform for Linux and open source software: virtual machines running on mainframe computers.
There are many ways to integrate open source software into enterprise computing. However, the most fascinating configuration to contemplate is running Linux on a mainframe computer using a virtual machine: the sheer power of the mainframe, coupled with the popularity and ubiquity of Linux, combine to create a compelling picture. To illustrate what I mean, let me take a moment to describe one of the most awesome experiments in computing history. The experiment was conducted in late 1999 by a talented systems design engineer named David Boyes. At the time, the economy was nearing the end of the late-1990s tech boom, and a large, East Coast telecommunications company was planning to develop a full-service, high availability Internet services. They expected to start with about 250 customers, but wanted a system that could scale up to support 1,000 at the same time. The company hired a very expensive, national consulting firm to design a plan. The firm proposed a solution that would use three Sun servers per customer. The initial deployment would require a 750 (250 x 3) separate computers, along with the associated racks, floor space, power, software, cables, routers and switches, not to mention a great deal of engineering and management time. The total three-year cost of the system was estimated at 44 million dollars. Moreover, the price would go up proportionally as the service grew to handle 1,000 customers. In spite of the cost, the company's executives were prepared to implement the proposed plan. However, before making a final decision, they wanted a second opinion. For this they called in Boyes, a telecommunications and network engineer with a great deal of mainframe experience. After a careful review, Boyes suggested building a new system around an IBM System/390 mainframe, running Linux and other open source tools under VM. Boyes' calculations showed that the price of the new mainframe, along with associated costs, would be only two to three million dollars. It looked good but, at the time, this was leading edge technology, and the executives were reluctant to commit so much money without seeing a working prototype. In the fall of 1999, Boyes set about creating a test system using the unused capacity on one of the company's mid-level mainframes, a System/390 9672-G5. Specifically, Boyes was given an LPAR (logical partition) that contained: two CPUs, 128 MB of memory, and a large, fast EMC disk unit. His goal was to simulate the eventual production system by creating a test configuration to serve Web pages upon demand. By the beginning of 2000, Boyes was ready. He had created four different Linux configurations to act as templates, each running in its own virtual machine. In the parlance of the mainframe, such virtual systems are called "images" or "instances". Each of the four Linux images was configured to provide a particular service:
• Network router
All the images used the same version of Linux and were built using open-source tools and applications. To clone more images, Boyes wrote a few short REXX execs (scripts). Once this was done, he was able to create and configure a new working image in about 90 seconds, using only two commands. For external network connectivity, he used the VM TCP/IP communications stack that came with the mainframe. To run the test, Boyes created a Web server farm built from multiple Linux images, each running in its own virtual machine. A single test load generator would simulate one "customer" by sending out 150 Web client requests per minute. To service the requests, each "customer" was given three dedicated virtual systems of its own: a network router, a file server and a Web server. To measure performance, the test load generator would record the number of seconds it took to receive each Web page. To start, Boyes wanted to see if his configuration could meet the client's initial requirement of 250 customers. Doing so would require 1,000 Linux systems (4/customer) to run simultaneously on a single mainframe. The test — which he named Test Plan Able — was successful. There were no major problems and the response time was satisfactory. Boyes then embarked on the next phase, a more ambitions configuration he called Test Plan Baker. He wanted to see if the system could support the secondary requirement of being able to scale up to 1,000 customers. Slowly, he expanded the number of virtual machines. Within a few days, he found himself running 10,000 Linux images at the same time. Boyes had proved his point: a suitably configured mainframe, running Linux and open source software in multiple virtual machines, could handle more than 1,000 customers at the same time. His client was convinced and (incidentally) IBM was astonished. However, Boyes was not yet satisfied. He wanted to see how far he could go. He pushed ahead with one more configuration: Test Plan Charlie. The goal of Test Plan Charlie was to see what it would take to bring the machine to its knees. Slowly, Boyes started to increase the number of Linux images. By the end of the second week in January (2000), he had found the limit: using only part of a mid-level IBM mainframe, Boyes was able to run 41,400 working Linux images at the same time before the machine slowed to a halt. After Test Plan Charlie, Boyes' client was satisfied. The telecommunications company adopted the plan and went into the Internet hosting business. At their peak, they used virtual Linux machines running on a single mainframe to running more than 8,000 virtual machines at the same time. However, this was the early 2000s and the tech bubble had burst. Within a short time, business fell off significantly, and the company discontinued the service. For David Boyes, Test Plan Charlie had been an unparalleled success. But was this really the end? Comparatively speaking, Boyes did not have that much computing and I/O power at his disposal. Although he had proved his point, he couldn't help but wonder, what could he do with a really powerful computer? In April 2000, three months after his initial tests, Boyes got a chance to find out. A large hosting provider had a System/390 9672-ZZ7 system temporarily available. At the time, it was most powerful of the standard mainframes, and Boyes was given four hours in which he could use the machine any way he wanted. In a short time, he had ported all the tools from the previous test. He then embarked on what would come to be known as Test Plan Omega: a second attempt to push an enterprise computer as far as it could go. This time, however, Boyes had more computing power at his disposal than virtually any single person in history. Starting from the endpoint of Test Plan Charlie, Boyes began to create more and more Linux images, each one a fully functional open source system running continuously in its own virtual machine. Within hours, Boyes had his answer: the most powerful mainframe in the world was able to run 97,427 simultaneous Linux images before grinding to a halt. In a few short hours, David Boyes had pushed virtualization to a new limit, and enterprise computing had — in a most convincing and astonishing way — been irrevocably wed to Linux and open source software.
Consider the responsibilities of an enterprise IT department. By its nature, enterprises require very large amounts of computing power, I/O throughput, data storage, and network connectivity. At the same time, there must be a high degree of stability and load balancing, along with guaranteed performance and efficient maintenance (with no downtime!). At the most granular level, the IT department must plan for and support thousands to hundreds of thousands of individual users. Each of these users must have access to the organization's databases, its networks, and the Internet. The users also need a large variety of applications: company wide tools, such as email, word processing, spreadsheets, and presentation programs, for everyone; and specialist tools, such as image editing and Web design programs for, say, the marketing department. Moreover, users expect their computers to always work and their data to always be backed up seamlessly, with no extra effort on their part. Sitting under all of this are the computers and the operating systems. Every computer (real or virtual) needs an operating system, and all of those operating systems must be maintained, patched, and — from time to time — upgraded. Moreover, all the hardware must also be maintained and upgraded. In principle, nothing should ever be allowed to fail: every computer must be up and available at all times. Now consider the needs of the systems behind the scenes. On the software side, the IT department must maintain the applications that operate silently in the background. At the back end, there are large distributed databases and (possibly) transaction systems that form the backbone of the enterprise. These programs must run on mainframes or, at the very least, on very large servers. One level down is the middleware that allows programs to interact across a network. Then there is all the networking, data storage, and I/O required by both the back end and the middleware. At the highest level, the enterprise as a whole must balance the needs for security and privacy against a certain amount of openness, sharing, and access to the outside world. Although some of this is achieved by the use of policy and rules, much of the work is done by software, again the purview of the IT department. Finally, no enterprise is static. As the world changes, so do the needs of the organization. The IT department must not only take care of current needs, it must continually be developing, testing, and deploying new tools, and it must do so without disrupting the status quo. So how does open source software fit in? Carefully, and here is why.
A great deal of commercial and custom software is written specifically for enterprises. Such applications are created to fit into very large-scale environments and to be managed by an IT department. Open source software is not designed in this way. Although it is free and often of high quality, almost all open source software is written for personal workstations and servers, perhaps connected to a local area network. For this reason, an IT department must integrate open source software into the enterprise, a slow and thoughtful process with special consideration given to the overall, long-term needs of the organization. This requires a thorough understanding of the relevant industry standards, as well as expertise with interoperability tools. At times, it also requires a crystal ball, particularly when it comes to money, resources, and future technology. With this in mind, let me describe the ways in which open source software can be used within an enterprise. First, at the personal level, there are a great many open source tools that can be used effectively. For example, for regular users, Open Office is compatible with — and can replace — every major office productivity suite, including Microsoft Office; Firefox can replace Internet Explorer; GIMP (the GNU Image Manipulation Program) can replace Adobe Photoshop; and so on. For programmers, the world of open source software is even richer. Virtually all programmers can find a wealth of high-quality open source development tools and libraries to help with their work. When new programmers are hired, it is common to find that they are already familiar with Linux and open source tools, and they will look for them in their work environment. Indeed, if they aren't given the tools they want, most programmers will find a way to download and install them for themselves. (This is especially true for programmers that come from a university environment, where open source software is used widely in order to reduce licensing costs.) At the middleware level, the situation is different. Middleware is used to "glue together" a variety of disparate applications. Such tools are used to bridge the gaps between mainframe and client/server applications, and to provide dependable communication across heterogeneous platforms. Compared to the world of personal applications, most middleware is still proprietary. Although there are notable open source alternatives, they are, for the most part, not well-developed enough to scale to the level required by an enterprise. When it comes to servers, enterprises often use both open source and proprietary software. The exact choice depends on the application and the scale of the requirements. For example, for Web services, the most widely used software is, by far, the open-source Apache Web server. (In fact, worldwide, it has been the most popular Web server since 1996.) In addition, there are over 40 other related open source products supported by the Apache Software Foundation, many of which have found their way into enterprise computing. However, many open source servers that work just fine do not scale well enough and are not robust enough for an enterprise. For example, the most popular open source database server is MySQL, based on the industry standard SQL (Structured Query Language; invented at IBM in 1969). There is even a special version, MySQL Enterprise Server, specifically designed for large-scale data warehousing, OLTP (online transaction processing) and e-commerce. Nevertheless, the vast majority of enterprises depend on very sophisticated mainframe-based tools for OLTP and data management, and that is not going to change in the foreseeable future. These tools are the end product of decades of development, and very large companies have no interest in abandoning them. For example, virtually all Fortune 500 companies use IBM mainframes running CICS (Customer Information Control System) for OLTP, and DB2 for data management. The likelihood that any of these companies will change to MySQL just because it is open source is remote. At the most basic level, there are the operating systems and the operating environments. In most cases, this part of the enterprise world is still proprietary. For example, most personal computers and workstations run some form of Microsoft Windows (and to a lesser extent Mac OS) although, at the server level, you do see machines running open source operating systems (mostly Linux, some FreeBSD). The mainframe world, however, still depends on IBM, for good reason. The IBM operating systems are very sophisticated products, designed specifically to run large- and very large-scale systems. Thus, the most common mainframe operating systems are still proprietary: z/OS, z/VSE (the replacement for VSE/ESA) and, for high-volume transaction processing, TPF. But what about the operating environments? The systems that support entire operating systems? The most important such environment is VM, the virtual machine facility we discussed earlier. This facility — which IBM calls the "z/VM hypervisor" — makes it possible to run Linux within a virtual machine. In fact, on the right sort of mainframe, VM can support thousands to tens of thousands of virtual Linux machines, all at the same time. (Recall the experiments of David Boyes.) When it comes to Linux and enterprise computing, it is difficult to overstate the importance of z/VM. Indeed, for enterprises, this is where the future lies for several important reasons. First, the system is remarkably dependable. In the words of IBMer Romney White, one of the hardworking geniuses behind VM, "The IBM System z is the most reliable large-scale computing system in existence." Sounds like hubris, but he's right: it would take nothing less an act of God to bring down an entire System z running VM. (An intriguing thought, especially if you are an atheist.) Second, mainframes and mainframe software are expensive. However, when you run a large number of virtual machines on a single computer, the unit cost of one such machine is relatively low. Moreover, a virtual machine is more powerful, easier to administer, and far safer than a real computer. Consider, for example, a standalone Linux workstation or server within an enterprise. The hardware cost is relatively inexpensive. However, the support environment — management, backups, connectivity, and maintenance — is costly. A virtual machine that runs within an existing VM operating environment has very low operating costs with a hardware cost of, literally, zero. Just as important: when Linux is run under VM, the IT department has access to all the existing VM management tools. This allows them to compile detailed system performance measurements, which can be used for capacity planning in a way that is literally impossible with standalone systems. This brings us to the most important benefit of integrating Linux and open source software into an enterprise: embedding the modern Internet-based, open-standards culture into the mainframe environment. As we have discussed, one physical mainframe computer can support a vast number of virtual Linux machines. Each of these can be used as a server or as a personal workstation. In one sense, the idea is not new: for a long time, mainframe users have been running CMS within their own virtual machines. However, it is now possible to provide each user with a mature, rich Unix-like environment of his own, backed up by the power and stability of a mainframe. Once Linux users become members of the mainframe club, they have access to a wealth of mature tools that would otherwise never be available to them. In this way, the benefits of 40 years of mainframe technology are extended to the world of Linux. As George Gershwin remarked the first time he used his own virtual Linux machine, "Who could ask for anything more?"
For almost every important business need, there is an open source alternative. In a perfect world, enterprises would choose the very best open source solutions to integrate into their existing computing environments. Along the way, the IT department would have the full support and cooperation of, not only the executives, but each individual user. And, of course, there would be plenty of money in the IT budget to ensure that none of their noble goals would ever be compromised. Sounds good, but let's skip all the salted nuts and little pink candies. Enterprise computing is carried out in an environment of inadequate budgets, unrealistic expectations, and political infighting. In the real world, the computing needs of an enterprise are met by a complex system consisting of four components: hardware, software, people, and the organization. We have talked a lot about hardware and software, but the other two components are just as important and far more buggy, for several good reasons. First, there is a strong dichotomy between the needs of the enterprise as a whole and the needs of individual users. IT departments need to control and manage computing and information resources for the overall good of the organization. Individuals, however, prefer to use their tools and explore the limits of their thinking in ways that feel personally satisfying. This is a conflict that will always exist because, although every enterprise is composed of individuals, the organization has a life of its own, independent of any person or persons, including its executives and managers. Can you ever imagine, for instance, that the departure of any person, or any group of people, could be damaging enough to force the collapse of an IBM or a Wal-Mart? Within an enterprise, neither the employees nor the organization is perfect. Employees are people, with all the imperfections and emotions that come with being human. They will break rules, ignore policy and — more than we like to admit — make bad decisions, often for personal reasons. Organizations, on the other hand, are complex, soulless organisms that care more about their continued existence and short- and long-term goals than they do about the individuals that do the work. Earlier in this article, I summarized the many — often conflicting — responsibilities that fall upon the shoulders of the people working in an enterprise IT department. If you do not work in such an environment, you are probably ready to take a moment right now, sit back, smile, and count your blessings. But what if you do work in a large IT department? If so, be proud of yourself for doing an impossible job in an improbable environment. You too should take a moment, sit back, smile, and count your blessings. The systems you put together and maintain are nothing short of amazing, But there's more. At this very moment, there are millions of highly skilled, intelligent, open source programmers around the world, happy to volunteer their time and share the fruits of their labor with you for free. Take a moment to think about the vast amount of high quality open source software available to anyone, from individuals to small groups, to the largest enterprises in the world. Is that not also amazing? None of these open source programmers, designers or documenters know you. Nor do they understand your personal goals or the needs of your organization. All they know is that you are a human being trying to use a computer for something useful or interesting and, for that reason, they want to help. And when you think about it, that may be the most amazing thing of all.
© All contents Copyright 2024, Harley Hahn
|