linuxism

Asynchronous I/O

Computer Science/Asynchronous IO 2012. 11. 7. 13:05

Asynchronous I/O, or non-blocking I/O, in computer science, is a form of input/output processing that permits other processing to continue before the transmission has finished.

Input and output (I/O) operations on a computer can be extremely slow compared to the processing of data. An I/O device can incorporate mechanical devices that must physically move, such as a hard drive seeking a track to read or write; this is often orders of magnitude slower than the switching of electric current. For example, during a disk operation that takes ten milliseconds to perform, a processor that is clocked at one gigahertz could have performed ten million instruction-processing cycles.

A simple approach to I/O would be to start the access and then wait for it to complete. But such an approach (called synchronous I/O or blocking I/O) would block the progress of a program while the communication is in progress, leaving system resources idle. When a program makes many I/O operations, this means that the processor can spend almost all of its time idle waiting for I/O operations to complete.

Alternatively, it is possible to start the communication and then perform processing that does not require that the I/O has completed. This approach is called asynchronous input/output. Any task that actually depends on the I/O having completed (this includes both using the input values and critical operations that claim to assure that a write operation has been completed) still needs to wait for the I/O operation to complete, and thus is still blocked, but other processing that does not have a dependency on the I/O operation can continue.

Many operating system functions exist to implement asynchronous I/O at many levels. In fact, one of the main functions of all but the most rudimentary of operating systems is to perform at least some form of basic asynchronous I/O, though this may not be particularly apparent to the operator or programmer. In the simplest software solution, the hardware device status is polledat intervals to detect whether the device is ready for its next operation. (For example the CP/M operating system was built this way. Its system call semantics did not require any more elaborate I/O structure than this, though most implementations were more complex, and thereby more efficient.) Direct memory access (DMA) can greatly increase the efficiency of a polling-based system, and hardware interrupts can eliminate the need for polling entirely. Multitasking operating systems can exploit the functionality provided by hardware interrupts, whilst hiding the complexity of interrupt handling from the user. Spooling was one of the first forms of multitasking designed to exploit asynchronous I/O. Finally, multithreading and explicit asynchronous I/OAPIs within user processes can exploit asynchronous I/O further, at the cost of extra software complexity.

Asynchronous I/O is used to improve throughput, latency, and/or responsiveness.

[hide]

[edit]Forms

All forms of asynchronous I/O open applications up to potential resource conflicts and associated failure. Careful programming (often using mutual exclusion, semaphores, etc.) is required to prevent this.

When exposing asynchronous I/O to applications there are a few broad classes of implementation. The form of the API provided to the application does not necessarily correspond with the mechanism actually provided by the operating system; emulations are possible. Furthermore, more than one method may be used by a single application, depending on its needs and the desires of its programmer(s). Many operating systems provide more than one of these mechanisms, it is possible that some may provide all of them.

[edit]Process

Available in early Unix. In a computer multitasking operating system, processing can be distributed across different processes, which run independently, have their own memory, and process their own I/O flows; these flows are typically connected in pipelines. Processes are fairly expensive to create and maintain, so this solution only works well if the set of processes is small and relatively stable. It also assumes that the individual processes can operate independently, apart from processing each other's I/O; if they need to communicate in other ways, coordinating them can become difficult.

An extension of this approach is dataflow programming, which allows more complicated networks than just the chains that pipes support.

[edit]Polling

Variations:

Error if it cannot be done yet (reissue later)
Report when it can be done without blocking (then issue it)

Available in traditional Unix and Windows. Its major problem is that it can waste CPU time polling repeatedly when there is nothing else for the issuing process to do, reducing the time available for other processes. Also, because a polling application is essentially single-threaded it may be unable to fully exploit I/O parallelism that the hardware is capable of.

[edit]Select(/poll) loops

Available in BSD Unix, and almost anything else with a TCP/IP protocol stack that either utilizes or is modeled after the BSD implementation. A variation on the theme of polling, a select loop uses the select system call to sleep until a condition occurs on a file descriptor (e.g., when data is available for reading), a timeout occurs, or a signal is received (e.g., when a child process dies). By examining the return parameters of the select call, the loop finds out which file descriptor has changed and executes the appropriate code. Often, for ease of use, the select loop is implemented as an event loop, perhaps using callback functions; the situation lends itself particularly well to event-driven programming.

While this method is reliable and relatively efficient, it depends heavily on the Unix paradigm that "everything is a file"; any blocking I/O that does not involve a file descriptor will block the process. The select loop also relies on being able to involve all I/O in the central select call; libraries that conduct their own I/O are particularly problematic in this respect. An additional potential problem is that the select and the I/O operations are still sufficiently decoupled that select's result may effectively be a lie: if two processes are reading from a single file descriptor (arguably bad design) the select may indicate the availability of read data that has disappeared by the time that the read is issued, thus resulting in blocking; if two processes are writing to a single file descriptor (not that uncommon) the select may indicate immediate writability yet the write may still block, because a buffer has been filled by the other process in the interim, or due to the write being too large for the available buffer or in other ways unsuitable to the recipient.

The select loop doesn't reach the ultimate system efficiencies possible with, say, the completion queues method, because the semantics of the select call, allowing as it does for per-call tuning of the acceptable event set, consumes some amount of time per invocation traversing the selection array. This creates little overhead for user applications that might have open one file descriptor for the windowing system and a few for open files, but becomes more of a problem as the number of potential event sources grows, and can hinder development of many-client server applications; other asynchronous methods may be noticeably more efficient in such cases. Some Unixes provide system-specific calls with better scaling; for example, epoll in Linux(that fills the return selection array with only those event sources on which an event has occurred), kqueue in FreeBSD, and /dev/poll in Solaris.

SVR3 Unix provided the poll system call. Arguably better-named than select, for the purposes of this discussion it is essentially the same thing. SVR4 Unixes (and thus POSIX) offer both calls.

[edit]Signals (interrupts)

Available in BSD and POSIX Unix. I/O is issued asynchronously, and when it is complete a signal (interrupt) is generated. As in low-level kernel programming, the facilities available for safe use within the signal handler are limited, and the main flow of the process could have been interrupted at nearly any point, resulting in inconsistent data structures as seen by the signal handler. The signal handler is usually not able to issue further asynchronous I/O by itself.

The signal approach, though relatively simple to implement within the OS, brings to the application program the unwelcome baggage associated with writing an operating system's kernel interrupt system. Its worst characteristic is that every blocking (synchronous) system call is potentially interruptible; the programmer must usually incorporate retry code at each call.

[edit]Callback functions

Available in Mac OS (pre-Mac OS X), VMS and Windows. Bears many of the characteristics of the signal method as it is fundamentally the same thing, though rarely recognized as such. The difference is that each I/O request usually can have its own completion function, whereas the signal system has a single callback.

A potential problem is that stack depth can grow unmanageably, as an extremely common thing to do when one I/O is finished is to schedule another. If this should be satisfied immediately, the first callback is not 'unwound' off the stack before the next one is invoked. Systems to prevent this (like 'mid-ground' scheduling of new work) add complexity and reduce performance. In practice, however, this is generally not a problem because the new I/O will itself usually return as soon as the new I/O is started allowing the stack to be 'unwound'.

[edit]Light-weight processes or threads

Light-weight processes (LWPs) or threads are available in more modern Unixes, originating in Plan 9. Like the process method, but without the data isolation that hampers coordination of the flows. This lack of isolation introduces its own problems, usually requiring kernel-provided synchronization mechanisms and thread-safe libraries. Each LWP or thread itself uses traditional blocking synchronous I/O. The requisite separate per-thread stack may preclude large-scale implementations using very large numbers of threads. The separation of textual (code) and time (event) flows provides fertile ground for errors.

This approach is also used in the Erlang programming language runtime system. The Erlang virtual machine uses asynchronous I/O using a small pool of only a few threads or sometimes just one process, to handle I/O from up to millions of Erlang processes. I/O handling in each process is written mostly using blocking synchronous I/O. This way high performance of asynchronous I/O is merged with simplicity of normal I/O. Many I/O problems in Erlang are mapped to message passing, which can be easily processed using built-in selective receive.

[edit]Completion queues/ports

Available in Microsoft Windows, Solaris and DNIX. I/O requests are issued asynchronously, but notifications of completion are provided via a synchronizing queue mechanism in the order they are completed. Usually associated with a state-machine structuring of the main process (event-driven programming), which can bear little resemblance to a process that does not use asynchronous I/O or that uses one of the other forms, hampering code reuse. Does not require additional special synchronization mechanisms or thread-safe libraries, nor are the textual (code) and time (event) flows separated.

[edit]Event flags

Available in VMS. Bears many of the characteristics of the completion queue method, as it is essentially a completion queue of depth one. To simulate the effect of queue 'depth', an additional event flag is required for each potential unprocessed (but completed) event, or event information can be lost. Waiting for the next available event in such a clump requires synchronizing mechanisms that may not scale well to larger numbers of potentially parallel events.

[edit]Implementation

The vast majority of general-purpose computing hardware relies entirely upon two methods of implementing asynchronous I/O: polling and interrupts. Usually both methods are used together, the balance depends heavily upon the design of the hardware and its required performance characteristics. (DMA is not itself another independent method, it is merely a means by which more work can be done per poll or interrupt.)

Pure polling systems are entirely possible, small microcontrollers (such as systems using the PIC) are often built this way. CP/M systems could also be built this way (though rarely were), with or without DMA. Also, when the utmost performance is necessary for only a few tasks, at the expense of any other potential tasks, polling may also be appropriate as the overhead of taking interrupts may be unwelcome. (Servicing an interrupt requires time [and space] to save at least part of the processor state, along with the time required to resume the interrupted task.)

Most general-purpose computing systems rely heavily upon interrupts. A pure interrupt system may be possible, though usually some component of polling is also required, as it is very common for multiple potential sources of interrupts to share a common interrupt signal line, in which case polling is used within the device driver to resolve the actual source. (This resolution time also contributes to an interrupt system's performance penalty. Over the years a great deal of work has been done to try to minimize the overhead associated with servicing an interrupt. Current interrupt systems are rather lackadaisical when compared to some highly-tuned earlier ones, but the general increase in hardware performance has greatly mitigated this.)

Hybrid approaches are also possible, wherein an interrupt can trigger the beginning of some burst of asynchronous I/O, and polling is used within the burst itself. This technique is common in high-speed device drivers, such as network or disk, where the time lost in returning to the pre-interrupt task is greater than the time until the next required servicing. (Common I/O hardware in use these days relies heavily upon DMA and large data buffers to make up for a relatively poorly-performing interrupt system. These characteristically use polling inside the driver loops, and can exhibit tremendous throughput. Ideally the per-datum polls are always successful, or at most repeated a small number of times.)

At one time this sort of hybrid approach was common in disk and network drivers where there was not DMA or significant buffering available. Because the desired transfer speeds were faster even than could tolerate the minimum four-operation per-datum loop (bit-test, conditional-branch-to-self, fetch, and store), the hardware would often be built with automatic wait stategeneration on the I/O device, pushing the data ready poll out of software and onto the processor's fetch or store hardware and reducing the programmed loop to two operations. (In effect using the processor itself as a DMA engine.) The 6502 processor offered an unusual means to provide a three-element per-datum loop, as it had a hardware pin that, when asserted, would cause the processor's Overflow bit to be set directly. (Obviously one would have to take great care in the hardware design to avoid overriding the Overflow bit outside of the device driver!)

[edit]Synthesis

Using only these two tools (polling, and interrupts), all the other forms of asynchronous I/O discussed above may be (and in fact, are) synthesized.

In an environment such as a Java Virtual Machine (JVM), asynchronous I/O can be synthesized even though the environment the JVM is running in may not offer it at all. This is due to the interpreted nature of the JVM. The JVM may poll (or take an interrupt) periodically to institute an internal flow of control change, effecting the appearance of multiple simultaneous processes, at least some of which presumably exist in order to perform asynchronous I/O. (Of course, at the microscopic level the parallelism may be rather coarse and exhibit some non-ideal characteristics, but on the surface it will appear to be as desired.)

That, in fact, is the problem with using polling in any form to synthesize a different form of asynchronous I/O. Every CPU cycle that is a poll is wasted, and lost to overhead rather than accomplishing a desired task. Every CPU cycle that is not a poll represents an increase in latency of reaction to pending I/O. Striking an acceptable balance between these two opposing forces is difficult. (This is why hardware interrupt systems were invented in the first place.)

The trick to maximize efficiency is to minimize the amount of work that has to be done upon reception of an interrupt in order to awaken the appropriate application. Secondarily (but perhaps no less important) is the method the application itself uses to determine what it needs to do.

Particularly problematic (for application efficiency) are the exposed polling methods, including the select/poll mechanisms. Though the underlying I/O events they are interested in are in all likelihood interrupt-driven, the interaction to this mechanism is polled and can consume a large amount of time in the poll. This is particularly true of the potentially large-scale polling possible through select (and poll). Interrupts map very well to Signals, Callback functions, Completion Queues, and Event flags, such systems can be very efficient.

[edit]References

[edit]See also

[edit]External links

"Non-blocking communication example using MPI (Message Passing Interface)". Hakan Haberdar, University of Houston, Texas, USA. Retrieved October 2012.
The C10K Problem; a survey of asynchronous I/O methods with emphasis on scaling – by Dan Kegel
Article "Boost application performance using asynchronous I/O" by M. Tim Jones
Article "Lazy Asynchronous I/O For Event-Driven Servers" by Willy Zwaenepoel, Khaled Elmeleegy, Anupam Chanda and Alan L. Cox
Perform I/O Operations in Parallel
Description from POSIX standard
Inside I/O Completion Ports by Mark Russinovich
Description from .NET Framework Developer's Guide
Asynchronous I/O and The Asynchronous Disk I/O Explorer
IO::AIO is a Perl module offering an asynchronous interface for most I/O operations
ACE Proactor

저작자표시 (새창열림)

Posted by linuxism

JRE (Java runtime environment)

Development/JavaEssential 2012. 11. 7. 10:51

JRE (Java runtime environment) ; 자바 런타임 환경

흔히 "자바 런타임"이라고도 알려져 있는 JRE는 자바 응용프로그램 개발도구인 JDK의 일부이다.

JRE는 자바 응용프로그램이 실행되는데 필요한 최소한의 요건을 제공하며, JVM과, 핵심적인 클래스들, 그리고 각종 지원 파일들로 구성된다.

출처 - http://terms.co.kr/JRE.htm

The Java Runtime Environment (also called JRE) is a software framework developed by Oracle Corporation, which runs regardless of the computer architecture. It provides the Java virtual machine and a large library, which allows to run applications written in the Java programming language^[1].

It is constituted by:

a Java virtual machine. The heart of the Java platform is the concept of a "virtual machine" that executes Java bytecode programs. This bytecode is the same no matter what hardware or operating system the program is running under.
its associated Java Class Library, which is the Standard library available to Java programs.

[edit]See also

Java virtual machine
Java virtual machine
Java Class Library
.NET Framework, a similar framework developed by Microsoft

저작자표시 (새창열림)

'Development > JavaEssential' 카테고리의 다른 글

java - 내부클래스(inner class) (0)	2012.11.07
java - 제네릭스(Generics) (0)	2012.11.07
java - override (0)	2012.11.03
java - 배열(Array) (0)	2012.10.25
java - 연산자 2 (1)	2012.10.07

Posted by linuxism

‘아파치 하둡’과 손 잡은 기업들

Cloud/Common 2012. 11. 7. 01:27

‘아파치 하둡’은 오픈소스 검색 라이브러리인 아파치 루씬의 창시자인 더그 커팅이 구글의 파일 시스템 논문 공개 후 개발한 오픈소스 프로젝트다. 지난 2004년 구글은 대용량 데이터를 처리할 수 있는 구글 파일시스템 기술을 발표했는데, 이를 오픈소스 진영이 유사한 컨셉으로 만들어내면서 ‘아파치 하둡’이 생겨났다. 하둡은 분산처리 시스템인 구글 파일 시스템을 대체할 수 있는 하둡 분산 파일 시스템과 데이터를 분산시켜 처리한 뒤 하나로 합치는 기술인 맵리듀스를 구현한 오픈소스 프레임워크를 일컫는다.

하둡 프레임워크를 이용하면 대용량 데이터를 저렴하면서도 빠르게 분석할 수 있다. 기존에 슈퍼컴퓨터를 며칠씩 돌려야 했던 데이터를 하둡을 이용하면 x86 서버로 실시간 분석이 가능졌다고 할까. DB·DW 업계는 앞다퉈 하둡과 손을 잡았고, 하둡은 빅데이터 처리와 분석을 위한 플랫폼 시장에서 사실상 표준이 됐다. IDC는 ‘하둡과 맵리듀스 생태계 소프트웨어 풍경 2012‘라는 보고서를 통해 2011년 7700만달러 수준인 하둡과 맵리듀스 관련 시장이 2016년이 되면 8억1280만달러에 이를 것으로 보인다고 분석하면서 매년 60% 넘게 하둡 관련 시장이 성장한다고 전망했을 정도다.

오픈소스 하둡을 전문적으로 개발해 상용 솔루션으로 배포하는 기업도 등장했다. 클라우데라와 호튼웍스, 맵R이 대표적이다. 이들 업체들은 하둡을 기반으로 한 플랫폼을 만들고 이를 기존 DB·DW 솔루션 업체에 배포한다.

클라우데라는 2009년 클라우드 컴퓨팅 개념의 창시자로 알려진 클리스토퍼 비시글리아를 중심으로 설립된 업체로 더그 커팅이 클라우데라에 합류하면서 유명해졌다. 호튼웍스는 지난 6월 포털업체 야후와 실리콘밸리 벤처 자본 회사인 벤치마크가 오픈소스 아파치 하둡 사업을 위해 설립한 회사다. 맵R은 지난 8월 세워진 아파치 하둡 배포판 공급 업체다. 모두 하둡 기반 시스템을 다양한 분야의 고객에게 판매해 온 컨설팅 회사라고 보면 된다.

클라우데라를 제외하곤 모두 설립된지 이제 막 1년을 넘은 신생업체들이다. 그럼에도 불구하고 엔터프라이즈 솔루션 업체들은 이들과 손잡고 빅데이터 플랫폼 시장을 노리고 있다. 어떤 기업들이 이들과 손을 잡았을까.

■ 맵R : EMC, 인포매티카

EMC는 DW업체 그린플럼을 인수한 뒤 지난해 9월 비정형 데이터 저장을 위해 하둡을 탑재한 ‘그린플럼 DCA’를 출시했다. 이 장비가 등장하기 전까지만 해도 시장은 정형 데이터와 비정형 데이터를 나눠 따로 분석했다. EMC는 맵R을 바탕으로 클라우데라와 같은 업체처럼 따로 하둡을 만든 다음 이를 자사 관계형 DBMS와 하나로 묶어내는 DW를 만들어냈다.

데이터 통합 소프트웨어 전문업체 인포매티카는 지난 5월 맵R과 손을 잡았다. 자사 데이터 플랫폼인 ‘인포매티카 플랫폼’에서 하둡용 맵R 배포판을 지원함으로써 대용량 데이터를 더 빠르게 통합하고 복제할 수 있게 하기 위해서다. 양사는 비정형 데이터를 처리할 수 있도록 함께 협럭하겠다고 발표했다.

■ 클라우데라 : 오라클, IBM

오라클은 지난해 11월 출시한 ‘오라클 빅데이터 어플라이언스’로 하둡과 손을 잡았다. DB 시장 강자인 오라클은 비정형 데이터의 원활한 처리를 위해 클라우데라와 손을 잡고 자사 어플라이언스의 클라우데라의 하둡을 탑재했다.

IBM은 하둡 시장에 비교적 빨리 관심을 보인 편에 속한다. IBM은 2010년 하둡 기반의 빅데이터 플랫폼을 개발하겠다고 발표했다. 그동안 인수했던 코그너스와 SPSS와 같은 분석 솔루션을 하둡과 결합해 비용 효율적으로 대용량 정보들을 처리, 분석할 수 있도록 하겠다는 전략이었다. 그러나 지난 4월 IBM은 자사 빅데이터 플랫폼에 탑재하는 하둡 배포판으로 클라우데라를 선택했다면서 자사 분석 소프트웨어인 ‘인포스피어 빅인사이트’를 통해 클라우데라와 손잡는다고 발표했다.

■ 호튼웍스 : 테라데이타, 마이크로소프트, 시만텍, VM웨어

하둡 솔루션을 도입하거나 하둡을 개발한 다른 업체들과 달리, 테라데이타는 자사 데이터 처리 방식에 하둡 기술을 담았다. 테라데이타는 맵리듀스와 전통적인 DB 처리 언어인 SQL을 결합한 ‘테라데이타 애스터’를 갖고 있었다. 여기에 지난 6월 호튼웍스와 협력해 ‘애스터 SQL-H’라는 기술을 선보였다. 애스터 SQL-H는 어떤 방식으로 데이터가 저장되어 있는지 파악할 필요 없이 방대한 하둡 데이터를 직접 분석할 수 있도록 도와주는 게 특징이다.

마이크로소프트(MS)는 지난 3월 호튼웍스와 협력해 하둡에 저장된 데이터를 엑셀로 불러올 수 있는 ‘커넥터’를 개발했다고 발표했다. 하둡을 몰라도 엑셀만 알면 현업 사용자들이 하둡을 통해 데이터를 쉽게 처리할 수 있게 한 것이다. 그 뿐 아니다. MS는 지난 10월 윈도우 서버와 자사 클라우드 인프라 ‘애저’에 하둡 플랫폼을 통합한 ‘MS 하둡 프리뷰 에디션’을 선보였다. MS는 애저와 하둡 플랫폼을 통합함으로써 윈도우 서버와 애저 환경에서 하둡을 좀 더 편리하게 이용하고 관리할 수 있게 됐다고 설명했다.

시만텍은 지난 8월 호튼웍스와 손잡고 자사 데이터 관리 솔루션인 ‘클러스터 파일 솔루션’에 아파치 하둡 솔루션을 더한 새로운 빅데이터 관리 솔루션 ‘시만텍 엔터프라이즈 솔루션 포 하둡’을 출시했다. 기존 솔루션에 하둡 커넥터를 필요해 밉 리듀스와 하둡 스택을 추가함으로써 고객들이 대용량 데이터를 보다 원활하게 처리하고 분석할 수 있게 돕겠다고 나섰다.

호튼웍스와 손잡은 VM웨어는 조금 특이한 경우에 속한다. DB나 DW 업체가 아닌 가상화 솔루션 업체임에도 불구하고 아파치 하둡을 도입했다. 가상화 환경에서도 하둡 환경을 다룰 수 있게 만든다는 게 이유다. VM웨어는 지난 6월 가상화 환경에서 오픈소스 하둡을 배포하고 관리하기 위한 ‘세렝게티 프로젝트’를 시작하겠다고 발표했으며, 현재 솔루션 개발이 한창이다.

출처 - http://www.bloter.net/archives/133048

저작자표시 (새창열림)

'Cloud > Common' 카테고리의 다른 글

cloud - 클라우드 컴퓨팅 정의 (0)	2013.04.30
Summary Cloud (0)	2013.03.27
cloud Foundry 설치 관련 (0)	2012.11.04
Cloud Foundry 소개 (0)	2012.11.04
cloudstack - 시트릭스 아파치재단에 클라우드스택 기증 (0)	2012.09.19

Posted by linuxism

이전 1 ··· 196 197 198 199 200 201 202 ··· 487 다음

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

linuxism

Asynchronous I/O

Contents

[edit]Forms

[edit]Process

[edit]Polling

[edit]Select(/poll) loops

[edit]Signals (interrupts)

[edit]Callback functions

[edit]Light-weight processes or threads

[edit]Completion queues/ports

[edit]Event flags

[edit]Implementation

[edit]Synthesis

[edit]References

[edit]See also

[edit]External links

JRE (Java runtime environment)

[edit]See also

'Development > JavaEssential' 카테고리의 다른 글

‘아파치 하둡’과 손 잡은 기업들

'Cloud > Common' 카테고리의 다른 글

카테고리

태그목록

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

달력

링크

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역


	by linuxism