TECHNIQUES TO IMPROVE SCALABILITY AND PERFORMANCE OF J2EE-BASED APPLICATIONS MARCIN JARZĄB, JACEK KOSIŃSKI - PDF

Description
TASKQUARTERLY8No4, TECHNIQUES TO IMPROVE SCALABILITY AND PERFORMANCE OF J2EE-BASED APPLICATIONS MARCIN JARZĄB, JACEK KOSIŃSKI AND KRZYSZTOF ZIELIŃSKI Department of Computer Science, AGH University

Please download to get full document.

View again

of 18
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information
Category:

Health & Lifestyle

Publish on:

Views: 34 | Pages: 18

Extension: PDF | Download: 1

Share
Transcript
TASKQUARTERLY8No4, TECHNIQUES TO IMPROVE SCALABILITY AND PERFORMANCE OF J2EE-BASED APPLICATIONS MARCIN JARZĄB, JACEK KOSIŃSKI AND KRZYSZTOF ZIELIŃSKI Department of Computer Science, AGH University of Science and Technology, Al. Mickiewicza 30, Cracow, Poland {mj, jgk, (Received 6 June 2004; revised manuscript received 28 August 2004) Abstract: This paper reports research on improvement techniques of scalability and performance of Java 2 Platform, Enterprise Edition,(J2EE) based applications. The study deals with operating systems and Java Virtual Machine(JVM) tuning, the setting of Enterprise Java Beans(EJB) Application Servers configuration parameters and clustering methods. The theoretical principles of achieving high performance and scalability of J2EE Applications are considered. The experimental environment and scenarios are described. The experimental results of the considered techniques evaluation are presented and analyzed. Keywords: J2EE, EJB, performance, scalabilty, tuning, load-balancing 1. Introduction TheJava R 2Platform,EnterpriseEdition,(J2EE R )[1]isthestandardserverside environment for developing enterprise applications in the Java programming language. These applications are typically composed of several tiers that handle specific aspects, notably web modules(servlets and Java Server Pages) for interaction and presentation, EJB modules for business logic, and resource adapter modules for accessing legacy applications. These modules are hosted in containers that interpose between the application modules and the available services such as transaction management, persistence, resource management and component life-cycle, security, concurrency, and remote accessibility. EJB containers are responsible for managing enterprise beans and interact with beans by calling management methods as required. Exact management schemes and their configuration attributes used by an EJB container is specific to the implementation of application server. This makes evaluation of Application Servers offered by various vendors rather difficult and opens an area for an interesting performance study. The J2EE architecture allows certain containers to be logically separated, viz. awebcontainerandan EJBcontainercanbelocatedindifferent JVM s,with tq408b-e/453 4 XII 2004 BOP s.c., 454 M. Jarząb et al. communication through well-defined interfaces. In this way, the J2EE platform offers better scalability and more sophisticated deployment strategies. It is achieved by replication of application servers and transparent content switching of requests to support the suitable load-balancing between clustered instances. This technique could be further elaborated to provide fail-over when there is a necessity to assure faulttolerant service operation for a huge number of simultaneously working users, whose number might reach a few thousands. The choice of a suitable load-balancing strategy and fail-over scheme is another interesting research area. The already proposed solutions can be divided into two categories. One addresses the increase of application processing performance through proper selection of configuration parameters at the application server side as well as at the operating system side. The selection of proper application and operating system parameters is a complex and time consuming task, requiring substantial knowledge and experience. The other category assumes increasing the number of machines serving a given service. This mechanism is simpler in realization and guarantees the growth of performance through the purchase of additional servers and setting them up in the form of a farm. The performance benefits occur as processing simultaneous users requests by distributing them among replicated nodes. This paper summarizes research on improvement techniques of scalability and performance of J2EE-based applications performed during the last two years at thedepartmentofcomputerscienceof AGH-UST.Thegoalofthepaperisto practically demonstrate the influnce of operating systems and JVM tuning, the setting of EJB Application Servers configuration parameters and clustering methods on system performance. It proposes a methodology of implementation and evaluation of the integration layer. A thorough consideration of structuralization of the business processing and data access results in a proposal of design patterns[2]. The design patterns have been used in the presented study to eliminate the additional overhead introduced by inefficient usage of the EJB technology. Thepaperisstructuredasfollows.InSection2,J2EEpatternsusedtostructure database access via an application server are shortly described. In Section 3, tuning mechanisms applicable to most of J2EE Application Servers, such as control of transaction behavior, tune thread count and some vendors proprietary features of application servers, are discussed. In Section 4 we present clustering techniques applicable in J2EE environments. Section 5 contains the performance study s methodology and scenarios used for the tests. The performance test results of the three most popular Application Servers, BEA Weblogic[3], JBOSS[4] and Sun ONE[5], are presented in Section 6. The paper ends with conclusions. 2. J2EE Design Patterns Building applications with the J2EE technology in an efficient way requires very good understanding of the key characteristics of this middleware platform. The experience gained by application programmers in many areas of software engineering has been summarized as Design Patterns, popularized by the classic book [6]. Specifically for EJB problems and solutions, we now have Core J2EE Patterns, Best tq408b-e/454 4 XII 2004 BOP s.c., Techniques to Improve Scalability and Performance Practices and Design Strategies defined by the Sun Java Center[7] and EJB Design Patterns[8]. This section focuses on performance improvement practices using patterns inejb.astherearemanyreports[2,9]referringtothisissue,wehavelimited the presentation to selected solutions applied in our study. To understand the organization of these Patterns it is necessary to understand the EJB components life cycle first EJB beans life cycle-related techniques While analyzing the EJB beans life cycle it is useful to distinguish between session and entity beans. Session beans are business process objects which are relatively short-lived components. Their lifetime is roughly equivalent to a session or thelifetimeoftheclientcodethatiscallingthesessionbean.therearetwosubtypes of session beans: stateful session beans and stateless session beans. A stateless bean is a bean that holds conversations that span a single method call. After each method call, the container may choose to destroy a stateless session bean or recreate it, cleaning itself out of all information pertaining to past invocations. It may also choose to keep theinstancearound,reusingitforallclientswhowanttousethesamesessionbean class. The exact algorithm is container-specific. In fact, stateless session beans can bepooled,reusedandswappedfromoneclienttoanotheroneachmethodcall.this saves time of object instantiation and memory. With stateful session beans, pooling is not so simple. When a client invokes amethodonabean,aclientisstartingaconversationwiththebean,andthe conversational state stored in the bean must be available for the same client s next method request. But we still need to achieve the effect of pooling for stateful session beans, to conserve resources and enhance the overall scalability of the system. EJB containers limit the number of stateful session beans instances in the memory by swapping out a stateful bean, saving its conversational state to a hard disk or other storage. This is called passivation. Entity beans are persistent objects which are constructed in memory from databasedataandcansurviveforlongperiodsoftime.thismeansthatifyouupdate the in-memory entity bean instance, the data base should automatically be updated as well. Therefore, there must be a mechanism to transfer information back and forth between Java objects and the database. This data transfer is accomplished with two special methods that the entity bean class must implement, called ejbload and ejbstore. These methods are called by the container when a bean instance needs to be refreshed, depending on the current transactional state. TheEJBtechnologyassumesthatonlyasinglethreadcaneverberunning within a bean instance. To boost performance, it is necessary to allow containers to instantiate multiple instances of the same entity bean class. This allows many clients to concurrently interact with separate instances, each representing the same underlying entity data. If many bean instances represent data via caching, we are dealing with multiple replicas cached in the memory. Similar to session beans, entity beans instances are objects that may be pooled depending on the container s policy. It saves resources and shortens the instantiating time. tq408b-e/455 4 XII 2004 BOP s.c., 456 M. Jarząb et al EJB common Design Patterns Enterprise beans encapsulate business logic with business data and expose their interfaces with all the complexity of the distributed services to the client. This could create some problems when too many method invocations between a client and a server lead to a network performance bottleneck and the overhead of many simple transactions being processed. To solve this problem, session beans should be used as a facade to encapsulate the complexity of interactions between the business objects participating in a business transaction. The Session Facade manages the business objects and provides uniform coarse-grained service layer access used by clients, reducing the network overhead. It is also important in the situation when entity beans are transactional components, which means that each method call may result in invoking a new transaction, possibly reducing the performance. This behavior can be controlled by encapsulating method calls of entity beans inside the session beans, which act as a transactional shell for all transactions raised by entity beans, thus leading to better performance. The Session Facade is one of the most popular EJB Design Patterns, which helps to achieve a proper partition business logic and at the same time minimizes dependencies between a client and a server and forcing to execute business transaction in one networked call. TheuseoftheSessionFacadepatterncouldresultinareductionofremote calls. A pattern which addresses only the data transfer reduction overhead is the Value Object pattern. A Value Object encapsulates a set of attributes and provides set/get methods to access them. Value Objects are transported by value from the enterprise bean to the client component. When the client requests the business data from an enterprise bean, the bean constructs a value object, populates it with the attributevaluesandpassesitbyvaluetotheclient.theclientwhocallsanenterprise beanthatusesavalueobjectmakesonlyoneremotecallinsteadofnumerousremote callstogeteachattributevalueineachcall.theclientreceivesavalueobjectand locally invokes set/get methods on this object to access attribute values. It is necessary topointoutthatthesamepatterncouldbeusedtooptimizeaccesstodatastoredin adatabase. Another problem is that access to data varies depending on the data source. Access to persistent storage varies greatly depending on the type of storage(rdbms, OODBMS, LDAP flat files, etc.) and the vendor implementation. These data must be accessed and manipulated from business components, such as enterprise beans, which are responsible for persistence logic. These components require transparency to the actual persistent store or data source implementation to enable easy migration to different vendor products, different storage types and different data source types. The solution is to use the Data Access Object(DAO) design pattern, which abstracts and encapsulates all access to the data source. The DAO design pattern enables transparency between business components anddatastorage.itactsasaseparatelayerwhichcanbechangedeasilyincase an application migrates to another database implementation. Because a Data Access Object manages all the data access complexities, it simplifies the code in business components that use the data access objects. All the implementation-related code (suchas SQLstatements)iscodedinthe DAOandnotinthebusinessobject. tq408b-e/456 4 XII 2004 BOP s.c., Techniques to Improve Scalability and Performance This improves code readability and development productivity. Another important point should be emphasized at this stage. DAO is not useful for Container Managed Persistence(CMP) entity beans, as an EJB container serves and implements all the persistence logic. There are also some tips worth considering in the implementation phase of enterprise beans, which tend to significantly increase performance. They are briefly described below: Serialization of Value Objects transferred between Remote Enterprise Beans should be considered and implemented in the most efficient way possible. References to Enterprise beans EJBHome object should be cached. There is already a pattern, called the Service Locator, which is responsible for getting any objects from the Java Naming and Directory Interface(JNDI) tree and puttingthemintothecache.thenextrequestforanyoftheseobjectsdoesnot resultinajndicall,buttheobjectsalreadystoredinthecachearereturned. Transactions should be controlled by avoiding transactions for non-transactional methods. If method calls must participate in a transaction, appropriate transaction methods signatures should always be declared to increase the performance of a transaction raised by this method call. Use JDBC for reading. The most common use in distributed applications originates from the need to present a set of data resulting from certain search criteria, known as the read-only use case. When a client requests data for read-only purposes, a solution which uses entity beans has some unnecessary overhead,oftencalledthen+1problem.inordertoreadndatabaserows whenentitybeansareused,onemustfirstcallthefindermethod,whichis one database call. The ejbload method is then called for each acquired row, representedbyanentitybean.whenweusejdbcqueriestofetchdata,queries are performed in one database call. Comparing this behavior to entity beans, we can notice a significant improvement in performance. Some disadvantages of using entity beans have already been mentioned above. To solve some performance problems of entity beans, the EJB specification offers an option of read-only entity beans. Theadvantageofread-onlyentitybeansisthattheirdatacanbecachedin memory, on one or many servers, when dealing with clustering. Read-only entity beans do not use expensive logic to keep the distributed caches coherent. Instead, the deployer specifies a timeout value and the entity beans cached state is refreshed afterthetimeouthasexpired.onemorethingisthatread-onlybeansdonothaveto participate in transactions. Read-only and read-write entities can coexist in the read-mostly design pattern. The concept of this pattern is EJB optional deployment setting of read-only and deployment of the same bean code twice in the same application: as read/write beans to support transactional behavior and as read-only beans to enable rapid data access. In the read-mostly pattern, a read-only entity EJB retrieves bean data at intervals specified by the refresh-period deployment descriptor element specified in thedescriptorfile.aseparateread-writeentityejbmodelsthesamedataasareadonly EJB and updates the data at required intervals. The main factor which should tq408b-e/457 4 XII 2004 BOP s.c., 458 M. Jarząb et al. be considered when using the read-mostly pattern to reduce the data consistency problemistochooseanappropriatevalueoftherefreshinterval.itshouldbesetat the smallest time-frame that yields acceptable performance levels. 3. Setting the operating system and application server parameters Setting the operating system and application server parameters is most commonly referred to as tuning. The tuning that should be considered during the J2EE servers installation and application deployment phase is related to: the operating system and TCP stack configuring; JVM parameter modification; J2EE servers deployment setting. Inthissectionthemaintuningtechniquesforeachofthesewillbebriefly described. This will illustrate the multidimensionality and complexity of J2EE environment setting The operating system and TCP stack Tuning the operating system consists mainly in such selection of configurable system parameters that the usage of hardware resources is the most optimal. The selection of proper values strongly depends on the type of application whose performance is the most important. Typical parameters of special influence on the performance of operating systems are the size of swap space, its physical localization, and the mannerofdataaccess thesizeofdiskbuffersorthesettingsrelatedtocommunication with data storage devices such as DMA channel settings. The total performance may be improved by manipulating the configuration and parameters of the algorithm for scheduling processor tasks. This algorithm and its settings are particularly important for minimization of the so-called time of reaction, crucial for seriously overloaded systems. The Linux operating system enables modification of all these parameters, and through the availability of alternative implementations of ranking algorithms, matchingtheschedulerwithagivenapplicationandthedegreeofsystemoverload.inthe case of the Linux operating system, performance can be improved by such kernel configuration that all unnecessary drivers are disabled. This significantly decreases kernel sizeandthesizeoftherequiredmemory. Performance improvement of the TCP protocol communication subsystem means appropriate manipulations of time parameters that control the possibility of repeated device usage for the given socket or retransmission timers, connection establishment or dropping, etc. The present TCP stack implementations use efficient and effective methods of access to data that store information about open connections. These methods exhaust hash-tables. However, the parameters related to memory allocation for these hash-tables can be modified by such selection of parameters that their values would not create a bottleneck for TCP communications JVM tuning WhentuningJVM,thefirstparametertobesetistheoverallheapspace allocated and the sizes of the various generational areas[10]. Extensive testing has tq408b-e/458 4 XII 2004 BOP s.c., Techniques to Improve Scalability and Performance shown that the default parameters supplied by the Java server VM(using the server option) provide almost the best throughput, with the exception of a few parameters. To increase the predictability of garbage collection the settings for minimum(- Xms)andmaximum(-Xmx)heapsizeshouldbesettothesamevalue.Thedefault valueforbothparametersis64mbwhichmaybetoosmallinsomecases,causing errors during runtime(java.lang.outofmemoryexception). This factor should be considered especially with server applications, which in most cases have much greater memory footprint than common applications. The heap is also divided into areas, viz. young and old generation, which are supposed to hold objects of different ages. In some cases we may wish to customize the generation sizes, especially when the garbage collector becomes a bottleneck. Another important factor is the method of garbage collection(gc). Under heavy load, two factors have to be considered. First,ifalargeamountofheapspaceisallocated,garbagecollectioncantakealong time,intheorderofone,twooreventhreeseconds.second,hundredsorthousands ofrequestsarrivepersecondandallofthemhavetobequeuedwhiletheserver performsgc.often,theremaybetoomanyofthemtobequeuedbytheserver process and the operating system s TCP stack. The TCP/IP connectio
Related Search
Similar documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks