Menu
News
All News
Dungeons & Dragons
Level Up: Advanced 5th Edition
Pathfinder
Starfinder
Warhammer
2d20 System
Year Zero Engine
Industry News
Reviews
Dragon Reflections
White Dwarf Reflections
Columns
Weekly Digests
Weekly News Digest
Freebies, Sales & Bundles
RPG Print News
RPG Crowdfunding News
Game Content
ENterplanetary DimENsions
Mythological Figures
Opinion
Worlds of Design
Peregrine's Nest
RPG Evolution
Other Columns
From the Freelancing Frontline
Monster ENcyclopedia
WotC/TSR Alumni Look Back
4 Hours w/RSD (Ryan Dancey)
The Road to 3E (Jonathan Tweet)
Greenwood's Realms (Ed Greenwood)
Drawmij's TSR (Jim Ward)
Community
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Resources
Wiki
Pages
Latest activity
Media
New media
New comments
Search media
Downloads
Latest reviews
Search resources
EN Publishing
Store
EN5ider
Adventures in ZEITGEIST
Awfully Cheerful Engine
What's OLD is NEW
Judge Dredd & The Worlds Of 2000AD
War of the Burning Sky
Level Up: Advanced 5E
Events & Releases
Upcoming Events
Private Events
Featured Events
Socials!
EN Publishing
Twitter
BlueSky
Facebook
Instagram
EN World
BlueSky
YouTube
Facebook
Twitter
Twitch
Podcast
Features
Top 5 RPGs Compiled Charts 2004-Present
Adventure Game Industry Market Research Summary (RPGs) V1.0
Ryan Dancey: Acquiring TSR
Q&A With Gary Gygax
D&D Rules FAQs
TSR, WotC, & Paizo: A Comparative History
D&D Pronunciation Guide
Million Dollar TTRPG Kickstarters
Tabletop RPG Podcast Hall of Fame
Eric Noah's Unofficial D&D 3rd Edition News
D&D in the Mainstream
D&D & RPG History
About Morrus
Log in
Register
What's new
Search
Search
Search titles only
By:
Forums & Topics
Forum List
Latest Posts
Forum list
*Dungeons & Dragons
Level Up: Advanced 5th Edition
D&D Older Editions
*TTRPGs General
*Pathfinder & Starfinder
EN Publishing
*Geek Talk & Media
Search forums
Chat/Discord
Menu
Log in
Register
Install the app
Install
Community
Meta - Forums About Forums
Meta
Improving Web Site Performance
JavaScript is disabled. For a better experience, please enable JavaScript in your browser before proceeding.
You are using an out of date browser. It may not display this or other websites correctly.
You should upgrade or use an
alternative browser
.
Reply to thread
Message
<blockquote data-quote="andargor" data-source="post: 2113691" data-attributes="member: 7231"><p>The boards have been very sluggish lately. I'd like to offer some tips on some things that can be done to improve web site performance.</p><p></p><p>I don't want to sound condescending, it is not my purpose. But my work experience has been networks and the Internet, and particularily the architecture and implementation of high-end hosting data centers providing multi-gigabit connectivity and 5/9+ high-availability (5/9: that's 99.999% and better).</p><p></p><p>Here are the steps I propose, with "quick hits" identified which require a minimum of effort and expenses:</p><p></p><ul> <li data-xf-list-type="ul"> <strong>Caching: </strong>I've mentioned it in <a href="http://www.enworld.org/showthread.php?t=121914" target="_blank">this thread</a>. Use it whenever you can, especially on images. If you can afford it, use services like Akamai which cache images and other "weighy" objects throughout the world so they are accessed locally by users, so you save bandwidth and processing power (this is also especially useful with streaming content and downloads). Another alternative is a reverse proxy, which is basically a cache server in front of your web server which does the grunt work of transferring images, CSS, javascript, zips and static HTML.<br /> <br /> <strong>Quick Hit: </strong>Remove any "no-cache" pragmas from your web server configuration and HTML templates, as well as any "max-age: 0" or other expiration directives. This will quickly reap benefits in both bandwidth, number of requests, and processing power by allowing the user's browser to cache frequently fetched but seldom changing elements. There may be some items that cannot be cached (I haven't seen those on EN World, but you never know), so you may have to have a few separate templates. Alternatively, you can also specify an expiration ("max age") of several minutes, which should still provide significant performance improvements, since it will cut down on transfers during a user's session.<br /> <br /> </li> <li data-xf-list-type="ul"> <strong>Separation of Function: </strong>Most high-bandwidth, high-performance sites separate all functions into layers: presentation (HTTP), application logic (PHP or other), and database typically. This allows a "funnel" architecture where you may have several low-cost front-end web servers (e.g. 2-10 500 MHz linux boxes) whose sole task is serving files (HTML, javascript, images, etc.). The smallest of linux boxes can serve an impressive amount of files per second. If a dynamic request is required, the request is proxied back to the application logic layer which just returns the dynamically generated HTML, and all other static HTML elements are returned by the front-end. This is a 15:1 ratio typically (e.g. for every dynamic request, there are an average of 15 requests for the static HTML elements in the page, such as images, CSS, javascript, etc.).<br /> <br /> This is the most significant performance boost. Although there is a 15:1 ratio, application logic is usually more CPU intensive, and require about 1 box for each 2-5 front-end servers (depending on the application architecture). The boxes are usually mid-size (say, 1 GHz compared to the 500 MHz front-ends). If a database query is required, then that is passed on to the database layer. How many database requests there are depends entirely on the application architecture. From what I can tell, VBulletin goes to database at every request, so there would be a 1:1 relationship with EN World, which is high. However, a performance boost here is the use of a pool of persistent connections so that the application layer doesn't have to bring up and tear down connections at every request. Having the database on a separate machine allows for better capacity planning and tuning, since you see what is causing CPU drain.<br /> <br /> <strong>Quick Hit: </strong>The full separation of function is usually a non-negligible expense, and I don't know what are your financial resources. You can also go hybrid, by combining either the front-end servers with the application layer, or the application layer with the database (meaning two layers). That all depends on where the load is coming from. You can also do "poor man's" separation of function, which is what I recommend as a quick hit. If you can only have one server, and use a web server like Apache, you can separate HTTP services and application logic by running two different versions of Apache on the same server. I'm not talking version numbers, but rather one version that is "light" and is compiled with only the most basic modules, and another which is "heavy" with all the PHP, Perl, et. al., modules.<br /> <br /> What this does is allow you to have the 15:1 ratio in Apache processes to serve static elements with the "light" version (typically 200-500K per process), and do an internal proxy (using Rewrite Rules) for dynamic elements (such as the HTML resulting from PHP) with the "heavy" version (typically a few megs per process). This gives you a good performance boost, since you don't have to spawn "heavy" processes during high load periods, you conserve memory, you respond much more quickly since there are many "light" processes for all the static elements, and you keep a relatively constant number of pooled persistent connections to the database from the "heavy" processes (which is significant). I would still separate out the database, for the reasons mentioned above, but that's your call if you want to leave it all on the same machine.<br /> <br /> </li> <li data-xf-list-type="ul"> <strong>Minimize dynamic requests: </strong>This is a tough one because of VBulletin's and EN World's nature. As long as you will have a home page that is dynamically served, you will require tons of processing power. Our customers would complain that "the hosting architecture is slow", until we pointed this out to them and made corrections for them. The bottom line: serve as much static content as you can. We improved performance (simultaneous connections) by as much as 5000% in one case. Not easy with a constantly changing site, but there is a method: static publication. Static publication is basically the periodic automated fetching of dynamic pages, which are saved as static HTML and served to users. How periodic depends on how often the information on the page changes.<br /> <br /> For example, can the home page be statically published every hour? Every day? A good example is the main forum page. Personally, I would not mind if it is updated every minute or so. It is probably one of the most requested pages other than the home page. How many page requests do you get in one minute during peak periods? 10? 100? 1000? 10000? If the page were static, you would go to dynamic 10%, 1%, 0.1%, or 0.01% of the time, respectively, with the associated savings in processing power and database connections.<br /> <br /> <strong>Quick Hit: </strong>Implement a set of cron jobs for the home page and the main forum page, set at intervals you are comfortable with, and use a simple program like <em>wget</em> to fetch and save the result as static pages. Make sure all appropriate links point to those static files. The simplest way is a Rewrite Rule in the "light" Apache processes. Or, if you want to go more radical, save all "archived" content to static HTML and redirect any queries using Rewrite rules to those pages. That way they are still indexed, but you don't go to PHP/database for these unchanging threads.<br /> <br /> </li> <li data-xf-list-type="ul"> <strong>Request throttling: </strong>I do not know if you have DoS or protections from inappropriate use. This is a drain on any dynamic site: the harvesting of threads for either offline viewing or other purposes. Or the wanton generation of requests for the sole purpose of making your life more difficult. Most web spiders play nice, but some play not so nice, and will spider your site as fast as you can serve it. What you can do is limit the number of request per second to any one IP address, or any range of IP addresses (better, since it protects from distributed DoS bots in any one network). These is a module called <em>mod_throttle</em> that does exactly this, and probably others.<br /> <br /> <strong>Quick hit </strong>: Installing a module like <em>mod_throttle</em> is relatively easy, and should be implemented as soon as possible.<br /> <br /> </li> <li data-xf-list-type="ul"> <strong>Firewall optimization: </strong>I don't know your firewall setup, but during my experience, I have found that the firewall can be a critical performance factor. I really don't know if this is an issue, but I'll give ou a few leads. Where I saw performance drain was when the firewall had stateful inspection rules (meaning it checks both sides of the connection or session, the request and the return). Stateful inspections are slow, compared to simple single-sided "drop" rules. Also, unfortunately, it can be DoSed relatively easily by opening an inordinate amount of connections until the firewall runs out. All further connections are dropped. This can also occur accidentally during high load periods.<br /> <br /> The solution: turn off stateful inspection (which is a higher risk, but its a tradeoff), or put the front-end "light" servers outside the firewall. Linux servers can be well-protected on their own using the built-in Netfilter firewall, and BSD servers are said to be the most secure. Then you refuse any connections from the outside of your firewall, except for HTTP connections from the front-end servers to the application servers. Because of the 15:1 ratio in request, you increase your available connections by that much.<br /> <br /> <strong>Quick Hit: </strong>I don't know until I know what your hosting setup is.</li> </ul><p></p><p>The bottom line:</p><p></p><p>All of the above is nice and good, but until you know what is causing your performance drain, you are shooting in the dark. Get some good (free) monitoring tools and take a look at your server activity: memory, HTTP requests, database connections, disk activity, etc.. Analyze web logs for the most number of requests or suspected DoS attacks (and firewall them out).</p><p></p><p>Hopefully there are a few gems in this post that will enhance the community's experience of this great site.</p><p></p><p>Andargor</p></blockquote><p></p>
[QUOTE="andargor, post: 2113691, member: 7231"] The boards have been very sluggish lately. I'd like to offer some tips on some things that can be done to improve web site performance. I don't want to sound condescending, it is not my purpose. But my work experience has been networks and the Internet, and particularily the architecture and implementation of high-end hosting data centers providing multi-gigabit connectivity and 5/9+ high-availability (5/9: that's 99.999% and better). Here are the steps I propose, with "quick hits" identified which require a minimum of effort and expenses: [list] [*] [b]Caching: [/b]I've mentioned it in [url=http://www.enworld.org/showthread.php?t=121914]this thread[/url]. Use it whenever you can, especially on images. If you can afford it, use services like Akamai which cache images and other "weighy" objects throughout the world so they are accessed locally by users, so you save bandwidth and processing power (this is also especially useful with streaming content and downloads). Another alternative is a reverse proxy, which is basically a cache server in front of your web server which does the grunt work of transferring images, CSS, javascript, zips and static HTML. [b]Quick Hit: [/b]Remove any "no-cache" pragmas from your web server configuration and HTML templates, as well as any "max-age: 0" or other expiration directives. This will quickly reap benefits in both bandwidth, number of requests, and processing power by allowing the user's browser to cache frequently fetched but seldom changing elements. There may be some items that cannot be cached (I haven't seen those on EN World, but you never know), so you may have to have a few separate templates. Alternatively, you can also specify an expiration ("max age") of several minutes, which should still provide significant performance improvements, since it will cut down on transfers during a user's session. [*] [b]Separation of Function: [/b]Most high-bandwidth, high-performance sites separate all functions into layers: presentation (HTTP), application logic (PHP or other), and database typically. This allows a "funnel" architecture where you may have several low-cost front-end web servers (e.g. 2-10 500 MHz linux boxes) whose sole task is serving files (HTML, javascript, images, etc.). The smallest of linux boxes can serve an impressive amount of files per second. If a dynamic request is required, the request is proxied back to the application logic layer which just returns the dynamically generated HTML, and all other static HTML elements are returned by the front-end. This is a 15:1 ratio typically (e.g. for every dynamic request, there are an average of 15 requests for the static HTML elements in the page, such as images, CSS, javascript, etc.). This is the most significant performance boost. Although there is a 15:1 ratio, application logic is usually more CPU intensive, and require about 1 box for each 2-5 front-end servers (depending on the application architecture). The boxes are usually mid-size (say, 1 GHz compared to the 500 MHz front-ends). If a database query is required, then that is passed on to the database layer. How many database requests there are depends entirely on the application architecture. From what I can tell, VBulletin goes to database at every request, so there would be a 1:1 relationship with EN World, which is high. However, a performance boost here is the use of a pool of persistent connections so that the application layer doesn't have to bring up and tear down connections at every request. Having the database on a separate machine allows for better capacity planning and tuning, since you see what is causing CPU drain. [b]Quick Hit: [/b]The full separation of function is usually a non-negligible expense, and I don't know what are your financial resources. You can also go hybrid, by combining either the front-end servers with the application layer, or the application layer with the database (meaning two layers). That all depends on where the load is coming from. You can also do "poor man's" separation of function, which is what I recommend as a quick hit. If you can only have one server, and use a web server like Apache, you can separate HTTP services and application logic by running two different versions of Apache on the same server. I'm not talking version numbers, but rather one version that is "light" and is compiled with only the most basic modules, and another which is "heavy" with all the PHP, Perl, et. al., modules. What this does is allow you to have the 15:1 ratio in Apache processes to serve static elements with the "light" version (typically 200-500K per process), and do an internal proxy (using Rewrite Rules) for dynamic elements (such as the HTML resulting from PHP) with the "heavy" version (typically a few megs per process). This gives you a good performance boost, since you don't have to spawn "heavy" processes during high load periods, you conserve memory, you respond much more quickly since there are many "light" processes for all the static elements, and you keep a relatively constant number of pooled persistent connections to the database from the "heavy" processes (which is significant). I would still separate out the database, for the reasons mentioned above, but that's your call if you want to leave it all on the same machine. [*] [b]Minimize dynamic requests: [/b]This is a tough one because of VBulletin's and EN World's nature. As long as you will have a home page that is dynamically served, you will require tons of processing power. Our customers would complain that "the hosting architecture is slow", until we pointed this out to them and made corrections for them. The bottom line: serve as much static content as you can. We improved performance (simultaneous connections) by as much as 5000% in one case. Not easy with a constantly changing site, but there is a method: static publication. Static publication is basically the periodic automated fetching of dynamic pages, which are saved as static HTML and served to users. How periodic depends on how often the information on the page changes. For example, can the home page be statically published every hour? Every day? A good example is the main forum page. Personally, I would not mind if it is updated every minute or so. It is probably one of the most requested pages other than the home page. How many page requests do you get in one minute during peak periods? 10? 100? 1000? 10000? If the page were static, you would go to dynamic 10%, 1%, 0.1%, or 0.01% of the time, respectively, with the associated savings in processing power and database connections. [b]Quick Hit: [/b]Implement a set of cron jobs for the home page and the main forum page, set at intervals you are comfortable with, and use a simple program like [i]wget[/i] to fetch and save the result as static pages. Make sure all appropriate links point to those static files. The simplest way is a Rewrite Rule in the "light" Apache processes. Or, if you want to go more radical, save all "archived" content to static HTML and redirect any queries using Rewrite rules to those pages. That way they are still indexed, but you don't go to PHP/database for these unchanging threads. [*] [b]Request throttling: [/b]I do not know if you have DoS or protections from inappropriate use. This is a drain on any dynamic site: the harvesting of threads for either offline viewing or other purposes. Or the wanton generation of requests for the sole purpose of making your life more difficult. Most web spiders play nice, but some play not so nice, and will spider your site as fast as you can serve it. What you can do is limit the number of request per second to any one IP address, or any range of IP addresses (better, since it protects from distributed DoS bots in any one network). These is a module called [i]mod_throttle[/i] that does exactly this, and probably others. [b]Quick hit [/b]: Installing a module like [i]mod_throttle[/i] is relatively easy, and should be implemented as soon as possible. [*] [b]Firewall optimization: [/b]I don't know your firewall setup, but during my experience, I have found that the firewall can be a critical performance factor. I really don't know if this is an issue, but I'll give ou a few leads. Where I saw performance drain was when the firewall had stateful inspection rules (meaning it checks both sides of the connection or session, the request and the return). Stateful inspections are slow, compared to simple single-sided "drop" rules. Also, unfortunately, it can be DoSed relatively easily by opening an inordinate amount of connections until the firewall runs out. All further connections are dropped. This can also occur accidentally during high load periods. The solution: turn off stateful inspection (which is a higher risk, but its a tradeoff), or put the front-end "light" servers outside the firewall. Linux servers can be well-protected on their own using the built-in Netfilter firewall, and BSD servers are said to be the most secure. Then you refuse any connections from the outside of your firewall, except for HTTP connections from the front-end servers to the application servers. Because of the 15:1 ratio in request, you increase your available connections by that much. [b]Quick Hit: [/b]I don't know until I know what your hosting setup is. [/list] The bottom line: All of the above is nice and good, but until you know what is causing your performance drain, you are shooting in the dark. Get some good (free) monitoring tools and take a look at your server activity: memory, HTTP requests, database connections, disk activity, etc.. Analyze web logs for the most number of requests or suspected DoS attacks (and firewall them out). Hopefully there are a few gems in this post that will enhance the community's experience of this great site. Andargor [/QUOTE]
Insert quotes…
Verification
Post reply
Community
Meta - Forums About Forums
Meta
Improving Web Site Performance
Top