HTTP
Overview
HTTP: Hypertext Transfer Protocol
HTTP is an application based protocol that works in a request-response manner and contains many elements that are all crucial to creating the perfect website/application. It works on port 80 and port 443 for HTTPS (HTTP Secure).

Source: https://www.webnots.com
HTTP Transactions consist of 6 flows:
- DNS Lookup
- Connect
- Send
- Wait
- Load
- Close
Source: http://blog.catchpoint.com/
Below are short descriptions of the different versions, status codes and methods available. This information is followed by a complete list of request and response headers along with best practice for each.
Breakdown of an HTTP request
Example HTTP URL | |
---|---|
http://www.domain.com/test/folder/image.jpg?v=1.0 | |
HTTP Request | Example |
Protocol | http:// or https:// |
Domain | www.domain.com |
URI (note this is not the URL) | test/folder/image.jpg |
Query String | ?v=1.0 |
Version
There are two versions of the protocol: HTTP/1.1 and HTTP/2. HTTP/1.1 has served most of the internet traffic for over 15 years, however, recently, as with all protocols we have seen the arrival of HTTP/2. Below is a table of information that notes the new benefits and advantages of moving towards the newer version. Most major CDN vendors and browsers should now support the use of HTTP/2.
Feature | Benefits of HTTP/2 |
---|---|
Multiplexing | Version 1.1 only allows one request and one outstanding (queued) request at the same time. Version 2 now allows multiple requests speeding up the delivery on content in parallel. |
Single connection to server | A single TCP connection is opened to the server and is kept-alive for as long as the website is open. No need for several TCP connections while browsing the same site. |
Server pushing | Similar to pre-heating the CDN with content from the origin server, you can now push content to the end user (browser) for future use. |
Prioritisation | As it sounds, HTTP/2 allows for content to be loaded by priority. |
HPACK | Header Compression to reduce overheads |
Status Codes
Overview
Status codes are a list of 3-digit codes that are split into 5 categories. These codes are sent from a server to an end user (browser) in the response to every request. The current standard for these codes were defined in HTTP/1.1 and there are (when this document was written) no changes moving towards HTTP/2. The five categories are shown below with best practice use (where required). The list contains the most commonly seen status codes only. For a complete list please following the link below: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
10x – Information
Code | Description and Best Practice (where applicable to applications/CDN) |
---|---|
100 | Continue The server has received the request headers and is now ready for the request body. This is generally only used for specific methods such as POST. |
101 | Switching Protocols A request to switch protocols and the server has agreed. |
20x – Success
Code | Description and Best Practice |
---|---|
200 | OK Standard response to all successful HTTP requests. Remember that applications and CDN configurations can manipulate response codes. It is very important not to send 200 OK responses for failed requests. For example, do not configure an application or the CDN to send 200 OK responses for 404 pages. This is not only bad practice for analytics but also on how proxies and clients (browsers) may or may not cache or manipulate the responses. |
203 | Non-Authoritative Information Specific to a web proxy (such as a CDN), when a 200 OK is received from the origin server, the web proxy sends a modified version to the end user. |
206 | Partial Content The server is only delivering part of the resource due to a range request sent by the client. This is commonly used for large files and is very useful to resume large downloads after connections are broken.For example, sending a 1Gb file in chunks of 100Mb. If the connection breaks after the first 5 parts of the file have been received, when the client makes a request for the same file again, it can resume and not need to start again. Proper use of partial would indicate a 206 is received for each chunk of the file but a 200 OK should be sent with the last chunk.Note, that on the CDN configuration and the origin server, chunking must be enabled to accept these requests. Also, note that the CDN may have a default list of file types for chunking. Specific file types may require customised rules. |
30x – Redirections
Code | Description and Best Practice |
---|---|
301 | Moved Permanently The requested URL has been definitively to another URL. This is based on the Location header. This request and all further requests for the original URL will be redirected.Given proper use of this response code, it makes perfect sense for caching to be enabled and to allow browsers and CDN servers to respond to the redirects.Example first request:
Example second request:
|
302 | Found Although the name of this response code is “found”, it is easier described as a temporary redirect. This works in the same way as a 301 but responses should not be cached.Example request:
|
304 | Not Modified An indication response code that means there is no need to retrieve the requested resource because as there is already a fresh version cached. This is usually based on the request header If-Modified-Since or If-None-Match. |
40x – Client Errors
Code | Description and Best Practice |
---|---|
400 | Bad Request The server cannot or will not process the request due to a client-side error. |
401 | Unauthorised Specifically used for authentication. This could be a result of authentication not being provided or a failed attempt. Very common use (that would result in a 401) would be Basic Authentication. Usually a username and password operation that can be set up on the origin or the CDN very easily. |
403 | Forbidden The request is valid, but the client had blocked the end user from accessing the site. When setting up IP blocks on the CDN, this is the most used status code for any rejected/blocked user attempts. |
404 | Not Found
The request is accepted by the client server, but the object or resource does not exist. This is very common when providing specific, long URL links that can easily have been changed over time. |
405 | Method Not Allowed A basic status code that states the requested HTTP method is not supported or allowed. For example, if a GET request is used to send information in a form it may be rejected as this should be a POST request. |
50x – Server Errors
Code | Description and Best Practice |
---|---|
500 | Internal Server Error A general message provided by the server when an issue occurs that was unexpected and does not sit within any specific category. |
501 | Not Implemented Response code that suggests the server does not accept the requested HTTP method or it does not have the ability to complete the request. |
502 | Bad Gateway When servers act as proxy servers or gateways (i.e. A CDN) and an invalid response was received from the upstream server. This is mostly seen when troubleshooting and watching the hops between servers over HTTP (not TCP). |
503 | Service Unavailable Server is unavailable. This is a classic code used when a CDN server cannot get a response or build a TCP connection to the origin server. This could be for a number of reasons:
Note, the general error message seen on a browser for 503 responses via a CDN is: |
Request Headers
Host
Explanation | The domain name of the server and if used, the port number required. |
---|---|
Best Practice | When the CDN is in play, it is important to configure the Host header on the CDN as required by the origin. In most cases the CDN domain will differ from the origin domain, therefore, the initial request to the CDN will carry a specific value for the host header. We do not want this value used when the CDN needs to connect to the origin (unless the origin is configured to accept that specific value). |
Example | CDN domain: cdn.domain.com Origin domain: origin.domain.com End User -> CDN – Host: cdn.domain.com CDN -> Origin – Host: origin.domain.com |
References |
User-Agent
Explanation | A string of information that allows network protocol peers to identify application type, software vendor, operating system and software version. |
---|---|
Best Practice | As this is a request header that is automated and controlled by the browser the best practice options for developers would involve how their application may respond with different content depending on the value of the header itself. It is important to note if and how this can be achieved in the CDN, saving the request from having to go to the origin. |
Example | A common use is to read the user-agent to determine the type of device as all new websites and applications are now responsive.
Although the URL’s for the above 2 sites might be the same, a CDN can cache both as separate instances based on the User-Agent. |
References | User-Agent List: http://www.useragentstring.com/pages/useragentstring.php |
Method
Explanation | Request methods are not essentially request headers but are still a major part of each request. A list of the different methods available (most common at the top of the list) are:
|
---|---|
Best Practice | CDN’s are generally closed to different HTTP methods except for GET and HEAD. If you request the use of POST or other methods, these need to be enabled or explicitly defined in the CDN configuration. There is no specific best practice except the correct use of each method, however, it is important to understand how your CDN vendor caches requests and responses based on different methods. |
References | A complete list of methods with explanations: https://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Request_methods |
Referer
Explanation | Contains the URL of the previous webpage from which the current requested page came from. As a note, if the previous page was HTTPS and the current page is HTTP, a browser may not send the Referer header. In most cases, this information is used for analytics, logging or optimising caching. |
---|---|
Best Practice | Used in certain cases to route/proxy requests (CSS, JS etc.) to the same origin as an HTML page. |
References |
Origin
Explanation | This is similar to the Referer but only states name of the server from where the request originated from. |
---|---|
Best Practice | Directly linked to CORS headers that are explained later in the document. |
References |
Accept-Encoding
Explanation | A set of values that are related to content-encoding. These are usually compression-based values that are sent by the client (browser) to the server. Although encoding may be advertised and supported by both the client and the server, it does not always enforce compression. For example, if the content is already compressed and/or if the server is overloaded. General rules state that servers will not compress content on the fly if they are at 80% CPU usage. |
---|---|
Best Practice | Note that this header is sent by the client (browser), therefore, there is no need to actively enforce it unless the client does not send the header for specific content types that you deem compressible. It is imperative that the application understands compression (if required) needs to be enabled on the CDN configuration and possibly customised for specific content or “the amount of compression”. These are generally basic settings on the CDN but can be customised. The idea is to have the CDN request the content from the origin and receive a compressed version that can be cached. CDN servers will decompress content on the fly.
|
References | Full encoding tokens: https://en.wikipedia.org/wiki/HTTP_compression |
Transfer-Encoding
Explanation | A set of values that are valid on a hop-by-hop basis between servers. It denotes the form of encoding required to send an object or entity to an end user. Similar to Accept-Encoding, there are several options but the most commonly used is “chunked”. Chunking is enabled on large file transfers. These files are broken into small chunks instead of one large file. The main advantage is that if the transfer fails, it can continue from where it failed and not from the start and the CDN can cache chunks. Very useful when delivering VoD (Video on Demand). The response code for each chunk should be a 206 until the last chunk is complete; this will respond with the status code 200. |
---|---|
Best Practice |
|
References | https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Transfer-Encoding |
Accept-Language
Explanation | This header is sent by the client (browser) by default and is very rarely amended by the end user. By browser defaults it sends the preferred language and locale of the end user. This header is passed (by default) from the CDN to the Origin where the website/application may or may not be configured to read the values present and make use of them. In general, the syntax used is a 2 or 3 letter language code followed by a “-“ (dash) followed by a 2 letter locale. For example: Accept-Language: en-US |
---|---|
Best Practice | Example: If a CDN has cached versions of a homepage in different language and country variations, there is no need for the request to be sent to the origin. The CDN can be customised to read the Accept-Language header and provide the suitable for that end user. It is more common to use the following structure when creating sites in different languages/locales: www.domain.com/<language-locale>/ www.domain.com/en-US/ These redirects can easily be setup or cached on the CDN. The suggestion that the first request should always hit the origin is an older theory that applied before CDN configurations could handle these varied requests successfully. Best practice would be to push as many processes to the CDN without the need for the origin to be sent any requests. |
References | https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Accept-Language |
Cookie
Explanation | The cookie header contains previously stored cookies (from the Set-Cookie response header – explained later). |
---|---|
Best Practice |
|
References |
If-Modified-Since
Explanation | Create a conditional request that will respond with a 200 OK, only if the requested object has been last modified after the given date in the If-Modified-Since header. |
---|---|
Best Practice | The header will be ignored if used in conjunction with If-None-Match and can only be used with GET and HEAD requests. The most common use of If-Modified-Since is to keep cache entries up to date where no Etag Header is being used. Syntax example: If-Modified-Since: Wed, 21 Oct 2015 07:28:00 GMT Take note that the day of the week and month are “case – sensitive” and the time zone is always GMT (not local). |
References |
Cache-Control
Explanation | Note, Cache-Control is a header that can be used as a request and response header, however, the header is unidirectional which means what happens on the request does not need to be linked to what is sent on response. Cache-Control headers include a various set of directives and values that are used to validate a resource. In this case, it can be used as request header sent from a browser to the server. Cache-Control does not often come up as a request header (or is often over looked). |
---|---|
Best Practice | Cache-Control: no-cache The most popular use of this header with a value of “no-cache” in a request is to tell proxies to revalidate content, regardless of whether the content is fresh (i.e. Has not expired or reached the max-age). |
References | Cache-Control contains many values and uses. Please refer to the following for a full list of usage. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control |
Response Headers
Cache-Control
Explanation | Note, Cache-Control is a header that can be used as a request and response header, however, the header is unidirectional which means what happens on the request does not need to be linked to what is sent on response.
Cache-Control is mainly used as a response header and is now the main header used to control how and how long a resource or object is cached. There are many values but the most common are:
|
---|---|
Best Practice | Each application should spend time analysing the life cycle of all their assets. |
Example | |
References | Cache-Control contains many values and uses. Please refer to the following for a full list of usage. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control |
Date
Explanation | This contains the date and time at which the message was sent. It is the responsibility of the server that replies to the request to include the date header and this date must be as accurate as possible and follow the HTTP : date format. Date = “Date” “:” HTTP-date |
---|---|
Best Practice | The main course of best practice with the date header is as follows:
|
References |
Set-Cookie
Explanation | Please note that this header is used to send cookie information from the origin server to the user-agent (not the end user as such). The same end user could make a request for the same content on a laptop or smart device – the responses could and most likely differ per device. Syntax should follow where the prefix is optional: Set-Cookie: <cookie-name>=<cookie-value>; <prefix> |
---|---|
Best Practice | Only in rare occurrences does the CDN look at that header. |
References | For a complete list and usage of this header, please read the following: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Set-Cookie |
Content-Encoding
Explanation | In conjunction with the request header, Accept-Encoding. The header describes the type of compression used so the client (browser) understands how to decompress the resource or object.
Some of the most common uses are:
|
---|---|
Best Practice | It is highly recommended to compress resources wherever possible, however, be careful on which resources you apply compression to as it may already be compressed. Applying compression to an object that is already compressed may not decrease the size of the object but will most likely increase the load time (as compression is handled on the fly). |
References |
Last-Modified
Explanation | Date and time in which the origin server believes the object or resource was last modified. Very useful to validate or determine if an object or resource received is the same. |
---|---|
Best Practice | Used in conjunction with request headers If-Modified-Since and If-Unmodified-Since, best practice would be to have accurate dates and times. However, some would argue best practice would be to use the ETag header (explained later). |
References |
Content-Type
Explanation | The client can read this header to determine the type of content that is a specific object or resource is, however, many browsers have MIME sniffing enabled that provides this information, so they may ignore the header. |
---|---|
Best Practice | Although browsers may enable MIME sniffing, application developers should always configure the content-type header correctly as assumptions cannot be made. More importantly, a CDN most likely does not MIME sniff; therefore, proper use of this header is very important, especially when a CDN considers compression, caching and chunking.
Example: |
References | MIME types: https://developer.mozilla.org/en-US/docs/Web/HTTP/Basics_of_HTTP/MIME_types |
CORS Headers – Access-Control-Allow-Origin
Explanation | Cross Origin Resource Sharing (CORS) is a way for the client (browser) to access a resource or object that is on a server different to the origin. There are many CORS headers, however, the most common is Access-Control-Allow-Origin. This header is sent from the origin and indicates whether the response can be shared with resources or objects with the specified origin. Two of the main uses could be : Access-Control-Allow-Origin: * Wildcard that allows access from all origins Access-Control-Allow-Origin: <origin> Access from a specific domain is allowed |
---|---|
Best Practice | Generally, CDN’s will automatically pass what are considered as “safe” headers between the origin and the end user without them having to be specified. With CORS headers, many CDN vendors will require CORS headers to be strictly defined in each configuration. Best practice would be to avoid using wildcard (*) values as these defeat the object of using the header, therefore, stick with specific domains for proper use. This will also prevent other applications or websites hijacking resources. A very common object that is hijacked and loaded on other websites are fonts. |
References | CORS overview and all CORS headers: https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS |
Location
Explanation | Simple but important. This header tells the browser which URL to redirect that particular request to. Note, that this is only valid for redirections (generally with a 30x response code) but not for rewrites. |
---|---|
Best Practice | The only points to consider here are the correct use of status code. Should the redirect be a 301 or 302? There are other options, but these are the main 2. As discussed before (in status codes), it is important to consider what redirects should be permanent or temporary. |
References |
E-Tag
Explanation | A specific identifier for the version of a resource or object. A more efficient way to determine whether a server has the latest version. Saving on bandwidth, a server can send a request to the origin for an object with the current ETag of the object in cache. If the ETag’s match, there is no need to send the object again. |
---|---|
Best Practice | Many CDN’s support ETag but do not actively promote the use of it. It is very easy to use properly, for example, ensure a new ETag value is generated for each new version of the object sent. |
References | https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag |
Vary
Explanation | A very sensitive response header that uses the matching of specified request headers to determine if a cached resource can be served or a fresh copy should be sent. The format could be:
The use of a wildcard (*) would indicate that a new copy needs to be provided for every request. The use of a header would ask the server to consider the request header value and match it against the request header value on the cached copy. For example, it is possible to cache different versions of a resource based on User-Agent. This is very common with websites/applications that have desktop and mobile versions. |
---|---|
Best Practice | Recommendations would be to not use the Vary header but use meaningful Cache-Control headers and Content Variation. Vary headers can be interpreted incorrectly very easily and are more than often used incorrectly.
Cache-Control used correctly can clearly mark objects that should not be cached. Content Variation (supported by most CDN vendors), automatically caches different versions of objects based on specified headers. |
References |
3rd Party Objects
Explanation | Objects that are not loaded through your domain. For example, google features, fonts from specific vendors etc. |
---|---|
Best Practice | Always a common mistake when looking at analytics for websites and applications. Always consider running tests on your specific domain and not webpage URL’s. Slowness and often restricted domains can and will affect the load times of your website. Consider that the 3rd party domains do not use a CDN to accelerate their content. Do not cover your website or application with 3rd party objects. It is detrimental to performance and security. Note: Countries that restrict major domains. For example, Google is blocked in China. |
References |