Cloud CDN overview | Google Cloud (2024)

Cloud CDN (Content Delivery Network) uses Google's global edge network toserve content closer to users, which accelerates your websites andapplications.

Cloud CDN works with the global external Application Load Balancer or theclassic Application Load Balancerto deliver content to your users. The external Application Load Balancer provides the frontend IPaddresses and ports that receive requests and the backends that respond to therequests.

Cloud CDN content can be sourced from various typesof backends.

In Cloud CDN, these backends are also called origin servers.Figure 1 illustrates how responses from origin servers that run onvirtual machine (VM) instances flow through an external Application Load Balancer before beingdelivered by Cloud CDN. In this situation, theGoogle Front End (GFE)comprises Cloud CDN and the external Application Load Balancer.

How Cloud CDN works

When a user requests content from an external Application Load Balancer, the requestarrives at a GFE that is at the edge of Google'snetwork as close as possible to the user.

If the load balancer's URL map routes traffic to a backend service or backendbucket that has Cloud CDN configured, the GFE uses Cloud CDN.

Cache hits and cache misses

A cache is a group of servers that stores and manages content so thatfuture requests for that content can be served faster. The cached content is acopy of cacheable content that is stored on origin servers.

If the GFE looks in the Cloud CDN cache and finds a cached responseto the user's request, the GFE sends the cached response to the user. This iscalled a cache hit. When a cache hit occurs, the GFE looks up the content byits cache key and responds directly to the user,shortening the round-trip time and saving the origin server from having toprocess the request.

A partial hit occurs when a request is served partially from cache andpartially from a backend. This can happen if only part of the requested contentis stored in a Cloud CDN cache, as described inSupport for byte range requests.

The first time that a piece of content is requested, the GFE determines that itcan't fulfill the request from the cache. This is called a cache miss. When acache miss occurs, the GFE forwards the request to the external Application Load Balancer. Theload balancer then forwards the request to one of your origin servers. When thecache receives the content, the GFE forwards the content to the user.

If the origin server's response to this request iscacheable, Cloud CDN stores theresponse in the Cloud CDN cache for future requests.Data transfer from a cache to a client is called cache egress.Data transfer to a cache is called cache fill.

Figure 2 shows a cache hit and a cache miss:

Cache hit ratio

The cache hit ratio is the percentage of times that a requested object isserved from the cache. If the cache hit ratio is 60%, it means that therequested object is served from the cache 60% of the time and must be retrievedfrom the origin 40% of the time.

For information about how cache keys can affect the cache hit ratio,see Using cache keys.For troubleshooting information, see Cache hit ratio islow.

View the cache hit ratio for a small time period

To view the cache hit ratio for a small time period (the last few minutes):

In the Google Cloud console, go to the Cloud CDN page.
Go to Cloud CDN
For each origin, see the Cache hit ratio column.
n/a means that the load-balanced content isn't currently cached or hasn'tbeen requested recently.

View the cache hit ratio for a longer time period

To view the cache hit ratio for a time period from 1 hour to 30 days:

In the Google Cloud console, go to the Cloud CDN page.
Go to Cloud CDN
In the Origin name column, click the origin name.
Click the Monitoring tab.
Optional: select a specific backend.

The CDN hit rate is one of the available monitoring graphs. A graph thatdisplays n/a means that the content isn't cached or hasn't been requested inthe displayed time range.

You can adjust the time period by selecting a different time range. Thefollowing image is an example of time ranges that you can select:

Inserting content into the cache

Caching is reactive in that an object is stored in a particular cache if arequest goes through that cache and if the response is cacheable. An objectstored in one cache does not automatically replicate into other caches; cachefill happens only in response to a client-initiated request. You cannot preloadcaches except by causing the individual caches to respond to requests.

When the origin server supports byte rangerequests, Cloud CDN can initiatemultiple cache fill requests in reaction to a single client request.

Serving content from a cache

After you enable Cloud CDN, caching happens automatically for allcacheable content. Your origin server uses HTTP headers to indicate whichresponses are cached. You can also control cacheability by using cachemodes.

When you use a backend bucket, the origin server is Cloud Storage. Whenyou use VM instances, the origin server is the web server software that you runon those instances.

Cloud CDN uses caches in numerous locations around the world. Becauseof the nature of caches, it is impossible to predict whether a particularrequest is served out of a cache. You can, however, expect that popular requestsfor cacheable content are served from a cache most of the time, yieldingsignificantly reduced latencies, reduced cost, and reduced load on your originservers.

For more information about what Cloud CDN cachesand for how long, see the Caching overview.

To see what Cloud CDN is serving from a cache, you can viewlogs.

Removing content from the cache

To remove an item from a cache, you can invalidate cached content. For moreinformation, see:

Cache invalidation overview
Invalidating cached content

Cache bypass

To bypass Cloud CDN, you can request an object directly from aCloud Storage bucket or a Compute Engine VM. For example, a URL for aCloud Storage bucket object looks like this:

https://storage.googleapis.com/STORAGE_BUCKET/FILENAME

Eviction and expiration

For content to be served from a cache, it must have been inserted into thecache, it must not be evicted, and it must not be expired.

Eviction and expiration are two different concepts. They both affect whatgets served, but they don't directly affect each other.

Eviction

If you are testing content caching with a small number of requests, you mightnotice that the content gets evicted.

Every cache has a limit on how much it can hold. However, Cloud CDNadds content to caches even after they're full. To insert content into a fullcache, the cache first removes something else to make room. This is calledeviction. Caches are usually full, so they are constantly evicting content. Theygenerally evict content that hasn't recently been accessed, regardless of thecontent's expiration time. The evicted content might be expired, and it mightnot be. Setting an expiration time doesn't affect eviction.

Unpopular content means content that hasn't been accessed in a while. Awhile and unpopular are both relative to the bulk of other items in thecache. Multiple Google Cloud projects share a common pool of cache spacebecause the projects are served from the same set of GFEs. The relativepopularity of content is compared across multiple projects, not only withina single project.

As caches receive more traffic, they also evict more cached content.

As with all large-scale caches, content can be evicted unpredictably, so noparticular request is guaranteed to be served from the cache.

Expiration

Content in HTTP(S) caches can have a configurable expiration time. Theexpiration time informs the cache not to serve old content, even if the contenthasn't been evicted.

For example, consider a picture-of-the-hour URL. Its responses shouldbe set to expire in under one hour. Otherwise, the served content might be anold picture from a cache.

For information about fine tuning expiration times, seeUsing TTL settings and overrides.

Requests initiated by Cloud CDN

When your origin server supports byte range requests, Cloud CDNcan send multiple requests to the origin server in reaction to a single clientrequest. As described inSupport for byte range requests,Cloud CDN can initiate two types of requests: validation requestsand byte range requests.

Data location settings of other Cloud Platform Services

Using Cloud CDN means that data may be stored at servinglocations outside of the region or zone of your origin server. This is normaland how HTTP caching works on the internet. Under theService Specific Terms of the Google Cloud PlatformTerms of Service, the Data Location Setting that is available for certain CloudPlatform Services does not apply to Core Customer Data for the respective CloudPlatform Service when used with other Google products and services (in this casethe Cloud CDN service). If you don't want this outcome,don't use the Cloud CDN service.

Support for Google-managed SSL certificates

You can useGoogle-managed certificates when Cloud CDN is enabled.

Google Cloud Armor with Cloud CDN

Google Cloud Armor with Cloud CDN features two types of security policies:

Edge security policies. These policies can be applied to yourCloud CDN-enabled origin servers. They apply to alltraffic, before CDN lookup.
Backend security policies. These policies are enforced only for requestsfor dynamic content, cache misses, or other requests that are destined foryour origin server.

For more information, see the Google Cloud Armordocumentation.

What's next

To enable Cloud CDN for your HTTP(S) load balanced instances andstorage buckets, see Using Cloud CDN.
To use Cloud CDN with Google Kubernetes Engine, seeConfigure Cloud CDN through Ingress.
To find GFE points of presence, see Cache locations.