Loading static files from disk
Authorizing and Mapping Urls and Domains
By default PageSpeed loads sub-resources via an HTTP fetch. It would be faster to load sub-resources directly from the filesystem, however this may not be safe to do because the sub-resources may be dynamically generated or the sub-resources may not be stored on the same server.
However, you can explicitly tell PageSpeed to load static sub-resources from disk by using the
LoadFromFile directive. For example:
pagespeed LoadFromFile "http://www.example.com/static/" "c:\www\static/"
tells PageSpeed to load all resources whose URLs start with
http://www.example.com/static/ from the filesystem under
c:\www\static/. For example,
http://www.example.com/static/images/foo.png will be loaded from the file
http://www.example.com/bar.jpg will still be fetched using HTTP.
If you need more sophisticated prefix-matching behavior, you can use the
LoadFromFileMatch directive, which supports RE2-formatted regular expressions. (Note that this is not the same format as the wildcards used above and elsewhere in PageSpeed.) For example:
pagespeed LoadFromFileMatch "^https?://example.com/~([^/]*)/static/" "c:\www\static/\\1"
c:\www\static/al/css/ie. The resource
http://example.com/~pat/images/static/puppy.gif, however, would not be matched by this directive and would be fetched using HTTP.
Because PageSpeed is loading the files directly from the filesystem, no custom headers will be set.
You can also use the
LoadFromFile directive to load HTTPS resources which would not be otherwise fetchable directly. For example:
pagespeed LoadFromFile "https://www.example.com/static/" "c:\www\static/";
The filesystem path must be an absolute path.
You can specify multiple
LoadFromFile associations in configuration files. Note that large numbers of such directives may impact performance.
If the sub-resource cannot be loaded from file in the directory specified, the sub-request will fail (rather than fall back to HTTP fetch). Part of the reason for this is to indicate a configuration error more clearly.
As an added benefit. If resources are loaded from file, the rewritten versions will be updated immediately when you change the associated file. Resources loaded via normal HTTP fetches are refreshed only when they expire from the cache (by default every 5 minutes). Therefore, the rewritten versions are only updated as often as the cache is refreshed. Resources loaded from file are not subject to caching behavior because they are accessed directly from the filesystem for every request for the rewritten version.
This directive can not be use in location-specific configuration sections.
Limiting Direct Loading
A mapping set up with
LoadFromFile allows filesystem loading for anything it matches. If you have directories or file types that cannot be loaded directly from the filesystem,
LoadFromFileRule lets you add fine-grained rules to control which files will be loaded directly and which will fall back to the standard process, over HTTP.
When given a URL PageSpeed first determines whether any LoadFromFile mappings apply. If one does, it calculates the mapped filename and checks for applicable LoadFromFileRules. Considering rules in the reverse order of definition, it takes the first applicable one and uses that to determine whether to load from file or fall back to HTTP.
Some examples may be helpful. Consider a website that is entirely static content except for a
c:\www\index.html c:\www\css\style.css c:\www\gfx\image.png c:\www\bin\webapp.dll
While most of the site can be loaded directly from the filesystem,
web.config are files that need to be interpreted before serving -- or not served at all! Adding a rule disallowing the
/bin directory tells us to fall back to HTTP appropriately:
pagespeed LoadFromFile http://example.com/ c:\www\ pagespeed LoadFromFileRule Disallow c:\www\bin
LoadFromFileRule directive takes two arguments. The first must be either
Disallow while the second is a prefix that specifies which filesystem paths it should apply to. Because the default is to allow loading from the filesystem for all paths listed in any
LoadFromFile statement, most of the time you will be using
Disallow to turn off filesystem loading for some subset of those paths. You would use
Allow only after a
Disallow that was overly general.
Not all sites are well suited for prefix-based control. Consider a site with aspx files mixed in with ordinary static files:
c:\www\index.html c:\www\webmail.aspx c:\www\webmail.css c:\www\blog/index.aspx c:\www\blog/header.png c:\www\blog/blog.css
Blacklisting just the
.aspx files so they fall back to an HTTP fetch allows everything else to be loaded directly from the filesystem:
pagespeed LoadFromFile http://example.com/ c:\www\; pagespeed LoadFromFileRuleMatch Disallow \.aspx;
LoadFromFileRuleMatch directive also takes two arguments. The first is either
Disallow and functions just like for
LoadFromFileRule above. The second argument, however, is a RE2-format regular expression instead of a file prefix. Remember to escape characters that have special meaning in regular expressions. For example, if instead of
\.aspx$ we had simply
.aspx$ then a file named
example.notphp would still be forced to load over HTTP because "
." is special syntax for "match any single character".
Consider a site with the opposite problem: a few file types can be reliably loaded from file but the rest need interpretation first. For example:
c:\www\index.html c:\www\site.css c:\www\script-using-ssi.js c:\www\generate-image.ashx c:\www\
In this site
generate-image.ashx needs to be interpreted to make images. The only resources on the site that are generally safe to load are
.css ones. By first blacklisting everything and then whitelisting only the
.css files, we can make PageSpeed do this:
pagespeed LoadFromFile http://example.com/ c:\www\ pagespeed LoadFromFileRuleMatch disallow .* pagespeed LoadFromFileRuleMatch allow \.css$
This works because order is significant: later rules take precedence over earlier ones.