Pivotal Knowledge Base

Follow

How to use Elastic Runtime BLOB storage data

Environment

 Product  Version
 Pivotal Cloud Foundry®  1.7.x, 1.8.x, 1.9.x, 1.10.x, 1.11.x, 1.12.x

Purpose

The Elastic Runtime utilizes a BLOB store for its data. This can be configured as an external S3/S3 compatible endpoint or it can use an internal server. This article discusses the two choices and describes the data stored here.

Procedure

BLOB Store Configuration

Elastic Runtime utilizes a blob store for its persistent data. This is configured in the Ops Manager under the Elastic Runtime tile's File Storage config screen. Selecting "Internal" will cause an NFS or WebDav VM to be spun up and used for storage.  Specifying "External" will allow you to provide an external S3 compatible file store. When possible, Pivotal recommends using an external file store. See the Impact / Risks section below for more details.

BLOB Store Data

Elastic Runtime uses the BLOB store to persist for five different types of data: application bits, droplets, build packs, build pack cache data, and resource cache data.

  • Application Bits: This is the full set of data for an application when it is pushed to the platform. This is the complete set of application files that is injected into the container when an application is staged.
  • Droplet: This is the full output of the staging process. It includes the application bits plus any additional software that is necessary to run your application.
  • Build Packs: These are the files for all of the build packs that are installed on the platform. These are downloaded to the DEA/Diego Cell and used to stage an application.
  • Build Pack Cache: Build packs have the ability to cache data between staging runs.  This allows them to save large downloads and other files from run-to-run, typically so that the build pack runs more quickly. Once staging is complete, the build pack's cache is persisted to the BLOB store.
  • Resource Cache: Before the cf CLI uploads large files to the platform, it will first check with the cloud controller to see if there are cached copies of the file on the platform. It does this to save bandwidth and makes uploads take less time. The cached data is stored in the BLOB store.

BLOB Store Structure

The BLOB store is organized into the following structure, which corresponds to the different types of data being stored:

  • /cc-packages - contains the application bits
  • /cc-droplets - contains the droplets
  • /cc-buildpacks - contains the build packs
  • /cc-droplets/buildpack_cache - contains the build pack cache
  • /cc-resources - contains the resource cache

When using the internal storage, these directories will be located at /var/vcap/store/shared. When using external storage, you can specify bucket names for packages, droplets, build packs, and resources. The build pack cache will be stored in the droplets bucket.

BLOB Store Clean Up

In general, the BLOB store manages it's own storage and cleans up after itself. There are two exceptions: the build pack cache and the resource cache. These are allowed to grow and have no automated clean up.

Fortunately, it is easy to manually clean up these two caches. The build pack cache, can be cleaned by running cf curl -X DELETE /v2/blobstores/buildpack_cache as a platform admin user.  This will completely clear the build pack cache. This should not cause problems for well-behaved build packs.

The resource cache can simply be deleted. If you're using the internal store, that's done by using bosh ssh to connect to the VM for the internal store (NFS or WebDav) and deleting the contents of the /var/vcap/store/shared/cc-resources directory. If you're using an external file store, then you would use the file store's API to delete the contents of the resources bucket.

There are no cases where you should delete items from the packages, droplets, or build packs locations directly. If you need to free up resources from those locations, you would need to run either cf delete-buildpack or cf delete, depending on the resource you want to delete.

Impact/Risks

As mentioned above, Pivotal recommends using an external S3 compatible file store. This is because the internal storage is a single point of failure. The internal store will also likely perform worse, and it will have scalability issues as the platforms storage needs grow. It is possible to scale up the size of the persistent disk used by the internal store but this will require downtime as this component is a single point of failure. The amount of downtime goes up as the size of the internal store's persistent disk grows (i.e. more data on the internal store, equals more downtime).

It is not possible to switch from an internal to external or external to internal store. The choice must be made during the initial installation of PCF and cannot be changed after the fact. If you need to switch at a later date then you must take a backup, install PCF from scratch, and restore the backup into the newly installed environment.

Additional Information

For PCF 1.5.x and older, you will not have the same directory structure for your BLOB store. With PCF 1.6.x, the structure was changed to the format that is documented in the BLOB Store Structure section above. For PCF 1.5.x and any version of PCF 1.6.x that has been upgraded from PCF 1.5.x, there will only be a single folder that contains all of the different resource files (See BLOB Store Data section). Since the different file types are all mixed together in the single, flat, top level folder, it is not safe to delete any files from the BLOB store.

In PCF 1.8.x, the internal NFS server has been switched to an internal WebDav server. It is still a single point of failure and all the information listed above is valid for the WebDav server too.

Comments

  • Avatar
    Todd Robbins

    Per this line:

    "If you need to switch at a later date then you must take a backup, install PCF from scratch, and restore the backup into the newly installed environment."

    The backup/restore procedure has you backup webdav data and restore it afterwards. So I don't think doing backup/restore will help you switch to external storage. It seems there is not a procedure to migrate packages/droplets to external store. It seems there is no choice but to lose all these existing files if you should need to switch.

  • Avatar
    Daniel Mikusa

    >The backup/restore procedure has you backup webdav data and restore it afterwards. So I don't think doing backup/restore will help you switch to external storage. It seems there is not a procedure to migrate packages/droplets to external store.

    There is no supported way at the moment to move from internal NFS/WebDav to external storage (S3, etc.), but you can manually work through it. The rough process is similar to a backup & restore. There's a couple differences.

    1.) When you restore, after you have imported the Ops Manager configuration but before you've applied changes for the first time, you need to change your storage configuration and point it to the external storage. This should reconfigure the system to point to the external storage. You might be able to do this without doing a complete restore, but without testing I can't say what that would do to a working environment.

    2.) You need to manually restore your backup files to the external storage. The backup is just an archive of files, so it should be a matter extracting the files and copying / uploading them to your external storage with the same directory structure. Exactly how you do this will likely depend on your external storage. If it's S3, you'd just be uploading files to the buckets you configure in #1.

    Hope that helps!

  • Avatar
    Daniel Mikusa

    Interestingly I see that there appears to be a tool to do this, migrate from NFS to a S3 provider. It's not supported by Pivotal Support, but here's a link to the Github page -> https://github.com/pivotal-cf/goblob.

Powered by Zendesk